THE LETTER OF THE DAY IS...Y?
AUTHOR: MICHAEL RIMLER, GSK
For the past two years, I have been involved in the effort to integrate the R programming language into GSK’s clinical reporting pipeline. Recently, I reached out to programming leaders within GSK Biostatistics to try to understand why R uptake has been slower than expected, given the rich set of resources available to programmers starting their journey. Perhaps the most concerning response identified a hesitancy towards changing from SAS to R. The feedback indicated that some people do not understand WHY we are looking to expand our capabilities to R.
“With the emphasis on shortening timelines and delivering final analyses quicker, WHY should I take the time to use R when I can do it much quicker with SAS?”
SAS works. If it isn’t broken, WHYfix it?
WELL, THEY’RE RIGHT – AND THEY’RE WRONG…
It is true that if both SAS and R can be used for analysis, and you’re already fluent in SAS, it follows that you’ll deliver quicker with SAS.
However, R is not the end game. Collectively, we are not just programmers; we are data scientists. While we may be data scientists that more resemble data engineers relative to data analysts, we are still data scientists.
If you poll data scientists in other industries, ask how many exclusively use SAS. How many exclusively use one programming language? The response to the first question will likely be quite low and to the second, near zero; at least for the top performers.
BECOME A TOP PERFORMER
In our future we will be dealing with all sorts of data, such as wearables and real-world data, not just nice, neat, CRF-extracted data mapped into SDTM tabulation datasets.
In our future we will be working with cutting-edge analysis methods such as machine learning, Bayesian methods and multiple imputation.
In our future we will be creating dynamic reports and potentially modelling our data-progressive ways to enforce standardisation and enable automation.
When handling these new data or implementing these new analyses requires programming skills beyond SAS, do you want to be a passenger and just run code provided by someone else, or do you want to be a mechanic and be able to develop and manage that code yourself?
The role of clinical statistical programmers is expanding; and with this expansion, is a greater demand for analytical capabilities. In the next few years, we will be increasingly asked to deliver some very progressive capabilities and if you aren’t positioned to answer the call, someone else will. Now is the time to invest in your capabilities.
Many industry leaders have chosen the path to programming multilingualism to begin with R, but it will undoubtedly continue through other languages such as python and Julia. You will begin to observe new hires onboarding with programming capabilities that aren’t limited to SAS. Now is the time to invest in yourself, to remain competitive, a luxury that may not exist three years from now, by which time it may already be too late.
Embrace this opportunity to become a top performer and help do your part to drive the industry forward.
SO, WHY LEARN R?
So you can deliver all that R can deliver (shiny apps, data science capabilities, interoperability with python, publishing with R Markdown and bookdown, graphics with ggplot2 and, yes, for production and QC of conventional datasets and TFLs).
Your organisation may want you to be ready for the future, but you need to decide whether you want to be ready. And if you don’t, you may find yourself with a relatively limited set of opportunities to contribute to your organisation as a programmer.
As a monolingual SAS programmer, you may be fine in today’s ‘as-is’ state, but you will be behind in tomorrow’s ‘to-be’ state. There may be limited growth potential for pure SAS programmers across the industry in the coming years.
Now is the time to ensure your skills stay relevant and at the forefront of a programmer/analyst who specialises in clinical trial data.
The letter of today may be Y; but tomorrow, it will be R.
Don’t forget to check out the expert panel discussion Transformation of the Statistical Programmer Role within Modern SCEs at the PHUSE US Connect 2021, June 14–18.
PHUSE Education covers the broad bandwidth of knowledge a clinical data scientist needs to have to be successful in their job. Check out the section on Programming Languages, including R!