Scaling Data Science with R at Janssen Pharmaceuticals
Satish J. Murthy, IT Manager, Janssen R&D IT
The use of R for statistical computing is growing quickly among statisticians in Janssen Research & Development. They need an environment where they can collaborate in real time with colleagues across the world, run simulations with High Performance Computing (HPC), and deploy Shiny Applications to share their analyses. The Janssen R&D IT team developed and implemented the R as a Service for Visualization and Processing (RSVP) platform which combines Rstudio Server, Rshiny, cloud-bursted HPC, and RStudio Connect to address the challenges that Janssen scientists presented them with. This talk will present the design and architecture of RSVP from the IT side, and present a use case from the R&D side of an application that runs on the platform.
Paulo Bargo, Scientific Director, Statistics and Decision Sciences, Janssen R&D
As the Pharmaceutical sector boosts its interactions with regulatory agencies using R programming as one key instrument for drug development submissions, we face a dilemma that several members of statistics and statistical programming teams are not currently advanced R programmers. For many years SAS has been a powerful tool in the data analysis repertoire of pharma statisticians however the recent development of automation capabilities such as RMarkdown and R/Shiny have created a new venue to expedite access to consumable information in the form of reports, presentations or interactive graphics that can be produced efficiently and in standard format for all phases of a drug development or submission process. At Janssen we aim to improve the literacy in R programming and achieve nearly 100% adhesion by statistics and statistical programming teams in the coming 2-3 years. To achieve this goal, we are leveraging all types of training formats, from online training, to in-house instructor led seminar, to one-on-one mentoring. One of the key methods we have been developing is the use of RStudio.Cloud as a platform for internal crowd-led hands-on workshops where statisticians/programmers are "thought" to solve on-the-job real problems ranging from visualization to automated reports. In this presentation we will discuss our experience creating this program and share lessons learned, mistakes and successes.
Only 1,000 live attendees are allowed in the Webinar on a first come first serve basis. There will be approximately 60 minutes of presentation. While we usually have a question and answer session, there will be a lot of ground to cover during this presentation.
We've started a Github repository with all webinar materials. Speakers for this webinar and all future webinars will add their materials to the repository. https://github.com/rstudio/webinars
If you can't attend, don't worry. We record (almost) every webinar and post all materials on our website within 48 hours. See past webinars at https://resources.rstudio.com/webinars.