Accelerating bioinformatic learningNicholas Provart / Professor, Plant Cyberinfrastructure & Systems Biology
The University of Toronto worked with Coursera to deliver two massive, open, online courses (https://www.coursera.org/learn/bioinformatics-methods-1 and https://www.coursera.org/learn/bioinformatics-methods-2) to teach students bioinformatic methods, in order to better understand biology. One area in this field that is important is understanding when and where genes are active, based on “RNA-seq” data. Students explore this aspect in the 5th week of the second course.
Analyzing RNA-seq data can be done with Bioconductor. In the first iteration of the course, the instructor, Nicholas Provart, had the students install R and Bioconductor on their own computers. This proved to be a stumbling point because of the many different operating systems and dependencies between different packages on different platforms.
RStudio Server Pro offered a great solution. By being able to set up a web-accessible instance of Bioconductor running on RStudio on Amazon Web Services, students could proceed to learn the methods for analyzing RNA-seq data instead of spending time futzing with installation issues. “We used an AWS c4.large VM as in the schematic below to run up to 16 RStudio Server Pro instances, depending on the load. A php-based LDAP server was used to handle the authentication for student logins.”
RStudio Server Pro offered the ability to have multiple users connect to the Bioconductor environment, each with his/her own workspace where work can be maintained from session to session and pre-existing data folders can be shared with users.
Far fewer complaints on the Discussion Forums associated with the RNA-Seq lab and happier students! Professor Provart is also using the online course material to teach his University of Toronto class, on which these courses were based. By moving the Bioconductor component to the “cloud” and using RStudio Server Pro,systems administration headaches with University of Toronto computer labs were also alleviated. The University of Toronto is excited to be teaching so many students and educating future bioinformatic professionals on state-of-the-art analytics: 43,000 students have enrolled in the 2nd course so far!
RStudio provides open source and enterprise-ready professional software for data science teams using R and Python.