BLACK LIVES MATTER
Join us and donate
The premier IDE for R
RStudio anywhere using a web browser
Put Shiny applications online
Shiny, R Markdown, Tidyverse and more
Do, share, teach and learn data science
Let us host your Shiny applications
The premier software bundle for data science teams
RStudio for the Enterprise
Connect data scientists with decision makers
Control and distribute packages
Building Spark ML pipelines with sparklyr
March 4, 2018
We provide an overview of the recently implemented Pipelines API in sparklyr, an R package for interfacing with Apache Spark. This new feature allows users to build and tune data transformation and machine learning pipelines that are interoperable with Scala and Python, simplifying handoffs between data science and data engineering. We go over the components of pipelines and walk through practical examples.
Kevin is a software engineer working on open source packages for big data analytics and machine learning. He has held data science positions in a variety of industries and was a credentialed actuary. He likes mixing cocktails and studying about wine.