How RStudio helped Redfin move their data models from spreadsheets to a reproducible data science environment

by Jared Goulart / Director - Operations Analytics, Redfin

The Problem


When Redfin was smaller, a lot of planning was possible using basic models implemented via spreadsheets and gathering input via email or files saved to GoogleDrive. But, as we wanted to make models more complex, and scale these models to handle our business’s increased scope, we found those things simply wouldn’t work anymore. We found ourselves unable to use more statistical approaches to forecasting that don’t work in spreadsheets, maintaining the formulas was error-prone and slow, and the amount of time spent on consolidating user input limited the number of iterations we could run of our models. Workbooks took too long to update with refreshed data and then would be painfully slow, taking 10 or more minutes for people to open and actually use, or worse: crashing, leaving them unable to open them at all.

“RStudio Connect was the only solution we saw that allowed us to replicate the interactivity that our users got in spreadsheets, but hosted on a server for ease of access and maintainability, and built around more complex statistical approaches that R enabled.”

- Jared Goulart, Redfin

The Solution


We knew R could be our solution to this, because R was already our solution in many areas, and was the preferred language of almost the whole analytics team. Getting things out of spreadsheet formulas and into source code was itself a big win just in terms of maintenance. We were excited to use packages like data.table to speed up data filtering and aggregation, and we needed to apply more advanced forecasting approaches such as the prophet package to make time series projections. Probably the most exciting package was Shiny itself, the ability to translate R models into an end product people could use with only a web browser. We built functionality in our apps to gather and store user input into a SQL database with as much metadata as we wished, and this completely streamlined our workflow.

Why RStudio Professional Products?


We expected to use Shiny Server Pro to host our applications, but when we heard the news of RStudio Connect’s debut, we knew it was exactly what we needed. We had a lot of heartburn about how hard it might be to deploy code, manage user access to apps, and administer the overall server when multiple apps were available. We were still willing to work around that because R+Shiny was the solution we needed, even if getting the server running would require handling some of these challenges. But, RStudio Connect’s enterprise-focused features such as Active Directory integration allowed us to utilize existing user groups when managing access to apps just like we do with other systems, and it’s one-click publishing and admin functions made it feel achievable rather than daunting.

The Payoff


Our first Shiny app was a tool allowing managers to evaluate different scenarios and plan their budgets for the next year. The launch of the tool achieved our goals, and resulted in really impressed users: where they were accustomed to a tool giving them limited feedback and taking minutes to open they could now move fast, launching the Shiny app in seconds, and iterating through scenarios and making their plans in minutes!

This initial payoff was swift and would have been a large win itself, but having RStudio Connect available to the analytics team allowed us to solve other problems we never anticipated. Now, the RStudio data science platform is a Swiss army knife we use for many things. A Shiny app has still proven to be the best solution for us when a spreadsheet model gets out-of-hand with complexity. We’ve also built tools to help some teams manage their day-to-day workflow, as well as tools that help managers run their markets week to week. When use cases outgrew what a Google Map or FusionTable could handle, we built Leaflet-based apps to help multiple groups of users be more efficient with custom features.

About Redfin


Redfin is a technology-powered real estate brokerage, combining its own full-service agents with modern technology to redefine real estate in the consumer’s favor. Founded by software engineers, Redfin has the country’s #1 brokerage website and offers a host of online tools to consumers, including the Redfin Estimate, the automated home-value estimate with the industry’s lowest published error rate for listed homes. Homebuyers and sellers enjoy a full-service, technology-powered experience from Redfin real estate agents, while saving thousands in commissions. Redfin serves more than 90 major metro areas across the U.S. and Canada. The company has helped customers buy or sell homes worth more than $85 billion.

RStudio provides open source and enterprise-ready professional software for the R community. We are inspired by the people who use our products to understand and improve the world through data analysis.