Understanding PCA using Shiny and Stack Overflow data

Understanding PCA using Shiny and Stack Overflow data

February 26, 2018

Principal component analysis (PCA) is a powerful approach for exploring high-dimensional data, but can be challenging for learners to comprehend. In this talk, I will walk through a practical and interactive explanation of what PCA is and how it works. As a case study I’ll explore a domain that many data analysts and data scientists are familiar with: programming languages and technologies, as understood through traffic to Stack Overflow questions. We will explore how interactive visualization using Shiny gives us insight into the complex, real-world relationships in high-dimensional datasets.

View Slides

About the speaker

Julia Silge

Julia Silge is a data scientist at Stack Overflow, with a PhD in astrophysics and an abiding love for Jane Austen. She is both an international keynote speaker and a real-world practitioner focusing on data analysis and machine learning practice. She is the author of Text Mining with R, with her coauthor David Robinson. She loves making beautiful charts and communicating about technical topics with diverse audiences.