"There weren’t any tools on the market with a comparable solution"
Udacity is the leading digital talent transformation platform connecting global talent with tech jobs. The company works with 200+ industry leaders – including Google, Meta, Mercedes-Benz and NVIDIA – to develop real world content for job-ready competencies in artificial intelligence, machine learning, data science, and other disciplines. These unique collaborations ensure that individuals and professionals learn the technology skills that employers value most, so that Udacity can fulfill its mission of training the world’s workforce in the careers of the future. With 20M registered users across 240 countries and more than 100 enterprise partners, the company is well on its way to achieving its goal of changing lives, businesses, and nations through talent transformation.
- 25%Engineer/analyst time saved
- 2XFaster data layer rebuilds
- Mike DollHead of Data & Analytics
- Simon DongSoftware Engineer, Data
- Anima SharmaDirector of Analytics
In late 2020, Mike Doll, Udacity’s Head of Data & Analytics, invited his team to solve a challenge that had bubbled up around SQL collaboration, namely the proliferation of tools on local machines.
When he first joined the company at the tail end of 2019, his data team of scientists, analysts and engineers were using a range of separate freeware SQL editors including Chartio, Dbeaver, and editors that were strictly for Mac or that only supported a handful of databases the company used across its business. It was clear to Mike and his team that the company needed to improve their data stack.
Anima Sharma, Udacity’s Director of Analytics, recalls that “it was difficult having everyone in different tools: Jupyter for data science, Chartio for analysts, Dbeaver for data engineers”. Furthermore, the tools themselves each had their own limitations. “Jupyter was a bit too advanced for analysts, Dbeaver’s UI was not friendly or easy to use, and Chartio didn’t provide the ability to show tables and columns within the schemas across our three databases (Redshift, Postgres, Athena),” she continues.
Put together, this fragmentation made SQL development at Udacity slow and difficult as it also led to a lack of consistency and efficiency in querying data. Individual teams and users were writing SQL on their own, getting different answers, and often wasting time rewriting redundant queries that other teammates had already written.
The situation came to a head when Mike saw a lot of copy/pasting of SQL queries in his data team’s Slack channel and repeat questions such as:
- Is this the latest query?
- I found a similar query, but with different results – which is correct?
- I need help with this query, can someone hop on a Zoom call and show me what I’m doing wrong?
Mike’s team had played a critical role to date in building a culture of consistency, accuracy, and trust around data at Udacity. So as soon as he caught early signs of that trust eroding, he saw an opportunity to build and consolidate a more efficient and effective data culture by thinking strategically about what tool his team should be using to stay on the same page.
Mike decided to move forward with PopSQL in early 2021. “There weren’t any tools on the market with a comparable solution,” he recalls.
From a collaboration perspective, PopSQL’s core features around folders, charting, autosave, version history, easy-share URL links, and comments have been critical in helping support his growing data team’s need to get insights and ship data products fast.
To support the enterprise-level needs of Udacity’s IT team, PopSQL’s single sign-on (SSO) integration has enabled their data team to securely log into their workspace, while also leveraging the “shared credentials” feature to provide them with instant access to the database connections they need, without spending days hunting for and receiving said credentials to simply do their work.
Additionally, every user could now leverage PopSQL’s Data Catalog feature to easily find and search for data across their schemas, tables and columns, with rich metadata around table and column popularity like “Top Users” for tables, and common joins to improve team productivity and output.
After a little over 1 year of leveraging PopSQL, Udacity’s data team has been able to:
- Save 20-25% of Data Science, Data Analyst, and Data Engineers’ time
- Ship their data layer rebuild to the entire company in one quarter, instead of two or more
- Enable end users with an immaculate and comprehensive understanding of their databases and data sources
- Increase throughput of insights by answering more critical business questions faster
“Testing our latest Datamart product has been super quick with PopSQL. We were able to consolidate data from 1,000 production tables to a set of 150 curated tables in a single quarter, thanks to PopSQL,” shares Anima Sharma. In addition, PopSQL enables Udacity’s data teams to quickly run ad hoc analyses and get deep into the data to help inform better decision making.
For Anima’s team, which focuses on the learner’s experience and the enterprise side of the business, PopSQL has been instrumental in understanding learner outcomes and improving their overall experience in Udacity. In one specific instance, the insights generated using PopSQL even helped prevent raising false alarms on usage data.
“Recently we were looking at a metric to understand product utilization rate. We saw that the overall trend was down and were wondering if this was a red signal that needs to be raised to our executive team. However, before raising alarms, we looked at this data in PopSQL, and quickly figured out that it was only two test accounts that were impacting the overall trend negatively, and therefore, there wasn’t a utilization issue across all customers.”