How Snowflake Saved My Life
Let me tell you a story of how the Snowflake Data Cloud saved my life. I know it sounds dramatic, but just hear me out.
A few years ago, I worked with a multi-billion-dollar wholesale distributor that had never implemented a data warehouse. Their main goal? Consolidate all data into one location and enable KPI roll-ups from across their disparate systems. However, in this case, they did not want to invest in additional licensing. So, my team set about building a traditional data warehouse leveraging their current platform, SQL Server. Initially, it was a successful four-layer architecture with Staging, Consolidation, Dimensional Model, and Tabular Cubes, with the end visualization solution being Power BI… but within a few months, issues began to surface.
The number of sources feeding into this platform had increased dramatically and this increase started to impact load times. Initially, the batch load processes were running between two and three hours, but over time increased to taking 5, 6, sometimes 7 hours to run! We needed a long-term solution, but in the short term, keep the platform running to deliver data to the organization.
What we were experiencing were challenges with Constraints, Indexing, Locks, Fragmentation, etc… To mitigate these issues, I personally took the step of waking up every morning at 3:00AM to log in and ensure certain process milestones successfully completed in a timely manner. If those milestones were not achieved, the batch process would either stall, fail, or run excessively long and the last thing I wanted was to explain to the business why they were not going to have data until 9:00, 10:00, 11:00AM. After a couple weeks of doing this, it became apparent – we needed a better solution, and fast!
In the past, I had some experience with Big Data platforms, but decided to research options outside of established technologies, such as Cloudera or Hadoop-based solutions and instead looked into something new – Snowflake. Snowflake is the world’s largest and most efficient data management platform, where organizations can access, share, and maintain their data, so I thought, why not? Let’s give it a shot!
We set up a proof of concept initially trying to mimic the 4-layer architecture we had set up in SQL Server. After seeing limited success, as well as being laughed at for even trying it, we took a step back, reevaluated our approach, and flipped the architecture from ‘Extract Transform Load’ toward ‘Extract Load Transform’.. And… Eureka! With this change, we were able to reduce overnight batch runtimes from the 5, 6, 7 hour SQL Server to less than 20 minutes. In fact, our average runtimes for our load processes were around 17 minutes, but now I’m just showing off.
Not only did this have an incredible effect on our ability to deliver data in a timely manner, but it also enabled an increase in the frequency in which we processed data. You see, with the SQL Server we were never able to update data more than once a day, but with Snowflake, we could run the batch process every 20 minutes and quickly deliver requested changes to the models, measures, and dimensions.
The implementation process went from taking weeks to taking days, or even hours, resulting in some very happy stakeholders. With these results, coupled with the fact that I no longer had to wake up at 3:00AM to verify successful batch processes…Snowflake truly saved my life.
Want to learn more about the Snowflake Data Cloud?
Strive Consulting is a business and technology consulting firm, and proud partner of Snowflake, having direct experience helping our clients understand and monopolize the benefits the Snowflake Data Platform presents. Our team of experts can work hand-in-hand with you to determine if leveraging Snowflake is right for your organization. Check out Strive’s additional Snowflake thought leadership here.
Snowflake delivers the Data Cloud – a global network where thousands of organizations mobilize data with near-unlimited scale, concurrency, and performance. Inside the Data Cloud, organizations unite their siloed data, easily discover and securely share governed data, and execute diverse analytic workloads. Join the Data Cloud at Snowflake.com.