An Example of a Living Data Mesh: The Snowflake Data Marketplace
The enterprise data world has been captivated by a new trend: Data Mesh. The “What Is Data Mesh” articles have already come out, but in this publication, I want to highlight a live, in production, worldwide Data Mesh example – The Snowflake Data Marketplace.
As in every “new thing” that comes down the pike, people will change the definition to suit their purposes and point of view, and I am no different. Zhamak Dehghani, a Director of Emerging Technologies at ThoughtWorks, writes that Data Mesh must contain the following shifts:
- Organization: From central controlled to distributed data owners. From enterprise IT to the domain business owners.
- Technology: It shifts from technology solutions that treat data as a byproduct of running pipeline code to solutions that treat data and code that maintains it as one lively autonomous unit.
- Value: It shifts our value system from data as an asset to be collected to data as a product to serve and delight the data users (internal and external to the organization).
- Architecture: From central warehouses and data lakes to a distributed mesh of data products with a standardized interface.
It is on this principal that I take departure and advocate the Snowflake Data Cloud. I believe that the advantages that have always been in a centralized data store can be retained, while the infinite scale of Snowflake’s Data Cloud facilitates the rest of the goals behind Data Mesh.
With so much to understand about the new paradigm and its benefits, or even grasping what an up and running Data Mesh would look like… to date, even simplified overview articles are lengthy. As I wrestled with coming to my own understanding of Data Mesh and how Strive could bring our decades of successful implementations in all things data, software development, and organizational change management to bear, I was hit by a simple notion. There is already a great example of a successfully implemented, world-wide, multi-organization Data Mesh – The Snowflake Marketplace.
There are more than 1,100 data sets from more than 240 providers, available to any Snowflake customer. The data sets from the market become part of the customer’s own Snowflake account and yet are managed and kept up to date by providers. No ETL needed and no scheduling. When providers update their data, it is updated for all subscribers. This is the definition of “data as a product”.
In effect, The Snowflake Data Cloud is the self-service, data-as-a-platform infrastructure. The Snowflake Marketplace is the discovery and governance tool within it. Everyone that has published data into the Marketplace has become product owners and delivered data as a product.
We can see the promised benefit of the Snowflake Marketplace as Data Mesh in this – massive scalability. I’m not speaking of the Snowflake platforms near infinite scalability, impressive as that is, however considering how every team publishing data into the market has been able to do so without the cooperation of another team. None of the teams that have published data have had to wait in line to have their priorities bubble up to the top of IT’s agenda. A thousand new teams can publish data today. A hundred thousand new teams can publish their data tomorrow.
This meets the organizational shift from centralized control to decentralized domain ownership, and the data as a product, and technically with data and the code together as one product.
Data consumers can go to market and find data that they need, regardless of which organization created the data. If it’s in the Snowflake Marketplace, any Snowflake customer can use the data for their own needs. Each consumer of the data will bring their own compute, so that nobody’s use of the data is impacting or slowing down the performance of another team’s dashboards.
Imagine that instead of weather data published by AccuWeather and financial data by Capital One – it’s your own organizations customer, employee, marketing, and logistics data. Each data set is owned by the business team that creates the data. They are the team that knows the data best. They curate, cleanse, and productize the data themselves. They do so on their own schedule and with their own resources. That data is then discoverable and usable by anyone else in the enterprise (gated by role-based security). Imagine that you can scale as your business demands, as new businesses are acquired, as ideation for new products occur. All facilitated by IT, but never hindered by IT as a bottle neck.
With Snowflake’s hyper scalability and separation of storage and compute, and its handling of structured, semi-structured, and unstructured data, it’s the perfect platform to enable enterprise IT to offer “data as self-serve infrastructure” to the business domain teams. From there, it is a small leap to see how the Snowflake Data Marketplace is, in fact, a living example of a Data Mesh with all the benefits realized in Zhamak Dehghani’s papers.
As a data practitioner with over 3 decades of my own experience, I am as excited today as ever to see the continuous evolution of how to get value out of data and deal with the explosion in data types and volumes. I welcome Data Mesh and the innovations it is promising, along with Data Vault 2.0, cloud data hyper-scale databases, like Snowflake, to facilitate the scale and speed to value of today’s data environment.
Strive is a proud partner of Snowflake!
Strive Consulting is a business and technology consulting firm, and proud partner of Snowflake, having direct experience with query usage and helping our clients understand and monopolize the benefits the Snowflake Data Platform presents. Our team of experts can work hand-in-hand with you to determine if leveraging Snowflake is right for your organization. Check out Strive’s additional Snowflake thought leadership HERE.
Snowflake delivers the Data Cloud – a global network where thousands of organizations mobilize data with near-unlimited scale, concurrency, and performance. Inside the Data Cloud, organizations unite their siloed data, easily discover and securely share governed data, and execute diverse analytic workloads. Join the Data Cloud at SNOWFLAKE.COM.