Crash course in Data Mesh: 4 things to know

Read time:

Author:

Date published:

4.29.2021

Table of Contents:

Expand Table of Contents

Collapse Table of Contents

Data mesh is a hot topic. In fact, the data mesh has been identified as a top trend of 2021. So what’s all the noise about? A lot. Here’s an introduction to get you started and on the path to learning more.

What is a data mesh?

The data mesh represents a paradigm shift: the concept is to design and develop data architectures around distributed ownership rather than a centralized data warehouse or data lake. At a high level, a data mesh is a decentralized data architecture broken into smaller portions and oriented around data domains. Run by data experts and data owners, it serves up data products and uses a common, self-serve data infrastructure with centralized governance and standardization.

Where did the idea of data mesh come from?

Thoughtworks’ technology consultant Zhamak Dehghani first introduced the Data Mesh concept in the blog How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh. From there, it took off like wildfire and has become the de-facto standard in describing the data mesh.

“The data mesh platform is an intentionally designed distributed data architecture, under centralized governance and standardization for interoperability, enabled by a shared and harmonized self-serve data infrastructure.”

How can a data mesh help an enterprise?

There are many reasons a data mesh is moving from a more “fringe” idea to more mainstream consideration, but here are some of the biggest ones.

A giant leap towards data democratization. It signals a dramatic cultural shift to bring the power center to the employees using the data. Domain experts, engineers, and customers can collaborate to unlock the data’s full value for meaningful insights.
Eliminates data pipeline bottlenecks. With competing, growing business demands of business stakeholders and repeated cycles of data ingestion, transformation, and delivery, the traditional method is full of churn. The distributed infrastructure as a platform opens up avenues for a universal yet automated approach toward data standardization, collection, and data sharing.
Empowers better service for customers: Business stakeholders can get the information they need faster without special training or IT as the “middle-person”. In this model, with data is the product, it becomes much easier to find what you need faster and easier. Ultimately this leads to a more seamless path to value creation.

What does data-as-a-product mean?

Infusing the idea of product thinking, data is treated with its own intrinsic value, not a byproduct, with domain experts managing their data products. This also eliminates the middle-person (often IT) for direct interaction between those who deeply understand the data and the business stakeholders who deeply understand the use case for the data.

‍