Please log in to watch this conference skillscast.
For over half a century organizations have assumed that data is an asset to collect more of, and data must be centralized to be useful. These assumptions have led to centralized and monolithic architectures, such as data warehousing and data lake, that limit organization to innovate with data at scale.
Data Mesh is an alternative architecture and organizational structure for managing analytical data. Its objective is enabling access to high quality data for analytical and machine learning use cases - at scale.
It's an approach that shifts the data culture, technology and architecture - from centralized collection and ownership of data to domain-oriented connection and ownership of data - from data as an asset to data as a product - from proprietary big platforms to an ecosystem of self-serve data infrastructure with open protocols - from top-down manual data governance to a federated computational one.
In this talk, Zhamak will introduce the principles underpinning Data Mesh and architecture.
Q&A
Question: I'm a ML/big data consultant and we are working with one of our clients who use the data mesh architecture. The problem they have is, since we don't take ownership, the verticals in the organisation wouldn't come to us for datasets and services. And also, since this is a different mentality from the rest of the organisation, we need to introduce ourselves in a different way but we are not gaining much attraction. Have you come across this before? and what is your recommended solution?
Answer: I need to understand this better … in my experience the data scientists and analysts are so poorly served that they are always looking for data - even better data products - and there is always friction to get to them. So if the DM platform removes friction, access control, discoverability, and serve high quality data in a way that meet the needs of their native tools/processes I’m curious why they still don’t show up for a bucket?
Question: With the data mesh approach, at a high level it feels like it is domain driven and ownership will be at domain level. Will this make implementing master data management solutions or "one source of truth" needs difficult for complex organizations.
Answer: One source of truth is an ever moving goal post. Yes you are right that DM shifts from one source of truth to the most trusted truth - but it does provide a certain guardrails that supports the notion: (1) data product is immutable and readonly - data never updates for a particular processing time, so that removes a lot of accidental complexity that comes from different updated information - (2) data product has SLOs that they need to guarantee in terms of quality and accuracy, integrity, etc. (3) Aggregate data products can be created to provide the mastering capability if needed around core concepts.
Question: With distributed ownership of data, do you have any thoughts around how to incentivize data owners/producers to maintain baseline data usability, data quality, SLAs etc and own the data they produce so that they can think of data as a product.
Answer: Great question, Guaranteeing data quality in a way that makes the users happy is part of their job, and their OKRs should include that. NPS, platform observability automation to check the data integrity, quality, etc. could be some of the tools. Dave and I from ThoughtWorks gave a talk recently on how to manage and guide evolution of organizations using “fitness functions”. Check out By ThoughtWorks the talk is online “Guiding the evolution of Data Mesh with fitness functions”.
Question: In most of the organizations the data is not owned by the business users, In your experience what is the best way to educate the business owners to own the data within the organization and educate them about various aspects of data including Data Privacy…
Answer: It’s very hard to do that without bringing all the other 3 pillars to life. It’s very hard to do that without having a business-domain-oriented tech (including data) capabilities to support the business, or a platform that empowers them, or a governance model that include them …
So to get started we need all of those pieces to come together.
Additionally, this is a “transformation” so organizational change is part of this, and most importantly a top-down support.
YOU MAY ALSO LIKE:
Data Mesh; A principled introduction
Zhamak Dehghani
Principal ConsultantThoughtworks