The road to data liberation: Data mesh, centralisation and standardisation

Author: Isha Jain

Jonny Dixon, Senior Product Manager at Dremio

Data has become the golden commodity that all enterprises want to get more of to drive business success. And in this frenzy, it’s not surprising that different departments have embraced different platforms and technologies to mine as much data as they can.

Typically, data is centralised and managed by a single data engineering team, which can lead to bottlenecks and inefficiencies. The adoption of different stand-alone platforms and analytical initiatives has created a reality where siloed data engineering teams are shouldering the burden of continuously transferring, duplicating and converting data to provide valuable datasets across every part of an enterprise.

How has this happened when we have seen a remarkable advancement in data management tools?

The demand for more data, more quickly, has become so high that engineers and technologists cannot keep pace with impossible backlogs of data requests, in siloed departments, which results in existing data quickly turning stale. And the majority of these issues have persisted, bundled up in the migration to the cloud.

What is now needed is a decentralised, ‘divide and conquer’ approach to data engineering to speed up development. Where more people in different areas of the business can self-serve access and produce high-quality, reusable, compliant datasets. This is exactly what data mesh can offer.

Time to de-mesh-tify

A decentralised data engineering approach created in 2019 by Zhamak Deghani, the data mesh approach focuses on subject matter experts, across the business, building pipelines to produce reusable, data products (datasets) that they own and make available for others to use across the enterprise. Made up of four principles – data as a product; domain oriented decentralised data ownership; self-serve common platform; and federated computational data governance – it paints a roadmap for improved analytical data management.

By breaking down the traditional monolithic approach to data management into smaller, more manageable pieces, similar to how a mesh is made up of interconnected nodes, data mesh advocates for democratising data ownership and spreads responsibility across different business units or teams, making data more accessible and more agile, whilst also enhancing its quality.

Data mesh further fosters a culture of collaboration among different teams or departments. When data ownership and responsibility are distributed, it encourages cross-functional teams to work together, share insights, and leverage each other’s data assets for more holistic decision-making. Particularly for less technical users, who can benefit from user-friendly interfaces and intuitive access to data without having to rely on a central data team.

Now the question is, what’s the first step enterprises need to take?

Time for a centralised program office setup

Before anything else, what is needed is a ‘single pane of glass’ for your data products, a ‘storefront’ for data, where data governance policies are applied enterprise-wide, with a central program office that acts as a ‘central hub’ for all data and analytical projects.

This should be led by a chief data officer who will coordinate development activity across multiple decentralised teams and align it to the overall business strategy. This centralised program office in turn will support decentralised data engineering teams and ensure that each data and analytical project is aligned to one or more business objectives in the company’s business strategy. As well as enabling collaborative planning and coordination of all data products, business intelligence such as reports or dashboards, and machine learning model development projects across multiple domains while avoiding reinvention and siloed engineering.

Why standardisation is key

The second major step when ramping up data product development is standardisation to avoid unnecessary complexity. This should improve development productivity and enable data and metadata sharing across teams of data producers without getting in their way. Making everything easier to maintain over the long term.

Further, it should become simpler for different teams and systems to work with and exchange data, reducing the complexity of integrating data from multiple sources. As well as improving data quality, when there are clear standards for data entry and validation, errors and inconsistencies are less likely to occur, which enhances the overall trustworthiness of the data.

In the pursuit of data-driven success, centralised data management has become a bottleneck. The rapid adoption of various data initiatives has led to siloed data engineering teams struggling to meet the soaring demand for data. Data mesh can provide the solution. Enhancing data accessibility, it helps nurture a culture of collaboration and equips organisations to leverage their data assets without bottlenecks, to foster innovation and informed decision-making in today’s dynamic data landscape.