Over the last couple of years, there has been a lot of interest in the data mesh architecture first proposed back in 2018 by Zhamak Dehghani. It offers a practical alternative to the previously dominant notion of a single centralised data warehouse. Trying to gather all of an organisation’s data into one place can work for small and medium sized organisations, but for organisations over a certain size, or those that are made up of many semi-autonomous divisions, it’s just not practical.
Underpinning the data mesh architecture, are four underlying principles :
- Domain Ownership
- Data as a Product
- Facilitating Self-Service
- Federated Governance
Of these, perhaps the least well understood is the principle of federated governance.
As with the other principles of a data mesh architecture, federated governance can stand alone and deliver benefit, but there are additional synergies if it is implemented as part of a broader strategy to manage data using a distributed data mesh.
“Federated,” according to wiktionary.org, is defined in the context of computing as, “Interconnected, as a federation or supernetwork of independent and interoperable services.”
Data governance is itself a broad term encompassing regulatory compliance, data quality, roles and responsibilities, policy setting, and security. Within a complex organisation, responsibility for some aspects will be handled centrally, with others devolved to teams within separate business domains.
The point of a federated governance approach is to ensure that each aspect of data governance is handled by those who are best positioned to do it (whether centralised or devolved), and to include common protocols and feedback mechanisms so that different entities can communicate effectively and evolve over time.
Many organisations try to centralise governance as much as they can. This has advantages, but has problems as well.
Advantages of Centralised Data Governance
Data Assets are Easy to Locate
When data governance is handled centrally, users can go to one place to request access to an application or data asset.
Consistency of Approach
A centrally managed data governance team has the resources to keep up to date with industry best practices, changing legal constraints and emerging technologies so they are best positioned to respond to threats and take advantage of new technologies and standards.
A centrally managed team can implement a single sign-on process that works across multiple systems and doesn’t require separate usernames, passwords and MFA controls for each system.
A centralised user authentication process can be improved more quickly over time as technology evolves, perhaps getting rid of usernames and passwords altogether and using biometric information instead.
In technical terms the “attack surface” is reduced, as individual systems do not need to be exposed to login attempts from unverified users.
Disadvantages of Centralised Data Governance
Lack of contextual understanding of data
A centralised team won’t have the same understanding of domain data as individual domains. Users within individual business functions are bound to have a more nuanced understanding of what their data means, it’s accuracy and limitations, and how it might be used or abused.
When determining what access to data is required, a devolved team working in the same business domain as the data will have a clearer understanding of what data is absolutely necessary, and what additional data, while not strictly necessary, might provide additional helpful context. Some centralised data teams follow an “all doors closed” or “need to know” approach, where only the bare minimum of data access is allowed. This can greatly damage both the productivity and morale of end users, who may justifiably feel that overly zealous restrictions imposed by a far away centralised team are making their work more difficult than it needs to be.
Administrative Bottlenecks
Funneling all data access requests through a single central team can create a bottleneck. Access requests that could be granted in minutes or hours, can instead take weeks or even months to process.
Lack of Segmentation of Data
This is not a direct result of centralised governance, but often centralised governance can lead to a concentration of very large amounts of data in one place. If there is any kind of data breach, then the impact of the breach is then much more serious. A recent example is the Ticketmaster hack in May 2024, where unauthorised access to a single user account allowed the personal details of 560 million customers to be stolen.
So, Where Are We?
Looking at the advantages and disadvantages above, some patterns emerge in the areas of authentication, authorisation, data policies and data labelling, as well as verifying and monitoring data accuracy.
An Example
A central authority is better placed to determine policies around how personally identifiable information (PII) is handled. Where it’s handled in a consistent way across the organisation, a centralised team can develop and implement technical solutions to automate this process and monitor for exceptions.
A devolved team, with a closer understanding of the data is better placed to identify what data elements should be regarded as PII, particularly where no one data element may be enough to identify a subject but a number of elements together may be.
Centralised Governance Responsibilities
- User Authentication
- Setting consistent policies across the organisation
- Implement technical measures to test for data quality issues
- Provide a centralised data catalogue so that users across an organisation know where to start looking for access to data
Devolved Governance Responsibilities
- User Authorisation
- Labelling and classifying data
- Verify accuracy and timeliness of data input and output
- Manage data as data products, including creation and maintenance of metadata and catalogue entries
Who takes responsibility for the technical infrastructure that hosts data or handles user authorisation can depend upon the scale of the federated domains. Where a domain maps to a single department within a larger organisation, they may not have the capability or desire to manage any technology themselves. Where a domain represents a larger function, such as a division within an international conglomerate, they may already have staff and infrastructure in place, and are quite comfortable continuing to manage this themselves.
Implemented effectively a federated governance approach can:
- Scale to even the largest and most complex organisations.
- Ensure compliance with latest legal frameworks allow an organisation to keep up to date with latest best practices.
- Provide good security without getting in the way of business functions.
- Allow users to access new data without lengthy waits for access to be granted.
There are a wide variety of tools and services on the market today that can help an organisation with data governance. Despite what individual vendors might tell you, one size definitely does not fit all.
At InterWorks, we work with a wide variety of different product and service vendors and have extensive experience of data handling with clients of all sizes across a variety of industries including manufacturing and retail, professional services, healthcare, law enforcement and finance.
As an independent data consultancy, we can help you navigate through the forest of different product and service offerings, and help you develop and implement data solutions that are are appropriate for your organisation.