This blog post is Human-Centered Content: Written by humans for humans.
Here’s a question for you: If you were asked to define “Data Governance,” how confident would you be in your answer?
Considering how quickly the data space is changing, we wouldn’t blame you if your confidence in that answer isn’t as high as you’d like it to be
For us, we’d say that data governance encompasses how organizations manage, protect, and contextualize their data assets across the entire data lifecycle and technology ecosystem.
We think that’s a fine definition, but it does beg the question: How do you actually do Data Governance? Given the wide breadth of our definition, that question can be quite daunting! With that said, we’d like to offer an analogy using one of our favorite Psychologist to help drive steer and focus.
Maslow’s Hierarchy of Needs
You might be familiar with this: American psychologist Abraham Maslow proposed in 1943 a way of quantifying an ascending order of needs for humans to achieve their full potential. Often portrayed as a pyramid-shaped graph, it outlines the most basic human needs all the way up to realizing one’s full potential at the top:
While we’re certainly not going to claim to be experts in human psychology, this does have a useful framework that we can step through to achieve understanding how we should tackle good data governance:
Let’s step through it, starting at the first step: Taking care of your base-level needs.
Step 1: Base Level
Providing the basic necessities of Data Governance lays the foundation for data governance, and are non-negotiables for a surviving data function
To that end, let’s break the basics down into three main parts:
- Data Acquisition: The processes and technologies used to collect and ingest data from various sources into the data ecosystem.
- Accessibility: The ability for authorized users to retrieve, view, and utilize data when needed. This includes physical access methods, query capabilities, and retrieval mechanisms.
- Availability: Ensuring data systems are operational and data is accessible when needed, including redundancy and reliability mechanisms.
Step 2: Safety and Security
Safety, in this context, are the protective mechanisms that secure data assets and ensure regulatory compliance, creating a trusted data environment. We’ll outline two essentials in this area, as well as two things you need to avoid.
The essentials:
- Role-Based Access Controls (RBAC) with automated provisioning
- Regular security audits and vulnerability assessments
The avoids:
- Not having formal, written policies about data access or sharing
- Lacking adequate data classification levels based on sensitivity
Step 3: Connection
Connections, in this case, are less technical and more about capabilities that create shared understanding of data and connect it to business meaning, fostering data communities and collaboration. This can be accomplished in a few ways.
Catalogs are a good way of maintaining an inventory of all your data assets, making them discoverable and understandable through technical and business metadata. Nailing down the Semantics is also going to help universalize the definitions and context that give your data unified meaning, including standardized terminology and data element definitions. Finally, establishing a Center of Excellence (COE) makes cross-functional groups that collaborate on data governance, sharing knowledge and best practices to improve data usage.
Step 4: Self Service
Self Service — the organizational and technical approaches that enable teams to independently access, understand and utilize data within a governed framework. We’re nearing the top of the hierarchy of needs, so it’s only natural that we’re close to reaching full data independence. This one has some must-haves and avoids as well:
Must-Haves:
- Ownership: Business users can independently access and analyze data
- Accountability: Domain-oriented data ownership with quality accountability
Avoids:
- Lack of Domain-Level Expertise: Some team level technical expertise needed
- Bottlenecks: Possibly due to centralized data team dependencies or inefficient products.
Step 5: Full Potential
This is the tippy top of the hierarchy pyramid: The full extent of what all the effort on the lower steps helps you achieve. This is using your data to the fullest. So, what does that look like?
It means achieving automation — being able to rely on technology to automatically enforce your policies, monitor compliance and remediate issues without manual interventions. It also includes data contracts, formal agreements between data producers and consumers that specify data structure, quality, delivery and usage terms. It also unlocks a current hot-button want for most companies: AI integration. This isn’t just using ChatGPT. It’s also the application of artificial intelligence to enhance governance capabilities, including anomaly detection, policy recommendation and adaptive controls.
To Be Continued
With the hierarchy laid out, our next blog is going to talk about actually building that framework piece by piece to make sure you can achieve your full data potential. Keep an eye out here for when that blog comes out, and if you have any questions so far, feel free to reach out.