Often, discussions about data governance begin and end with security. Preventing unauthorized access or the loss of proprietary data is crucial. Data governance also ensures that your data systems comply with regulatory requirements and protect privacy.
You are undoubtedly aware of the horror stories regarding data breaches and extortion. Keeping proprietary data safe and secure is a fiduciary requirement and a legal necessity. This is the price of admission to modern analytics.
While data security often takes the spotlight, the role of data governance in making data accessible, understandable and easy to find is imperative for driving adoption and process improvement. It empowers making informed decisions by ensuring your secured data is accessible and understandable. This aspect of business intelligence systems may not make headlines, but building proficiency in this area enables cultural change and powers broad adoption, which leads to business process improvement.
Data governance planning is a business process improvement initiative that must consider data lineage, provenance, user personas, living data catalogs, master data management, access and utility.
Reframe your thinking and treat governance as a business process improvement initiative with multiple goals: First, train your team in the basics of data structure, data workflows, and presentation tools. Second, empower your entire workforce with high-quality information that is timely, accurate and accessible. Third, remove wasted effort and redundancy and improve the initial quality of your business data when it is created.
Reducing waste in time and material means improved efficiency and profitability.
Data Lineage
Data lineage refers to tracking and visualizing data flow through various stages of its lifecycle. The goal is to improve the visibility of the data that makes up the data consumption layer (dashboards, reporting, analysis).
Exposing the data components that make up reporting improves confidence in the data and awareness of how it was developed. It also reduces the possibility that people will build redundant sources because they didn’t know validated and approved data sources are already available.
Data Provenance
Investors in fine art expend a lot of effort validating the authenticity of a piece of art. Data provenance applies the same concept to data sources. The central idea is to identify the team that controls the process workflow generating the data and the manager responsible for that workflow. This lets everyone know who “owns” the workflow that is the source of the data.
Every business process that creates data uses some transaction system (Salesforce, Great Plains, Oracle, IBM, SAP, etc.). Measuring the quality of the first pass data and holding a manager accountable for improving the quality of this data. Your business intelligence team can provide this information as a by-product of the data transformation workflows they must create to load your data warehouse.
Two potential benefits result. The first-pass data will improve, and the process will become more efficient because the processes using this data will have fewer mistakes that affect operations.
Another potential benefit is learning. People working in downstream processes will learn to interpret data with a more critical eye. Training should be provided in the basics of database architecture for the transaction system and the data warehouse. This will improve data literacy deeper and broader in your organization.
Personas
Meeting them where they live means understanding who they are, what they do, and their level of technical understanding and need. The list of user personas in descending technical competence includes:
- Data Scientists, Data Architects (high technical skill, access to unstructured data and data testing).
- Business Analysts (some technical skills, domain knowledge, controlled data access, help from more technical resources).
- Business Managers and non-technical staff with (less technical skill but deep operational knowledge.) They need dashboards, reports and analysis. Access to analytics in the tools they typically use via embedding is best.
- Senior Managers (Operational Leaders) need dashboards, summary information, and drill-down capability. They also need support for answering ad hoc questions and emergent needs.
- Executive C-Level: Summary reporting, interactive dashboards, ad hoc support.
These personas must be considered when defining reporting, analysis and ad hoc analysis support. Standardizing the look and feel of dashboards for your organization will speed up training new hires and help experienced staff learn and navigate new dashboards more easily.
A Living Data Catalog
Every field, calculation and formula should be defined and stored in a data catalog. In many cases, data catalogs (if they are made) are ad hoc and built into available tools (spreadsheets, dashboards). These are static systems that do not adapt well to rapidly changing requirements.
Fortunately, new third-generation systems like Atlan have emerged, making the business of data building and maintaining data catalogs more decentralized without giving up visibility and control.
A data catalog is the primary reference for every dimension, fact, formula or KPI you use in your business. Your business needs and data change constantly, and the data catalog system you utilize must adapt quickly, decentralize the work and be accessible with appropriate controls. If the data catalog software can’t adapt quickly and easily, nobody will rely upon it.
Training
Everyone in your company should receive training in the basics of data architecture, data transformation and data visualization. Your goal should be to expand your knowledge of best practices and sound data schema and the process of ingesting, transforming, structuring and presenting data as usable information.
Tomorrow’s post will go deeper into education, change management and how incentive plans can be used to expand, deepen and improve the use of data to drive efficiency and profitability.