This weekend I presented an introduction to Agile Data Warehouse Design for SQL Saturday #223 in Oklahoma City, OK. It was a fun talk for a great crowd. At the end, my friend Kristin Ferrier (blog:twitter) asked a great question:
“What is the expected deliverable from an Agile data warehouse sprint?”
I love this question. It’s a bit embarrassing to admit that I’ve had a lot of trouble reworking my thought process to fully embrace using Agile methodologies in my data warehouse design. Before I get to the answer, let’s step back and look at the question some more. This question goes to the heart of one of the key principles behind the Agile Manifesto:
“Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.”
My view is the ideal iteration for a data warehouse design sprint should be 1-3 weeks that combine Agile Dimensional Modeling, ETL, BI Prototyping and finally Review and Release. But release of what? I admitted earlier that I have had some trouble fully embracing the Agile mindset. As a database guy, I love a well-designed schema. It’s SO easy for me to get caught up in the “Agile Dimensional Modeling” step and lose track of the rest.
My first answer to Kristin’s question was that the “working software” we deliver at the end of the sprint is a star schema. Even as I said this I knew I’d gone wrong. Is a star schema “working software?” Does it deliver value to the business? No. It’s a part, but not the whole of what we aim to deliver. The true goal of an Agile DW sprint should be to deliver to the business the ability to answer a question. Ideally we give them a way to answer a question that they couldn’t even ask before. In my world that means we create a star schema focused on one (and ONLY one) business event, an ETL (Extract Transform and Load) process to load data and visual analytics (with Tableau) to understand data.
(Note: as we move forward in our dw design we can begin to answer questions that cross lines of business and/or involve multiple business events.)
As we learn to embrace Agile methodologies in our data warehouse designs, we must constantly remind ourselves that people don’t care about the schema. People care about the questions they can ask of the data and the answers they can get.
It’s our job to make that happen.