Agile Data Warehouse Design: What is a Sprint?

Data

Agile Data Warehouse Design: What is a Sprint?

by Tim Costello
//

This weekend I presented an introduction to Agile Data Warehouse Design for SQL Saturday #223 in Oklahoma City, OK. It was a fun talk for a great crowd. At the end, my friend Kristin Ferrier (blog:twitter) asked a great question:

“What is the expected deliverable from an Agile data warehouse sprint?”

I love this question. It’s a bit embarrassing to admit that I’ve had a lot of trouble reworking my thought process to fully embrace using Agile methodologies in my data warehouse design. Before I get to the answer, let’s step back and look at the question some more. This question goes to the heart of one of the key principles behind the Agile Manifesto

“Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.”

My view is the ideal iteration for a data warehouse design sprint should be 1-3 weeks that combine Agile Dimensional Modeling, ETL, BI Prototyping and finally Review and Release.  But release of what? I admitted earlier that I have had some trouble fully embracing the Agile mindset. As a database guy, I love a well-designed schema. It’s SO easy for me to get caught up in the “Agile Dimensional Modeling” step and lose track of the rest.

My first answer to Kristin’s question was that the “working software” we deliver at the end of the sprint is a star schema. Even as I said this I knew I’d gone wrong. Is a star schema “working software?” Does it deliver value to the business? No. It’s a part, but not the whole of what we aim to deliver. The true goal of an Agile DW sprint should be to deliver to the business the ability to answer a question. Ideally we give them a way to answer a question that they couldn’t even ask before. In my world that means we create a star schema focused on one (and ONLY one) business event, an ETL (Extract Transform and Load) process to load data and visual analytics (with Tableau) to understand data.

(Note: as we move forward in our dw design we can begin to answer questions that cross lines of business and/or involve multiple business events.)

As we learn to embrace Agile methodologies in our data warehouse designs, we must constantly remind ourselves that people don’t care about the schema. People care about the questions they can ask of the data and the answers they can get.

It’s our job to make that happen.

More About the Author

Tim Costello

Analytics Consultant
3 Reasons Why You Should Want a Data Warehouse I hear a lot of reasons people avoid a data warehouse: It takes too long to build, data warehouse projects are expensive and often end ...
The Tableau Performance Checklist: Filtering – Avoid High-Cardinality Quick Filters The next item in our Filtering checklist is: “Avoid high-cardinality quick filters (multi-select or drop-down lists). ...

See more from this author →

Subscribe to our newsletter

  • I understand that InterWorks will use the data provided for the purpose of communication and the administration my request. InterWorks will never disclose or sell any personal data except where required to do so by law. Finally, I understand that future communications related topics and events may be sent from InterWorks, but I can opt-out at any time.
  • This field is for validation purposes and should be left unchanged.

InterWorks uses cookies to allow us to better understand how the site is used. By continuing to use this site, you consent to this policy. Review Policy OK

×

Interworks GmbH
Ratinger Straße 9
40213 Düsseldorf
Germany
Geschäftsführer: Mel Stephenson

Kontaktaufnahme: markus@interworks.eu
Telefon: +49 (0)211 5408 5301

Amtsgericht Düsseldorf HRB 79752
UstldNr: DE 313 353 072