While teaching an Advanced Tableau session in Washington DC last week, we ran across a great little puzzler that highlighted to me the value of discussing data structures.
Working from one of Tableau’s standard data sets, the Superstore, the question was “Can we use Tableau to show which states had higher or lower sales than another?”
This is a great example of when a specific Business Question or Use Case may mandate a particular Data Structure. If what we’re interested in is comparing any single state to any other single state, then there is actually a nice way to do this using two parameters and a pair of calculated fields. Using a parameter to select State 1 and a calculated field with a formula like SUM(IIF([State] = [State 1], [Sales], 0)), we could isolate sales in State 1. Using an identical second parameter and calculated field, we could isolate sales in State 2. Then, by changing the value of each parameter, we could output the sales in each state and compare.
However, if what we’re interested in is comparing sales in any state to every other state at the same time, then that’s a different question. In this situation [which was the case last week], to correctly answer the question, we need to engineer a data structure that allows a row-row comparison between one state and any other state. Something like:
State | State 2 | State Sales | State 2 Sales |
OK | TX | 1 | 2 |
OK | PA | 1 | 4 |
OK | NY | 1 | 6 |
TX | OK | 2 | 1 |
TX | PA | 2 | 4 |
TX | NY | 2 | 6 |
PA | TX | 4 | 2 |
PA | NY | 4 | 6 |
PA | OK | 4 | 1 |
This seems like a great use case for the Custom SQL option when connecting to data. I ran the following query to produce a specialized State vs. State dataset and then built a quick dashboard based on the resulting Extract.
SELECT * from (
SELECT [Orders$].[Customer State] AS [Customer State], sum([Orders$].[Sales]) AS [Sales] FROM [Orders$] GROUP BY [Orders$].[Customer State]) AS a
INNER JOIN (
SELECT [Orders$].[Customer State] AS [Customer State2], sum([Orders$].[Sales]) AS [Sales2] FROM [Orders$] GROUP BY [Orders$].[Customer State]) AS b
ON [a].[Customer State] [b].[Customer State2]
As you can see, we now have the ability to use Dashboard Actions (or Quick Filters) to filter down to any State while maintaining the ability to compare to any other State (through the State 2 column). I used a very simple data set for this example, but the principle can be extended across many different data structures. Feel free to download the attached workbook – hopefully this will help in approaching your own business questions.