Comparing Tableau Dimension Members without Custom SQL

Comparing Tableau Dimension Members without Custom SQL

Joseph Everett
//

What do you do when a request can’t be solved by traditional reporting tools, and adding new tables or views to your database can take weeks? 

Situation 

A new manager joins the group. While you’re presenting your workbook, she wants to see things a little differently. Instead of top 10 sales by state, she wants to compare one state’s sales against another state’s sales. She gives you an hour to get it done before she hops on a plane to Zurich.

In order to handle our new manager’s request, we want to simulate the results of a CROSS JOIN (Cartesian product). Through this, we can define a relationship of every Column A entry to every column B entry. See the example below:

EXAMPLE: Cartesian product.

Cartesian product

This result set can be quite large when dealing with even a few hundred records because 1000 customers by 50 states suddenly becomes 50000 rows! Because we don’t want a huge result set, we’re going to want to filter by only the relevant information in our analysis.  

These situations are typically handled with a little custom SQL, but what if you don’t know SQL (or even what to Google)? Worst of all, what if your SQL guy is spending the day at ComicCon?

What to do now?  

Enter the Blend  

When bringing additional data sources into your workbook, our mantra is “join if you can; blend if you must.” Sadly, people take this advice to mean that they should avoid blends instead of looking for the circumstances where blends can really advance your analysis.

One of my teammates has already tackled this intra-dimensional comparison and delivered a custom SQL answer (Business Questions and Data Structures). Today we’re going to explore the same challenge via blending, and allow anyone with a basic understanding of blending and data sources to produce some powerful results.

NoSQL “cross join” overview via blending:

  • Connect to data source
  • Duplicate data source
  • Build out view in primary data source
  • To help organize our efforts, we will make our calculated fields in our primary data source 
  • Break the link field so we don’t blend on a single state; we will maintain a higher level of aggregation or grouping
  • Use a filter to identify the state that has our attention (show me only what I want to see)

Step 1: Open Tableau and Connect to Your Data

NOTE: In order to match the original solution (see above) and because I want to filter out international data prior to aggregating my sales data, I’m filtering out the international business from the sample Superstore Data via a data source filter. To create the filter, right click Data Source, select Edit Data Source Filters and add your filter – e.g. Region.

Region Filter

Step 2: Duplicate Your Data Source

Right click your data source and select “Duplicate.”

This blend works like a cross-join since the results will be the same for every state in the primary view.

STANDARD BLEND (on linking field) – only the selected state in the filter is blended.

Standard Blend

Blending with no linking field – all states from the primary data source are shown.

Blending with no linking field

 

NOTE: In order to achieve this blend, from your secondary data source, click the red link icon that is automatically activated on State (break the link). We don’t want to actually blend data on the state level.

Stop using state as linking field

As seen above, if we keep the link activated on State, we’ll only bring in data from our selected state. This is not our desired behavior. We need the unlinked blend, otherwise the post-aggregate join (blend) would filter out the rows in the primary source that match the state (then we’re back where we started).

See the following example:

Our State and Sales data appear in this form in Excel:  

State Sales
Maine 97000
Vermont 40000
Utah 83000

When we filter on Maine with a linking field activated, our query will return the following:

Maine – 97000

In Tableau, it will appear like this:

Blend Active (linking field on): Selected State = Maine (

Blend Active Maine

Without a linking field, we receive the sales information for all the states.

Blend Inactive (linking field off) : Selected State = Maine

Blend Inactive

Step 3:  Build Your Initial View, Establish Your Primary Data Source

Drag Sales to columns.

Drag State to rows.

Sort State by Sales descending.

Step 4: Create Two Calculated Fields

Sales Difference:

From our primary data source, select Analysis then Create Calculated Field.

Take the absolute value of the difference between the Sales field from the secondary data source.

Note: The field is pre-aggregated and so is the Sales field from our primary data source.

ABS(SUM([Sample – Superstore – English (Extract) (copy)].[Sales]) – SUM([Sales]))

Sales Evaluation:

Use our filter to identify our selected state and states with lower and higher sales:

IF MIN([State]) == (ATTR([Sample – Superstore – English (Extract) (copy)].[State])) THEN  “Selected State”

ELSEIF SUM([Sales]) > SUM([Sample – Superstore – English (Extract) (copy)].[Sales]) then “Greater”

ELSE “LESS”

END

Step 5: Finish Out Your View

Drag Sales Difference to Columns (next to Sales).

From our secondary data source, drag State to the Filters shelf. Right click and select Show Quick Filter.

Note: Your view should disappear until you select a choice from the filter.

Disappeared View

Step 6: Format Your Workbook and Pat Yourself on the Back

Pro Tip: To show only states greater than or less than our selected state, select Greater from the legend then select Hide.  

Hide

What’s Going on Under the Hood?

By unlinking on State we send two queries out: one to the primary datasource, which brings back all the states. The other query is sent to the secondary datasource, returning only the state we care about via the filter on the secondary data source.

This scenario actually gives us a performance edge over a standard cross join because of the a la carte comparisons we can perform with our quick filter without bringing over data we’re not interested in.

Because of size and organization of data, your mileage may vary with real world datasets. When performing this example with your data, my recommendation would be to optimize your extracts by rolling up aggregation to a monthly or quarterly level and hide unneeded columns.

There you have it! In the time it would take to locate your SQL guy or to Google how to write this query, you handled the challenge. Now you’re probably wondering about the wonderful chocolate your boss is going to bring back for you from Zurich.

Good luck in your future blending!

UPCOMING WEBINAR

Data Strategy Myths vs Realities: What's Effective and What's Not

InterWorks uses cookies to allow us to better understand how the site is used. By continuing to use this site, you consent to this policy. Review Policy OK

×

Interworks GmbH
Ratinger Straße 9
40213 Düsseldorf
Germany
Geschäftsführer: Mel Stephenson

Kontaktaufnahme: markus@interworks.eu
Telefon: +49 (0)211 5408 5301

Amtsgericht Düsseldorf HRB 79752
UstldNr: DE 313 353 072

×

Love our blog? You should see our emails. Sign up for our newsletter!