Tableau Prep: How to Union and Join Your Data to Infinity and Beyond

Data

Tableau Prep: How to Union and Join Your Data to Infinity and Beyond

As you may recall from my previous post, the year is 3015. As the spaceship sales manager for Toyota, I’m gathering data from the marketplace to analyze it, identify trends and eventually make pricing decisions based on my findings. We’ve been struggling to keep up with other manufacturers as of late, and it’s time to reverse the trend. They just don’t make spaceships like they used to.

I’ve been able to scrounge up planetary sales data from four different sources; Earth, Mars, Earth’s Moon, Io and data from the Andromeda galaxy. While my sales data is coming from different parts of the galaxy, it’s in a standardized format. Additionally, I have some data that lives in SQL Server containing customer information and demographics as well as various spaceship specifications.

Using Tableau Prep, I’ll show you how to combine all these disparate data sources into something easily digested by Tableau Desktop. If you need a refresher on joining data, Zac Heacker wrote a great blog on Combining Data with Joins. Let’s begin!

State of the Union

I’m going to start by opening Tableau Prep and connecting to my .xlsx sales files. For my analysis, I don’t want these datasets to be separated; since these four sources are identical in number of fields, field names and data types, I can union them into one set of data. Far out!

Here are my four standardized sales files waiting in the Flow Pane – all of them have TransactionID, CustomerID, ShipCode, SalePrice and SaleDate:

Tableau Prep - Standardized Files in Flow

To create the union, I’ll simply click on the + sign next to one of my data sources and select Add Union. You can see data profiling is available on the Union step, which is tremendously helpful:

Tableau Prep - Add Union

To add the other data sources to the new union, drag them onto the Union step and drop them on the +Add as shown here:

Tableau Prep - Add to Union

Once I’ve added all four data sets to the union, the Profile Pane shows what I’ve included:

Tableau Prep - Union Complete

Victory! I have now successfully combined the four standardized sales files into one set of data and any new connections to this Union step will treat it as such. Next, I need to join this unioned data to the SQL tables containing customer information and spaceship specifications.

… I know Join-Fu

Tableau Prep - Join Fu

After connecting to the SQL database, I’ll drop the customer_info and spaceship_mfg_info tables onto my Flow Pane. Notice the Custom SQL option here; you can craft custom SQL to retrieve data if necessary, but I just need these tables in their existing form:

Tableau Prep - Add Tables

I’ll be joining customer_info and spaceship_mfg_info to the Union data set sales_1 (just created) by way of their respective foreign keys (CustomerID and ShipCode). Do this by dragging and dropping from the Flow Pane onto the New Join side of sales_1:

Tableau Prep - Customer Info Join

Looking at the Join Panel in the Profile Pane, Tableau Prep smartly recognizes the join will be on Customer_ID. As expected, the data resulting from the join is now available for review in the Profile Pane, and the column headers are color-coded to match their source in the Flow Pane:

Tableau Prep - Customer Join Results

Since I’m just supplementing customer information to the sales data where it’s available, I’m going to adjust my inner join to a left join. Tableau Prep continues to be fantastically helpful with the visual aids and data profiling, even on something as seemingly mundane as applying a join.

Now, I want to join spaceship_mfg_info to this freshly joined sales and customer data. To accomplish this, I’ll drop spaceship_mfg_info onto the Join step I just made as another New Join:

Tableau Prep - Ship Join

Once again, Tableau Prep identifies the correct key field to join on (ShipCode). The Profile Pane shows me there were 15 ShipCodes that were not involved in any sales, and therefore will be excluded from the Join Results. I can even click on the Excluded values to see them profiled, showing exactly which ships were not found:

Tableau Prep - Ships Not Found

Bonus time: Doing a cursory review of my final set of combined data, I noticed the SalePrice and SaleDate columns must have been set to text in the Excel files (so much for standardized format!). I want to change these into decimal and date respectively, so I can use them more efficiently in Tableau Desktop. Both will be easy fixes in Tableau Prep and will require no backtracking because I can simply transform the fields from the “final” Join step itself!

For SaleDate, I’ll just click on the data type indicator within the Profile Pane, select date and “viola!” – a Clean step is added with the appropriate transformation:

Tableau Prep - Ship Date

To fix SalePrice, I’ll try an Automatic Split – to my delight, the $ symbol has been stripped out and my newly created value is in decimal format. Then, I’ll just delete the “bad” field and rename the split value accordingly. Now we’re cooking with lasers!

Tableau Prep - Sale Price

I have now successfully combined my four .xlsx sales files with the two SQL tables containing customer information and spaceship specifications! Adding an output step at the end, here’s the current state of my Flow. From here, I can create a Tableau Data Extract, .csv output file or even publish a data source to Tableau Server:

Tableau Prep - Final Flow

That’s all for now! Stay tuned for more Tableau Prep fun, and please reach out to us today if you have any questions or want to learn more about how InterWorks can help you master your data.

More About the Author

Spencer Parker

Data Engineer
Tableau Prep: How to Union and Join Your Data to Infinity and Beyond As you may recall from my previous post, the year is 3015. As the spaceship sales manager for Toyota, I’m gathering data from the ...
Tableau Prep: How to Cleanse Your Data and Prepare It for World Domination Analysis Hello again, data friends. My focus for this blog post will be the variety of formidable data cleansing options available in Tableau ...

See more from this author →

InterWorks uses cookies to allow us to better understand how the site is used. By continuing to use this site, you consent to this policy. Review Policy OK

×

Interworks GmbH
Ratinger Straße 9
40213 Düsseldorf
Germany
Geschäftsführer: Mel Stephenson

Kontaktaufnahme: markus@interworks.eu
Telefon: +49 (0)211 5408 5301

Amtsgericht Düsseldorf HRB 79752
UstldNr: DE 313 353 072

×

Love our blog? You should see our emails. Sign up for our newsletter!