If you’ve followed any of my previous blog posts, it’s pretty obvious that I am beyond excited about Snowflake. I have been working with the cloud data warehouse for about four months now, and it has consistently made my life easier. Today I am extremely excited to discuss how my love for Snowflake complements one of the first business intelligence tools I was exposed to: Tableau.
Tableau recently published a Whitepaper on using Snowflake as a data source with Tableau, and there are some extremely helpful tips and tricks hidden in the 63-page document. For those of you who don’t have the time or interest to read the whole thing, I invite you to continue reading this post for insights from the Whitepaper, along with my personal tips from the field on almost everything you need to know about powering your Tableau dashboards with Snowflake.
Things You Already Know
The Whitepaper starts off by reviewing some information about Snowflake that’s important to a reporting environment with Tableau:
- No need to install or maintain hardware
- Elasticity allowing the scale up and down of resources
- Snowflake’s fully SQL-compliant architecture
The New Stuff
After connecting to your Snowflake instance, we have the ability to take advantage of the full toolkit Snowflake offers. To get the most out of Snowflake with Tableau, it can occasionally take a little extra love. To use features like Time Travel or to query semi-structured data, you’re going to need to use some custom SQL connections, outlined in detail in the Whitepaper.
The Big Impact
Snowflake makes working with Tableau exponentially easier. From split-second query times to a pricing model designed for savings, the benefits stack up quickly. Looking at the Tableau reporting environment, here are some key areas in which Snowflake complements Tableau:
- Query Time
Snowflake’s automatic optimization techniques translate directly into sub-second query times. This means that for organizations that haven’t made the leap to Tableau Server and the use of Hyper Extracts, you can experience similar performance with a live database connection. Even if you are not planning on using Hyper Extracts, a Snowflake live connection will almost certainly be faster and more efficient than other data warehouse solutions. - Query Cache
I recently wrote a blog post that outlines what Query Cache is. When you partner Tableau with Query Cache, you end up with dashboards that can essentially query your database for free. - Semi-Structured Data
Have you ever tried using the Data Connector for JSON files in Tableau? If you have, then the way Snowflake handles semi-structured data will change your life. By building views in Snowflake, you can kill all your complex ETL processes for parsing JSON. It’s seriously just writing a select query.
Live Action!
I’ll demonstrate the features above by building a quick leaderboard for Destiny 2 in Tableau. Throughout the process of building the dashboard, Tableau will query Snowflake, which will allow me to analyze how they are handled. A great indicator of Tableau’s efficiency in sending queries is to analyze the query profiler in the History tab within Snowflake. Within the History tab, you can identify which queries were run by Tableau based on user, warehouse and time of execution. Below are two query profiles—one for the initial request to Snowflake and another from when I reload the dashboard with a different machine:
Pretty interesting, right? That first query took over three minutes, which is still very quick considering the quantity of data being pulled, but it gets better when Snowflake goes to reuse that query. Calling the queries cache takes less than a second. While the raw performance is incredible, the amount of data retrieved shows that using this query cache method to power a dashboard will cover most Tableau dashboard data requirements. Here’s a detailed breakdown of the view being generated:
You are reading that correctly. That is 1,268,205 distinct players, along with the aggregation of each player’s kills. That’s a healthy serving of information being displayed on my dashboard, and these query profiles are based on a 100% live connection.
Note: Dashboard above is an extract, because Tableau Public…
JSON Handling
The above example covers the query cache component and the query execution time. These both have significant impact on how you build reports and use Tableau. One unsung hero in this dashboard is the use of JSON views. Below is my data connection pane, along with a breakdown of each view’s total row count, to give you an idea of how much JSON Snowflake is shredding with each query:
VW_PLAYERS | VW_GAMESTATS | VW_WEAPONS |
13,367,463 | 13,367,463 | 20,060,697 |
The exciting piece about this JSON shredding is that even in a Tableau dashboard, Snowflake’s ability to parse and shred JSON in real time trumps the ways of the past. Sure, you can send your DBA to work on getting the JSON set up in structured tables, but when something handles JSON as well as Snowflake does … why would you?
Why You Should Care
When you look at choosing a data warehouse that complements Tableau, performance, cost and sustainability are key factors. Looking at Snowflake’s ability to cache query results, execute large transactions at efficient speeds and its ability to parse JSON on the fly, it becomes pretty evident that Snowflake should be an integral piece of anyone’s Tableau reporting environment. Snowflake allows you to maintain live connections to explore your data, meaning that you no longer have to send a request to your data team to have a new data source published as an extract.
On top of the quick execution times, businesses can power dashboards for free while the underlying data hasn’t changed. Distributing reports across a business has never been easier, and it removes a huge amount of concern over what your cloud bill will be at the end of each month. In addition to the obvious benefits of Query Cache and Snowflake’s query optimization engine, the ability to connect to views constructed with semi-structured data allows businesses to push their analytics process a step further.
I believe Snowflake is the solution for businesses looking to move their reporting environment to the cloud. If you have any questions regarding Snowflake with Tableau, please feel free to reach out to me on LinkedIn or leave a comment below!