The Tableau Performance Checklist series is designed to help you streamline your dashboard performance and Tableau Server configuration. Each post expands upon one item listed in the master Tableau Performance Checklist.
Next, we’ll explore the sixth point under the Data section of the Tableau Performance Checklist:
“Use extracts wherever possible to accelerate performance. Hide unused and confidential fields. Roll up data granularity by pre-aggregating or filtering. Break hierarchies to only visible dimensions”
There are many reasons to utilize extracts and many interpretations of when to use them over other types of connections, but with the point above, we are just utilizing extracts to their fullest potential. Let me explain a little more about extracts and why the above points can make a difference.
First, when you create an extract, many different techniques are used by Tableau to optimize the extract for use with Tableau. Tableau first outlines the structure for the extract and defines a separate file for each column being utilized in the underlying data source. It will sort, compress and add the values for each column into a columnar store file.
By hiding the unused fields, as mentioned in the checklist, you are minimizing the files needed. These combine with metadata to form your single memory-mapped file containing all files pertaining to each column from your underlying data source.
Second, when creating an extract, you have the option to aggregate your data for visible dimensions. It’s commonly referred to as an aggregated extract. Since you are aggregating the data, you are not bringing in the row-level data in its entirety as with a non-aggregated extract.
When interacting with an aggregated extract, all the calculations and summations have already been calculated. Therefore, Tableau has little work to do in order to display results within your visualization. You can also determine a roll-up level to further reduce the size of the extract, again increasing performance.
Data Source Filtering
Lastly, the checklist mentions filtering. Data source filtering can also be helpful when trying to control the size of your extracts. Timing of when this filter is applied is key.
- If a data source filter is in place prior to extract creation, the extract will contain filtered records.
- If a data source filter is put in place after extract creation, the filter will be applied against the full extracted data set. So, your extract will contain “all” data but will only show what the data source filter is allowing.
I hope this brief explanation sheds some light on how extracts work and how you can utilize them to increase performance.
Mastering Best Practices
If you’re interested in becoming a Tableau Server guru, then learning these performance best practices is essential. Check back frequently as we add new posts and dive deeper into each point in the Tableau Performance Checklist.
Another great way to identify best practices is to leverage the insights offered by our Performance Analyzer, part of Workbook Tools for Tableau. It will examine all of your workbooks, worksheets, dashboards and data sources against a list of best practices to ensure that you’re using all the tips and tricks to guarantee your visualizations are moving at light speed.
As always, feel free to get in touch with us if you have any questions regarding performance or anything Tableau related! We’d be happy to help.