Since Tableau’s release of relational data sources (AKA, “Noodle”) and the introduction of a logical layer, conventional wisdom has carried over into leveraging both physical and logical layers when modeling Tableau data sources. Typically, we see physical joins to enforce row-level security. We also know that creating Tableau Hyper extracts to package data increases dashboard performance and responsiveness. However, to truly maximize Hyper’s columnar nature, we should look to eliminate all activities in the physical layer.
Physical & Logical Layers
Consider the following example: Figure 1 contains a logical table,”Orders,” composed of two physical tables — “Orders” and “Orders Entitlement.” We enforce row level security (RLS) through a physical join on Order ID and Username.
Figure 1: Logical Table With Physical Tables Joined
To determine the result set of our logical table, Tableau would need to perform the following steps:
- Determine the Current User through the USERNAME() function.
- For each record in the Orders table, match records with the Orders Entitlement table on the condition of Order ID = Order ID and Current User = Username.
Rather than forcing Tableau to perform row-level actions through an inner join between two tables, we should push the join and materialize the resulting dataset to the database instead. We would enforce RLS through a data source filter on USERNAME() = Username as shown below.
Figure 2: Logical Table With No Physical Joins
To determine the result set of our logical table, Tableau would need to perform the following steps.
- Determine the Current User through the USERNAME() function.
- Find records a user is entitled to in the materialized Orders + Entitlements table.
By materializing our Orders and Entitlements table, we remove row-level actions and leverage hyper columnar indexing to quickly find records for a given user. However, this comes at a cost of increasing Tableau Workbook size since our extract now contains the result of pre-joining two tables as opposed to maintaining each table separately within the extract. For a small group of users and aggregated security levels (e.g. secured by Country), this may be a non-issue. However, in the case where you are needing to enforce security at a granular level (e.g. Order ID) amongst a large group of users (e.g. over 50,000 employees worldwide), this becomes a major issue detrimental not only to Workbook size, but Workbook performance as well. In Part 2 of our series, we’ll cover methods to address workbook size and performance degradation due to the sudden increase in data volume. Stay tuned!