It may happen that we work with really big tables that are not just a few rows but millions or even billions of rows. Tableau’s VizQL’s SQL (what a beginning of a sentence!) has to query all these tables, and that takes time. When we are dealing with seconds of loading time (latency) or even more, that is not just a developer’s issue but a problem that each consumer has to face. And nowadays, nothing frustrates more than waiting on a website to load its data.
It’s not only about frustration. Let’s say:
- We have 2,000 co-workers who all have to visit a specific Tableau dashboard twice a day and adjust a filter each time – so, 4 visits/clicks per day.
- Only 1,500 of them are actually doing that; the rest are too frustrated because of long latencies.
- Each visit or click takes 30 seconds.
- People work on approximately 200 days a year.
Now, we multiply: 1,500 employees x 200 days x 4 visits/clicks per day x 15 seconds = 36,000,000 seconds = 10,000 hours waiting time. That’s the same as putting 10 colleagues into a room in January and forcing them to wait on a website to load – until July.
Getting the 30 seconds down to 15 seconds (still an eternity) will let the poor guys out of the room in April at least. This is where the Performance Options of the Relationships come in. With them, we can help Tableau draft faster queries and give our co-workers time for more relevant things to do than waiting.
Of course, we will never reduce the latency to zero; but every reduction helps, even the minor ones.
The Cardinality defines if the elements of your table are unique or not. It’s kind of difficult to explain without showing it, so here is a tiny animation for you:
We connected both of our tables on the Post_ID, which is what our colorful rectangles are showing. The left table of our posts is designed in a way that every post gets exactly one row, meaning the Post_IDs are unique in there. Every ID occurs exactly ONE time.
The table on our right is different. The IDs of each comment may also be unique; but as we connect our tables on the Post_ID, we have to look at these. And no, the Post_IDs are not unique: Some Post_IDs have several comments and appear MANY times. This is all this Cardinality is about.
Behind the curtain the following is happening: Once we drag fields into our view, Tableau performs joins on the data depending on what we want to see. Usually, Tableau takes the necessary data from the first table, aggregates it and then merges it with the aggregated data of the other table. This is a safe bet as no values are duplicated along the way. This is the standard procedure and is always triggered when the Cardinality is set to MANY in both tables. That is the default.
Now, if we tell Tableau the Cardinality is ONE, it’s the other way round: Tableau joins first and aggregates after. That means there is less aggregation to do and, as such, results in faster performance.
But, be aware: The option ONE shall only be set when you are sure the items in that table are in fact unique. Otherwise, you risk – as Tableau puts it – unpredictable and incorrect values. Equipped with that knowledge, you may already use this option!
Now, if you are curious and want to go down this rabbit hole a little further, there will be another deep dive one blog after the next of this mini-series, “Breaking the Cardinality,” where we will even look at the SQL queries behind our view.
If you are all set, then feel free to jump to the next chapter: “Referential Integrity“