I’m en route to San Jose for some work with a new client then a weekend in Salt Lake City for a few laps on the Great White Icicle! I usually try to have some kind of grander plan for my blogs, but this one is simply designed to save you some time if you ever need to write some Custom SQL against the JET driver.
I’ve been musing whether to condemn the JET driver for its perplexing syntax and lack of simple functions or laud it for providing much of the functionality found in real databases when connecting to something as simple as an Excel document. Ultimately, I’ve settled on grudging appreciation for the work Excel is able to do as a stand-in for a true SQL DB. As JET is the (Slightly Frustrating) way that happens, here are a few Tips and Tricks that may be time savers the day you need to do them. This isn’t intended to be a comprehensive list, these are simply the Gotcha!’s that I’ve had to work around this week alone.
SubQuery Syntax
When writing a join that requires a sub-query, JET requires a somewhat strange syntax. Imagine I have two tables, each with a different metric and want to join their aggregates based on a shared dimension – let’s not worry with why I’m doing this particular query, I want to keep the example simple and the syntax is the point.
Column | Type |
Region | Nvarchar(56) |
Metric One | Float |
Column | Type |
Region | Nvarchar(56) |
Metric Two | Float |
SELECT
QUERYONE.[Region],
QUERYONE.[Metric One Total],
QUERYTWO.[Region_two],
QUERYTWO.[Metric Two Total]
FROM(
SELECT [Table1$].[Region] AS [Region],
Sum([Table1$].[Metric One]) AS [Metric One Total]
FROM [Table1$]
GROUP BY [Table1$].[Region]
) [QUERYONE]
LEFT JOIN
(
SELECT [Table2$].[Region] AS [Region_two],
Sum([Table2$].[Metric Two]) AS [Metric Two Total]
FROM [Table2$]
GROUP BY [Table2$].[Region]
) AS [QUERYTWO]
ON QUERYONE.[Region]=QUERYTWO.[Region_two]
Note the interesting token assignments for the sub-queries – each requires square brackets on the token outside of the sub-query itself. Failing to follow this syntax and using code you’ve seen in other ANSI SQL sources will result in a myriad of errors like “Syntax Error in FROM Clause” or “Syntax Error in JOIN Clause”. You’ll see that I actually used a slightly different token assignment on the second sub-query, including AS in there – as long as the assignor has square brackets, we’re fine. Note also that I had to alias the Region field in the second query to “Region_two” to avoid the dreaded “Duplicate output alias ‘Region’” error – Tableau and JET won’t support two identically named fields, even from two different tables.
Cross Join
Let’s build on that example and consider a real-life problem I may want to solve using Custom SQL (yes there are some other solutions including Blending, Table-Calculations and Pass-Through Calculated fields, but we’re focusing on SQL in JET today). Imagine I want to create a % of Total field where I look at one Region as a percent of the total across regions and wanted to achieve that using Custom SQL to get the grand total on all lines. I could easily achieve this using a Cross Join in most databases (Cross Join will essentially create a Cartesian Product).
Here’s a cross Join that would solve the problem in another DB:
SELECT
ORDERS.[Region],
ORDERS.[Sales],
SUM.[Total Sales]
FROM
(SELECT
[Orders$].[Region] AS [Region],
[Orders$].[Sales] AS [Sales]
FROM [Orders$]) [ORDERS]
CROSS JOIN
(SELECT
sum(Sales) AS [Total Sales]
FROM [Orders$]
) [SUM]
EDIT: I want to thank our good friend Jonathan Drummey, a Tableau Zen Master, for introducing me to the CROSS JOIN syntax in JET. Turns out old dogs [myself] CAN learn new tricks! This is a great help, though utilizes somewhat unintuitive syntax. The generic syntax for a CROSS JOIN against JET is below – note that it effectively produces the CROSS JOIN output without specifying any Join condition.
SELECT
TABLE1.[FieldOne],
TABLE2.[FieldTwo]
FROM TABLE1,TABLE2
So that would yield the following query to produce our necessary effect.
SELECT
ORDERS.[Region],
ORDERS.[Sales],
SUM.[Total Sales]
FROM
(SELECT
[Orders$].[Region] AS [Region],
[Orders$].[Sales] AS [Sales]
FROM [Orders$]
) [ORDERS],
(SELECT
sum(Sales) AS [Total Sales]
from [Orders$]
) [SUM]
Nesting SubQueries
Both of the examples above used two tables and two sub-queries. What if I had three? I know many of you are used to writing pretty SQL like:
SELECT * FROM A,B,C
JOIN B ON A.Field=B.Field
JOIN C ON A.Field=C.Field
Not in JET you’re not. If two queries necessitated the form:
SELECT A.Columns, B.Columns (no * here)
FROM
(SELECT Columns FROM A) AS [A]
LEFT JOIN
(SELECT Columns FROM B) AS [B]
ON A.Key=B.Key
Which I would abbreviate as (A join B), then adding a third table will take the form ((A join B) join C), or in longerhand:
SELECT AB.Columns, C.Columns FROM
(
SELECT A.Columns, B.Columns (no * here)
FROM(
SELECT Columns FROM A) AS [A]
LEFT JOIN
(SELECT Columns FROM B) AS [B]
ON A.Key=B.Key
) AS [AB]
LEFT JOIN
( SELECT C.Columns FROM C
) AS [C]
ON [AB].Key=[C].Key
That’s it for now – don’t even get me started on FULL OUTER joins or some other subtleties. My flight is going to land soon and I’m looking forward to some dinner. Hopefully this will save you a few moments of headache. I’ve included an example of each in the attached TWBX workbook.