Implementing Advanced Capabilities in dbt

Data

Implementing Advanced Capabilities in dbt

by Colin Murray
//

After you’ve finished setting up your dbt environment and begun working in dbt, you may find that you’ll need more than just the basics. As mentioned in The Transformational Power of dbt in Analytics, dbt allows data teams to leverage the capabilities of software engineering teams. Here, we are going to talk about dbt’s more advanced capabilities and how you can implement them in your projects.

Packages

Packages in dbt are standalone dbt projects that can be incorporated into other dbt projects. Packages are comparable to external libraries available in other programming languages. Think of things like Pandas in Python or dplyr in R. This helps make your data models more composable in the way that a Python project can be composed of potentially many libraries. dbt packages can then decrease the amount of time required to build data models by providing code that someone else has already created for common data-modeling scenarios. dbt makes all packages available via their package hub. There you can explore packages to solve common problems, like modeling data from Salesforce.

Jinja and Macros

One problem with SQL in the context of a modern data team is that SQL is almost entirely static. dbt solves for this by incorporating the Jinja templating language and the use of macros. Jinja provides a way to use control structures like “if statements” and for loops. In dbt, you use Jinja to write macros. Macros are similar to functions in other programming languages. They are reusable pieces of code that can be incorporated into your SQL models.

For example, below is a simple way to calculate sales tax using a macro:

{% macro sales_tax(column_name, sales_tax_rate, precision=2) %}
    ({{ column_name }} * (sales_tax_rate / 100))::numeric(16, {{ precision }})
{% endmacro %}

To use this macro inside a model, the following query could be an example:

select
  product,
  sale_price,
  sales_tax(sale_price, 8.75) as total_sales_tax
from product_sales;

Using the above macro, we can now calculate the amount of sales tax owed per product if the sales tax rate is 8.75%.

Hooks and Operations

Hooks are SQL statements that can be executed before or after a model, seed or snapshot is run. When you add hooks to transformations, you have the ability to create dependencies before a model is built. A great example of a pre-hook is before any model is built, you can choose the role prior to running your script. Meanwhile, a post-hook could grant the ability to select values in a newly created table or view to a role after the model is built. Operations are simply a macro that can be run outside of a model. The dbt docs give an example of running an operation and show how to easily use operations inside of dbt.

Implementing dbt

Advanced dbt usage is typically a function of project maturity. As you create more models, add more sources and run more tests, the ability to modularize and reuse parts of your code increases the dependability, consistency and trustworthiness of your data. It’s past time data teams take full advantage of the tools that software teams have had for years.

We have been a part of numerous dbt implementations across a variety of industries; with this depth of experience, we’re knowledgeable of the pitfalls to avoid. If you are having trouble finding the right package, creating macros or fitting dbt within your team, don’t hesitate to reach out. We’d love to help!

More About the Author

Colin Murray

Data Lead
Choosing Data Pipeline Tools: Matillion or Alteryx Designer What are the differences or similarities between Matillion and Alteryx Designer? Both Matillion and Alteryx are popular ...
Moving Objects Between S3 Buckets via AWS Lambda A common scenario that people encounter is that people need to move one object in a S3 bucket to a different bucket. Recently, I ...

See more from this author →

InterWorks uses cookies to allow us to better understand how the site is used. By continuing to use this site, you consent to this policy. Review Policy OK

×

Interworks GmbH
Ratinger Straße 9
40213 Düsseldorf
Germany
Geschäftsführer: Mel Stephenson

Kontaktaufnahme: markus@interworks.eu
Telefon: +49 (0)211 5408 5301

Amtsgericht Düsseldorf HRB 79752
UstldNr: DE 313 353 072

×

Love our blog? You should see our emails. Sign up for our newsletter!