Data modeling with dbt: Part III, SQL best practices

When writing dbt models, following SQL best practices helps keep your data models clean, concise, and easy to follow.

Use joins instead of subqueries

Using joins in dbt is best practice over using subqueries. Joins simplify code and make it more readable. Subqueries can be messy and complicated to follow. SQL best practices aim to streamline and make code easy to read. Use joins instead of subqueries to produce cleaner and more readable code. Replace the following code:

With something like:

This is much easier to read. Writing good SQL functions and prioritising readability is essential for creating high-quality dbt data models.

Use descriptive names

It is essential to give models descriptive names so that it is easier to understand their purpose. An excellent way to name models is by using verbs describing their function. For instance, if a model is created to join two tables, it can be called '_joined' to indicate that it was designed for this specific purpose. This makes it easier for readers to understand why the model was created in the first place.

Using descriptive names and in-line comments is crucial when working on complex transformations, which may need clarification. Doing so will help to communicate the purpose of that particular piece of code to anyone who reads it without requiring them to ask you for clarification.

No abbreviations but full names

When joining tables, clarify which columns are from which table by spelling out the full table names or similar names. Avoid using random letter aliases for tables, as it can be frustrating and confusing to map back to the original tables.

Instead of using aliases, the full name of the table is spelt out in a previous example.

Comparing with the following:

The first option is much cleaner and easier to understand.

Previous
Previous

Data modeling with dbt: Part IV, testing

Next
Next

Data modeling with dbt: Part II, design best practices