PopSQL
Back to tutorials

dbt run Command: Usage & Examples

Introduction

Welcome to this comprehensive tutorial on the dbt run command. dbt is a transformative tool for data professionals who need to manage and transform data in SQL. The dbt run command is a core function of dbt, enabling the execution of compiled SQL model files against a specified target database. This is particularly beneficial in business scenarios where raw data needs to be transformed into a more digestible format for analysis and decision-making.

Understanding the dbt run Command

The dbt run command is designed to execute your compiled SQL model files against the current target database. It connects to your database and executes the necessary SQL to materialize all data models using the strategies you’ve outlined.

Let’s consider a business scenario where you have a raw sales data table and you want to create a model that aggregates this data by product category. We'll assume you already created a staging model for that table, so we can reference the staging model here. Your SQL model file might look like this:

-- sales_by_category.sql

{{
  config(
    materialized='table'
  )
}}

select 
  product_category,
  sum(sales_amount) as total_sales 

from {{ref('stg_sales_data')}}

group by product_category

To run this model, you would use the dbt run command in your terminal:

dbt run --model sales_by_category

The command will execute the SQL in the model file against your target database, creating a new table with the aggregated sales data.

dbt run Model Deployment

When deploying new models, dbt run handles the process smoothly. It first builds each model with a temporary name, then drops the existing model, and finally renames the new model to its correct name. This process is handled within a single database transaction for database adapters that support transactions, ensuring data integrity.

For instance, if you update the sales_by_category.sql model to also include the average sales amount per category, dbt run will create a new version of the sales_by_category table with the additional data, and then replace the old table with the new one.

Using the full-refresh Flag

The --full-refresh flag is a powerful feature of dbt run. When used, dbt treats incremental models as table models. This is especially useful when the schema of an incremental model changes and you need to recreate it, or when you want to reprocess the entirety of the incremental model due to new logic in the model code.

For example, if we had a daily version of our sales_by_category model from earlier and it only added new rows each day, it'd look like this:

--daily_sales_by_category.sql

select 
  sales_date,
  product_category, 
  sum(sales_amount) as total_sales, 
  avg(sales_amount) as average_sales, 
  count(*) as number_of_transactions

from {{ref('stg_sales_data')}} 

{% if is_incremental() %}

  where sales_date > (select max(sales_date) from {{ this }})

{% endif %}

group by sales_date, product_category

In this case if you need to run the model with all your raw data, and not just the days not yet added to the model, you can use the --full-refresh flag to recreate the table with the new data.

dbt run --model sales_by_category --full-refresh

This will ignore the incremental logic in the where statement of the model and run for all days in the source data set.

Selecting Specific Models to Materialize

dbt run also allows you to select which specific models you’d like to materialize. This can be useful in scenarios where you may prefer running a different set of models at various intervals, or when you want to limit the tables materialized while you develop and test new models.

For example, if you have a large number of models but you’re currently only working on a few of them, you can use the m option with dbt run to only run those models. This can save time and resources during the development and testing process.

dbt run -m sales_by_category,daily_sales_by_category

Conclusion

We’ve covered a lot in this tutorial, from understanding the dbt run command and its intelligent multi-threading, to deploying dbt run models and using the --full-refresh flag. Remember, the best way to get comfortable with these concepts is to practice. So, go ahead and start using the dbt run command in your data transformation projects.

database icon
Unified workspace for your dbt workflow
Forget about the painful parts of dbt development, focus on what matters the most - data analysis