dbt docs generate & serve: Command Usage and Examples
Introduction
dbt is a command-line tool that enables data analysts and engineers to transform data in their warehouses more effectively. One of the key features of dbt is its ability to generate documentation for your data models, which is where the dbt docs
command comes into play. This tutorial will guide you through the process of generating and serving your project documentation using dbt.
Understanding the dbt docs Command
The dbt docs
command has two subcommands: generate
and serve
. The generate
command is used to create your project’s documentation, while the serve
command is used to view this documentation in a web browser.
Generating Documentation with dbt docs generate
To generate documentation for your dbt project, navigate to the root of your dbt project in your terminal and run the following command:
dbt docs generate
This command will create a static site with documentation for your project. The site includes information about your models, tests, sources, and more.
Serving Documentation Locally with
After generating the documentation, you can view it locally using the serve
command. Run the following command in your terminal:
dbt docs serve
This will start a web server and open the documentation in your default web browser. You can navigate through the documentation to view information about your dbt project.
Exploring the Documentation
The generated documentation provides a wealth of information about your dbt project. You can switch between a view of your project’s folder hierarchy and a database-focused collection of tables and views using the Project/Database toggle. You can also use the search bar to find specific models in your project.
Visualizing Data Lineage
One of the powerful features of dbt docs is its ability to visualize the relationships between your models. You can access this feature by clicking on the “Lineage” tab in the model’s page. This will show a graph of all the models that are upstream or downstream of the selected model, providing a clear view of your data’s lineage.
Uploading Documentation to re_cloud
If you want to share your documentation with others, you can upload it to re_cloud. To do this, you need to set up re_cloud and run the following command in your dbt project directory:
re_cloud upload dbt-docs --name your_project_name
This will upload your generated documentation to re_cloud, where it can be accessed by others in your organization.
dbt docs Example
Assume we have a dbt project with a simple model that transforms raw sales data into a more useful format. The model is defined in a file called sales.sql
:
-- models/sales.sql
{{ config(materialized='table') }}
select
order_id,
product_id,
customer_id,
quantity,
price,
quantity * price as total_price,
order_date
from raw.sales
We also have a schema.yml
file that describes this model:
# models/schema.yml
version: 2
models:
- name: sales
description: This table contains transformed sales data.
columns:
- name: order_id
description: The unique identifier for each order.
- name: product_id
description: The unique identifier for each product.
- name: customer_id
description: The unique identifier for each customer.
- name: quantity
description: The quantity of the product sold in the order.
- name: price
description: The price of the product.
- name: total_price
description: The total price of the order, calculated as quantity * price.
- name: order_date
description: The date the order was placed.
Now, we can generate documentation for this model using dbt docs generate
. In your terminal, navigate to the root of your dbt project and run:
dbt docs generate
This command will generate a target
directory in your dbt project. Inside this directory, you’ll find a manifest.json
file and a catalog.json
file. These files contain metadata about your dbt project and are used to generate the documentation website.
Next, you can serve the documentation locally using dbt docs serve
:
dbt docs serve
This command will start a web server and open your default web browser to the documentation site. Here, you’ll see a page for the sales
model with all the column descriptions and other information you defined in the schema.yml
file.
Conclusion
The dbt docs
command is a powerful tool for generating and serving documentation for your dbt projects. By using this command, you can ensure that your team and other stakeholders have access to up-to-date, accurate information about your data models.
Previous
dbt depsNext
dbt Exposures