Dbt

dbt (Data Build Tool) #

Context #

  • dbt is a SQL-first transformation workflow that lets teams quickly and collaboratively deploy analytics code following software engineering best practices like modularity, portability, CI/CD, and documentation.
  • dbt uses Jinja as the templating engine to generate SQL code for data transformations such as {{ ref() }} function:
select *
from {{ref('model_a')}}

dbt Setup in Cloud #

  • dbt Cloud is a hosted service for managing your dbt projects and deployments launched in 2019. Click here to know more details.
  • dbt Reference
  • Login credentials: @hotmail.com /

dbt Setup on Local #

brew update
brew tap dbt-labs/dbt
brew install dbt-postgres
conda create -e dbt
pip install dbt-postgres
  • Create a dbt project:
dbt init dbt_sandbox
dbt build
dbt run
  • Different between dbt build & run: dbt run = execute your models. dbt build = execute your models and test them.

dbt - Getting started Guides #

Jinja Examples #

{% set payment_methods = ["bank_transfer", "credit_card", "gift_card"] %}

select
order_id,
{% for payment_method in payment_methods %}
sum(case when payment_method = '{{payment_method}}' then amount end) as {{payment_method}}_amount,
{% endfor %}
sum(amount) as total_amount
from {{ ref('raw_payments') }}
group by 1
  • Writing modular macros:
{% macro get_column_values(column_name, relation) %}

{% set relation_query %}
select distinct
{{ column_name }}
from {{ relation }}
order by 1
{% endset %}

{% set results = run_query(relation_query) %}

{% if execute %}
{# Return the first column #}
{% set results_list = results.columns[0].values() %}
{% else %}
{% set results_list = [] %}
{% endif %}

{{ return(results_list) }}

{% endmacro %}


{% macro get_payment_methods() %}

{{ return(get_column_values('payment_method', ref('raw_payments'))) }}

{% endmacro %}