dbt project structure using medallion architecture (bronze/silver/gold layers)...
Transform AI agents into experts on dbt project architecture and medallion layer patterns, providing guidance on structuring production-grade dbt projects with proper layer separation, naming conventions, and configuration strategies.
Activate this skill when users ask about:
Medallion architecture demonstrates how dbt best practices seamlessly integrate with a layered data approach:
stg_) - One-to-one source relationshipsint_) - Business logic transformationsdim_, fct_) - Business-ready data productsEvery recommendation follows both architectural principles and dbt best practices simultaneously.
Bronze (Staging):
stg_{source}__{table}ephemeralSilver (Intermediate):
int_{entity}__{description}ephemeral or tableGold (Marts):
dim_{entity} or fct_{process}table or incrementalAlways enforce these patterns:
ref('stg_*')), never source()
directlystg_, int_, dim_, fct_ prefixesOfficial dbt Documentation: How we structure our dbt projects
Purpose: One-to-one relationship with source tables. Light cleaning and standardization only.
Materialization: ephemeral (compiled as CTEs)
Naming: stg_{source}__{table}.sql
-- models/bronze/stg_tpc_h__customers.sql
{{ config(materialized='ephemeral') }}
select
-- Primary key (renamed)
c_custkey as customer_id,
-- Attributes (cast and renamed)
c_name as customer_name,
c_address as customer_address,
c_phone as phone_number,
c_acctbal as account_balance,
-- Metadata
current_timestamp() as dbt_loaded_at
from {{ source('tpc_h', 'customer') }}
✅ DO:
{{ source() }}❌ DON'T:
Purpose: Reusable business logic and complex transformations. Sits between staging and marts.
Materialization: ephemeral (reusable logic) or table (complex computations)
Naming: int_{entity}__{description}.sql
-- models/silver/int_customers__with_orders.sql
{{ config(materialized='ephemeral') }}
with customers as (
select * from {{ ref('stg_tpc_h__customers') }}
),
orders as (
select * from {{ ref('stg_tpc_h__orders') }}
),
customer_metrics as (
select
customer_id,
count(*) as total_orders,
sum(order_total) as lifetime_value,
min(order_date) as first_order_date
from orders
group by customer_id
)
select
c.customer_id,
c.customer_name,
coalesce(m.total_orders, 0) as total_orders,
coalesce(m.lifetime_value, 0) as lifetime_value,
m.first_order_date
from customers c
left join customer_metrics m on c.customer_id = m.customer_id
✅ DO:
❌ DON'T:
Purpose: Business-ready data products optimized for BI tools and end users.
Materialization: table (dimensions) or incremental (large facts)
Naming: dim_{entity} (dimensions), fct_{process} (facts)
-- models/gold/dim_customers.sql
{{ config(materialized='table') }}
with customers as (
select * from {{ ref('int_customers__with_orders') }}
)
select
-- Primary key
customer_id,
-- Attributes
customer_name,
customer_email,
-- Metrics
total_orders,
lifetime_value,
first_order_date,
-- Business classification
case
when lifetime_value >= 5000 then 'gold'
when lifetime_value >= 1000 then 'silver'
else 'bronze'
end as customer_tier,
-- Metadata
current_timestamp() as dbt_updated_at
from customers
-- models/gold/fct_orders.sql
{{ config(
materialized='incremental',
unique_key='order_id',
cluster_by=['order_date', 'customer_id']
) }}
select
order_id,
customer_id,
order_date,
order_status,
order_total,
current_timestamp() as dbt_updated_at
from {{ ref('stg_tpc_h__orders') }}
{% if is_incremental() %}
where order_date > (select max(order_date) from {{ this }})
{% endif %}
✅ DO:
❌ DON'T:
| Layer | Prefix | Example | Purpose |
|---|---|---|---|
| Bronze/Staging | stg_ |
stg_tpc_h__customers |
Clean source data |
| Silver/Intermediate | int_ |
int_customers__with_orders |
Business logic |
| Gold/Dimensions | dim_ |
dim_customers |
Business entities |
| Gold/Facts | fct_ |
fct_orders |
Business events |
Primary & Foreign Keys:
{entity}_id - customer_id, order_id, product_idBoolean Flags:
is_{condition} - is_active, is_deleted, is_first_orderhas_{attribute} - has_orders, has_discountDates & Timestamps:
{event}_date - order_date, created_date{event}_at - created_at, updated_at, deleted_atMetrics & Aggregates:
{metric}_count - order_count, customer_count{metric}_amount - total_amount, discount_amountRow Numbers & Sequences:
{entity}_row_number - order_row_number{entity}_seq_number - sequence_number✅ DO:
❌ DON'T:
models/
├── bronze/ # Staging layer - one-to-one with sources
│ ├── stg_tpc_h__customers.sql
│ ├── stg_tpc_h__orders.sql
│ └── stg_tpc_h__lineitem.sql
├── silver/ # Intermediate layer - business logic
│ ├── int_customers__with_orders.sql
│ ├── int_fx_rates__daily.sql
│ └── customer_segments.sql
└── gold/ # Marts layer - business-ready analytics
├── dim_customers.sql
├── dim_products.sql
├── fct_orders.sql
└── fct_order_lines.sql
Configure common settings at the folder level to minimize model-level overrides:
models:
your_project:
bronze:
+materialized: ephemeral
+tags: ["bronze", "staging"]
+schema: bronze
silver:
+materialized: ephemeral
+tags: ["silver"]
+schema: silver
gold:
+materialized: table
+tags: ["gold", "marts"]
+schema: gold
Model-Level Configuration: Override folder defaults only for unique requirements (incremental settings, clustering, etc.)
✅ LEVERAGE: dbt's additive tag inheritance
Tags accumulate hierarchically per the dbt documentation. Child folders inherit all parent tags automatically.
# ✅ GOOD: Avoid duplicate tags
bronze:
+tags: ["bronze", "staging"]
subfolder:
+tags: ["subfolder"] # Inherits: bronze, staging, subfolder
# ❌ BAD: Redundant parent tags
bronze:
+tags: ["bronze", "staging"]
subfolder:
+tags: ["bronze", "staging", "subfolder"] # Duplicates parent tags
Common Selection Patterns:
dbt run --select tag:bronze # All bronze models
dbt run --select tag:gold # All gold models
dbt run --select tag:staging # Alternative to bronze
When users ask for architectural guidance:
stg_, int_, dim_, fct_ patterns"How should I structure my project?"
"Where does this model belong?"
"What should I name this model?"
stg_{source}__{table}int_{entity}__{description}dim_{entity}fct_{process}Goal: Transform AI agents into expert dbt architects who guide users through project structure with confidence, clarity, and production-ready patterns.