Medallion Architecture with dbt and Snowflake: A Complete Implementation Guide
Medallion architecture has become the default pattern for modern data lakehouses — and for good reason. It gives your data platform a clear, maintainable structure: raw data flows in at the Bronze layer, gets cleaned and standardised at Silver, and emerges as business-ready aggregates at Gold. When you combine this pattern with dbt and Snowflake, you get one of the most powerful and cost-effective data stacks available today.
This guide walks through a complete implementation: folder structure, naming conventions, dbt model configuration, and the Snowflake objects you need at each layer.
Why Medallion Architecture Works
Before the medallion pattern, most data warehouses had ad-hoc layering — staging tables, reporting tables, and a tangle of views with no consistent logic. The medallion pattern enforces discipline:
- Bronze — raw, unmodified ingestion of source data. No transformations, no joins.
- Silver — cleaned, typed, and deduplicated data. Schema is enforced here.
- Gold — business aggregates and dimensional models. Directly queried by BI tools.
Each layer is idempotent and auditable. If something breaks at Gold, you can always reprocess from Bronze without re-ingesting from source.
Setting Up the dbt Project Structure
Your dbt project folder structure should mirror the medallion layers:
-
models/
bronze/
raw_orders.sql
raw_customers.sql
silver/
stg_orders.sql
stg_customers.sql
gold/
dim_customers.sql
fct_orders.sql
agg_daily_revenue.sql
In your dbt_project.yml, configure each layer to materialise differently in Snowflake:
models:
your_project:
bronze:
+materialized: table
+schema: bronze
silver:
+materialized: table
+schema: silver
gold:
+materialized: table
+schema: gold
The Bronze Layer: Raw Ingestion
Bronze models simply select from your source system tables. If you are using Fivetran, Airbyte, or Snowpipe for ingestion, your source data lands in a raw schema. Bronze models reference these sources directly with minimal transformation:
-
-- models/bronze/raw_orders.sql
{{ config(materialized='table') }}SELECT *
FROM {{ source('raw', 'orders') }}
Define your sources in a sources.yml file so dbt can track freshness and lineage automatically.
The Silver Layer: Cleaning and Standardisation
Silver models apply business logic to clean and standardise the Bronze data:
-- models/silver/stg_orders.sql
{{ config(materialized='table') }}SELECT
order_id::VARCHAR AS order_id,
customer_id::VARCHAR AS customer_id,
order_date::DATE AS order_date,
UPPER(TRIM(status)) AS status,
total_amount::DECIMAL(18,2) AS total_amount,
_loaded_at AS ingested_at
FROM {{ ref('raw_orders') }}
WHERE order_id IS NOT NULL
Add dbt schema tests at the Silver layer to enforce data contracts:
models:
- name: stg_orders
columns:
- name: order_id
tests:
- not_null
- unique
- name: status
tests:
- accepted_values:
values: ['PENDING', 'CONFIRMED', 'SHIPPED', 'CANCELLED']
The Gold Layer: Business Aggregates
Gold models join Silver tables to produce the metrics your business actually uses:
-
-- models/gold/agg_daily_revenue.sql
{{ config(materialized='table') }}SELECT
o.order_date,
c.customer_segment,
COUNT(DISTINCT o.order_id) AS orders,
SUM(o.total_amount) AS revenue,
AVG(o.total_amount) AS avg_order_value
FROM {{ ref('stg_orders') }} o
LEFT JOIN {{ ref('stg_customers') }} c
ON o.customer_id = c.customer_id
WHERE o.status = 'CONFIRMED'
GROUP BY 1, 2
Snowflake-Specific Optimisations
-
- Clustering keys on Gold tables — add clustering on your most common filter columns (date, region) to reduce query costs
- Transient tables for Bronze — Bronze data is reprocessable from source, so use transient=true to save on Snowflake storage (no Fail-safe overhead)
- Dynamic tables at Silver — Snowflake Dynamic Tables can automate Silver refreshes without orchestration overhead
- Zero-copy cloning for development — clone your Gold schema for development environments without duplicating storage costs
Running the Full Stack
Once your models are built, your CI/CD pipeline runs:
-
dbt run --select bronze # refresh raw layer
dbt test --select bronze # validate sources
dbt run --select silver # clean and standardise
dbt test --select silver # enforce schema contracts
dbt run --select gold # build business aggregates
dbt docs generate # update lineage documentation
Need Help Building This?
The medallion architecture pattern is straightforward in theory but requires careful design decisions in practice — particularly around partitioning, incremental processing, and cost management in Snowflake. DataKrypton has implemented this stack for multiple organisations and can help you design, build, and optimise your dbt and Snowflake environment from day one.
Reach out to our team to discuss your data platform needs. We offer hands-on implementation support and architecture reviews.