Create a custom Dagster Component with demo mode support, realistic asset structure, and optional custom scaffolder using the dg CLI...
This skill automates the creation and validation of a new custom Dagster component using the dg CLI tool with uv as a package manager. It incorporates demo mode functionality for creating realistic demonstrations that can run locally without external dependencies. The documentation for creating good components can be found here https://docs.dagster.io/guides/build/components/creating-new-components/creating-and-registering-a-component and here https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-dbt/dagster_dbt/components/dbt_project/component.py for a complex example of a component.
When invoked, this skill will:
dg scaffold component ComponentNamebuild_defs() function with both real and demo mode implementationsdemo_mode boolean flag in the component YAML for toggling between real and local demo implementationsdg scaffold defs my_module.components.ComponentName my_component commanddg check defs and dg list defs to ensure that the expected component instances are all loaded.Before running this skill, ensure:
uv is installed (check with uv --version)Ask the user for:
MyDagsterComponent. Validate that:Use dg to create the component
uv run dg scaffold component <ComponentName>
This will:
defs/componentscomponent_name.py fileFill in the build_defs() function in the component file. The component should:
demo_mode parameter in the component params (default: False)defs/ folder in a resources.py file, using dg scaffold defs dagster.resource resources.pyassets field in the YAML that describes what assets are used in the underlying component. See https://dagster.io/blog/dsls-to-the-rescue for best practices in how to design a good DSL. Refer to https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-dbt/dagster_dbt/components/dbt_project/component.py and https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-fivetran/dagster_fivetran/components/workspace_component/component.py for two reference architectures for good component design with mutli-assets.kinds argument to indicate technologies in useExample asset structure:
CRITICAL: When creating a custom component, consider what will consume your component's assets. The asset keys you generate should align with downstream component expectations to avoid requiring per-asset configuration.
Your component (upstream) should generate asset keys in a structure that downstream components naturally reference. This eliminates the need for meta.dagster.asset_key or complex translation configuration.
If dbt will consume your assets:
["<source_name>", "<table_name>"]["fivetran_raw", "customers"] or ["api_raw", "users"]source('fivetran_raw', 'customers')If custom Dagster assets will consume them:
deps["category", "name"])["system", "subsystem", "type", "name"] unless necessaryIf another integration component will consume them:
If your assets are intermediate and consumed by your own component:
["raw", "table"] → ["processed", "table"] → ["enriched", "table"]import dagster as dg
class APIIngestionComponent(dg.Component, dg.Model, dg.Resolvable):
"""Ingests data from REST APIs."""
api_endpoint: str
tables: list[str]
demo_mode: bool = False
def build_defs(self, context: dg.ComponentLoadContext) -> dg.Definitions:
assets = []
for table in self.tables:
# Design key for dbt consumption: ["api_raw", "table_name"]
# NOT: ["api", "ingestion", "raw", "table_name"]
@dg.asset(
key=dg.AssetKey(["api_raw", table]), # ← Flattened for easy downstream reference
kinds={"api", "python"},
)
def ingest_table(context: dg.AssetExecutionContext):
if self.demo_mode:
context.log.info(f"Demo mode: Mocking API call for {table}")
return {"status": "demo", "rows": 100}
else:
# Real API call
pass
assets.append(ingest_table)
return dg.Definitions(assets=assets)
Result: dbt can reference these assets naturally:
# sources.yml
sources:
- name: api_raw
tables:
- name: customers # Matches ["api_raw", "customers"]
Always verify asset keys align with downstream dependencies:
# Check asset keys and their dependencies
uv run dg list defs --json | uv run python -c "
import sys, json
assets = json.load(sys.stdin)['assets']
print('\\n'.join([f\"{a['key']}: deps={a.get('deps', [])}\" for a in assets]))
"
What to verify:
deps array["category", "name"])❌ Too deeply nested: ["company", "team", "project", "environment", "table"]
❌ Inconsistent structure: Some assets with 2 levels, others with 4
❌ Generic names: ["data", "table1"], ["output", "result"]
✅ Good patterns:
["source_system", "entity"]: ["fivetran_raw", "customers"]["integration", "object"]: ["salesforce", "accounts"]["stage", "table"]: ["staging", "orders"]IMPORTANT: Asset keys should be exactly the same whether demo_mode is True or False. Only the asset implementation (the function body) should differ between modes.
Why this matters:
Example - CORRECT approach:
def build_defs(self, context: dg.ComponentLoadContext) -> dg.Definitions:
@dg.asset(
key=dg.AssetKey(["fivetran_raw", "customers"]), # ← Same key in both modes
kinds={"fivetran"},
)
def customers_sync(context: dg.AssetExecutionContext):
if self.demo_mode:
# Demo implementation - mock data
context.log.info("Demo mode: Creating empty table")
# ... create mock table
else:
# Production implementation - real Fivetran sync
context.log.info("Production: Syncing from Fivetran")
# ... call Fivetran API
return dg.Definitions(assets=[customers_sync])
Example - INCORRECT approach:
def build_defs(self, context: dg.ComponentLoadContext) -> dg.Definitions:
if self.demo_mode:
@dg.asset(
key=dg.AssetKey(["demo", "customers"]), # ❌ Different key!
)
def demo_customers():
pass
return dg.Definitions(assets=[demo_customers])
else:
@dg.asset(
key=dg.AssetKey(["fivetran_raw", "customers"]), # ❌ Different key!
)
def prod_customers():
pass
return dg.Definitions(assets=[prod_customers])
Reference Documentation:
uv run dg docs integrations --json
When creating assets in your component, ALWAYS add the kinds parameter to properly categorize assets by their technology/integration type. This helps with:
Common integration kinds:
kinds={"fivetran"} for Fivetran assetskinds={"dbt"} for dbt assetskinds={"census"} for Census assetskinds={"sling"} for Sling assetskinds={"powerbi"} for PowerBI assetskinds={"looker"} for Looker assetskinds={"airbyte"} for Airbyte assetskinds={"python"} for custom Python processingkinds={"snowflake"} for Snowflake assetsYou can verify kinds are showing correctly by running:
uv run dg list defs
The "Kinds" column should show the integration type for each asset.
Example Component Structure:
from dagster import asset, Definitions, AssetExecutionContext
from pydantic import BaseModel
class MyComponentParams(BaseModel):
demo_mode: bool = False
# ... other params
class MyComponent(Component):
params_schema = MyComponentParams
def build_defs(self, context: ComponentLoadContext) -> Definitions:
params = self.params
@asset(
kinds={"fivetran"}, # ← REQUIRED: Add the integration kind
)
def raw_data(context: AssetExecutionContext):
if params.demo_mode:
# Demo implementation - local/mocked data
context.log.info("Running in demo mode with local data")
pass
else:
# Real implementation - connect to actual systems
context.log.info("Running with real data source")
pass
@asset(
deps=[raw_data],
kinds={"dbt"}, # ← REQUIRED: Add the integration kind
)
def processed_data(context: AssetExecutionContext):
if params.demo_mode:
context.log.info("Processing demo data")
pass
else:
context.log.info("Processing real data")
pass
# ... more assets
return Definitions(assets=[raw_data, processed_data, ...])
Use dg scaffold defs to create the component instance:
uv run dg scaffold defs my_module.components.ComponentName my_component
This creates a YAML file that should include the demo_mode parameter:
type: my_module.components.ComponentName
attributes:
demo_mode: true # Set to true for local demos, false for real deployments
# ... other params
If the user requested a custom scaffolder in Step 1, follow the directions here: https://docs.dagster.io/guides/build/components/creating-new-components/component-customization#customizing-scaffolding-behavior
Customize the scaffolder to provide a better developer experience for creating instances of this component.
Run these commands to ensure everything works:
# Check that definitions load without errors
uv run dg check defs
# List all assets to verify they were created
uv run dg list defs
Verify that:
demo_mode flag toggles between implementations correctlydemo_mode: false implementation uses realistic resources and is a production implementationCRITICAL: Verify Asset Key Alignment
Check that asset dependencies are correct by running:
uv run dg list defs --json | uv run python -c "
import sys, json
data = json.load(sys.stdin)
assets = data.get('assets', [])
print('Asset Dependencies:\n')
for asset in assets:
key = asset.get('key', 'unknown')
deps = asset.get('deps', [])
if deps:
print(f'{key}')
for dep in deps:
print(f' ← {dep}')
else:
print(f'{key} (no dependencies)')
print()
"
What to verify:
deps array["category", "name"])Key Principle: Asset keys should be identical between demo mode and production mode. Only the asset implementation (the function body) should differ. This ensures:
If demo mode was implemented:
demo_mode: truedg check defs to verify it works locallyThe component is complete when:
build_defs() is implemented with proper asset logickinds metadatadg check defs passes without errorsdg list defs shows all expected assetsAfter completion, inform the user: