Smithery Logo
MCPsSkillsDocsPricing
Login
NewFlame, an assistant that learns and improves. Available onTelegramSlack
    databricks-solutions

    databricks-jobs

    databricks-solutions/databricks-jobs
    Data & Analytics
    19

    About

    SKILL.md

    Install

    • Telegram
      Telegram
    • Slack
      Slack
    • Claude Code
      Claude Code
    • Codex
      Codex
    • OpenClaw
      OpenClaw
    • Cursor
      Cursor
    • Amp
      Amp
    • GitHub Copilot
      GitHub Copilot
    • Gemini CLI
      Gemini CLI
    • Kilo Code
      Kilo Code
    • Junie
      Junie
    • Replit
      Replit
    • Windsurf
      Windsurf
    • Cline
      Cline
    • Continue
      Continue
    • OpenCode
      OpenCode
    • OpenHands
      OpenHands
    • Roo Code
      Roo Code
    • Augment
      Augment
    • Goose
      Goose
    • Trae
      Trae
    • Zencoder
      Zencoder
    • Antigravity
      Antigravity
    • Download skill
    ├─
    ├─
    └─
    Smithery Logo

    Give agents more agency

    Resources

    DocumentationPrivacy PolicySystem Status

    Company

    PricingAboutBlog

    Connect

    © 2026 Smithery. All rights reserved.

    About

    Use this skill proactively for ANY Databricks Jobs task - creating, listing, running, updating, or deleting jobs...

    SKILL.md

    Databricks Lakeflow Jobs

    Overview

    Databricks Jobs orchestrate data workflows with multi-task DAGs, flexible triggers, and comprehensive monitoring. Jobs support diverse task types and can be managed via Python SDK, CLI, or Asset Bundles.

    Reference Files

    Use Case Reference File
    Configure task types (notebook, Python, SQL, dbt, etc.) task-types.md
    Set up triggers and schedules triggers-schedules.md
    Configure notifications and health monitoring notifications-monitoring.md
    Complete working examples examples.md

    Quick Start

    Python SDK

    from databricks.sdk import WorkspaceClient
    from databricks.sdk.service.jobs import Task, NotebookTask, Source
    
    w = WorkspaceClient()
    
    job = w.jobs.create(
        name="my-etl-job",
        tasks=[
            Task(
                task_key="extract",
                notebook_task=NotebookTask(
                    notebook_path="/Workspace/Users/user@example.com/extract",
                    source=Source.WORKSPACE
                )
            )
        ]
    )
    print(f"Created job: {job.job_id}")
    

    CLI

    databricks jobs create --json '{
      "name": "my-etl-job",
      "tasks": [{
        "task_key": "extract",
        "notebook_task": {
          "notebook_path": "/Workspace/Users/user@example.com/extract",
          "source": "WORKSPACE"
        }
      }]
    }'
    

    Asset Bundles (DABs)

    # resources/jobs.yml
    resources:
      jobs:
        my_etl_job:
          name: "[${bundle.target}] My ETL Job"
          tasks:
            - task_key: extract
              notebook_task:
                notebook_path: ../src/notebooks/extract.py
    

    Core Concepts

    Multi-Task Workflows

    Jobs support DAG-based task dependencies:

    tasks:
      - task_key: extract
        notebook_task:
          notebook_path: ../src/extract.py
    
      - task_key: transform
        depends_on:
          - task_key: extract
        notebook_task:
          notebook_path: ../src/transform.py
    
      - task_key: load
        depends_on:
          - task_key: transform
        run_if: ALL_SUCCESS  # Only run if all dependencies succeed
        notebook_task:
          notebook_path: ../src/load.py
    

    run_if conditions:

    • ALL_SUCCESS (default) - Run when all dependencies succeed
    • ALL_DONE - Run when all dependencies complete (success or failure)
    • AT_LEAST_ONE_SUCCESS - Run when at least one dependency succeeds
    • NONE_FAILED - Run when no dependencies failed
    • ALL_FAILED - Run when all dependencies failed
    • AT_LEAST_ONE_FAILED - Run when at least one dependency failed

    Task Types Summary

    Task Type Use Case Reference
    notebook_task Run notebooks task-types.md#notebook-task
    spark_python_task Run Python scripts task-types.md#spark-python-task
    python_wheel_task Run Python wheels task-types.md#python-wheel-task
    sql_task Run SQL queries/files task-types.md#sql-task
    dbt_task Run dbt projects task-types.md#dbt-task
    pipeline_task Trigger DLT/SDP pipelines task-types.md#pipeline-task
    spark_jar_task Run Spark JARs task-types.md#spark-jar-task
    run_job_task Trigger other jobs task-types.md#run-job-task
    for_each_task Loop over inputs task-types.md#for-each-task

    Trigger Types Summary

    Trigger Type Use Case Reference
    schedule Cron-based scheduling triggers-schedules.md#cron-schedule
    trigger.periodic Interval-based triggers-schedules.md#periodic-trigger
    trigger.file_arrival File arrival events triggers-schedules.md#file-arrival-trigger
    trigger.table_update Table change events triggers-schedules.md#table-update-trigger
    continuous Always-running jobs triggers-schedules.md#continuous-jobs

    Compute Configuration

    Job Clusters (Recommended)

    Define reusable cluster configurations:

    job_clusters:
      - job_cluster_key: shared_cluster
        new_cluster:
          spark_version: "15.4.x-scala2.12"
          node_type_id: "i3.xlarge"
          num_workers: 2
          spark_conf:
            spark.speculation: "true"
    
    tasks:
      - task_key: my_task
        job_cluster_key: shared_cluster
        notebook_task:
          notebook_path: ../src/notebook.py
    

    Autoscaling Clusters

    new_cluster:
      spark_version: "15.4.x-scala2.12"
      node_type_id: "i3.xlarge"
      autoscale:
        min_workers: 2
        max_workers: 8
    

    Existing Cluster

    tasks:
      - task_key: my_task
        existing_cluster_id: "0123-456789-abcdef12"
        notebook_task:
          notebook_path: ../src/notebook.py
    

    Serverless Compute

    For notebook and Python tasks, omit cluster configuration to use serverless:

    tasks:
      - task_key: serverless_task
        notebook_task:
          notebook_path: ../src/notebook.py
        # No cluster config = serverless
    

    Job Parameters

    Define Parameters

    parameters:
      - name: env
        default: "dev"
      - name: date
        default: "{{start_date}}"  # Dynamic value reference
    

    Access in Notebook

    # In notebook
    dbutils.widgets.get("env")
    dbutils.widgets.get("date")
    

    Pass to Tasks

    tasks:
      - task_key: my_task
        notebook_task:
          notebook_path: ../src/notebook.py
          base_parameters:
            env: "{{job.parameters.env}}"
            custom_param: "value"
    

    Common Operations

    Python SDK Operations

    from databricks.sdk import WorkspaceClient
    
    w = WorkspaceClient()
    
    # List jobs
    jobs = w.jobs.list()
    
    # Get job details
    job = w.jobs.get(job_id=12345)
    
    # Run job now
    run = w.jobs.run_now(job_id=12345)
    
    # Run with parameters
    run = w.jobs.run_now(
        job_id=12345,
        job_parameters={"env": "prod", "date": "2024-01-15"}
    )
    
    # Cancel run
    w.jobs.cancel_run(run_id=run.run_id)
    
    # Delete job
    w.jobs.delete(job_id=12345)
    

    CLI Operations

    # List jobs
    databricks jobs list
    
    # Get job details
    databricks jobs get 12345
    
    # Run job
    databricks jobs run-now 12345
    
    # Run with parameters
    databricks jobs run-now 12345 --job-params '{"env": "prod"}'
    
    # Cancel run
    databricks jobs cancel-run 67890
    
    # Delete job
    databricks jobs delete 12345
    

    Asset Bundle Operations

    # Validate configuration
    databricks bundle validate
    
    # Deploy job
    databricks bundle deploy
    
    # Run job
    databricks bundle run my_job_resource_key
    
    # Deploy to specific target
    databricks bundle deploy -t prod
    
    # Destroy resources
    databricks bundle destroy
    

    Permissions (DABs)

    resources:
      jobs:
        my_job:
          name: "My Job"
          permissions:
            - level: CAN_VIEW
              group_name: "data-analysts"
            - level: CAN_MANAGE_RUN
              group_name: "data-engineers"
            - level: CAN_MANAGE
              user_name: "admin@example.com"
    

    Permission levels:

    • CAN_VIEW - View job and run history
    • CAN_MANAGE_RUN - View, trigger, and cancel runs
    • CAN_MANAGE - Full control including edit and delete

    Common Issues

    Issue Solution
    Job cluster startup slow Use job clusters with job_cluster_key for reuse across tasks
    Task dependencies not working Verify task_key references match exactly in depends_on
    Schedule not triggering Check pause_status: UNPAUSED and valid timezone
    File arrival not detecting Ensure path has proper permissions and uses cloud storage URL
    Table update trigger missing events Verify Unity Catalog table and proper grants
    Parameter not accessible Use dbutils.widgets.get() in notebooks
    "admins" group error Cannot modify admins permissions on jobs
    Serverless task fails Ensure task type supports serverless (notebook, Python)

    Related Skills

    • databricks-bundles - Deploy jobs via Databricks Asset Bundles
    • databricks-spark-declarative-pipelines - Configure pipelines triggered by jobs

    Resources

    • Jobs API Reference
    • Jobs Documentation
    • DABs Job Task Types
    • Bundle Examples Repository
    Recommended Servers
    DataForB2B
    DataForB2B
    Laddro Career
    Laddro Career
    StudioMeyer-Crew
    StudioMeyer-Crew
    Repository
    databricks-solutions/ai-dev-kit
    Files