Skip to content

Serving Kedro pipelines over HTTP

Kedro includes a built-in HTTP server that lets external systems interact with a Kedro project over REST — triggering pipeline runs, inspecting project metadata, and more. It is backed by KedroServiceSession, which keeps the session alive across multiple requests.

The HTTP server requires optional dependencies. Install them with:

pip install 'kedro[server]'

Note

The HTTP server is intentionally minimal and meant to provide an interface that can be extended for custom use cases. It does not include authentication, authorisation, request queuing, async job execution, run history, or per-request session isolation. Do not expose it publicly without adding appropriate security controls.

Starting the server

From inside a Kedro project, run:

kedro server start

This starts the server at http://127.0.0.1:8000 by default.

Options

Option Short Default Description
--host -H 127.0.0.1 Host to bind the server to
--port -p 8000 Port to bind the server to
--reload False Enable auto-reload on code changes. Intended for development, do not use in production.
--env -e Kedro configuration environment
--conf-source Path to a custom configuration directory

Examples:

# Bind to localhost on port 8080
kedro server start --host 127.0.0.1 --port 8080

# Use the staging environment with auto-reload
kedro server start --env staging --reload

Endpoints

GET /health

Returns server status and the Kedro version in use.

curl http://127.0.0.1:8000/health
{
  "status": "healthy",
  "kedro_version": "<installed-kedro-version>"
}

kedro_version is the version of the Kedro package running the server, not the version declared in the project's pyproject.toml.

GET /snapshot

Returns a snapshot of the project structure: metadata, registered pipelines, catalog datasets, and parameter keys.

curl http://127.0.0.1:8000/snapshot
{
  "status": "success",
  "metadata": {
    "project_name": "My Project",
    "package_name": "my_project",
    "kedro_version": "1.0.0"
  },
  "pipelines": [
    {
      "name": "__default__",
      "nodes": [
        {
          "name": "split_data_node",
          "inputs": ["example_iris_data"],
          "outputs": ["X_train", "X_test"],
          "tags": [],
          "namespace": null
        }
      ],
      "inputs": ["example_iris_data"],
      "outputs": ["example_predictions"]
    }
  ],
  "datasets": {
    "example_iris_data": {
      "name": "example_iris_data",
      "type": "pandas.CSVDataset",
      "filepath": "data/01_raw/iris.csv"
    }
  },
  "parameters": ["example_learning_rate", "example_num_train_iter"]
}

If the snapshot cannot be built (for example, due to a catalog error), the response still returns HTTP 200 with "status": "failure". The error field contains the exception type and message, and the data fields (metadata, pipelines, datasets, parameters) are absent:

{
  "status": "failure",
  "error": {
    "type": "MissingConfigException",
    "message": "No config files found matching the pattern(s) 'catalog*'"
  }
}

Note

The /snapshot endpoint uses the environment and configuration source configured at server startup (--env / KEDRO_SERVER_ENV and --conf-source / KEDRO_SERVER_CONF_SOURCE). It does not accept per-request env or conf_source parameters.

See Inspect a Kedro project for the programmatic API and details on the snapshot structure.

POST /run

Triggers a pipeline run. All fields are optional; send an empty JSON object ({}) to run the default pipeline with default settings.

Run the default pipeline:

curl -X POST http://127.0.0.1:8000/run \
  -H "Content-Type: application/json" \
  -d '{}'

Run a specific pipeline with runtime parameters:

curl -X POST http://127.0.0.1:8000/run \
  -H "Content-Type: application/json" \
  -d '{"pipeline_names": ["training"], "params": {"n_splits": 5}}'

Key request fields:

Field Type Description
from_inputs list[str] Start the pipeline from these dataset names
to_outputs list[str] End the pipeline at these dataset names
from_nodes list[str] Start the pipeline from these node names
to_nodes list[str] End the pipeline at these node names
node_names list[str] Run specific nodes
runner str Runner class name or full dotted path, should be a subclass of kedro.runner.AbstractRunner (default: SequentialRunner)
is_async bool Load and save node inputs and outputs asynchronously with threads (default: false)
tags list[str] Run nodes with these tags
load_versions dict[str, str] Pin specific dataset versions for loading, as {"dataset_name": "version"}
pipeline_names list[str] Pipelines to run (default pipeline if omitted)
namespaces list[str] Run nodes in these namespaces
params dict Runtime parameters passed to the context
only_missing_outputs bool Skip nodes whose outputs already exist and are persisted

On success the response contains run_id, status, and duration_ms:

{
  "status": "success",
  "run_id": "2024-01-01T00.00.00.000Z",
  "duration_ms": 142.3
}

On failure the response additionally contains an error object with the exception type and message:

{
  "status": "failure",
  "run_id": "2024-01-01T00.00.00.000Z",
  "duration_ms": 12.1,
  "error": {
    "type": "DatasetError",
    "message": "Failed to load dataset 'raw_data'"
  }
}

Note

RunRequest model uses strict validation, unknown fields return an error rather than being ignored.

The first /run request creates a KedroServiceSession which the following requests reuse. The endpoint runs in a thread pool, so concurrent /run requests share the same session and pipeline runs are not isolated from each other.

Note

env and conf_source are not accepted per-request. Set them at server startup through the --env and --conf-source options instead.

Runner security

Short names (for example, SequentialRunner) always resolve against kedro.runner. Fully-qualified names (for example, mypackage.runners.MyRunner) must belong to kedro.runner, the project's own package, or a module listed in RUNNER_MODULES_WHITELIST in settings.py. The module is never imported otherwise.

# settings.py
RUNNER_MODULES_WHITELIST = ["external_lib.runners"]

Interactive API reference

When the server is running, FastAPI automatically generates interactive API documentation at http://127.0.0.1:8000/docs. This page lists all available endpoints, their request and response schemas, and lets you try them out directly in the browser.

Using create_http_server programmatically

You can create the FastAPI application directly and serve it. If project_path is not provided, it is resolved from the KEDRO_PROJECT_PATH environment variable. env and conf_source can be set in the create_http_server arguments or through the KEDRO_SERVER_ENV and KEDRO_SERVER_CONF_SOURCE environment variables.

from kedro.server import create_http_server

app = create_http_server(
    project_path="/path/to/project",
    env="prod",
)

# Serve with uvicorn
import uvicorn
uvicorn.run(app, host="127.0.0.1", port=8000)

Extending the server

create_http_server returns a standard FastAPI application, so you can mount additional routes or add middleware directly onto it.

Adding a custom endpoint

If you need to expose project-specific information — for example, the list of registered pipelines — add an extra route after creating the app:

from kedro.framework.project import pipelines
from kedro.server import create_http_server

app = create_http_server(project_path="/path/to/project")


@app.get("/pipelines")
def list_pipelines() -> dict:
    return {"pipelines": list(pipelines.keys())}

The new /pipelines endpoint sits alongside the built-in /health, /snapshot, and /run routes and benefits from the same session lifecycle.