Skip to content

Kedro's command line interface

Kedro's command line interface (CLI) is used to give commands to Kedro via a terminal shell (such as the terminal app on macOS, or cmd.exe or PowerShell on Windows). You need to use the CLI to set up a new Kedro project, and to run it.

Autocompletion (optional)

If you are using macOS or Linux, you can set up your shell to autocomplete kedro commands. If you don't know the type of shell you are using, first type the following:

echo $0


Add the following to your ~/.bashrc (or just run it on the command line):

eval "$(_KEDRO_COMPLETE=bash_source kedro)"


Add the following to ~/.zshrc:

eval "$(_KEDRO_COMPLETE=zsh_source kedro)"


Add the following to ~/.config/fish/completions/foo-bar.fish:

eval (env _KEDRO_COMPLETE=fish_source kedro)

Invoke Kedro CLI from Python (optional)

You can invoke the Kedro CLI as a Python module:

python -m kedro

Kedro commands

Here is a list of Kedro CLI commands, as a shortcut to the descriptions below. Project-specific commands are called from within a project directory and apply to that particular project. Global commands can be run anywhere and don't apply to any particular project:

Global Kedro commands

Kedro

Usage:

Kedro [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--help, -h boolean Show this message and exit. False

Kedro new

Create a new kedro project.

Usage:

Kedro new [OPTIONS]

Options:

Name Type Description Default
--verbose, -v boolean See extensive logging and error stack traces. False
--config, -c path Non-interactive mode, using a configuration yaml file. This file
must supply the keys required by the template's prompts.yml. When not using a starter,
these are project_name, repo_name and python_package. None
--starter, -s text Specify the starter template to use when creating the project.
This can be the path to a local directory, a URL to a remote VCS repository supported
by cookiecutter or one of the aliases listed in kedro starter list. None
--checkout text An optional tag, branch or commit to checkout in the starter repository. None
--directory text An optional directory inside the repository where the starter resides. None
--name, -n text The name of your new Kedro project. None
--tools, -t text Select which tools you'd like to include. By default, none are included.

Tools

1) Linting: Provides a basic linting setup with Ruff

2) Testing: Provides basic testing setup with pytest

3) Custom Logging: Provides more logging options

4) Documentation: Basic documentation setup with Sphinx

5) Data Structure: Provides a directory structure for storing data

6) PySpark: Provides set up configuration for working with PySpark

Example usage:

kedro new --tools=lint,test,log,docs,data,pyspark (or any subset of these options)

kedro new --tools=all

kedro new --tools=none

For more information on using tools, see https://docs.kedro.org/en/stable/starters/new_project_tools.html | None | | --example, -e | text | Enter y to enable, n to disable the example pipeline. | None | | --telemetry, -tc | choice (yes | no | y | n) | Allow or not allow Kedro to collect usage analytics. We cannot see nor store information contained into a Kedro project. Opt in with "yes" and out with "no". | None | | --help, -h | boolean | Show this message and exit. | False |

Kedro starter

Commands for working with project starters.

Usage:

Kedro starter [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--help, -h boolean Show this message and exit. False
Kedro starter list

List all official project starters available.

Usage:

Kedro starter list [OPTIONS]

Options:

Name Type Description Default
--help, -h boolean Show this message and exit. False

Customise or override project-specific Kedro commands

Note

All project related CLI commands should be run from the project’s root directory.

Kedro's command line interface (CLI) allows you to associate a set of commands and dependencies with a target, which you can then execute from inside the project directory.

The commands a project supports are specified on the framework side. If you want to customise any of the Kedro commands you can do this either by adding a file called cli.py or by injecting commands into it via the plugin framework. Find the template for the cli.py file below.

Project Kedro commands

Kedro

Usage:

Kedro [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--help, -h boolean Show this message and exit. False

Kedro catalog

Commands for working with catalog.

Usage:

Kedro catalog [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--help, -h boolean Show this message and exit. False
Kedro catalog list-datasets

Show datasets grouped by type for the specified pipelines.

This method lists datasets used in the specified pipelines, categorizing them into three groups: - datasets: Datasets explicitly defined in the catalog. - factories: Datasets resolved from dataset factory patterns. - defaults: Datasets that do not match any pattern or explicit definition.

Usage:

Kedro catalog list-datasets [OPTIONS]

Options:

Name Type Description Default
--env, -e text Kedro configuration environment name. Defaults to local. None
--pipeline, -p text Name of the modular pipeline to run. If not set, the project pipeline is run by default. ``
--help, -h boolean Show this message and exit. False
Kedro catalog list-patterns

List all dataset factory patterns in the catalog, ranked by priority.

This method retrieves all dataset factory patterns defined in the catalog, ordered by the priority in which they are matched.

Usage:

Kedro catalog list-patterns [OPTIONS]

Options:

Name Type Description Default
--env, -e text Kedro configuration environment name. Defaults to local. None
--help, -h boolean Show this message and exit. False
Kedro catalog resolve-patterns

Resolve dataset factory patterns against pipeline datasets.

This method resolves dataset factory patterns for datasets used in the specified pipelines. It includes datasets explicitly defined in the catalog as well as those resolved from dataset factory patterns.

Usage:

Kedro catalog resolve-patterns [OPTIONS]

Options:

Name Type Description Default
--env, -e text Kedro configuration environment name. Defaults to local. None
--pipeline, -p text Name of the modular pipeline to run. If not set, the project pipeline is run by default. ``
--help, -h boolean Show this message and exit. False

Kedro ipython

Open IPython with project specific variables loaded.

Usage:

Kedro ipython [OPTIONS] [ARGS]...

Options:

Name Type Description Default
--verbose, -v boolean See extensive logging and error stack traces. False
--env, -e text Kedro configuration environment name. Defaults to local. None

Kedro jupyter

Open Jupyter Notebook / Lab with project specific variables loaded.

Usage:

Kedro jupyter [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--help, -h boolean Show this message and exit. False
Kedro jupyter lab

Open Jupyter Lab with project specific variables loaded.

Usage:

Kedro jupyter lab [OPTIONS] [ARGS]...

Options:

Name Type Description Default
--verbose, -v boolean See extensive logging and error stack traces. False
--env, -e text Kedro configuration environment name. Defaults to local. None
Kedro jupyter notebook

Open Jupyter Notebook with project specific variables loaded.

Usage:

Kedro jupyter notebook [OPTIONS] [ARGS]...

Options:

Name Type Description Default
--verbose, -v boolean See extensive logging and error stack traces. False
--env, -e text Kedro configuration environment name. Defaults to local. None
Kedro jupyter setup

Initialise the Jupyter Kernel for a kedro project.

Usage:

Kedro jupyter setup [OPTIONS] [ARGS]...

Options:

Name Type Description Default
--verbose, -v boolean See extensive logging and error stack traces. False

Kedro package

Package the project as a Python wheel.

Usage:

Kedro package [OPTIONS]

Options:

Name Type Description Default
--help, -h boolean Show this message and exit. False

Kedro pipeline

Commands for working with pipelines.

Usage:

Kedro pipeline [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--help, -h boolean Show this message and exit. False
Kedro pipeline create

Create a new modular pipeline by providing a name.

Usage:

Kedro pipeline create [OPTIONS] NAME

Options:

Name Type Description Default
--verbose, -v boolean See extensive logging and error stack traces. False
--skip-config boolean Skip creation of config files for the new pipeline(s). False
-t, --template directory Path to cookiecutter template to use for pipeline(s). Will override any local templates. None
--env, -e text Environment to create pipeline configuration in. Defaults to base. None
--help, -h boolean Show this message and exit. False
Kedro pipeline delete

Delete a modular pipeline by providing a name.

Usage:

Kedro pipeline delete [OPTIONS] NAME

Options:

Name Type Description Default
--verbose, -v boolean See extensive logging and error stack traces. False
--env, -e text Environment to delete pipeline configuration from. Defaults to 'base'. None
-y, --yes boolean Confirm deletion of pipeline non-interactively. False
--help, -h boolean Show this message and exit. False

Kedro registry

Commands for working with registered pipelines.

Usage:

Kedro registry [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--help, -h boolean Show this message and exit. False
Kedro registry describe

Describe a registered pipeline by providing a pipeline name. Defaults to the __default__ pipeline.

Usage:

Kedro registry describe [OPTIONS] [NAME]

Options:

Name Type Description Default
--verbose, -v boolean See extensive logging and error stack traces. False
--help, -h boolean Show this message and exit. False
Kedro registry list

List all pipelines defined in your pipeline_registry.py file.

Usage:

Kedro registry list [OPTIONS]

Options:

Name Type Description Default
--help, -h boolean Show this message and exit. False

Kedro run

Run the pipeline.

Usage:

Kedro run [OPTIONS]

Options:

Name Type Description Default
--from-inputs text A list of dataset names which should be used as a starting point. ``
--to-outputs text A list of dataset names which should be used as an end point. ``
--from-nodes text A list of node names which should be used as a starting point. ``
--to-nodes text A list of node names which should be used as an end point. ``
--nodes, -n text Run only nodes with specified names. ``
--runner, -r text Specify a runner that you want to run the pipeline with.
Available runners: 'SequentialRunner', 'ParallelRunner' and 'ThreadRunner'. None
--async boolean Load and save node inputs and outputs asynchronously
with threads. If not specified, load and save datasets synchronously. False
--env, -e text Kedro configuration environment name. Defaults to local. None
--tags, -t text Construct the pipeline using only nodes which have this tag
attached. Option can be used multiple times, what results in a
pipeline constructed from nodes having any of those tags. ``
--load-versions, -lv text Specify a particular dataset version (timestamp) for loading. ``
--pipeline, -p text Name of the registered pipeline to run.
If not set, the 'default' pipeline is run. None
--namespaces, -ns text Run only node namespaces with specified names. ``
--config, -c file Specify a YAML configuration file to load the run
command arguments from. If command line arguments are provided, they will
override the loaded ones. None
--conf-source text Path of a directory where project configuration is stored. None
--params text Specify extra parameters that you want to pass
to the context initialiser. Items must be separated by comma, keys - by colon or equals sign,
example: param1=value1,param2=value2. Each parameter is split by the first comma,
so parameter values are allowed to contain colons, parameter keys are not.
To pass a nested dictionary as parameter, separate keys by '.', example:
param_group.param1:value1. ``
--help, -h boolean Show this message and exit. False

Project setup

Install all package dependencies

The following runs pip to install all package dependencies specified in requirements.txt:

pip install -r requirements.txt

For further information, see the documentation on installing project-specific dependencies.

Run the project

Call the run() method of the KedroSession defined in kedro.framework.session.

kedro run

KedroContext can be extended in run.py (src/<package_name>/run.py). In order to use the extended KedroContext, you need to set context_path in the pyproject.toml configuration file.

Modifying a kedro run

Kedro has options to modify pipeline runs. Below is a list of CLI arguments supported out of the box. Note that the names inside angular brackets (<>) are placeholders, and you should replace these values with the the names of relevant nodes, datasets, envs, etc. in your project.

CLI command Description
kedro run --from-inputs=<dataset_name1>,<dataset_name2> A list of dataset names which should be used as a starting point
kedro run --to-outputs=<dataset_name1>,<dataset_name2> A list of dataset names which should be used as an end point
kedro run --from-nodes=<node_name1>,<node_name2> A list of node names which should be used as a starting point
kedro run --to-nodes=<node_name1>,<node_name1> A list of node names which should be used as an end point
kedro run --nodes=<node_name1>,<node_name2> Run only nodes with specified names.
kedro run --runner=<runner_name> Run the pipeline with a specific runner
kedro run --async Load and save node inputs and outputs asynchronously with threads
kedro run --env=<env_name> Run the pipeline in the env_name environment. Defaults to local if not provided
kedro run --tags=<tag_name1>,<tag_name2> Run only nodes which have any of these tags attached.
kedro run --load-versions=<dataset_name>:YYYY-MM-DDThh.mm.ss.sssZ Specify particular dataset versions (timestamp) for loading.
kedro run --pipeline=<pipeline_name> Run the whole pipeline by its name
kedro run --namespaces=<namespace> Run only nodes with the specified namespace
kedro run --config=<config_file_name>.yml Specify all command line options in a named YAML configuration file
kedro run --conf-source=<path_to_config_directory> Specify a new source directory for configuration files
kedro run --conf-source=<path_to_compressed file> Only possible when using the OmegaConfigLoader. Specify a compressed config file in zip or tar format.
kedro run --params=<param_key1>=<value1>,<param_key2>=<value2> Does a parametrised run with {"param_key1": "value1", "param_key2": 2}. These will take precedence over parameters defined in the conf directory. Additionally, dot (.) syntax can be used to address nested keys like parent.child:value

You can also combine these options together, so the following command runs all the nodes from split to predict and report:

kedro run --from-nodes=split --to-nodes=predict,report

This functionality is extended to the kedro run --config=config.yml command, which allows you to specify run commands in a configuration file.

A parameterised run is best used for dynamic parameters, i.e. running the same pipeline with different inputs, for static parameters that do not change we recommend following the Kedro project setup methodology.

Deploy the project

The following packages your application as one .whl file within the dist/ folder of your project. It packages the project configuration separately in a tar.gz file:

kedro package

See the Python documentation for further information about packaging.

Project quality

Project development

Modular pipelines

Create a new modular pipeline in your project
kedro pipeline create <pipeline_name>
Delete a modular pipeline

The following command deletes all the files related to a modular pipeline in your Kedro project.

kedro pipeline delete <pipeline_name>

Registered pipelines

Describe a registered pipeline

kedro registry describe <pipeline_name>
The output includes all the nodes in the pipeline. If no pipeline name is provided, this command returns all nodes in the __default__ pipeline.

List all registered pipelines in your project
kedro registry list

Data Catalog

Lists all datasets used in the specified pipelines

This command lists all datasets used in the specified pipeline(s), grouped by how they are defined.

  • datasets: Explicitly defined in catalog.yml
  • factories: Resolved using dataset factory patterns
  • defaults: Handled by user catch-all or default runtime patterns
kedro catalog list-datasets

The command also accepts an optional --pipeline argument that allows you to specify the pipeline name(s) (comma-separated values) in order to filter datasets used only by those named pipeline(s). For example:

kedro catalog list-datasets --pipeline=ds,de

Note

If no pipelines are specified, the __default__ pipeline is used.

Resolve dataset factories in the catalog

This command resolves datasets used in the pipeline against all dataset patterns, returning their full catalog configuration. It includes datasets explicitly defined in the catalog as well as those resolved from dataset factory patterns.

kedro catalog resolve-patterns

The command also accepts an optional --pipeline argument that allows you to specify the pipeline name(s) (comma-separated values).

kedro catalog resolve-patterns --pipeline=ds,de

Note

If no pipelines are specified, the __default__ pipeline is used.

List all dataset factory patterns defined in the catalog ordered by priority
kedro catalog list-patterns

The output includes a list of any dataset factories in the catalog, ranked by the priority on which they are matched against.

Notebooks

To start a Jupyter Notebook:

kedro jupyter notebook

To start JupyterLab:

kedro jupyter lab

To start an IPython shell:

kedro ipython

The Kedro IPython extension makes the following variables available in your IPython or Jupyter session:

To reload these variables (e.g. if you updated catalog.yml) use the %reload_kedro line magic, which can also be used to see the error message if any of the variables above are undefined.