Kedro plugins allow you to create new features for Kedro and inject additional commands into the CLI. Plugins are developed as separate Python packages that exist outside of any Kedro project.
Kedro’s extension mechanism is built on
pluggy, a solid plugin management library that was created for the pytest ecosystem.
pluggy relies on entry points, a Python mechanism for packages to provide components that can be discovered by other packages using
Example of a simple plugin¶
Here is a simple example of a plugin that prints the pipeline as JSON:
import click from kedro.framework.project import pipelines @click.group(name="JSON") def commands(): pass @commands.command() @click.pass_obj def to_json(metadata): """Display the pipeline in JSON format""" pipeline = pipelines["__default__"] print(pipeline.to_json())
From version 0.18.14, Kedro replaced
pyproject.toml. The plugin needs to provide entry points in either file. If you are using
setup.py, please refer to the
0.18.13 version of documentations.
To add the entry point to
pyproject.toml, the plugin needs to provide the following
[project.entry-points."kedro.project_commands"] kedrojson = "kedrojson.plugin:commands"
Once the plugin is installed, you can run it as follows:
Extend starter aliases¶
It is possible to extend the list of starter aliases built into Kedro. This means that a custom Kedro starter can be used directly through the
starter argument in
kedro new rather than needing to explicitly provide the
directory arguments. A custom starter alias behaves in the same way as an official Kedro starter alias and is also picked up by
kedro starter list.
You need to extend the starters by providing a list of
KedroStarterSpec, in this example it is defined in a file called
Example for a non-git repository starter:
# plugin.py starters = [ KedroStarterSpec( alias="test_plugin_starter", template_path="your_local_directory/starter_folder", ) ]
Example for a git repository starter:
# plugin.py starters = [ KedroStarterSpec( alias="test_plugin_starter", template_path="https://github.com/kedro-org/kedro-starters/", directory="pandas-iris", ) ]
directory argument is optional and should be used when you have multiple templates in one repository as for the official kedro-starters. If you only have one template, your top-level directory will be treated as the template. For an example, see the pandas-iris starter.
pyproject.toml, you need to register the specifications to
[project.entry-points."kedro.starters"] starter = "plugin:starters"
After that you can use this starter with
kedro new --starter=test_plugin_starter.
If your starter lives on a git repository, by default Kedro attempts to use a tag or branch labelled with your version of Kedro, e.g.
0.18.12. This means that you can host different versions of your starter template on the same repository, and the correct one will automatically be used. If you do not wish to follow this structure, you should override it with the
checkout flag, e.g.
kedro new --starter=test_plugin_starter --checkout=main.
Commands must be provided as
click Group will be merged into the main CLI Group. In the process, the options on the group are lost, as is any processing that was done as part of its callback function.
When they run, plugins may request information about the current project by creating a session and loading its context:
from pathlib import Path from kedro.framework.startup import _get_project_metadata from kedro.framework.session import KedroSession project_path = Path.cwd() session = KedroSession.create(project_path=project_path) context = session.load_context()
If the plugin initialisation needs to occur prior to Kedro starting, it can declare the
kedro.init. This entry point must refer to a function that currently has no arguments, but for future proofing you should declare it with
Plugins may also add commands to the Kedro CLI, which supports two types of commands:
global - available both inside and outside a Kedro project. Global commands use the
project - available only when a Kedro project is detected in the current directory. Project commands use the
Suggested command convention¶
We use the following command convention:
kedro <plugin-name> <command>, with
kedro <plugin-name> acting as a top-level command group. This is our suggested way of structuring your plugin bit it is not necessary for your plugin to work.
You can develop hook implementations and have them automatically registered to the project context when the plugin is installed.
To enable this for your custom plugin, simply add the following entry in
[project.entry-points."kedro.hooks"] plugin_name = "plugin_name.plugin:hooks"
plugin.py is the module where you declare hook implementations:
import logging from kedro.framework.hooks import hook_impl class MyHooks: @hook_impl def after_catalog_created(self, catalog): # pylint: disable=unused-argument logging.info("Reached after_catalog_created hook") hooks = MyHooks()
hooks should be an instance of the class defining the Hooks.
You can also develop Hook implementations to extend Kedro’s CLI behaviour in your plugin. To find available CLI Hooks, please visit kedro.framework.cli.hooks. To register CLI Hooks developed in your plugin with Kedro, add the following entry in your project’s
[project.entry-points."kedro.cli_hooks"] plugin_name = "plugin_name.plugin:cli_hooks"
plugin.py is the module where you declare Hook implementations):
import logging from kedro.framework.cli.hooks import cli_hook_impl class MyCLIHooks: @cli_hook_impl def before_command_run(self, project_metadata, command_args): logging.info( "Command %s will be run for project %s", command_args, project_metadata ) cli_hooks = MyCLIHooks()
When you are ready to submit your code:
Create a separate repository using our naming convention for
Choose a command approach:
globaland / or
globalcommands should be provided as a single
projectcommands should be provided as another
clickgroups are declared through the entry points mechanism
README.mddescribing your plugin’s functionality and all dependencies that should be included
Use GitHub tagging to tag your plugin as a
kedro-pluginso that we can find it
Supported Kedro plugins¶
Kedro-Datasets, a collection of all of Kedro’s data connectors. These data connectors are implementations of the
Kedro-Docker, a tool for packaging and shipping Kedro projects within containers
Kedro-Airflow, a tool for converting your Kedro project into an Airflow project
Kedro-Viz, a tool for visualising your Kedro pipelines
There are many community-developed plugins available and a comprehensive list of plugins is published on the
awesome-kedro GitHub repository. The list below is a small snapshot of some of those under active maintenance.
Your plugin needs to have an Apache 2.0 compatible license to be considered for this list.
kedro-mlflow, by Yolan Honoré-Rougé and Takieddine Kadiri, facilitates MLflow integration within a Kedro project. Its main features are modular configuration, automatic parameters tracking, datasets versioning, Kedro pipelines packaging and serving and automatic synchronisation between training and inference pipelines for high reproducibility of machine learning experiments and ease of deployment. A tutorial is provided in the kedro-mlflow-tutorial repo. You can find more information in the kedro-mlflow documentation.