Package an entire Kedro project¶
This section explains how to build your project documentation, and how to bundle your entire project into a Python package.
Kedro also has an advanced feature which supports packaging on a pipeline level allowing you share and reuse pipelines across projects! To read more about this please look at the section on micro-packaging.
Add documentation to your project¶
Kedro uses the Sphinx framework and creates a docs
directory that builds a basic template for project-specific documentation. We recommend that you add your project-specific documentation as markdown in docs/source
If you want to customise your documentation beyond the basic template, refer to the Sphinx documentation for details of how to extend docs/source/conf.py
.
Once you have added any documentation you need, run the following from the project root directory:
kedro build-docs --open
The HTML documention is built to docs/build/html
and opens automatically in a browser tab.
Note
The build-docs
command creates documentation based on the code structure of your project. Documentation includes any docstrings
defined in your code.
Package your project¶
To package your project, run the following in your project root directory:
kedro package
Kedro builds the package into the dist
folder of your project, and creates one .egg
file and one .whl
file, which are Python packaging formats for binary distribution.
The resulting package only contains the Python source code of your Kedro pipeline, not any of the conf
, data
and logs
subfolders. This means that you can distribute the project to run elsewhere, such as on a separate computer with different configuration information, dataset and logging locations.
We recommend that you document the configuration required (parameters and catalog) in the local README.md
file for any project recipients.
Package recipients¶
Recipients of the .egg
and .whl
files need to have Python and pip
on their machines, but do not need to have Kedro installed.
A recipient can install the project by calling:
pip install <path-to-wheel-file>
An executable, kedro-tutorial
, is placed in the bin
subfolder of the Python install folder, so the project can be run as follows:
python -m kedro_tutorial
Note
The recipient will need to add a conf
subfolder. They also need to add data
and logs
if the pipeline loads/saves local data or uses logging.
Once your project is installed, to run your pipelines from any Python code, simply import it:
from kedro_tutorial.__main__ import main
main(
["--pipeline", "__default__"]
) # or simply main() if you don't want to provide any arguments
This is equivalent to running kedro run
, and you can provide all the parameters described by kedro run --help
.
Docker, Airflow and other deployment targets¶
There are various methods to deploy packaged pipelines via Kedro plugins:
Kedro-Docker plugin for packaging and shipping Kedro projects within Docker containers.
Kedro-Airflow to convert your Kedro project into an Airflow project.
The Deployment guide touches on other deployment targets such as AWS Batch and Prefect, and there is a range of third-party plugins for deployment.