Get started with Kedro-Viz

Kedro-Viz is a key part of Kedro. It displays data and nodes, and the connections between them, to visualise the structure of the pipelines in a Kedro project.

This section assumes you are familiar with the basic Kedro concepts described in the spaceflights tutorial. If you have not yet worked through the tutorial, you can still follow this example.

Generate a copy of the spaceflights tutorial project with all the code in place by using the Kedro starter for the spaceflights tutorial:

kedro new --starter=spaceflights

When prompted for a project name, you can enter any name, but we will assume Kedro Tutorial throughout.

When your project is ready, navigate to the root directory of the project and install the dependencies for the project, which include Kedro-Viz:

pip install -r src/requirements.txt

Visualise the spaceflights project

To run Kedro-Viz, type the following into your terminal from the project directory:

kedro viz

The command automatically opens a browser tab to serve the visualisation at http://127.0.0.1:4141/.

You should see the following:

If a visualisation panel opens up and a pipeline is not visible, then please check that your tutorial project code is complete if you’ve not generated it from the starter template.

Need help?

If you still can’t see the visualisation, the Kedro community can help!

Exit an open visualisation

To exit the visualisation, close the browser tab. To regain control of the terminal, enter ⌘+c on Mac or Ctrl+c on Windows or Linux machines.

Automatic visualisation updates

You can use the --autoreload flag to autoreload Kedro-Viz when a Python or YAML file changes in the project. Add the flag to the command you use to start Kedro-Viz:

kedro viz --autoreload

The autoreload flag reflects changes to the project as they happen. For example, commenting out create_model_input_table_node in pipeline.py will trigger a re-render of the pipeline:

autoreload

Visualise layers

By convention, a pipeline can be defined as having different layers according to how data is processed, which makes it easier to collaborate.

For example, the data engineering convention labels datasets according to the stage of the pipeline (e.g. whether the data has been cleaned).

You can add a layer attribute to the datasets in the Data Catalog, which is reflected in the Kedro-Viz visualisation.

Open catalog.yml for the completed spaceflights tutorial and replace the existing code with the following:

companies:
  type: pandas.CSVDataSet
  filepath: data/01_raw/companies.csv
  layer: raw

reviews:
  type: pandas.CSVDataSet
  filepath: data/01_raw/reviews.csv
  layer: raw

shuttles:
  type: pandas.ExcelDataSet
  filepath: data/01_raw/shuttles.xlsx
  layer: raw

preprocessed_companies:
  type: pandas.ParquetDataSet
  filepath: data/02_intermediate/preprocessed_companies.pq
  layer: intermediate

preprocessed_shuttles:
  type: pandas.ParquetDataSet
  filepath: data/02_intermediate/preprocessed_shuttles.pq
  layer: intermediate

model_input_table:
  type: pandas.ParquetDataSet
  filepath: data/03_primary/model_input_table.pq
  layer: primary

regressor:
  type: pickle.PickleDataSet
  filepath: data/06_models/regressor.pickle
  versioned: true
  layer: models

The visualisation now includes the layers:

Share a pipeline visualisation

You can share a Kedro-Viz visualisation as a JSON file from the terminal:

kedro viz --save-file my_shareable_pipeline.json

This command will save a visualisation of the __default__ pipeline as a JSON file called my_shareable_pipeline.json.

To visualise the JSON file, type the following to load it from the terminal:

kedro viz --load-file my_shareable_pipeline.json