Skip to content

Spaceflights tutorial frequently asked questions

Note

If you can't find the answer you need here, ask the Kedro community for help!

How to resolve these common errors

Dataset errors

DatasetError: Failed while loading data from dataset

You're testing whether Kedro can load the raw test data and see the following:

DatasetError: Failed while loading data from dataset
CSVDataset(filepath=...).
[Errno 2] No such file or directory: '.../companies.csv'

or a similar error for the shuttles or reviews data.

Are the three sample data files stored in the data/raw folder?

DatasetNotFoundError: Dataset not found in the catalog

You see an error such as the following:

DatasetNotFoundError: Dataset 'companies' not found in the catalog

Has something changed in your catalog.yml from the version generated by the spaceflights starter? Take a look at the data specification to ensure it is valid.

Call exit() within the IPython session and restart kedro ipython (or type @kedro_reload into the IPython console to reload Kedro into the session without restarting). Then try again.

DatasetError: An exception occurred when parsing config for dataset

Are you seeing a message saying that an exception occurred?

DatasetError: An exception occurred when parsing config for Dataset
'data_processing.preprocessed_companies':
Object 'ParquetDataset' cannot be loaded from 'kedro_datasets.pandas'. Please see the
documentation on how to install relevant dependencies for kedro_datasets.pandas.ParquetDataset:
https://docs.kedro.org/en/stable/develop/dependencies/

The Kedro Data Catalog is missing dependencies needed to parse the data. Check that you have all the project dependencies to requirements.txt and then call pip install -r requirements.txt to install them.

Pipeline run

To run the pipeline, ensure all required input datasets exist; otherwise you may see an error such as this:

kedro run --pipeline=data_science

2019-10-04 12:36:12,158 - kedro.io.data_catalog - INFO - Loading data from `model_input_table` (CSVDataset)...
2019-10-04 12:36:12,158 - kedro.runner.sequential_runner - WARNING - There are 3 nodes that have not run.
You can resume the pipeline run with the following command:
kedro run
Traceback (most recent call last):
  ...
  File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.__cinit__
  File "pandas/_libs/parsers.pyx", line 689, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File b'data/03_primary/model_input_table.csv' does not exist: b'data/03_primary/model_input_table.csv'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  ...
    raise DatasetError(message) from exc
kedro.io.core.DatasetError: Failed while loading data from dataset CSVDataset(filepath=data/03_primary/model_input_table.csv, save_args={'index': False}).
[Errno 2] File b'data/03_primary/model_input_table.csv' does not exist: b'data/03_primary/model_input_table.csv'