kedro Logo
stable

Learn about Kedro

  • Introduction to Kedro
    • Learn how to use Kedro
      • Assumptions
  • First steps
    • Set up Kedro
      • Installation prerequisites
      • Create a virtual environment for your Kedro project
        • How to create a new virtual environment using conda
        • How to create a new virtual environment without using conda
      • How to install Kedro using pip
      • How to verify your Kedro installation
      • How to upgrade Kedro
      • How to install a development version of Kedro
      • Summary
    • Create a new Kedro project
      • Summary
      • Create a new empty project
      • Create a new project from a configuration file
      • Create a new project containing example code
      • Run the project
      • Visualise a Kedro project
      • Where next?
      • More information about the pandas-iris example project
        • Background information
        • Iris example: visualisation
    • Kedro concepts
      • Summary
      • Node
      • Pipeline
      • Data Catalog
      • Kedro project directory structure
        • conf
        • data
        • src

Tutorial and basic Kedro usage

  • Next steps: Tutorial
    • Set up the spaceflights project
      • Create a new project
      • Install project dependencies
        • Install the dependencies
      • Optional: logging and configuration
        • Configuration best practice to avoid leaking confidential data
    • Set up the data
      • Project datasets
      • Dataset registration
        • Test that Kedro can load the data
      • Further information
        • Custom data
        • Supported data locations
    • Create a data processing pipeline
      • Introduction
      • Data preprocessing node functions
      • The data processing pipeline
      • Test the example
      • Preprocessed data registration
      • Create a table for model input
      • Model input table registration
      • Test the example again
      • Visualise the project
      • Checkpoint
    • Create a data science pipeline
      • Data science nodes
      • Input parameter configuration
      • Model registration
      • Data science pipeline
      • Test the pipelines
        • Slice a pipeline
      • Modular pipelines
        • Optional: Extend the project with namespacing and a modular pipeline
        • How it works: the modular pipeline() wrapper
      • Optional: Kedro runners
    • Package an entire Kedro project
      • Add documentation to a Kedro project
        • Set up the Sphinx project files
        • Build HTML documentation
        • Documentation from docstrings
      • Package a Kedro project
        • Run a packaged project
        • Docker, Airflow and other deployment targets
    • Spaceflights tutorial FAQs
      • How do I resolve these common errors?
        • Dataset errors
        • Pipeline run
    • Get help
    • Terminology
      • Project root directory
      • Dependencies
      • Standard development workflow
  • Visualisation with Kedro-Viz
    • Visualise the spaceflights project
      • Automatic visualisation updates
      • Visualise layers
      • Share a pipeline visualisation
    • Preview data in Kedro-Viz
      • Configure the Data Catalog
      • Previewing Data on Kedro-viz
    • Visualise charts in Kedro-Viz
      • Visualisation with Plotly
        • Update the dependencies
        • Configure the Data Catalog
        • Create the template reporting pipeline
        • Add the Plotly reporting nodes
        • Update the reporting pipeline code
        • Run the pipeline
      • Visualisation with Matplotlib
        • Update the dependencies
        • Configure the Data Catalog
        • Add another node
        • Update the pipeline
        • Run the pipeline
  • Experiment tracking in Kedro-Viz
    • Kedro versions supporting experiment tracking
    • When should I use experiment tracking in Kedro?
    • Set up a project
      • Install Kedro and Kedro-Viz
      • Install the dependencies for the project
    • Set up the session store
      • Local storage
    • Collaborative experiment tracking
    • Set up experiment tracking datasets
    • Modify your nodes and pipelines to log metrics
    • Generate the run data
    • Access run data and compare runs
    • View and compare plots
      • Update the dependencies
      • Add a plotting node
    • View and compare metrics data
  • Kedro for notebook users
    • Kedro and Jupyter Notebooks
      • A custom Kedro kernel
      • Iris dataset example
        • catalog
        • context
        • pipelines
        • session
      • %reload_kedro line magic
      • %run_viz line magic
      • Convert functions from Jupyter Notebooks into Kedro nodes
      • Useful to know…
        • Managed services
        • IPython, JupyterLab and other Jupyter clients
      • Find out more
    • Kedro as a data registry
      • Usage
  • FAQs and resources
    • FAQs
      • Visualisation
      • Working with Jupyter
      • Kedro project development
      • Configuration
        • Advanced topics
      • Nodes and pipelines
      • What is data engineering convention?
    • Kedro glossary
      • Data Catalog
      • Data engineering vs Data science
      • Kedro
      • KedroContext
      • KedroSession
      • Kedro-Viz
      • Layers (data engineering convention)
      • Modular pipeline
      • Node
      • Node execution order
      • Pipeline
      • Pipeline slicing
      • Runner
      • Starters
      • Tags
      • Workflow dependencies

Kedro projects

  • Kedro project setup
    • Kedro starters
      • How to use Kedro starters
      • Starter aliases
      • List of official starters
      • Starter versioning
      • Use a starter with a configuration file
      • How to create a Kedro starter
        • Configuration variables
        • Example Kedro starter
    • Dependencies
      • Project-specific dependencies
      • Install project-specific dependencies
      • Workflow dependencies
        • Install dependencies related to the Data Catalog
    • Lifecycle management with KedroSession
      • Overview
      • Create a session
    • Project settings
      • Application settings
      • Project metadata
  • Configuration
    • Configuration
      • Configuration source
      • Configuration environments
        • Base
        • Local
      • Configuration loading
        • Configuration file names
        • Configuration patterns
      • How to use Kedro configuration
        • How to change the setting for a configuration source folder
        • How to change the configuration source folder at runtime
        • How to read configuration from a compressed file
        • How to access configuration in code
        • How to specify additional configuration environments
        • How to change the default overriding environment
        • How to use only one configuration environment
    • Credentials
      • How to load credentials in code
      • How to work with AWS credentials
    • Parameters
      • How to use parameters
      • How to load parameters in code
      • How to specify parameters at runtime
    • Migration guide for config loaders
      • ConfigLoader to OmegaConfigLoader
        • 1. Install the Required Library
        • 2. Use the OmegaConfigLoader
        • 3. Import Statements
        • 4. File Format Support
        • 5. Load Configuration
        • 6. Exception Handling
      • TemplatedConfigLoader to OmegaConfigLoader
        • 1. Install the required library
        • 2. Use the OmegaConfigLoader
        • 3. Import statements
        • 4. File format support
        • 5. Load configuration
        • 6. Templating of values
        • 7. Globals
        • 8. Deprecation of Jinja2
        • 9. Exception handling
    • Advanced configuration
      • TemplatedConfigLoader
        • Provide template values through globals
      • OmegaConfigLoader
      • Advanced Kedro configuration
        • How to change which configuration files are loaded
        • How to ensure non default configuration files get loaded
        • How to bypass the configuration loading rules
        • How to use Jinja2 syntax in configuration
        • How to do templating with the OmegaConfigLoader
        • How to use global variables with the OmegaConfigLoader
        • How to use resolvers in the OmegaConfigLoader
        • How to load credentials through environment variables
  • The Kedro Data Catalog
    • Introduction to the Data Catalog
      • The basics of catalog.yml
        • Dataset type
        • Dataset filepath
      • Additional settings in catalog.yml
        • Load and save arguments
        • Dataset access credentials
        • Dataset versioning
      • Use the Data Catalog within Kedro configuration
    • Data Catalog YAML examples
      • Load data from a local binary file using utf-8 encoding
      • Save data to a CSV file without row names (index) using utf-8 encoding
      • Load/save a CSV file from/to a local file system
      • Load/save a CSV on a local file system, using specified load/save arguments
      • Load/save a compressed CSV on a local file system
      • Load a CSV file from a specific S3 bucket, using credentials and load arguments
      • Load/save a pickle file from/to a local file system
      • Load an Excel file from Google Cloud Storage
      • Load a multi-sheet Excel file from a local file system
      • Save an image created with Matplotlib on Google Cloud Storage
      • Load/save an HDF file on local file system storage, using specified load/save arguments
      • Load/save a parquet file on local file system storage, using specified load/save arguments
      • Load/save a Spark table on S3, using specified load/save arguments
      • Load/save a SQL table using credentials, a database connection, and specified load/save arguments
      • Load a SQL table with credentials and a database connection, and apply a SQL query to the table
      • Load data from an API endpoint
      • Load data from Minio (S3 API Compatible Storage)
      • Load a model saved as a pickle from Azure Blob Storage
      • Load a CSV file stored in a remote location through SSH
      • Load multiple datasets with similar configuration using YAML anchors
      • Read the same file using two different datasets
      • Create a Data Catalog YAML configuration file via the CLI
    • Kedro dataset factories
      • How to generalise datasets with similar names and types
      • How to generalise datasets of the same type
      • How to generalise datasets using namespaces
      • How to generalise datasets of the same type in different layers
      • How to generalise datasets using multiple dataset factories
      • How to override the default dataset creation with dataset factories
      • CLI commands for dataset factories
        • How to use kedro catalog rank
        • How to use kedro catalog resolve
    • Advanced: Access the Data Catalog in code
      • How to configure the Data Catalog
      • How to view the available data sources
      • How to load datasets programmatically
      • How to save data programmatically
        • How to save data to memory
        • How to save data to a SQL database for querying
        • How to save data in Parquet
      • How to access a dataset with credentials
      • How to version a dataset using the Code API
    • Advanced: Partitioned and incremental datasets
      • Partitioned datasets
        • How to use PartitionedDataset
        • Dataset definition
        • Partitioned dataset credentials
        • Partitioned dataset load
        • Partitioned dataset save
        • Partitioned dataset lazy saving
      • Incremental datasets
        • Incremental dataset loads
        • Incremental dataset save
        • Incremental dataset confirm
        • Checkpoint configuration
        • Special checkpoint config keys
    • Advanced: Tutorial to create a custom dataset
      • AbstractDataset
      • Scenario
      • Project setup
      • The anatomy of a dataset
      • Implement the _load method with fsspec
      • Implement the _save method with fsspec
      • Implement the _describe method
      • The complete example
      • Integration with PartitionedDataset
      • Versioning
        • How to implement versioning in your dataset
      • Thread-safety
      • How to handle credentials and different filesystems
      • How to contribute a custom dataset implementation
  • Nodes and pipelines
    • Nodes
      • How to create a node
        • Node definition syntax
        • Syntax for input variables
        • Syntax for output variables
      • **kwargs-only node functions
      • How to tag a node
      • How to run a node
      • How to use generator functions in a node
        • Set up the project
        • Loading data with Generators
        • Saving data with Generators
    • Pipelines
      • How to build a pipeline
        • How to tag a pipeline
        • How to merge multiple pipelines
        • Information about the nodes in a pipeline
        • Information about pipeline inputs and outputs
      • Bad pipelines
        • Pipeline with bad nodes
        • Pipeline with circular dependencies
    • Modular pipelines
      • What are modular pipelines?
        • Key concepts
      • How do I create a modular pipeline?
        • What does kedro pipeline create do?
        • Ensuring portability
        • Providing modular pipeline specific dependencies
      • Using the modular pipeline() wrapper to provide overrides
      • Combining disconnected pipelines
      • Using a modular pipeline multiple times
      • How to use a modular pipeline with different parameters
    • The pipeline registry
      • Pipeline autodiscovery
    • Micro-packaging
      • Package a micro-package
      • Package multiple micro-packages
      • Pull a micro-package
        • Providing fsspec arguments
      • Pull multiple micro-packages
    • Run a pipeline
      • Runners
        • SequentialRunner
        • ParallelRunner
      • Custom runners
      • Load and save asynchronously
      • Run a pipeline by name
      • Run pipelines with IO
      • Output to a file
      • Configure kedro run arguments
    • Slice a pipeline
      • Slice a pipeline by providing inputs
      • Slice a pipeline by specifying nodes
      • Slice a pipeline by specifying final nodes
      • Slice a pipeline with tagged nodes
      • Slice a pipeline by running specified nodes
      • How to recreate missing outputs

Advanced usage

  • Extend Kedro
    • Common use cases
      • Use Case 1: How to add extra behaviour to Kedro’s execution timeline
      • Use Case 2: How to integrate Kedro with additional data sources
      • Use Case 3: How to add or modify CLI commands
      • Use Case 4: How to customise the initial boilerplate of your project
    • Kedro plugins
      • Overview
      • Example of a simple plugin
      • Extend starter aliases
      • Working with click
      • Project context
      • Initialisation
      • global and project commands
      • Suggested command convention
      • Hooks
      • CLI Hooks
      • Contributing process
      • Supported Kedro plugins
      • Community-developed plugins
    • Kedro architecture overview
      • Kedro project
      • Kedro framework
      • Kedro starter
      • Kedro library
      • Kedro extension
  • Hooks
    • Hooks
      • Concepts
        • Hook specification
        • Hook implementation
      • Under the hood
    • Common use cases
      • Use Hooks to extend a node’s behaviour
      • Use Hooks to customise the dataset load and save methods
      • Use Hooks to load external credentials
    • Hooks examples
      • Add memory consumption tracking
      • Add data validation
        • V2 API
        • V3 API
      • Add observability to your pipeline
      • Add metrics tracking to your model
      • Modify node inputs using before_node_run hook
  • Logging
    • Default framework-side logging configuration
      • Project-side logging configuration
        • Using KEDRO_LOGGING_CONFIG environment variable
        • Disable file-based logging
        • Customise the rich Handler
        • Use plain console logging
        • Rich logging in a dumb terminal
        • Rich logging in Jupyter
      • Perform logging in your project
  • PySpark integration
    • Centralise Spark configuration in conf/base/spark.yml
    • Initialise a SparkSession using a hook
    • Use Kedro’s built-in Spark datasets to load and save raw data
    • Spark and Delta Lake interaction
    • Use MemoryDataset for intermediary DataFrame
    • Use MemoryDataset with copy_mode="assign" for non-DataFrame Spark objects
    • Tips for maximising concurrency using ThreadRunner
  • Development
    • Set up Visual Studio Code
      • Advanced: For those using venv / virtualenv
      • Setting up tasks
      • Debugging
        • Advanced: Remote Interpreter / Debugging
      • Configuring the Kedro catalog validation schema
    • Set up PyCharm
      • Set up Run configurations
      • Debugging
      • Advanced: Remote SSH interpreter
      • Advanced: Docker interpreter
      • Configure Python Console
      • Configuring the Kedro catalog validation schema
    • Kedro’s command line interface
      • Autocompletion (optional)
      • Invoke Kedro CLI from Python (optional)
      • Kedro commands
      • Global Kedro commands
        • Get help on Kedro commands
        • Confirm the Kedro version
        • Confirm Kedro information
        • Create a new Kedro project
        • Open the Kedro documentation in your browser
      • Customise or Override Project-specific Kedro commands
        • Project setup
        • Run the project
        • Deploy the project
        • Pull a micro-package
        • Project quality
        • Project development
    • Debugging
      • Introduction
      • Debugging Node
      • Debugging Pipeline
    • Automated Testing
      • Introduction
      • Set up automated testing with pytest
        • Install pytest
        • Create a /tests directory
        • Test directory structure
        • Create an example test
        • Run your tests
      • Add test coverage reports with pytest-cov
        • Install pytest-cov
        • Configure pytest to use pytest-cov
        • Run pytest with pytest-cov
    • Code formatting and linting
      • Introduction
      • Set up Python tools
        • Install the tools
        • Run the tools
      • Automated formatting and linting with pre-commit hooks
        • Install pre-commit
        • Add pre-commit configuration file
        • Install git hook scripts
  • Deployment
    • Single-machine deployment
      • Container-based
        • How to use container registry
      • Package-based
      • CLI-based
        • Use GitHub workflow to copy your project
        • Install and run the Kedro project
    • Distributed deployment
      • 1. Containerise the pipeline
      • 2. Convert your Kedro pipeline into targeted platform primitives
      • 3. Parameterise the runs
      • 4. (Optional) Create starters
    • Apache Airflow
      • How to run a Kedro pipeline on Apache Airflow using a Kubernetes cluster
      • How to run a Kedro pipeline on Apache Airflow with Astronomer
        • Strategy
        • Prerequisites
        • Tutorial project setup
        • Deployment process
    • Amazon SageMaker
      • The kedro-sagemaker plugin
    • AWS Step Functions
      • Why would you run a Kedro pipeline with AWS Step Functions?
      • Strategy
      • Prerequisites
      • Deployment process
        • Step 1. Create new configuration environment to prepare a compatible DataCatalog
        • Step 2. Package the Kedro pipeline as an AWS Lambda-compliant Docker image
        • Step 3. Write the deployment script
        • Step 4. Deploy the pipeline
      • Limitations
    • Azure ML pipelines
      • kedro-azureml plugin
    • Dask
      • Why would you use Dask?
      • Prerequisites
      • How to distribute your Kedro pipeline using Dask
        • Create a custom runner
        • Update CLI implementation
        • Deploy
    • Databricks
      • Use a Databricks workspace to develop a Kedro project
        • What this page covers
        • Prerequisites
        • Set up your project
        • Modify your project and test the changes
        • Summary
      • Use an IDE, dbx and Databricks Repos to develop a Kedro project
        • What this page covers
        • Prerequisites
        • Set up your project
        • Modify your project and test the changes
        • Summary
      • Use a Databricks job to deploy a Kedro project
        • What are the advantages of packaging a Kedro project to run on Databricks?
        • What this page covers
        • Prerequisites
        • Set up your project for deployment to Databricks
        • Deploy and run your Kedro project using the workspace UI
        • Resources for automatically deploying to Databricks
        • Summary
      • Visualise a Kedro project in Databricks notebooks
    • Kubeflow Pipelines
      • Why would you use Kubeflow Pipelines?
      • The kedro-kubeflow plugin
    • Prefect
      • Prerequisites
      • Setup
      • How to run your Kedro pipeline using Prefect 2.0
        • Convert your Kedro pipeline to Prefect 2.0 flow
        • Run Prefect flow
    • VertexAI
      • The kedro-vertexai plugin
    • Argo Workflows (outdated documentation that needs review)
      • Why would you use Argo Workflows?
      • Prerequisites
      • How to run your Kedro pipeline using Argo Workflows
        • Containerise your Kedro project
        • Create Argo Workflows spec
        • Submit Argo Workflows spec to Kubernetes
        • Kedro-Argo plugin
    • AWS Batch (outdated documentation that needs review)
      • Why would you use AWS Batch?
      • Prerequisites
      • How to run a Kedro pipeline using AWS Batch
        • Containerise your Kedro project
        • Provision resources
        • Configure the credentials
        • Submit AWS Batch jobs
        • Deploy

Contribute to Kedro

  • Contribute to Kedro
    • Kedro’s Technical Steering Committee
      • Responsibilities of a maintainer
        • Product development
        • Community management
      • Requirements to become a maintainer
      • Kedro maintainers
      • Application process
      • Voting process
        • Other issues or proposals
        • Adding or removing maintainers

API documentation

  • kedro
    • kedro.KedroPythonVersionWarning
      • KedroPythonVersionWarning
        • KedroPythonVersionWarning.args
        • KedroPythonVersionWarning.with_traceback()
    • kedro.config
      • kedro.config.ConfigLoader
        • ConfigLoader
      • kedro.config.TemplatedConfigLoader
        • TemplatedConfigLoader
      • kedro.config.OmegaConfigLoader
        • OmegaConfigLoader
      • kedro.config.MissingConfigException
        • MissingConfigException
    • kedro.extras
      • kedro.extras.extensions
        • kedro.extras.extensions.ipython
      • kedro.extras.logging
        • kedro.extras.logging.color_logger
      • kedro.extras.datasets
        • kedro.extras.datasets.api.APIDataSet
        • kedro.extras.datasets.biosequence.BioSequenceDataSet
        • kedro.extras.datasets.dask.ParquetDataSet
        • kedro.extras.datasets.email.EmailMessageDataSet
        • kedro.extras.datasets.geopandas.GeoJSONDataSet
        • kedro.extras.datasets.holoviews.HoloviewsWriter
        • kedro.extras.datasets.json.JSONDataSet
        • kedro.extras.datasets.matplotlib.MatplotlibWriter
        • kedro.extras.datasets.networkx.GMLDataSet
        • kedro.extras.datasets.networkx.GraphMLDataSet
        • kedro.extras.datasets.networkx.JSONDataSet
        • kedro.extras.datasets.pandas.CSVDataSet
        • kedro.extras.datasets.pandas.ExcelDataSet
        • kedro.extras.datasets.pandas.FeatherDataSet
        • kedro.extras.datasets.pandas.GBQQueryDataSet
        • kedro.extras.datasets.pandas.GBQTableDataSet
        • kedro.extras.datasets.pandas.GenericDataSet
        • kedro.extras.datasets.pandas.HDFDataSet
        • kedro.extras.datasets.pandas.JSONDataSet
        • kedro.extras.datasets.pandas.ParquetDataSet
        • kedro.extras.datasets.pandas.SQLQueryDataSet
        • kedro.extras.datasets.pandas.SQLTableDataSet
        • kedro.extras.datasets.pandas.XMLDataSet
        • kedro.extras.datasets.pickle.PickleDataSet
        • kedro.extras.datasets.pillow.ImageDataSet
        • kedro.extras.datasets.plotly.JSONDataSet
        • kedro.extras.datasets.plotly.PlotlyDataSet
        • kedro.extras.datasets.redis.PickleDataSet
        • kedro.extras.datasets.spark.DeltaTableDataSet
        • kedro.extras.datasets.spark.SparkDataSet
        • kedro.extras.datasets.spark.SparkHiveDataSet
        • kedro.extras.datasets.spark.SparkJDBCDataSet
        • kedro.extras.datasets.svmlight.SVMLightDataSet
        • kedro.extras.datasets.tensorflow.TensorFlowModelDataset
        • kedro.extras.datasets.text.TextDataSet
        • kedro.extras.datasets.tracking.JSONDataSet
        • kedro.extras.datasets.tracking.MetricsDataSet
        • kedro.extras.datasets.yaml.YAMLDataSet
    • kedro.framework
      • kedro.framework.cli
        • kedro.framework.cli.catalog
        • kedro.framework.cli.cli
        • kedro.framework.cli.hooks
        • kedro.framework.cli.jupyter
        • kedro.framework.cli.micropkg
        • kedro.framework.cli.pipeline
        • kedro.framework.cli.project
        • kedro.framework.cli.registry
        • kedro.framework.cli.starters
        • kedro.framework.cli.utils
      • kedro.framework.context
        • kedro.framework.context.KedroContext
        • kedro.framework.context.KedroContextError
      • kedro.framework.hooks
        • kedro.framework.hooks.manager
        • kedro.framework.hooks.markers
        • kedro.framework.hooks.specs
      • kedro.framework.project
        • kedro.framework.project.configure_logging
        • kedro.framework.project.configure_project
        • kedro.framework.project.find_pipelines
        • kedro.framework.project.validate_settings
      • kedro.framework.session
        • kedro.framework.session.session
        • kedro.framework.session.shelvestore
        • kedro.framework.session.store
      • kedro.framework.startup
        • kedro.framework.startup.bootstrap_project
        • kedro.framework.startup.ProjectMetadata
    • kedro.io
      • kedro.io.AbstractDataset
        • AbstractDataset
      • kedro.io.AbstractVersionedDataset
        • AbstractVersionedDataset
      • kedro.io.CachedDataSet
        • CachedDataSet
      • kedro.io.CachedDataset
        • CachedDataset
      • kedro.io.DataCatalog
        • DataCatalog
      • kedro.io.IncrementalDataSet
        • IncrementalDataSet
      • kedro.io.IncrementalDataset
        • IncrementalDataset
      • kedro.io.LambdaDataSet
        • LambdaDataSet
      • kedro.io.LambdaDataset
        • LambdaDataset
      • kedro.io.MemoryDataSet
        • MemoryDataSet
      • kedro.io.MemoryDataset
        • MemoryDataset
      • kedro.io.PartitionedDataSet
        • PartitionedDataSet
      • kedro.io.PartitionedDataset
        • PartitionedDataset
      • kedro.io.Version
        • Version
      • kedro.io.DataSetAlreadyExistsError
        • DataSetAlreadyExistsError
      • kedro.io.DataSetError
        • DataSetError
      • kedro.io.DataSetNotFoundError
        • DataSetNotFoundError
      • kedro.io.DatasetAlreadyExistsError
        • DatasetAlreadyExistsError
      • kedro.io.DatasetError
        • DatasetError
      • kedro.io.DatasetNotFoundError
        • DatasetNotFoundError
    • kedro.ipython
      • kedro.ipython.load_ipython_extension
        • load_ipython_extension()
      • kedro.ipython.magic_reload_kedro
        • magic_reload_kedro()
      • kedro.ipython.reload_kedro
        • reload_kedro()
    • kedro.logging
      • kedro.logging.RichHandler
        • RichHandler
    • kedro.pipeline
      • kedro.pipeline.node
        • node()
      • kedro.pipeline.modular_pipeline.pipeline
        • pipeline()
      • kedro.pipeline.Pipeline
        • Pipeline
      • kedro.pipeline.node.Node
        • Node
      • kedro.pipeline.modular_pipeline.ModularPipelineError
        • ModularPipelineError
    • kedro.runner
      • kedro.runner.run_node
        • run_node()
      • kedro.runner.AbstractRunner
        • AbstractRunner
      • kedro.runner.ParallelRunner
        • ParallelRunner
      • kedro.runner.SequentialRunner
        • SequentialRunner
      • kedro.runner.ThreadRunner
        • ThreadRunner
    • kedro.utils
      • kedro.utils.load_obj
        • load_obj()
  • kedro_datasets
    • kedro_datasets.api.APIDataSet
      • APIDataSet
        • APIDataSet.DEFAULT_SAVE_ARGS
        • APIDataSet.__init__()
        • APIDataSet.exists()
        • APIDataSet.from_config()
        • APIDataSet.load()
        • APIDataSet.release()
        • APIDataSet.save()
    • kedro_datasets.biosequence.BioSequenceDataSet
      • BioSequenceDataSet
        • BioSequenceDataSet.DEFAULT_LOAD_ARGS
        • BioSequenceDataSet.DEFAULT_SAVE_ARGS
        • BioSequenceDataSet.__init__()
        • BioSequenceDataSet.exists()
        • BioSequenceDataSet.from_config()
        • BioSequenceDataSet.invalidate_cache()
        • BioSequenceDataSet.load()
        • BioSequenceDataSet.release()
        • BioSequenceDataSet.save()
    • kedro_datasets.dask.ParquetDataSet
      • ParquetDataSet
        • ParquetDataSet.DEFAULT_LOAD_ARGS
        • ParquetDataSet.DEFAULT_SAVE_ARGS
        • ParquetDataSet.__init__()
        • ParquetDataSet.exists()
        • ParquetDataSet.from_config()
        • ParquetDataSet.fs_args
        • ParquetDataSet.load()
        • ParquetDataSet.release()
        • ParquetDataSet.save()
    • kedro_datasets.databricks.ManagedTableDataSet
      • ManagedTableDataSet
        • ManagedTableDataSet.__init__()
        • ManagedTableDataSet.exists()
        • ManagedTableDataSet.from_config()
        • ManagedTableDataSet.load()
        • ManagedTableDataSet.release()
        • ManagedTableDataSet.resolve_load_version()
        • ManagedTableDataSet.resolve_save_version()
        • ManagedTableDataSet.save()
    • kedro_datasets.email.EmailMessageDataSet
      • EmailMessageDataSet
        • EmailMessageDataSet.DEFAULT_LOAD_ARGS
        • EmailMessageDataSet.DEFAULT_SAVE_ARGS
        • EmailMessageDataSet.__init__()
        • EmailMessageDataSet.exists()
        • EmailMessageDataSet.from_config()
        • EmailMessageDataSet.load()
        • EmailMessageDataSet.release()
        • EmailMessageDataSet.resolve_load_version()
        • EmailMessageDataSet.resolve_save_version()
        • EmailMessageDataSet.save()
    • kedro_datasets.geopandas.GeoJSONDataSet
      • GeoJSONDataSet
        • GeoJSONDataSet.DEFAULT_LOAD_ARGS
        • GeoJSONDataSet.DEFAULT_SAVE_ARGS
        • GeoJSONDataSet.__init__()
        • GeoJSONDataSet.exists()
        • GeoJSONDataSet.from_config()
        • GeoJSONDataSet.invalidate_cache()
        • GeoJSONDataSet.load()
        • GeoJSONDataSet.release()
        • GeoJSONDataSet.resolve_load_version()
        • GeoJSONDataSet.resolve_save_version()
        • GeoJSONDataSet.save()
    • kedro_datasets.holoviews.HoloviewsWriter
      • HoloviewsWriter
        • HoloviewsWriter.DEFAULT_SAVE_ARGS
        • HoloviewsWriter.__init__()
        • HoloviewsWriter.exists()
        • HoloviewsWriter.from_config()
        • HoloviewsWriter.load()
        • HoloviewsWriter.release()
        • HoloviewsWriter.resolve_load_version()
        • HoloviewsWriter.resolve_save_version()
        • HoloviewsWriter.save()
    • kedro_datasets.json.JSONDataSet
      • JSONDataSet
        • JSONDataSet.DEFAULT_SAVE_ARGS
        • JSONDataSet.__init__()
        • JSONDataSet.exists()
        • JSONDataSet.from_config()
        • JSONDataSet.load()
        • JSONDataSet.release()
        • JSONDataSet.resolve_load_version()
        • JSONDataSet.resolve_save_version()
        • JSONDataSet.save()
    • kedro_datasets.matplotlib.MatplotlibWriter
      • MatplotlibWriter
        • MatplotlibWriter.DEFAULT_SAVE_ARGS
        • MatplotlibWriter.__init__()
        • MatplotlibWriter.exists()
        • MatplotlibWriter.from_config()
        • MatplotlibWriter.load()
        • MatplotlibWriter.release()
        • MatplotlibWriter.resolve_load_version()
        • MatplotlibWriter.resolve_save_version()
        • MatplotlibWriter.save()
    • kedro_datasets.networkx.GMLDataSet
      • GMLDataSet
        • GMLDataSet.DEFAULT_LOAD_ARGS
        • GMLDataSet.DEFAULT_SAVE_ARGS
        • GMLDataSet.__init__()
        • GMLDataSet.exists()
        • GMLDataSet.from_config()
        • GMLDataSet.load()
        • GMLDataSet.release()
        • GMLDataSet.resolve_load_version()
        • GMLDataSet.resolve_save_version()
        • GMLDataSet.save()
    • kedro_datasets.networkx.GraphMLDataSet
      • GraphMLDataSet
        • GraphMLDataSet.DEFAULT_LOAD_ARGS
        • GraphMLDataSet.DEFAULT_SAVE_ARGS
        • GraphMLDataSet.__init__()
        • GraphMLDataSet.exists()
        • GraphMLDataSet.from_config()
        • GraphMLDataSet.load()
        • GraphMLDataSet.release()
        • GraphMLDataSet.resolve_load_version()
        • GraphMLDataSet.resolve_save_version()
        • GraphMLDataSet.save()
    • kedro_datasets.networkx.JSONDataSet
      • JSONDataSet
        • JSONDataSet.DEFAULT_LOAD_ARGS
        • JSONDataSet.DEFAULT_SAVE_ARGS
        • JSONDataSet.__init__()
        • JSONDataSet.exists()
        • JSONDataSet.from_config()
        • JSONDataSet.load()
        • JSONDataSet.release()
        • JSONDataSet.resolve_load_version()
        • JSONDataSet.resolve_save_version()
        • JSONDataSet.save()
    • kedro_datasets.pandas.CSVDataSet
      • CSVDataSet
        • CSVDataSet.DEFAULT_LOAD_ARGS
        • CSVDataSet.DEFAULT_SAVE_ARGS
        • CSVDataSet.__init__()
        • CSVDataSet.exists()
        • CSVDataSet.from_config()
        • CSVDataSet.load()
        • CSVDataSet.release()
        • CSVDataSet.resolve_load_version()
        • CSVDataSet.resolve_save_version()
        • CSVDataSet.save()
    • kedro_datasets.pandas.DeltaTableDataSet
      • DeltaTableDataSet
        • DeltaTableDataSet.ACCEPTED_WRITE_MODES
        • DeltaTableDataSet.DEFAULT_LOAD_ARGS
        • DeltaTableDataSet.DEFAULT_SAVE_ARGS
        • DeltaTableDataSet.DEFAULT_WRITE_MODE
        • DeltaTableDataSet.__init__()
        • DeltaTableDataSet.exists()
        • DeltaTableDataSet.from_config()
        • DeltaTableDataSet.fs_args
        • DeltaTableDataSet.get_loaded_version()
        • DeltaTableDataSet.history
        • DeltaTableDataSet.load()
        • DeltaTableDataSet.metadata
        • DeltaTableDataSet.release()
        • DeltaTableDataSet.save()
        • DeltaTableDataSet.schema
    • kedro_datasets.pandas.ExcelDataSet
      • ExcelDataSet
        • ExcelDataSet.DEFAULT_LOAD_ARGS
        • ExcelDataSet.DEFAULT_SAVE_ARGS
        • ExcelDataSet.__init__()
        • ExcelDataSet.exists()
        • ExcelDataSet.from_config()
        • ExcelDataSet.load()
        • ExcelDataSet.release()
        • ExcelDataSet.resolve_load_version()
        • ExcelDataSet.resolve_save_version()
        • ExcelDataSet.save()
    • kedro_datasets.pandas.FeatherDataSet
      • FeatherDataSet
        • FeatherDataSet.DEFAULT_LOAD_ARGS
        • FeatherDataSet.DEFAULT_SAVE_ARGS
        • FeatherDataSet.__init__()
        • FeatherDataSet.exists()
        • FeatherDataSet.from_config()
        • FeatherDataSet.load()
        • FeatherDataSet.release()
        • FeatherDataSet.resolve_load_version()
        • FeatherDataSet.resolve_save_version()
        • FeatherDataSet.save()
    • kedro_datasets.pandas.GBQQueryDataSet
      • GBQQueryDataSet
        • GBQQueryDataSet.DEFAULT_LOAD_ARGS
        • GBQQueryDataSet.__init__()
        • GBQQueryDataSet.exists()
        • GBQQueryDataSet.from_config()
        • GBQQueryDataSet.load()
        • GBQQueryDataSet.release()
        • GBQQueryDataSet.save()
    • kedro_datasets.pandas.GBQTableDataSet
      • GBQTableDataSet
        • GBQTableDataSet.DEFAULT_LOAD_ARGS
        • GBQTableDataSet.DEFAULT_SAVE_ARGS
        • GBQTableDataSet.__init__()
        • GBQTableDataSet.exists()
        • GBQTableDataSet.from_config()
        • GBQTableDataSet.load()
        • GBQTableDataSet.release()
        • GBQTableDataSet.save()
    • kedro_datasets.pandas.GenericDataSet
      • GenericDataSet
        • GenericDataSet.DEFAULT_LOAD_ARGS
        • GenericDataSet.DEFAULT_SAVE_ARGS
        • GenericDataSet.__init__()
        • GenericDataSet.exists()
        • GenericDataSet.from_config()
        • GenericDataSet.load()
        • GenericDataSet.release()
        • GenericDataSet.resolve_load_version()
        • GenericDataSet.resolve_save_version()
        • GenericDataSet.save()
    • kedro_datasets.pandas.HDFDataSet
      • HDFDataSet
        • HDFDataSet.DEFAULT_LOAD_ARGS
        • HDFDataSet.DEFAULT_SAVE_ARGS
        • HDFDataSet.__init__()
        • HDFDataSet.exists()
        • HDFDataSet.from_config()
        • HDFDataSet.load()
        • HDFDataSet.release()
        • HDFDataSet.resolve_load_version()
        • HDFDataSet.resolve_save_version()
        • HDFDataSet.save()
    • kedro_datasets.pandas.JSONDataSet
      • JSONDataSet
        • JSONDataSet.DEFAULT_LOAD_ARGS
        • JSONDataSet.DEFAULT_SAVE_ARGS
        • JSONDataSet.__init__()
        • JSONDataSet.exists()
        • JSONDataSet.from_config()
        • JSONDataSet.load()
        • JSONDataSet.release()
        • JSONDataSet.resolve_load_version()
        • JSONDataSet.resolve_save_version()
        • JSONDataSet.save()
    • kedro_datasets.pandas.ParquetDataSet
      • ParquetDataSet
        • ParquetDataSet.DEFAULT_LOAD_ARGS
        • ParquetDataSet.DEFAULT_SAVE_ARGS
        • ParquetDataSet.__init__()
        • ParquetDataSet.exists()
        • ParquetDataSet.from_config()
        • ParquetDataSet.load()
        • ParquetDataSet.release()
        • ParquetDataSet.resolve_load_version()
        • ParquetDataSet.resolve_save_version()
        • ParquetDataSet.save()
    • kedro_datasets.pandas.SQLQueryDataSet
      • SQLQueryDataSet
        • SQLQueryDataSet.__init__()
        • SQLQueryDataSet.adapt_mssql_date_params()
        • SQLQueryDataSet.create_connection()
        • SQLQueryDataSet.engines
        • SQLQueryDataSet.exists()
        • SQLQueryDataSet.from_config()
        • SQLQueryDataSet.load()
        • SQLQueryDataSet.release()
        • SQLQueryDataSet.save()
    • kedro_datasets.pandas.SQLTableDataSet
      • SQLTableDataSet
        • SQLTableDataSet.DEFAULT_LOAD_ARGS
        • SQLTableDataSet.DEFAULT_SAVE_ARGS
        • SQLTableDataSet.__init__()
        • SQLTableDataSet.create_connection()
        • SQLTableDataSet.engines
        • SQLTableDataSet.exists()
        • SQLTableDataSet.from_config()
        • SQLTableDataSet.load()
        • SQLTableDataSet.release()
        • SQLTableDataSet.save()
    • kedro_datasets.pandas.XMLDataSet
      • XMLDataSet
        • XMLDataSet.DEFAULT_LOAD_ARGS
        • XMLDataSet.DEFAULT_SAVE_ARGS
        • XMLDataSet.__init__()
        • XMLDataSet.exists()
        • XMLDataSet.from_config()
        • XMLDataSet.load()
        • XMLDataSet.release()
        • XMLDataSet.resolve_load_version()
        • XMLDataSet.resolve_save_version()
        • XMLDataSet.save()
    • kedro_datasets.pickle.PickleDataSet
      • PickleDataSet
        • PickleDataSet.DEFAULT_LOAD_ARGS
        • PickleDataSet.DEFAULT_SAVE_ARGS
        • PickleDataSet.__init__()
        • PickleDataSet.exists()
        • PickleDataSet.from_config()
        • PickleDataSet.load()
        • PickleDataSet.release()
        • PickleDataSet.resolve_load_version()
        • PickleDataSet.resolve_save_version()
        • PickleDataSet.save()
    • kedro_datasets.pillow.ImageDataSet
      • ImageDataSet
        • ImageDataSet.DEFAULT_SAVE_ARGS
        • ImageDataSet.__init__()
        • ImageDataSet.exists()
        • ImageDataSet.from_config()
        • ImageDataSet.load()
        • ImageDataSet.release()
        • ImageDataSet.resolve_load_version()
        • ImageDataSet.resolve_save_version()
        • ImageDataSet.save()
    • kedro_datasets.plotly.JSONDataSet
      • JSONDataSet
        • JSONDataSet.DEFAULT_LOAD_ARGS
        • JSONDataSet.DEFAULT_SAVE_ARGS
        • JSONDataSet.__init__()
        • JSONDataSet.exists()
        • JSONDataSet.from_config()
        • JSONDataSet.load()
        • JSONDataSet.release()
        • JSONDataSet.resolve_load_version()
        • JSONDataSet.resolve_save_version()
        • JSONDataSet.save()
    • kedro_datasets.plotly.PlotlyDataSet
      • PlotlyDataSet
        • PlotlyDataSet.DEFAULT_LOAD_ARGS
        • PlotlyDataSet.DEFAULT_SAVE_ARGS
        • PlotlyDataSet.__init__()
        • PlotlyDataSet.exists()
        • PlotlyDataSet.from_config()
        • PlotlyDataSet.load()
        • PlotlyDataSet.release()
        • PlotlyDataSet.resolve_load_version()
        • PlotlyDataSet.resolve_save_version()
        • PlotlyDataSet.save()
    • kedro_datasets.polars.CSVDataSet
      • CSVDataSet
        • CSVDataSet.DEFAULT_LOAD_ARGS
        • CSVDataSet.DEFAULT_SAVE_ARGS
        • CSVDataSet.__init__()
        • CSVDataSet.exists()
        • CSVDataSet.from_config()
        • CSVDataSet.load()
        • CSVDataSet.release()
        • CSVDataSet.resolve_load_version()
        • CSVDataSet.resolve_save_version()
        • CSVDataSet.save()
    • kedro_datasets.redis.PickleDataSet
      • PickleDataSet
        • PickleDataSet.DEFAULT_LOAD_ARGS
        • PickleDataSet.DEFAULT_REDIS_URL
        • PickleDataSet.DEFAULT_SAVE_ARGS
        • PickleDataSet.__init__()
        • PickleDataSet.exists()
        • PickleDataSet.from_config()
        • PickleDataSet.load()
        • PickleDataSet.release()
        • PickleDataSet.save()
    • kedro_datasets.snowflake.SnowparkTableDataSet
      • SnowparkTableDataSet
        • SnowparkTableDataSet.DEFAULT_LOAD_ARGS
        • SnowparkTableDataSet.DEFAULT_SAVE_ARGS
        • SnowparkTableDataSet.__init__()
        • SnowparkTableDataSet.exists()
        • SnowparkTableDataSet.from_config()
        • SnowparkTableDataSet.load()
        • SnowparkTableDataSet.release()
        • SnowparkTableDataSet.save()
    • kedro_datasets.spark.DeltaTableDataSet
      • DeltaTableDataSet
        • DeltaTableDataSet.__init__()
        • DeltaTableDataSet.exists()
        • DeltaTableDataSet.from_config()
        • DeltaTableDataSet.load()
        • DeltaTableDataSet.release()
        • DeltaTableDataSet.save()
    • kedro_datasets.spark.SparkDataSet
      • SparkDataSet
        • SparkDataSet.DEFAULT_LOAD_ARGS
        • SparkDataSet.DEFAULT_SAVE_ARGS
        • SparkDataSet.__init__()
        • SparkDataSet.exists()
        • SparkDataSet.from_config()
        • SparkDataSet.load()
        • SparkDataSet.release()
        • SparkDataSet.resolve_load_version()
        • SparkDataSet.resolve_save_version()
        • SparkDataSet.save()
    • kedro_datasets.spark.SparkHiveDataSet
      • SparkHiveDataSet
        • SparkHiveDataSet.DEFAULT_SAVE_ARGS
        • SparkHiveDataSet.__init__()
        • SparkHiveDataSet.exists()
        • SparkHiveDataSet.from_config()
        • SparkHiveDataSet.load()
        • SparkHiveDataSet.release()
        • SparkHiveDataSet.save()
    • kedro_datasets.spark.SparkJDBCDataSet
      • SparkJDBCDataSet
        • SparkJDBCDataSet.DEFAULT_LOAD_ARGS
        • SparkJDBCDataSet.DEFAULT_SAVE_ARGS
        • SparkJDBCDataSet.__init__()
        • SparkJDBCDataSet.exists()
        • SparkJDBCDataSet.from_config()
        • SparkJDBCDataSet.load()
        • SparkJDBCDataSet.release()
        • SparkJDBCDataSet.save()
    • kedro_datasets.spark.SparkStreamingDataSet
      • SparkStreamingDataSet
        • SparkStreamingDataSet.DEFAULT_LOAD_ARGS
        • SparkStreamingDataSet.DEFAULT_SAVE_ARGS
        • SparkStreamingDataSet.__init__()
        • SparkStreamingDataSet.exists()
        • SparkStreamingDataSet.from_config()
        • SparkStreamingDataSet.load()
        • SparkStreamingDataSet.release()
        • SparkStreamingDataSet.save()
    • kedro_datasets.svmlight.SVMLightDataSet
      • SVMLightDataSet
        • SVMLightDataSet.DEFAULT_LOAD_ARGS
        • SVMLightDataSet.DEFAULT_SAVE_ARGS
        • SVMLightDataSet.__init__()
        • SVMLightDataSet.exists()
        • SVMLightDataSet.from_config()
        • SVMLightDataSet.load()
        • SVMLightDataSet.release()
        • SVMLightDataSet.resolve_load_version()
        • SVMLightDataSet.resolve_save_version()
        • SVMLightDataSet.save()
    • kedro_datasets.tensorflow.TensorFlowModelDataSet
      • TensorFlowModelDataSet
        • TensorFlowModelDataSet.DEFAULT_LOAD_ARGS
        • TensorFlowModelDataSet.DEFAULT_SAVE_ARGS
        • TensorFlowModelDataSet.__init__()
        • TensorFlowModelDataSet.exists()
        • TensorFlowModelDataSet.from_config()
        • TensorFlowModelDataSet.load()
        • TensorFlowModelDataSet.release()
        • TensorFlowModelDataSet.resolve_load_version()
        • TensorFlowModelDataSet.resolve_save_version()
        • TensorFlowModelDataSet.save()
    • kedro_datasets.text.TextDataSet
      • TextDataSet
        • TextDataSet.__init__()
        • TextDataSet.exists()
        • TextDataSet.from_config()
        • TextDataSet.load()
        • TextDataSet.release()
        • TextDataSet.resolve_load_version()
        • TextDataSet.resolve_save_version()
        • TextDataSet.save()
    • kedro_datasets.tracking.JSONDataSet
      • JSONDataSet
        • JSONDataSet.DEFAULT_SAVE_ARGS
        • JSONDataSet.__init__()
        • JSONDataSet.exists()
        • JSONDataSet.from_config()
        • JSONDataSet.load()
        • JSONDataSet.release()
        • JSONDataSet.resolve_load_version()
        • JSONDataSet.resolve_save_version()
        • JSONDataSet.save()
        • JSONDataSet.versioned
    • kedro_datasets.tracking.MetricsDataSet
      • MetricsDataSet
        • MetricsDataSet.DEFAULT_SAVE_ARGS
        • MetricsDataSet.__init__()
        • MetricsDataSet.exists()
        • MetricsDataSet.from_config()
        • MetricsDataSet.load()
        • MetricsDataSet.release()
        • MetricsDataSet.resolve_load_version()
        • MetricsDataSet.resolve_save_version()
        • MetricsDataSet.save()
        • MetricsDataSet.versioned
    • kedro_datasets.video.VideoDataSet
      • VideoDataSet
        • VideoDataSet.__init__()
        • VideoDataSet.exists()
        • VideoDataSet.from_config()
        • VideoDataSet.load()
        • VideoDataSet.release()
        • VideoDataSet.save()
    • kedro_datasets.yaml.YAMLDataSet
      • YAMLDataSet
        • YAMLDataSet.DEFAULT_SAVE_ARGS
        • YAMLDataSet.__init__()
        • YAMLDataSet.exists()
        • YAMLDataSet.from_config()
        • YAMLDataSet.load()
        • YAMLDataSet.release()
        • YAMLDataSet.resolve_load_version()
        • YAMLDataSet.resolve_save_version()
        • YAMLDataSet.save()
kedro
  • Docs »
  • Development
  • Edit on GitHub

Development¶

  • Set up Visual Studio Code
  • Set up PyCharm
  • Kedro’s command line interface
  • Debugging
  • Automated Testing
  • Code formatting and linting
Previous Next

Revision 0293dc15.

Built with Sphinx using a theme provided by Read the Docs.
Read the Docs v: stable
Versions
latest
stable
0.18.13
0.18.12
0.18.11
0.18.10
0.18.9
0.18.8
0.18.7
0.18.6
0.18.5
0.18.4
0.18.3
0.18.2
0.18.1
0.18.0
0.17.7
0.17.6
0.17.5
0.17.4
0.17.3
0.17.2
0.17.1
0.17.0
0.16.6
0.16.5
0.16.4
0.16.3
0.16.2
0.16.1
0.16.0
0.15.9
0.15.8
0.15.7
0.15.6
0.15.5
0.15.4
0.15.3
0.15.2
0.15.0
0.14.3
Downloads
On Read the Docs
Project Home
Builds