
Welcome to Kedro’s documentation!¶
Learn about Kedro
Tutorial and basic Kedro usage
- Experiment tracking in Kedro-Viz
- Kedro versions supporting experiment tracking
- When should I use experiment tracking in Kedro?
- Set up a project
- Set up the session store
- Collaborative experiment tracking
- Set up experiment tracking datasets
- Modify your nodes and pipelines to log metrics
- Generate the run data
- Access run data and compare runs
- View and compare plots
- View and compare metrics data
Kedro projects
Advanced usage
- PySpark integration
- Centralise Spark configuration in
conf/base/spark.yml
- Initialise a
SparkSession
using a hook - Use Kedro’s built-in Spark datasets to load and save raw data
- Spark and Delta Lake interaction
- Use
MemoryDataset
for intermediaryDataFrame
- Use
MemoryDataset
withcopy_mode="assign"
for non-DataFrame
Spark objects - Tips for maximising concurrency using
ThreadRunner
- Centralise Spark configuration in
Contribute to Kedro
API documentation¶
Kedro is a framework that makes it easy to build robust and scalable data pipelines by providing uniform project templates, data abstraction, configuration and pipeline assembly. |
|
|