Learn Kedro with hands-on video

If you like to learn from video, you can follow our hands-on course “Introduction to Kedro: Building Maintainable Data Pipelines” on YouTube.

The course is structured into sections and these are each broken into short videos that cover specific Kedro topics. You’ll walk through the spaceflights tutorial and get hands-on with the example. Along the way, you’ll learn key Kedro concepts like datasets and the Kedro Data Catalog, nodes and pipelines, and configuration management.

Who is this course for?

This course is for data scientists, data engineers and machine learning engineers. You can be junior, mid-level or senior in your field of work. You’re likely to be hands-on with projects, or a decision-maker who regularly makes design and implementation choices about Python data products.

We assume you know these concepts:

  • Python basics (coding on Jupyter and other notebook interfaces)

  • Manipulating data with pandas

  • Visualising insights

  • Command line basics

We don’t assume knowledge of software engineering in Python, so the course contains information about reusability principles, how to create a Python package, and how to use version control.

Please note that we do expect users to have Git installed, as it is a prerequisite for the kedro new flow, which is used when creating a new project.

What you’ll learn

In short, you’ll learn answers to the following:

  • Introduction to Kedro

  • What is Kedro? How does it help you create maintainable, reusable data science code?

  • How does Kedro fit into the data science ecosystem?

  • What do you need to do to create a Kedro project?

  • How can you refactor a Jupyter notebook to a Kedro project?

  • How do you package Python code as a library?

  • How do you work with Kedro projects in VS Code?

  • What are namespaces and dataset factories?

  • What is needed to deploy a Kedro project using container solutions like Docker and open source orchestrators like Airflow?

  • What are Kedro plugins?

  • How can you contribute to Kedro?

You don’t need to register for the course and you can skip around the sections to find help on a particular area as you pick up the skills needed to build your own Kedro projects.

Index of videos

Introduction to Kedro: Building Maintainable Data Pipelines is split into the following videos:

Part 0: Introduction

  1. Data science in production: the good, the bad and the ugly

  2. What is Kedro?

  3. Kedro and data orchestrators

  4. How does Kedro fit into the data science ecosystem?

Part 1: Get started with Kedro

  1. Create a Kedro project from scratch?

  2. The spaceflights starter

  3. Use Kedro from Jupyter notebook

  4. Set up the Kedro Data Catalog

  5. Explore the spaceflights data

  6. Refactor your data processing code into functions

  7. Create your first data pipeline with Kedro

  8. Assemble your nodes into a Kedro pipeline

  9. Run your Kedro pipeline

  10. Visualise your data pipeline with Kedro-Viz

Part 2: Make complex Kedro pipelines

  1. Merge different dataframes in Kedro

  2. Predict prices using machine learning

  3. Refactor your data science code into functions

  4. How to work with parameters in Kedro

  5. Create a Kedro pipeline with parameters

  6. Reuse your Kedro pipeline using namespaces

  7. Kedro pipeline runners

  8. Create Kedro datasets dynamically using factories

Part 3: Ship your Kedro project to production

  1. Define your own Kedro environments

  2. Use S3 and MinIO cloud storage with Kedro

  3. Package your Kedro project into a Python wheel

  4. Turn your Kedro project into a Docker container

  5. Deploy your Kedro project to Apache Airflow

Part 4: Where next?

Continue your Kedro journey