Common use cases

Use Hooks to extend a node’s behaviour

You can use the before_node_run and after_node_run Hooks to add extra behavior before and after a node’s execution. Furthermore, you can apply extra behavior to not only an individual node or an entire Kedro pipeline, but also to a subset of nodes, based on their tags or namespaces: for example, suppose we want to add the following extra behavior to a node:

from kedro.pipeline.node import Node


def say_hello(node: Node):
    """An extra behaviour for a node to say hello before running."""
    print(f"Hello from {node.name}")

Then you can either add it to a single node based on the node’s name:

# src/<package_name>/hooks.py

from kedro.framework.hooks import hook_impl
from kedro.pipeline.node import Node


class ProjectHooks:
    @hook_impl
    def before_node_run(self, node: Node):
        # adding extra behaviour to a single node
        if node.name == "hello":
            say_hello(node)

Or add it to a group of nodes based on their tags:

# src/<package_name>/hooks.py

from kedro.framework.hooks import hook_impl
from kedro.pipeline.node import Node


class ProjectHooks:
    @hook_impl
    def before_node_run(self, node: Node):
        if "hello" in node.tags:
            say_hello(node)

Or add it to all nodes in the entire pipeline:

# src/<package_name>/hooks.py

from kedro.framework.hooks import hook_impl
from kedro.pipeline.node import Node


class ProjectHooks:
    @hook_impl
    def before_node_run(self, node: Node):
        # adding extra behaviour to all nodes in the pipeline
        say_hello(node)

If your use case takes advantage of a decorator, for example to retry a node’s execution using a library such as tenacity, you can still decorate the node’s function directly:

from tenacity import retry


@retry
def my_flaky_node_function():
    ...

Or applying it in the before_node_run Hook as follows:

# src/<package_name>/hooks.py
from tenacity import retry

from kedro.framework.hooks import hook_impl
from kedro.pipeline.node import Node


class ProjectHooks:
    @hook_impl
    def before_node_run(self, node: Node):
        # adding retrying behaviour to nodes tagged as flaky
        if "flaky" in node.tags:
            node.func = retry(node.func)

Use Hooks to customise the dataset load and save methods

We recommend using the before_dataset_loaded/after_dataset_loaded and before_dataset_saved/after_dataset_saved Hooks to customise the dataset load and save methods where appropriate.

For example, you can add logging about the dataset load runtime as follows:

@property
def _logger(self):
    return logging.getLogger(self.__class__.__name__)


@hook_impl
def before_dataset_loaded(self, dataset_name: str) -> None:
    start = time.time()
    self._logger.info("Loading dataset %s started at %0.3f", dataset_name, start)


@hook_impl
def after_dataset_loaded(self, dataset_name: str, data: Any) -> None:
    end = time.time()
    self._logger.info("Loading dataset %s ended at %0.3f", dataset_name, end)