Skip to content

Project

kedro.framework.project

kedro.framework.project module provides utility to configure a Kedro project and access its settings.

Function Description
configure_logging Configure logging according to the logging_config dictionary.
configure_project Configure a Kedro project by populating its settings with values defined in settings.py and pipeline_registry.py.
find_pipelines Automatically find modular pipelines having a create_pipeline function.
validate_settings Eagerly validate that the settings module is importable if it exists.

kedro.framework.project.configure_logging

configure_logging(logging_config)

Configure logging according to logging_config dictionary.

Source code in kedro/framework/project/__init__.py
412
413
414
def configure_logging(logging_config: dict[str, Any]) -> None:
    """Configure logging according to ``logging_config`` dictionary."""
    LOGGING.configure(logging_config)

kedro.framework.project.configure_project

configure_project(package_name, preserve_logging=False)

Configure a Kedro project by populating its settings with values defined in user's settings.py and pipeline_registry.py.

Parameters:

  • package_name (str) –

    The name of the project package.

  • preserve_logging (bool, default: False ) –

    If True, skip re-applying the logging configuration when setting up the project logger. Useful in long-running processes (e.g. FastAPI apps) where custom handlers are added at runtime and must not be overwritten on repeated calls to configure_project().

Source code in kedro/framework/project/__init__.py
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
def configure_project(package_name: str, preserve_logging: bool = False) -> None:
    """Configure a Kedro project by populating its settings with values
    defined in user's settings.py and pipeline_registry.py.

    Args:
        package_name: The name of the project package.
        preserve_logging: If True, skip re-applying the logging configuration when
            setting up the project logger. Useful in long-running processes (e.g.
            FastAPI apps) where custom handlers are added at runtime and must not
            be overwritten on repeated calls to ``configure_project()``.
    """
    settings_module = f"{package_name}.settings"
    settings.configure(settings_module)

    pipelines_module = f"{package_name}.pipeline_registry"
    pipelines.configure(pipelines_module)

    # Once the project is successfully configured once, store PACKAGE_NAME as a
    # global variable to make it easily accessible. This is used by validate_settings()
    # below, and also by ParallelRunner on Windows, as package_name is required every
    # time a new subprocess is spawned.
    global PACKAGE_NAME  # noqa: PLW0603
    PACKAGE_NAME = package_name

    if PACKAGE_NAME:
        LOGGING.set_project_logging(PACKAGE_NAME, preserve_logging=preserve_logging)

kedro.framework.project.find_pipelines

find_pipelines(raise_errors=False, pipelines_to_find=None)

Automatically find modular pipelines having a create_pipeline function. By default, projects created using Kedro 0.18.3 and higher call this function to autoregister pipelines upon creation/addition.

Projects that require more fine-grained control can still define the pipeline registry without calling this function. Alternatively, they can modify the mapping generated by the find_pipelines function.

For more information on the pipeline registry and autodiscovery, see https://docs.kedro.org/en/stable/build/pipeline_registry/

Parameters:

  • raise_errors (bool, default: False ) –

    If True, raise an error upon failed discovery.

  • pipelines_to_find (list[str] | None, default: None ) –

    Optional list of pipeline names to load selectively. If None or contains "__default__", all pipelines are loaded.

Returns:

  • dict[str, Pipeline]

    A generated mapping from pipeline names to Pipeline objects.

Raises:

  • RuntimeError

    When the project has not been configured (i.e. PACKAGE_NAME is None).

  • ImportError

    When a module does not expose a create_pipeline function, the create_pipeline function does not return a Pipeline object, or if the module import fails up front. If raise_errors is False, see Warns section instead.

Warns:

  • UserWarning

    When a module does not expose a create_pipeline function, the create_pipeline function does not return a Pipeline object, or if the module import fails up front. If raise_errors is True, see Raises section instead.

Source code in kedro/framework/project/__init__.py
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
def find_pipelines(  # noqa: PLR0912, PLR0915
    raise_errors: bool = False, pipelines_to_find: list[str] | None = None
) -> dict[str, Pipeline]:
    """Automatically find modular pipelines having a ``create_pipeline``
    function. By default, projects created using Kedro 0.18.3 and higher
    call this function to autoregister pipelines upon creation/addition.

    Projects that require more fine-grained control can still define the
    pipeline registry without calling this function. Alternatively, they
    can modify the mapping generated by the ``find_pipelines`` function.

    For more information on the pipeline registry and autodiscovery, see
    https://docs.kedro.org/en/stable/build/pipeline_registry/

    Args:
        raise_errors: If ``True``, raise an error upon failed discovery.
        pipelines_to_find: Optional list of pipeline names to load selectively.
            If ``None`` or contains ``"__default__"``, all pipelines are loaded.

    Returns:
        A generated mapping from pipeline names to ``Pipeline`` objects.

    Raises:
        RuntimeError: When the project has not been configured (i.e.
            ``PACKAGE_NAME`` is ``None``).
        ImportError: When a module does not expose a ``create_pipeline``
            function, the ``create_pipeline`` function does not return a
            ``Pipeline`` object, or if the module import fails up front.
            If ``raise_errors`` is ``False``, see Warns section instead.

    Warns:
        UserWarning: When a module does not expose a ``create_pipeline``
            function, the ``create_pipeline`` function does not return a
            ``Pipeline`` object, or if the module import fails up front.
            If ``raise_errors`` is ``True``, see Raises section instead.
    """
    if PACKAGE_NAME is None:
        raise RuntimeError(
            "'find_pipelines' cannot be called before the project is configured. "
            "Call 'configure_project' first."
        )

    # Determine if specific pipelines were requested
    load_all = pipelines_to_find is None or "__default__" in pipelines_to_find
    requested_pipelines: set[str] | None = None if load_all else set(pipelines_to_find)  # type: ignore[arg-type]

    pipelines_dict: dict[str, Pipeline] = {}

    if load_all:
        # Handle the simplified project structure found in several starters.
        pipeline_obj = None
        pipeline_module_name = f"{PACKAGE_NAME}.pipeline"
        try:
            pipeline_module = importlib.import_module(pipeline_module_name)
        except Exception as exc:
            if str(exc) != f"No module named '{pipeline_module_name}'":
                if raise_errors:
                    raise ImportError(
                        f"An error occurred while importing the "
                        f"'{pipeline_module_name}' module."
                    ) from exc

                warnings.warn(
                    IMPORT_ERROR_MESSAGE.format(
                        module=pipeline_module_name, tb_exc=traceback.format_exc()
                    )
                )
        else:
            pipeline_obj = _create_pipeline(pipeline_module)

        pipelines_dict["__default__"] = pipeline_obj or pipeline([])

    # Handle the case that a project doesn't have a pipelines directory.
    try:
        pipelines_package = importlib.resources.files(f"{PACKAGE_NAME}.pipelines")
    except ModuleNotFoundError as exc:
        if str(exc) == f"No module named '{PACKAGE_NAME}.pipelines'":
            if requested_pipelines is not None:
                missing_str = ", ".join(sorted(requested_pipelines))
                error_msg = f"Pipeline(s) not found: {missing_str}"
                if raise_errors:
                    raise KeyError(error_msg) from exc
                warnings.warn(error_msg)
                return {}
            return pipelines_dict

    seen: set[str] = set()
    for pipeline_dir in pipelines_package.iterdir():
        if not pipeline_dir.is_dir():
            continue

        pipeline_name = pipeline_dir.name
        if pipeline_name == "__pycache__":
            continue
        # Prevent imports of hidden directories/files
        if pipeline_name.startswith("."):
            continue

        if requested_pipelines is not None and pipeline_name not in requested_pipelines:
            continue

        seen.add(pipeline_name)
        pipeline_module_name = f"{PACKAGE_NAME}.pipelines.{pipeline_name}"
        try:
            pipeline_module = importlib.import_module(pipeline_module_name)
        except Exception as exc:
            if raise_errors:
                raise ImportError(
                    f"An error occurred while importing the "
                    f"'{pipeline_module_name}' module."
                ) from exc

            warnings.warn(
                IMPORT_ERROR_MESSAGE.format(
                    module=pipeline_module_name, tb_exc=traceback.format_exc()
                )
            )
            continue

        pipeline_obj = _create_pipeline(pipeline_module)
        if pipeline_obj is not None:
            pipelines_dict[pipeline_name] = pipeline_obj
        elif raise_errors:
            raise KeyError(f"Pipeline '{pipeline_name}' not found")

    if requested_pipelines is not None:
        for pipeline_name in requested_pipelines - seen:
            pipeline_module_name = f"{PACKAGE_NAME}.pipelines.{pipeline_name}"
            error_msg = f"An error occurred while importing the '{pipeline_module_name}' module."
            if raise_errors:
                raise ImportError(error_msg)
            warnings.warn(
                f"{error_msg} Nothing defined therein will be returned by 'find_pipelines'."
            )

    return pipelines_dict

kedro.framework.project.validate_settings

validate_settings()

Eagerly validate that the settings module is importable if it exists. This is desirable to surface any syntax or import errors early. In particular, without eagerly importing the settings module, dynaconf would silence any import error (e.g. missing dependency, missing/mislabelled pipeline), and users would instead get a cryptic error message Expected an instance of `ConfigLoader`, got `NoneType` instead. More info on the dynaconf issue: https://github.com/dynaconf/dynaconf/issues/460

Source code in kedro/framework/project/__init__.py
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
def validate_settings() -> None:
    """Eagerly validate that the settings module is importable if it exists. This is desirable to
    surface any syntax or import errors early. In particular, without eagerly importing
    the settings module, dynaconf would silence any import error (e.g. missing
    dependency, missing/mislabelled pipeline), and users would instead get a cryptic
    error message ``Expected an instance of `ConfigLoader`, got `NoneType` instead``.
    More info on the dynaconf issue: https://github.com/dynaconf/dynaconf/issues/460
    """
    if PACKAGE_NAME is None:
        raise ValueError(
            "Package name not found. Make sure you have configured the project using "
            "'bootstrap_project'. This should happen automatically if you are using "
            "Kedro command line interface."
        )
    # Check if file exists, if it does, validate it.
    if importlib.util.find_spec(f"{PACKAGE_NAME}.settings") is not None:
        importlib.import_module(f"{PACKAGE_NAME}.settings")
    else:
        logger = logging.getLogger(__name__)
        logger.warning("No 'settings.py' found, defaults will be used.")