Skip to content

Project

kedro.framework.project

kedro.framework.project module provides utility to configure a Kedro project and access its settings.

Function Description
configure_logging Configure logging according to the logging_config dictionary.
configure_project Configure a Kedro project by populating its settings with values defined in settings.py and pipeline_registry.py.
find_pipelines Automatically find modular pipelines having a create_pipeline function.
validate_settings Eagerly validate that the settings module is importable if it exists.

kedro.framework.project.configure_logging

configure_logging(logging_config)

Configure logging according to logging_config dictionary.

Source code in kedro/framework/project/__init__.py
362
363
364
def configure_logging(logging_config: dict[str, Any]) -> None:
    """Configure logging according to ``logging_config`` dictionary."""
    LOGGING.configure(logging_config)

kedro.framework.project.configure_project

configure_project(package_name, preserve_logging=False)

Configure a Kedro project by populating its settings with values defined in user's settings.py and pipeline_registry.py.

Parameters:

  • package_name (str) –

    The name of the project package.

  • preserve_logging (bool, default: False ) –

    If True, skip re-applying the logging configuration when setting up the project logger. Useful in long-running processes (e.g. FastAPI apps) where custom handlers are added at runtime and must not be overwritten on repeated calls to configure_project().

Source code in kedro/framework/project/__init__.py
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
def configure_project(package_name: str, preserve_logging: bool = False) -> None:
    """Configure a Kedro project by populating its settings with values
    defined in user's settings.py and pipeline_registry.py.

    Args:
        package_name: The name of the project package.
        preserve_logging: If True, skip re-applying the logging configuration when
            setting up the project logger. Useful in long-running processes (e.g.
            FastAPI apps) where custom handlers are added at runtime and must not
            be overwritten on repeated calls to ``configure_project()``.
    """
    settings_module = f"{package_name}.settings"
    settings.configure(settings_module)

    pipelines_module = f"{package_name}.pipeline_registry"
    pipelines.configure(pipelines_module)

    # Once the project is successfully configured once, store PACKAGE_NAME as a
    # global variable to make it easily accessible. This is used by validate_settings()
    # below, and also by ParallelRunner on Windows, as package_name is required every
    # time a new subprocess is spawned.
    global PACKAGE_NAME  # noqa: PLW0603
    PACKAGE_NAME = package_name

    if PACKAGE_NAME:
        LOGGING.set_project_logging(PACKAGE_NAME, preserve_logging=preserve_logging)

kedro.framework.project.find_pipelines

find_pipelines(raise_errors=False, pipelines_to_find=None)

Automatically find modular pipelines having a create_pipeline function. By default, projects created using Kedro 0.18.3 and higher call this function to autoregister pipelines upon creation/addition.

Projects that require more fine-grained control can still define the pipeline registry without calling this function. Alternatively, they can modify the mapping generated by the find_pipelines function.

For more information on the pipeline registry and autodiscovery, see https://docs.kedro.org/en/stable/build/pipeline_registry/

Parameters:

  • raise_errors (bool, default: False ) –

    If True, raise an error upon failed discovery.

  • pipelines_to_find (list[str] | None, default: None ) –

    Optional list of pipeline names to load selectively. If None or contains "__default__", all pipelines are loaded.

Returns:

  • dict[str, Pipeline]

    A generated mapping from pipeline names to Pipeline objects.

Raises:

  • RuntimeError

    When the project has not been configured (i.e. PACKAGE_NAME is None).

  • ImportError

    When a module does not expose a create_pipeline function, the create_pipeline function does not return a Pipeline object, or if the module import fails up front. If raise_errors is False, see Warns section instead.

Warns:

  • UserWarning

    When a module does not expose a create_pipeline function, the create_pipeline function does not return a Pipeline object, or if the module import fails up front. If raise_errors is True, see Raises section instead.

Source code in kedro/framework/project/__init__.py
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
def find_pipelines(  # noqa: PLR0912, PLR0915
    raise_errors: bool = False, pipelines_to_find: list[str] | None = None
) -> dict[str, Pipeline]:
    """Automatically find modular pipelines having a ``create_pipeline``
    function. By default, projects created using Kedro 0.18.3 and higher
    call this function to autoregister pipelines upon creation/addition.

    Projects that require more fine-grained control can still define the
    pipeline registry without calling this function. Alternatively, they
    can modify the mapping generated by the ``find_pipelines`` function.

    For more information on the pipeline registry and autodiscovery, see
    https://docs.kedro.org/en/stable/build/pipeline_registry/

    Args:
        raise_errors: If ``True``, raise an error upon failed discovery.
        pipelines_to_find: Optional list of pipeline names to load selectively.
            If ``None`` or contains ``"__default__"``, all pipelines are loaded.

    Returns:
        A generated mapping from pipeline names to ``Pipeline`` objects.

    Raises:
        RuntimeError: When the project has not been configured (i.e.
            ``PACKAGE_NAME`` is ``None``).
        ImportError: When a module does not expose a ``create_pipeline``
            function, the ``create_pipeline`` function does not return a
            ``Pipeline`` object, or if the module import fails up front.
            If ``raise_errors`` is ``False``, see Warns section instead.

    Warns:
        UserWarning: When a module does not expose a ``create_pipeline``
            function, the ``create_pipeline`` function does not return a
            ``Pipeline`` object, or if the module import fails up front.
            If ``raise_errors`` is ``True``, see Raises section instead.
    """
    if PACKAGE_NAME is None:
        raise RuntimeError(
            "'find_pipelines' cannot be called before the project is configured. "
            "Call 'configure_project' first."
        )

    # Determine if specific pipelines were requested
    load_all = pipelines_to_find is None or "__default__" in pipelines_to_find
    requested_pipelines: set[str] | None = None if load_all else set(pipelines_to_find)  # type: ignore[arg-type]

    pipelines_dict: dict[str, Pipeline] = {}

    if load_all:
        # Handle the simplified project structure found in several starters.
        pipeline_obj = None
        pipeline_module_name = f"{PACKAGE_NAME}.pipeline"
        try:
            pipeline_module = importlib.import_module(pipeline_module_name)
        except Exception as exc:
            if str(exc) != f"No module named '{pipeline_module_name}'":
                if raise_errors:
                    raise ImportError(
                        f"An error occurred while importing the "
                        f"'{pipeline_module_name}' module."
                    ) from exc

                warnings.warn(
                    IMPORT_ERROR_MESSAGE.format(
                        module=pipeline_module_name, tb_exc=traceback.format_exc()
                    )
                )
        else:
            pipeline_obj = _create_pipeline(pipeline_module)

        pipelines_dict["__default__"] = pipeline_obj or pipeline([])

    # Handle the case that a project doesn't have a pipelines directory.
    try:
        pipelines_package = importlib.resources.files(f"{PACKAGE_NAME}.pipelines")
    except ModuleNotFoundError as exc:
        if str(exc) == f"No module named '{PACKAGE_NAME}.pipelines'":
            if requested_pipelines is not None:
                missing_str = ", ".join(sorted(requested_pipelines))
                error_msg = f"Pipeline(s) not found: {missing_str}"
                if raise_errors:
                    raise KeyError(error_msg) from exc
                warnings.warn(error_msg)
                return {}
            return pipelines_dict

    seen: set[str] = set()
    for pipeline_dir in pipelines_package.iterdir():
        if not pipeline_dir.is_dir():
            continue

        pipeline_name = pipeline_dir.name
        if pipeline_name == "__pycache__":
            continue
        # Prevent imports of hidden directories/files
        if pipeline_name.startswith("."):
            continue

        if requested_pipelines is not None and pipeline_name not in requested_pipelines:
            continue

        seen.add(pipeline_name)
        pipeline_module_name = f"{PACKAGE_NAME}.pipelines.{pipeline_name}"
        try:
            pipeline_module = importlib.import_module(pipeline_module_name)
        except Exception as exc:
            if raise_errors:
                raise ImportError(
                    f"An error occurred while importing the "
                    f"'{pipeline_module_name}' module."
                ) from exc

            warnings.warn(
                IMPORT_ERROR_MESSAGE.format(
                    module=pipeline_module_name, tb_exc=traceback.format_exc()
                )
            )
            continue

        pipeline_obj = _create_pipeline(pipeline_module)
        if pipeline_obj is not None:
            pipelines_dict[pipeline_name] = pipeline_obj
        elif raise_errors:
            raise KeyError(f"Pipeline '{pipeline_name}' not found")

    if requested_pipelines is not None:
        for pipeline_name in requested_pipelines - seen:
            pipeline_module_name = f"{PACKAGE_NAME}.pipelines.{pipeline_name}"
            error_msg = f"An error occurred while importing the '{pipeline_module_name}' module."
            if raise_errors:
                raise ImportError(error_msg)
            warnings.warn(
                f"{error_msg} Nothing defined therein will be returned by 'find_pipelines'."
            )

    return pipelines_dict

kedro.framework.project.validate_settings

validate_settings()

Eagerly validate that the settings module is importable if it exists. This is desirable to surface any syntax or import errors early. In particular, without eagerly importing the settings module, dynaconf would silence any import error (e.g. missing dependency, missing/mislabelled pipeline), and users would instead get a cryptic error message Expected an instance of `ConfigLoader`, got `NoneType` instead. More info on the dynaconf issue: https://github.com/dynaconf/dynaconf/issues/460

Source code in kedro/framework/project/__init__.py
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
def validate_settings() -> None:
    """Eagerly validate that the settings module is importable if it exists. This is desirable to
    surface any syntax or import errors early. In particular, without eagerly importing
    the settings module, dynaconf would silence any import error (e.g. missing
    dependency, missing/mislabelled pipeline), and users would instead get a cryptic
    error message ``Expected an instance of `ConfigLoader`, got `NoneType` instead``.
    More info on the dynaconf issue: https://github.com/dynaconf/dynaconf/issues/460
    """
    if PACKAGE_NAME is None:
        raise ValueError(
            "Package name not found. Make sure you have configured the project using "
            "'bootstrap_project'. This should happen automatically if you are using "
            "Kedro command line interface."
        )
    # Check if file exists, if it does, validate it.
    if importlib.util.find_spec(f"{PACKAGE_NAME}.settings") is not None:
        importlib.import_module(f"{PACKAGE_NAME}.settings")
    else:
        logger = logging.getLogger(__name__)
        logger.warning("No 'settings.py' found, defaults will be used.")