Skip to content

kedro_datasets_experimental

kedro_datasets_experimental

Dataset Classes

Name Description
chromadb.ChromaDBDataset ChromaDBDataset loads and saves data to ChromaDB vector database collections.
databricks.ExternalTableDataset ExternalTableDataset implementation to access external tables in Databricks.
langchain.PromptDataset PromptDataset loads a langchain prompt template.
langfuse.EvaluationDataset EvaluationDataset manages Langfuse evaluation datasets for LLM experiment workflows, supporting local file syncing and remote dataset versioning.
langfuse.PromptDataset PromptDataset provides a seamless integration between local prompt files (JSON/YAML) and Langfuse prompt management, supporting version control, labeling, and different synchronization policies.
langfuse.TraceDataset TraceDataset provides Langfuse tracing clients for LLM observability and monitoring.
mlrun.MLRunAbstractDataset MLRunAbstractDataset base class for MLRun datasets, can be used directly for generic artifacts.
mlrun.MLRunModel MLRunModel saves and loads ML models via MLRun with framework metadata and configurable file format.
mlrun.MLRunDataframeDataset MLRunDataframeDataset saves and loads pandas DataFrames as MLRun artifacts.
mlrun.MLRunResult MLRunResult logs scalar results and metrics to MLRun with optional nested dict flattening.
netcdf.NetCDFDataset NetCDFDataset loads/saves data from/to a NetCDF file using an underlying filesystem (e.g.: local, S3, GCS). It uses xarray to handle the NetCDF file.
opik.EvaluationDataset EvaluationDataset manages Opik evaluation datasets for LLM experiment workflows.
opik.PromptDataset PromptDataset manages prompts with Opik versioning support, returning either raw SDK objects or LangChain templates.
opik.TraceDataset TraceDataset provides Opik tracing clients for observability and monitoring.
optuna.StudyDataset StudyDataset loads/saves an Optuna study, enabling distributed hyperparameter tuning.
pypdf.PDFDataset PDFDataset loads data from PDF files using pypdf to extract text from pages. Read-only dataset.
polars.PolarsDatabaseDataset PolarsDatabaseDataset implementation to access databases as Polars DataFrames. It supports reading from a SQL query and writing to a database table.
prophet.ProphetModelDataset ProphetModelDataset loads/saves Facebook Prophet models to a JSON file using an underlying filesystem (e.g., local, S3, GCS). It uses Prophet's built-in serialisation to handle the JSON file.
pytorch.PyTorchDataset PyTorchDataset loads and saves PyTorch models' state_dict using PyTorch's recommended zipfile serialization protocol. To avoid security issues with Pickle.
rioxarray.GeoTIFFDataset Loads and saves raster data files as xarray DataArrays. Supports single and multiband GeoTIFFs with CRS validation.
safetensors.SafetensorsDataset Loads and saves data using the SafeTensors library with support for multiple backends like numpy and torch.
video.VideoDataset Loads and saves video data as a sequence of images using OpenCV, supporting various codecs and formats.