Skip to content

kedro_datasets_experimental

kedro_datasets_experimental

Dataset Classes

Name Description
chromadb.ChromaDBDataset ChromaDBDataset loads and saves data to ChromaDB vector database collections.
databricks.ExternalTableDataset ExternalTableDataset implementation to access external tables in Databricks.
langchain.LangChainPromptDataset LangChainPromptDataset loads a langchain prompt template.
langfuse.LangfusePromptDataset LangfusePromptDataset provides a seamless integration between local prompt files (JSON/YAML) and Langfuse prompt management, supporting version control, labeling, and different synchronization policies.
langfuse.LangfuseTraceDataset LangfuseTraceDataset provides Langfuse tracing clients for LLM observability and monitoring.
netcdf.NetCDFDataset NetCDFDataset loads/saves data from/to a NetCDF file using an underlying filesystem (e.g.: local, S3, GCS). It uses xarray to handle the NetCDF file.
opik.OpikPromptDataset OpikPromptDataset manages prompts with Opik versioning support, returning either raw SDK objects or LangChain templates.
opik.OpikTraceDataset OpikTraceDataset provides Opik tracing clients for observability and monitoring.
optuna.StudyDataset StudyDataset loads/saves an Optuna study, enabling distributed hyperparameter tuning.
pypdf.PDFDataset PDFDataset loads data from PDF files using pypdf to extract text from pages. Read-only dataset.
polars.PolarsDatabaseDataset PolarsDatabaseDataset implementation to access databases as Polars DataFrames. It supports reading from a SQL query and writing to a database table.
prophet.ProphetModelDataset ProphetModelDataset loads/saves Facebook Prophet models to a JSON file using an underlying filesystem (e.g., local, S3, GCS). It uses Prophet's built-in serialisation to handle the JSON file.
pytorch.PyTorchDataset PyTorchDataset loads and saves PyTorch models' state_dict using PyTorch's recommended zipfile serialization protocol. To avoid security issues with Pickle.
rioxarray.GeoTIFFDataset Loads and saves raster data files as xarray DataArrays. Supports single and multiband GeoTIFFs with CRS validation.
safetensors.SafetensorsDataset Loads and saves data using the SafeTensors library with support for multiple backends like numpy and torch.
video.VideoDataset Loads and saves video data as a sequence of images using OpenCV, supporting various codecs and formats.