| chromadb.ChromaDBDataset |
ChromaDBDataset loads and saves data to ChromaDB vector database collections. |
| databricks.ExternalTableDataset |
ExternalTableDataset implementation to access external tables in Databricks. |
| langchain.LangChainPromptDataset |
LangChainPromptDataset loads a langchain prompt template. |
| langfuse.LangfusePromptDataset |
LangfusePromptDataset provides a seamless integration between local prompt files (JSON/YAML) and Langfuse prompt management, supporting version control, labeling, and different synchronization policies. |
| langfuse.LangfuseTraceDataset |
LangfuseTraceDataset provides Langfuse tracing clients for LLM observability and monitoring. |
| netcdf.NetCDFDataset |
NetCDFDataset loads/saves data from/to a NetCDF file using an underlying filesystem (e.g.: local, S3, GCS). It uses xarray to handle the NetCDF file. |
| opik.OpikPromptDataset |
OpikPromptDataset manages prompts with Opik versioning support, returning either raw SDK objects or LangChain templates. |
| opik.OpikTraceDataset |
OpikTraceDataset provides Opik tracing clients for observability and monitoring. |
| optuna.StudyDataset |
StudyDataset loads/saves an Optuna study, enabling distributed hyperparameter tuning. |
| pypdf.PDFDataset |
PDFDataset loads data from PDF files using pypdf to extract text from pages. Read-only dataset. |
| polars.PolarsDatabaseDataset |
PolarsDatabaseDataset implementation to access databases as Polars DataFrames. It supports reading from a SQL query and writing to a database table. |
| prophet.ProphetModelDataset |
ProphetModelDataset loads/saves Facebook Prophet models to a JSON file using an underlying filesystem (e.g., local, S3, GCS). It uses Prophet's built-in serialisation to handle the JSON file. |
| pytorch.PyTorchDataset |
PyTorchDataset loads and saves PyTorch models' state_dict using PyTorch's recommended zipfile serialization protocol. To avoid security issues with Pickle. |
| rioxarray.GeoTIFFDataset |
Loads and saves raster data files as xarray DataArrays. Supports single and multiband GeoTIFFs with CRS validation. |
| safetensors.SafetensorsDataset |
Loads and saves data using the SafeTensors library with support for multiple backends like numpy and torch. |
| video.VideoDataset |
Loads and saves video data as a sequence of images using OpenCV, supporting various codecs and formats. |