| chromadb.ChromaDBDataset |
ChromaDBDataset loads and saves data to ChromaDB vector database collections. |
| databricks.ExternalTableDataset |
ExternalTableDataset implementation to access external tables in Databricks. |
| langchain.PromptDataset |
PromptDataset loads a langchain prompt template. |
| langfuse.EvaluationDataset |
EvaluationDataset manages Langfuse evaluation datasets for LLM experiment workflows, supporting local file syncing and remote dataset versioning. |
| langfuse.PromptDataset |
PromptDataset provides a seamless integration between local prompt files (JSON/YAML) and Langfuse prompt management, supporting version control, labeling, and different synchronization policies. |
| langfuse.TraceDataset |
TraceDataset provides Langfuse tracing clients for LLM observability and monitoring. |
| mlrun.MLRunAbstractDataset |
MLRunAbstractDataset base class for MLRun datasets, can be used directly for generic artifacts. |
| mlrun.MLRunModel |
MLRunModel saves and loads ML models via MLRun with framework metadata and configurable file format. |
| mlrun.MLRunDataframeDataset |
MLRunDataframeDataset saves and loads pandas DataFrames as MLRun artifacts. |
| mlrun.MLRunResult |
MLRunResult logs scalar results and metrics to MLRun with optional nested dict flattening. |
| netcdf.NetCDFDataset |
NetCDFDataset loads/saves data from/to a NetCDF file using an underlying filesystem (e.g.: local, S3, GCS). It uses xarray to handle the NetCDF file. |
| opik.EvaluationDataset |
EvaluationDataset manages Opik evaluation datasets for LLM experiment workflows. |
| opik.PromptDataset |
PromptDataset manages prompts with Opik versioning support, returning either raw SDK objects or LangChain templates. |
| opik.TraceDataset |
TraceDataset provides Opik tracing clients for observability and monitoring. |
| optuna.StudyDataset |
StudyDataset loads/saves an Optuna study, enabling distributed hyperparameter tuning. |
| pypdf.PDFDataset |
PDFDataset loads data from PDF files using pypdf to extract text from pages. Read-only dataset. |
| polars.PolarsDatabaseDataset |
PolarsDatabaseDataset implementation to access databases as Polars DataFrames. It supports reading from a SQL query and writing to a database table. |
| prophet.ProphetModelDataset |
ProphetModelDataset loads/saves Facebook Prophet models to a JSON file using an underlying filesystem (e.g., local, S3, GCS). It uses Prophet's built-in serialisation to handle the JSON file. |
| pytorch.PyTorchDataset |
PyTorchDataset loads and saves PyTorch models' state_dict using PyTorch's recommended zipfile serialization protocol. To avoid security issues with Pickle. |
| rioxarray.GeoTIFFDataset |
Loads and saves raster data files as xarray DataArrays. Supports single and multiband GeoTIFFs with CRS validation. |
| safetensors.SafetensorsDataset |
Loads and saves data using the SafeTensors library with support for multiple backends like numpy and torch. |
| video.VideoDataset |
Loads and saves video data as a sequence of images using OpenCV, supporting various codecs and formats. |