langchain.OpenAIEmbeddingsDataset
kedro_datasets.langchain.OpenAIEmbeddingsDataset ¶
OpenAIEmbeddingsDataset(credentials={}, kwargs={})
Bases: AbstractDataset[None, OpenAIEmbeddings]
OpenAIEmbeddingsDataset loads an OpenAIEmbeddings langchain model.
Example usage for the YAML API¶
catalog.yml
text_embedding_ada_002:
type: langchain.OpenAIEmbeddingsDataset
kwargs:
model: "text-embedding-ada-002"
credentials: openai # Optional, can use environment variables instead
credentials.yml (optional if using environment variables)
If credentials are passed through credentials.yml, they take precedence over environment variables.
openai:
base_url: <openai-api-base> # Optional, defaults to OpenAI default
api_key: <openai-api-key> # Optional if OPENAI_API_KEY is set
Or use environment variables:
export OPENAI_API_KEY=<your-api-key>
export OPENAI_API_BASE=<openai-api-base> # Optional
Example usage for the Python API¶
from kedro_datasets.langchain import OpenAIEmbeddingsDataset
# With explicit credentials
embeddings = OpenAIEmbeddingsDataset(
credentials={
"base_url": "<openai-api-base>",
"api_key": "<openai-api-key>",
},
kwargs={
"model": "text-embedding-ada-002",
},
).load()
# Or without credentials (using environment variables)
embeddings = OpenAIEmbeddingsDataset(
kwargs={
"model": "text-embedding-ada-002",
},
).load()
# See: https://python.langchain.com/docs/integrations/text_embedding/openai
embeddings.embed_query("Hello world!")
Parameters:
-
credentials(Optional, default:{}) –contains
api_keyandbase_url. If not provided, will use environment variables OPENAI_API_KEY and OPENAI_API_BASE. -
kwargs(dict[str, Any], default:{}) –keyword arguments passed to the underlying constructor.
Source code in kedro_datasets/langchain/openai_embeddings_dataset.py
67 68 69 70 71 72 73 74 75 76 | |
_describe ¶
_describe()
Returns a description of the dataset.
Returns:
-
dict[str, Any]–Dictionary containing the kwargs passed to the OpenAI constructor.
Source code in kedro_datasets/langchain/openai_embeddings_dataset.py
78 79 80 81 82 83 84 85 86 87 | |
load ¶
load()
Load and return an OpenAI model instance.
Constructs an OpenAI instance using the provided kwargs and optional credentials. If credentials are not provided, the OpenAI instance will automatically use environment variables OPENAI_API_KEY and OPENAI_API_BASE for authentication.
Returns:
-
OPENAI_TYPE–A configured OpenAI model instance.
Source code in kedro_datasets/langchain/openai_embeddings_dataset.py
97 98 99 100 101 102 103 104 105 106 107 108 | |
save ¶
save(data)
Save operation is not supported for OpenAI datasets.
Raises:
-
DatasetError–Always raised as this dataset is read-only.
Source code in kedro_datasets/langchain/openai_embeddings_dataset.py
89 90 91 92 93 94 95 | |