Skip to content

langchain.ChatOpenAIDataset

kedro_datasets.langchain.ChatOpenAIDataset

ChatOpenAIDataset(credentials={}, kwargs={})

Bases: AbstractDataset[None, ChatOpenAI]

ChatOpenAIDataset loads a ChatOpenAI langchain model.

Example usage for the YAML API

catalog.yml

gpt_3_5_turbo:
    type: langchain.ChatOpenAIDataset
    kwargs:
        model: "gpt-3.5-turbo"
        temperature: 0.0
    credentials: openai  # Optional, can use environment variables instead

credentials.yml (optional if using environment variables) If credentials are passed through credentials.yml, they take precedence over environment variables.

openai:
    base_url: <openai-api-base>  # Optional, defaults to OpenAI default
    api_key: <openai-api-key>   # Optional if OPENAI_API_KEY is set

Or use environment variables:

export OPENAI_API_KEY=<your-api-key>
export OPENAI_API_BASE=<openai-api-base>  # Optional

Example usage for the Python API
from kedro_datasets.langchain import ChatOpenAIDataset

# With explicit credentials
llm = ChatOpenAIDataset(
    credentials={
        "base_url": "<openai-api-base>",
        "api_key": "<openai-api-key>",
    },
    kwargs={
        "model": "gpt-3.5-turbo",
        "temperature": 0.0,
    },
).load()

# Or without credentials (using environment variables)
llm = ChatOpenAIDataset(
    kwargs={
        "model": "gpt-3.5-turbo",
        "temperature": 0.0,
    },
).load()

# See: https://python.langchain.com/docs/integrations/chat/openai
llm.invoke("Hello world!")

Parameters:

  • credentials (Optional, default: {} ) –

    contains api_key and base_url. If not provided, will use environment variables OPENAI_API_KEY and OPENAI_API_BASE.

  • kwargs (dict[str, Any], default: {} ) –

    keyword arguments passed to the underlying constructor.

Source code in kedro_datasets/langchain/chat_openai_dataset.py
71
72
73
74
75
76
77
78
79
80
def __init__(self, credentials: dict[str, str] = {}, kwargs: dict[str, Any] = {}):
    """Constructor.

    Args:
        credentials (Optional): contains `api_key` and `base_url`.
            If not provided, will use environment variables OPENAI_API_KEY and OPENAI_API_BASE.
        kwargs: keyword arguments passed to the underlying constructor.
    """
    self.credentials = credentials or {}
    self.kwargs = kwargs or {}

credentials instance-attribute

credentials = credentials or {}

kwargs instance-attribute

kwargs = kwargs or {}

_describe

_describe()

Returns a description of the dataset.

Returns:

  • dict[str, Any]

    Dictionary containing the kwargs passed to the OpenAI constructor.

Source code in kedro_datasets/langchain/chat_openai_dataset.py
82
83
84
85
86
87
88
89
90
91
def _describe(self) -> dict[str, Any]:
    """Returns a description of the dataset.

    Returns:
        dict[str, Any]: Dictionary containing the kwargs passed to the OpenAI constructor.
    """
    credentials = (
        {k: "***" for k in self.credentials.keys()} if self.credentials else {}
    )
    return {**credentials, **self.kwargs}

load

load()

Load and return an OpenAI model instance.

Constructs an OpenAI instance using the provided kwargs and optional credentials. If credentials are not provided, the OpenAI instance will automatically use environment variables OPENAI_API_KEY and OPENAI_API_BASE for authentication.

Returns:

  • OPENAI_TYPE

    A configured OpenAI model instance.

Source code in kedro_datasets/langchain/chat_openai_dataset.py
101
102
103
104
105
106
107
108
109
110
111
112
def load(self) -> ChatOpenAI:
    """Load and return an OpenAI model instance.

    Constructs an OpenAI instance using the provided kwargs and optional
    credentials. If credentials are not provided, the OpenAI instance
    will automatically use environment variables OPENAI_API_KEY and
    OPENAI_API_BASE for authentication.

    Returns:
        OPENAI_TYPE: A configured OpenAI model instance.
    """
    return ChatOpenAI(**self.credentials, **self.kwargs)  # type: ignore[arg-type]

save

save(data)

Save operation is not supported for OpenAI datasets.

Raises:

  • DatasetError

    Always raised as this dataset is read-only.

Source code in kedro_datasets/langchain/chat_openai_dataset.py
93
94
95
96
97
98
99
def save(self, data: None) -> NoReturn:
    """Save operation is not supported for OpenAI datasets.

    Raises:
        DatasetError: Always raised as this dataset is read-only.
    """
    raise DatasetError(f"{self.__class__.__name__} is a read only dataset type")