pyseekdb.utils.embedding_functions.OpenAIBaseEmbeddingFunction

class pyseekdb.utils.embedding_functions.OpenAIBaseEmbeddingFunction(model_name: str, api_key_env: str | None = None, api_base: str | None = None, dimensions: int | None = None, **kwargs: Any)[source]

Bases: EmbeddingFunction[str | list[str]]

Base embedding function for OpenAI-compatible embedding APIs.

This class provides a common implementation for embedding functions that use OpenAI-compatible APIs. It uses the openai package to make API calls.

Subclasses should override: - _get_default_api_base(): Return the default API base URL - _get_default_api_key_env(): Return the default API key environment variable name - _get_model_dimensions(): Return a dict mapping model names to their default dimensions - Optionally override __init__ to set model-specific defaults

Example: .. code-block:: python

import pyseekdb from pyseekdb.utils.embedding_functions import OpenAIBaseEmbeddingFunction

class MyEmbeddingFunction(OpenAIBaseEmbeddingFunction):
def _get_default_api_base(self):

return “https://api.example.com/v1

def _get_default_api_key_env(self):

return “MY_API_KEY”

def _get_model_dimensions(self):

return {“model-v1”: 1536, “model-v2”: 1024}

__init__(model_name: str, api_key_env: str | None = None, api_base: str | None = None, dimensions: int | None = None, **kwargs: Any)[source]

Initialize OpenAIBaseEmbeddingFunction.

Parameters:
  • model_name (str) – Name of the embedding model.

  • api_key_env (str, optional) – Name of the environment variable containing the API key. Defaults to the value returned by _get_default_api_key_env().

  • api_base (str, optional) – Base URL for the API endpoint. Defaults to the value returned by _get_default_api_base().

  • dimensions (int, optional) – The number of dimensions the resulting embeddings should have. Can reduce dimensions from default for models that support it.

  • **kwargs – Additional arguments to pass to the OpenAI client. Common options include: - timeout: Request timeout in seconds - max_retries: Maximum number of retries - See https://github.com/openai/openai-python for more options

Methods

__init__(model_name[, api_key_env, ...])

Initialize OpenAIBaseEmbeddingFunction.

get_config()

Get the configuration dictionary for the OpenAIBaseEmbeddingFunction.

support_persistence(embedding_function)

Check if the embedding function supports persistence.

Attributes

dimension

Get the dimension of embeddings produced by this function.

property dimension: int

Get the dimension of embeddings produced by this function.

Returns the known dimension for models without making an API call. If the dimensions parameter is specified, that value is returned. Otherwise, the default dimension for the model is returned.

If the model is not in the known dimensions list, falls back to calling the parent’s dimension detection (which may make an API call).

Returns:

The dimension of embeddings for this model.

Return type:

int

get_config() dict[str, Any][source]

Get the configuration dictionary for the OpenAIBaseEmbeddingFunction.

Subclasses should override the name() method to provide the correct name for routing.

Returns:

Dictionary containing configuration needed to restore this embedding function