pyseekdb.utils.embedding_functions.OnnxEmbeddingFunction

class pyseekdb.utils.embedding_functions.OnnxEmbeddingFunction(model_name: str, hf_model_id: str, dimension: int, download_path: Path | None = None, preferred_providers: list[str] | None = None)[source]

Bases: object

Generic ONNX runtime embedding function.

This class handles model download, tokenizer/model loading, and embedding generation using onnxruntime.

__init__(model_name: str, hf_model_id: str, dimension: int, download_path: Path | None = None, preferred_providers: list[str] | None = None)[source]

Initialize an ONNX embedding function.

Parameters:
  • model_name – Name of the model (used for cache directory naming).

  • hf_model_id – Hugging Face model ID.

  • dimension – Output embedding dimension.

  • download_path – Optional cache path override.

  • preferred_providers – Preferred ONNX runtime providers.

Methods

__init__(model_name, hf_model_id, dimension)

Initialize an ONNX embedding function.

max_tokens()

Get the maximum number of tokens supported by the model.

Attributes

ARCHIVE_FILENAME

EXTRACTED_FOLDER_NAME

dimension

Get the dimension of embeddings produced by this function.

model

Get the model.

tokenizer

Get the tokenizer for the model.

property dimension: int

Get the dimension of embeddings produced by this function.

max_tokens() int[source]

Get the maximum number of tokens supported by the model.

property model: Any

Get the model.

Returns:

The model.

property tokenizer: Any

Get the tokenizer for the model.

Returns:

The tokenizer for the model.