Skip to content

Text Embedding Model Configuration

Pre-train Text Embedding Model Configuration

This configuration supports most of the pre-train text embedding models of SentenceTransformer. Examples of DPR text embedding model configuration files are shown below:

all-mpnet-base-v2

gfmrag/workflow/config/text_emb_model/mpnet.yaml
_target_: gfmrag.text_emb_models.BaseTextEmbModel
text_emb_model_name: sentence-transformers/all-mpnet-base-v2
normalize: False
batch_size: 32
query_instruct: null
passage_instruct: null
model_kwargs: null

BAAI/bge-large-en

gfmrag/workflow/config/text_emb_model/bge_large_en.yaml
_target_: gfmrag.text_emb_models.BaseTextEmbModel
text_emb_model_name: BAAI/bge-large-en
normalize: True
batch_size: 32
query_instruct: "Represent this sentence for searching relevant passages: "
passage_instruct: null
model_kwargs: null
Parameter Options Note
_target_ gfmrag.text_emb_models.BaseTextEmbModel The class name of Text Embedding model
text_emb_model_name None The name of the pre-train text embedding model.
normalize True, False Whether to normalize the embeddings.
query_instruct None The instruction for the query.
passage_instruct None The instruction for the passage.
model_kwargs {} The additional model arguments.

Nvidia Embedding Model Configuration

This configuration supports the Nvidia embedding models. An example of a Nvidia embedding model configuration file is shown below:

nvidia/NV-Embed-v2

gfmrag/workflow/config/text_emb_model/nv_embed_v2.yaml
_target_: gfmrag.text_emb_models.NVEmbedV2
text_emb_model_name: nvidia/NV-Embed-v2
normalize: True
batch_size: 32
query_instruct: "Instruct: Given a question, retrieve entities that can help answer the question\nQuery: "
passage_instruct: null
model_kwargs:
  torch_dtype: bfloat16
Parameter Options Note
_target_ gfmrag.kg_construction.entity_linking_model.NVEmbedV2ELModel The class name of Nvidia Embedding model
text_emb_model_name nvidia/NV-Embed-v2 The name of the Nvidia embedding model.
normalize True, False Whether to normalize the embeddings.
query_instruct Instruct: Given an entity, retrieve entities that are semantically equivalent to the given entity\nQuery: The instruction for the query.
passage_instruct None The instruction for the passage.
model_kwargs {} The additional model arguments.