Text Embedding Model Configuration¶
Pre-train Text Embedding Model Configuration¶
This configuration supports most of the pre-train text embedding models of SentenceTransformer. Examples of DPR text embedding model configuration files are shown below:
all-mpnet-base-v2
BAAI/bge-large-en
Parameter | Options | Note |
---|---|---|
_target_ |
gfmrag.text_emb_models.BaseTextEmbModel |
The class name of Text Embedding model |
text_emb_model_name |
None | The name of the pre-train text embedding model. |
normalize |
True , False |
Whether to normalize the embeddings. |
query_instruct |
None | The instruction for the query. |
passage_instruct |
None | The instruction for the passage. |
model_kwargs |
{} |
The additional model arguments. |
Nvidia Embedding Model Configuration¶
This configuration supports the Nvidia embedding models. An example of a Nvidia embedding model configuration file is shown below:
nvidia/NV-Embed-v2
gfmrag/workflow/config/text_emb_model/nv_embed_v2.yaml
_target_: gfmrag.text_emb_models.NVEmbedV2
text_emb_model_name: nvidia/NV-Embed-v2
normalize: True
batch_size: 32
query_instruct: "Instruct: Given a question, retrieve entities that can help answer the question\nQuery: "
passage_instruct: null
model_kwargs:
torch_dtype: bfloat16
Parameter | Options | Note |
---|---|---|
_target_ |
gfmrag.kg_construction.entity_linking_model.NVEmbedV2ELModel |
The class name of Nvidia Embedding model |
text_emb_model_name |
nvidia/NV-Embed-v2 |
The name of the Nvidia embedding model. |
normalize |
True , False |
Whether to normalize the embeddings. |
query_instruct |
Instruct: Given an entity, retrieve entities that are semantically equivalent to the given entity\nQuery: |
The instruction for the query. |
passage_instruct |
None | The instruction for the passage. |
model_kwargs |
{} |
The additional model arguments. |