KG-index Configuration¶
An example of a KG-index configuration file is shown below:
Example
gfmrag/workflow/config/stage1_index_dataset.yaml
hydra:
run:
dir: outputs/kg_construction/${now:%Y-%m-%d}/${now:%H-%M-%S} # Output directory
defaults:
- _self_
- ner_model: llm_ner_model # The NER model to use
- openie_model: llm_openie_model # The OpenIE model to use
- el_model: colbert_el_model # The EL model to use
dataset:
root: ./data # data root directory
data_name: hotpotqa # data name
kg_constructor:
_target_: gfmrag.kg_construction.KGConstructor # The KGConstructor class
open_ie_model: ${openie_model}
ner_model: ${ner_model}
el_model: ${el_model}
root: tmp/kg_construction # Temporary directory for storing intermediate files during KG construction
num_processes: 10 # Number of processes to use
cosine_sim_edges: True # Whether to conduct entities resolution using cosine similarity
threshold: 0.8 # Threshold for cosine similarity
max_sim_neighbors: 100 # Maximum number of similar neighbors to add
add_title: True # Whether to add the title to the content of the document during OpenIE
force: False # Whether to force recompute the KG
qa_constructor:
_target_: gfmrag.kg_construction.QAConstructor # The QAConstructor class
root: tmp/qa_construction # Temporary directory for storing intermediate files during QA construction
ner_model: ${ner_model}
el_model: ${el_model}
num_processes: 10 # Number of processes to use
force: False # Whether to force recompute the QA data
General Configuration¶
Parameter | Options | Note |
---|---|---|
run.dir |
None | The output directory of the log |
Defaults¶
Parameter | Options | Note |
---|---|---|
ner_model |
None | The config of the ner_model |
openie_model |
None | The config of the openie_model |
el_model |
None | The config of the el_model |
Dataset¶
Parameter | Options | Note |
---|---|---|
root |
None | The data root directory |
data_name |
None | The data name |
KG Constructor¶
Parameter | Options | Note |
---|---|---|
_target_ |
None | The class of KGConstructor |
open_ie_model |
None | The config of the openie_model |
ner_model |
None | The config of the ner_model |
el_model |
None | The config of the el_model |
root |
None | The temporary directory for storing intermediate files during KG construction |
num_processes |
None | The number of processes to use |
cosine_sim_edges |
None | Whether to conduct entities resolution using cosine similarity |
threshold |
None | Threshold for cosine similarity |
max_sim_neighbors |
None | Maximum number of similar neighbors to add |
add_title |
None | Whether to add the title to the content of the document during OpenIE |
force |
None | Whether to force recompute the KG |
Please refer to KG Constructor for details of parameters.
QA Constructor¶
Parameter | Options | Note |
---|---|---|
_target_ |
None | The class of QAConstructor |
root |
None | The temporary directory for storing intermediate files during QA construction |
ner_model |
None | The config of the ner_model |
el_model |
None | The config of the el_model |
num_processes |
None | The number of processes to use |
force |
None | Whether to force recompute the QA data |
Please refer to QAConstructor for details of parameters.