KG-index Configuration¶
An example of a KG-index configuration file is shown below:
Example
gfmrag/workflow/config/stage1_index_dataset.yaml
hydra:
  run:
    dir: outputs/kg_construction/${now:%Y-%m-%d}/${now:%H-%M-%S} # Output directory
defaults:
  - _self_
  - ner_model: llm_ner_model # The NER model to use
  - openie_model: llm_openie_model # The OpenIE model to use
  - el_model: colbert_el_model # The EL model to use
dataset:
  root: ./data # data root directory
  data_name: hotpotqa # data name
kg_constructor:
  _target_: gfmrag.kg_construction.KGConstructor # The KGConstructor class
  open_ie_model: ${openie_model}
  ner_model: ${ner_model}
  el_model: ${el_model}
  root: tmp/kg_construction # Temporary directory for storing intermediate files during KG construction
  num_processes: 10 # Number of processes to use
  cosine_sim_edges: True # Whether to conduct entities resolution using cosine similarity
  threshold: 0.8 # Threshold for cosine similarity
  max_sim_neighbors: 100 # Maximum number of similar neighbors to add
  add_title: True # Whether to add the title to the content of the document during OpenIE
  force: False # Whether to force recompute the KG
qa_constructor:
  _target_: gfmrag.kg_construction.QAConstructor # The QAConstructor class
  root: tmp/qa_construction # Temporary directory for storing intermediate files during QA construction
  ner_model: ${ner_model}
  el_model: ${el_model}
  num_processes: 10 # Number of processes to use
  force: False # Whether to force recompute the QA data
General Configuration¶
| Parameter | Options | Note | 
|---|---|---|
run.dir | 
None | The output directory of the log | 
Defaults¶
| Parameter | Options | Note | 
|---|---|---|
ner_model | 
None | The config of the ner_model | 
openie_model | 
None | The config of the openie_model | 
el_model | 
None | The config of the el_model | 
Dataset¶
| Parameter | Options | Note | 
|---|---|---|
root | 
None | The data root directory | 
data_name | 
None | The data name | 
KG Constructor¶
| Parameter | Options | Note | 
|---|---|---|
_target_ | 
None | The class of KGConstructor | 
open_ie_model | 
None | The config of the openie_model | 
ner_model | 
None | The config of the ner_model | 
el_model | 
None | The config of the el_model | 
root | 
None | The temporary directory for storing intermediate files during KG construction | 
num_processes | 
None | The number of processes to use | 
cosine_sim_edges | 
None | Whether to conduct entities resolution using cosine similarity | 
threshold | 
None | Threshold for cosine similarity | 
max_sim_neighbors | 
None | Maximum number of similar neighbors to add | 
add_title | 
None | Whether to add the title to the content of the document during OpenIE | 
force | 
None | Whether to force recompute the KG | 
Please refer to KG Constructor for details of parameters.
QA Constructor¶
| Parameter | Options | Note | 
|---|---|---|
_target_ | 
None | The class of QAConstructor | 
root | 
None | The temporary directory for storing intermediate files during QA construction | 
ner_model | 
None | The config of the ner_model | 
el_model | 
None | The config of the el_model | 
num_processes | 
None | The number of processes to use | 
force | 
None | Whether to force recompute the QA data | 
Please refer to QAConstructor for details of parameters.