Skip to content

Graph Constructor Configuration

KG Constructor Configuration

This configuration controls how raw documents are converted into the stage1 graph files (nodes.csv, relations.csv, edges.csv). An example KG constructor configuration file is shown below:

kg_constructor

gfmrag/workflow/config/graph_constructor/kg_constructor.yaml
_target_: gfmrag.graph_index_construction.graph_constructors.KGConstructor # The KGConstructor class
open_ie_model: ${openie_model}
el_model: ${el_model}
root: tmp/kg_construction # Temporary directory for storing intermediate files during KG construction
num_processes: 10 # Number of processes to use
cosine_sim_edges: True # Whether to conduct entities resolution using cosine similarity
threshold: 0.8 # Threshold for cosine similarity
max_sim_neighbors: 100 # Maximum number of similar neighbors to add
add_title: True # Whether to add the title to the content of the document during OpenIE
force: False # Whether to force recompute the KG
Parameter Options Note
_target_ gfmrag.graph_index_construction.graph_constructors.KGConstructor The class name of KGConstructor.
open_ie_model ${openie_model} OpenIE config used to extract triples from documents.
el_model ${el_model} Entity-linking config used during graph construction.
root tmp/kg_construction Temporary working directory for intermediate files.
num_processes Positive integer Number of worker processes.
cosine_sim_edges True, False Whether to add similarity-based edges between entities.
threshold Float in [0, 1] Cosine similarity threshold for edge creation.
max_sim_neighbors Positive integer Maximum number of similar neighbors to keep.
add_title True, False Whether to prepend the document title during OpenIE extraction.
force True, False Whether to rebuild intermediate graph-construction artifacts.