Skip to content

GFM-RAG Pre-training Configuration

An example configuration file for GFM pre-training is shown below:


    dir: outputs/kg_pretrain/${now:%Y-%m-%d}/${now:%H-%M-%S} # Output directory

  - _self_
  - text_emb_model: mpnet # The text embedding model to use

seed: 1024

  _target_: gfmrag.datasets.KGDataset # The KG dataset class
    root: ./data # data root directory
    force_rebuild: False # Whether to force rebuild the dataset
    text_emb_model_cfgs: ${text_emb_model} # The text embedding model configuration
  train_names: # List of training dataset names
    - hotpotqa_train_example
  valid_names: []

# GFM model configuration
  _target_: gfmrag.models.QueryGNN
    _target_: gfmrag.ultra.models.EntityNBFNet
    input_dim: 512
    hidden_dims: [512, 512, 512, 512, 512, 512]
    message_func: distmult
    aggregate_func: sum
    short_cut: yes
    layer_norm: yes

# Loss configuration
  num_negative: 256
  strict_negative: yes
  adversarial_temperature: 1
  metric: [mr, mrr, hits@1, hits@3, hits@10]

  _target_: torch.optim.AdamW
  lr: 5.0e-4

# Training configuration
  batch_size: 8
  num_epoch: 10
  log_interval: 100
  fast_test: 500
  save_best_only: no
  save_pretrained: no # Save the model for QA inference
  batch_per_epoch: null
  timeout: 60 # timeout minutes for multi-gpu training
  # Checkpoint configuration
  checkpoint: null

General Configuration

Parameter Options Note
run.dir None The output directory of the log


Parameter Options Note
text_emb_model None The text embedding model to use

Training datasets

Parameter Options Note
_target_ None KGDataset
cfgs.root None root dictionary of the datasets saving path
cfgs.force_rebuild None whether to force rebuild the dataset
cfgs.text_emb_model_cfgs None text embedding model configuration
train_names [] List of training dataset names
valid_names [] List of validation dataset names

GFM model configuration

Parameter Options Note
_target_ None QueryGNN model
entity_model None EntityNBFNet model
input_dim None input dimension of the model
hidden_dims [] hidden dimensions of the model
message_func transe,rotate,distmult message function of the model
aggregate_func pna,min,max,mean,sum aggregate function of the model
short_cut True, False whether to use short cut
layer_norm True, False whether to use layer norm

Loss configuration

Parameter Options Note
num_negative None number of negative samples for each query
strict_negative None whether to use strict negative sampling
adversarial_temperature None adversarial temperature for negative sampling
metric [] evaluation metrics for the model

Optimizer configuration

Parameter Options Note
optimizer._target_ None torch optimizer for the model None learning rate for the optimizer

Training configuration

Parameter Options Note
batch_size None batch size for the training
num_epoch None number of epochs for training
log_interval None logging interval for the training
fast_test None number of samples for fast test
save_best_only None whether to save the best model based on the metric
save_pretrained None whether to save the model for QA inference
batch_per_epoch None number of batches per epoch for training
timeout None timeout minutes for multi-gpu training
checkpoint None checkpoint path for the training