Here is my detailed explanation based on the code provided:
The configuration files for this codebase can be found under:
- /dolly/config/v100_config.json
- /dolly/config/a10_config.json
- /dolly/config/a100_config.json
These JSON files contain various hyperparameters and configuration options for training the model, such as:
- Whether to use fp16 or bf16 precision
- The optimizer type and parameters
- Scheduler type and parameters
- Zero optimization settings
- Gradient accumulation steps
- Etc.
The training code itself is located in trainer.py under /dolly/training/. This file contains the main train() function that executes model training along with various helper functions for loading datasets, tokenizers, models, preprocessing data, etc.
The hyperparameters for training a particular run can be specified when calling the train() function. For example:
train( input_model ="gpt2", local_output_dir="output_dir", epochs=10, per_device_train_batch_size=8, lr=0.001 )
So in summary, the configuration files under /dolly/config/ contain default hyperparameters that can be overridden when calling train(), and the actual training code lives in trainer.py under /dolly/training/.
Hope this helps! Let me know if you have any other questions.