Back
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
References
Narayanan et. al. (2021),
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM