Back
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
References
Rajbhandari, Rasley et. al. (2020),
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models