Back
Notes on FSDP
References
Meta AI,
Fully Sharded Data Parallel: faster AI training with fewer GPUs
NVIDIA,
Collective Operations