Gradient Compression

Don't Compress Gradients in Random Reshuffling: Compress Gradient Differences

This work introduces novel methods combining random reshuffling with gradient compression for distributed and federated learning, providing theoretical analysis and practical improvements over existing approaches.

Jun 14, 2022