Don't Compress Gradients in Random Reshuffling: Compress Gradient Differences
This work introduces novel methods combining random reshuffling with gradient compression for distributed and federated learning, providing theoretical analysis and practical improvements over existing approaches.
Jun 14, 2022