Decentralized gradient methods: does topology matter? - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Decentralized gradient methods: does topology matter?

Résumé

Consensus-based distributed optimization methods have recently been advocated as alternatives to parameter server and ring all-reduce paradigms for large scale training of machine learning models. In this case, each worker maintains a local estimate of the optimal parameter vector and iteratively updates it by averaging the estimates obtained from its neighbors, and applying a correction on the basis of its local dataset. While theoretical results suggest that worker communication topology should have strong impact on the number of epochs needed to converge, previous experiments have shown the opposite conclusion. This paper sheds lights on this apparent contradiction and show how sparse topologies can lead to faster convergence even in the absence of communication delays.
Fichier principal
Vignette du fichier
AISTATS2020.pdf (5.84 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-02430485 , version 1 (07-01-2020)

Identifiants

  • HAL Id : hal-02430485 , version 1

Citer

Giovanni Neglia, Chuan Xu, Don Towsley, Gianmarco Calbi. Decentralized gradient methods: does topology matter?. AISTATS 2020 - 23rd International Conference on Artificial Intelligence and Statistics, Aug 2020, Palermo /Online, Italy. ⟨hal-02430485⟩
169 Consultations
254 Téléchargements

Partager

Gmail Facebook X LinkedIn More