Global Convergence of Frank Wolfe on One Hidden Layer Networks - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2020

Global Convergence of Frank Wolfe on One Hidden Layer Networks

Résumé

We derive global convergence bounds for the Frank Wolfe algorithm when training one hidden layer neural networks. When using the ReLU activation function, and under tractable preconditioning assumptions on the sample data set, the linear minimization oracle used to incrementally form the solution can be solved explicitly as a second order cone program. The classical Frank Wolfe algorithm then converges with rate $O(1/T)$ where $T$ is both the number of neurons and the number of calls to the oracle.

Dates et versions

hal-02983259 , version 1 (29-10-2020)

Identifiants

Citer

Alexandre d'Aspremont, Mert Pilanci. Global Convergence of Frank Wolfe on One Hidden Layer Networks. 2020. ⟨hal-02983259⟩
2159 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More