Global Convergence of Frank Wolfe on One Hidden Layer Networks

Alexandre d'Aspremont; Mert Pilanci

Pré-Publication, Document De Travail Année : 2020

Global Convergence of Frank Wolfe on One Hidden Layer Networks

(1, 2, 3, 4) , (5)

1
2
3
4
5

Alexandre d'Aspremont

Fonction : Auteur
PersonId : 10163
IdHAL : aspremon
ORCID : 0000-0003-3851-216X
IdRef : 157968219

Statistical Machine Learning and Parsimony

Centre National de la Recherche Scientifique

Laboratoire d'informatique de l'école normale supérieure

Université Paris Sciences et Lettres

Mert Pilanci

Fonction : Auteur

Stanford University

Résumé

We derive global convergence bounds for the Frank Wolfe algorithm when training one hidden layer neural networks. When using the ReLU activation function, and under tractable preconditioning assumptions on the sample data set, the linear minimization oracle used to incrementally form the solution can be solved explicitly as a second order cone program. The classical Frank Wolfe algorithm then converges with rate $O(1/T)$ where $T$ is both the number of neurons and the number of calls to the oracle.

Domaines

Machine Learning [stat.ML] Recherche opérationnelle [math.OC]

Alexandre d'Aspremont : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02983259

Soumis le : jeudi 29 octobre 2020-15:22:20

Dernière modification le : samedi 20 avril 2024-03:15:27

Dates et versions

hal-02983259 , version 1 (29-10-2020)

Identifiants

HAL Id : hal-02983259 , version 1
ARXIV : 2002.02208

Citer

Alexandre d'Aspremont, Mert Pilanci. Global Convergence of Frank Wolfe on One Hidden Layer Networks. 2020. ⟨hal-02983259⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA INRIA2 TDS-MACS PSL ANR PRAIRIE-IA

2159 Consultations

0 Téléchargements

Global Convergence of Frank Wolfe on One Hidden Layer Networks

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager