An Inertial Newton Algorithm for Deep Learning

Camille Castera; Jérôme Bolte; Cédric Févotte; Edouard Pauwels

Pré-Publication, Document De Travail Année : 2019

An Inertial Newton Algorithm for Deep Learning

(1) , (2) , (1) , (3)

1
2
3

Camille Castera

Fonction : Auteur
PersonId : 175473
IdHAL : camille-castera
ORCID : 0000-0002-7384-6387

Université de Toulouse

Jérôme Bolte

Fonction : Auteur
PersonId : 995617

Université Toulouse Capitole

Cédric Févotte

Fonction : Auteur
PersonId : 184864
IdHAL : cedric-fevotte
ORCID : 0000-0003-3801-5534
IdRef : 083298460

Université de Toulouse

Edouard Pauwels

Fonction : Auteur
PersonId : 12830
IdHAL : edouard-pauwels
ORCID : 0000-0002-8180-075X

Université Toulouse III - Paul Sabatier

Résumé

We introduce a new second-order inertial method for machine learning called INDIAN, exploiting the geometry of the loss function while only requiring stochastic approximations of the function values and the generalized gradients. This makes the method fully implementable and adapted to large-scale optimization problems such as the training of a deep neural network. The algorithm combines both gradient-descent and Newton-like behaviors as well as inertia. We prove the convergence of INDIAN to critical points for most deep learning problems. To do so, we provide a well-suited framework to analyze deep learning losses involving tame optimization in which we study the continuous dynamical system together with the discrete stochastic approximations. On the theoretical side, we also prove a sublinear convergence rate for the continuous time differential inclusion which underlies the algorithm. From an empirical point of view the algorithm shows promising results on popular DNN training benchmark problems.

Mots clés

Stochastic Optimization Deep Learning Algorithms Nonconvex optimization

Optimisation non convexe Optimisation Stochastique Deep Learning Algorithmes pour le deep learning

Domaines

Apprentissage [cs.LG] Optimisation et contrôle [math.OC] Machine Learning [stat.ML]

Fichier principal

arXiv.pdf (1.34 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Camille Castera : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02140748

Soumis le : jeudi 12 décembre 2019-02:00:08

Dernière modification le : lundi 20 novembre 2023-11:44:21

Dates et versions

hal-02140748 , version 1 (27-05-2019)

hal-02140748 , version 2 (06-06-2019)

hal-02140748 , version 3 (12-12-2019)

hal-02140748 , version 4 (12-10-2020)

hal-02140748 , version 5 (02-07-2021)

hal-02140748 , version 6 (20-08-2021)

Identifiants

HAL Id : hal-02140748 , version 3
ARXIV : 1905.12278

Citer

Camille Castera, Jérôme Bolte, Cédric Févotte, Edouard Pauwels. An Inertial Newton Algorithm for Deep Learning. 2019. ⟨hal-02140748v3⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

702 Consultations

346 Téléchargements

An Inertial Newton Algorithm for Deep Learning

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Altmetric

Partager