An Inertial Newton Algorithm for Deep Learning - Université Toulouse III - Paul Sabatier - Toulouse INP Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2019

An Inertial Newton Algorithm for Deep Learning

Camille Castera
Jérôme Bolte
  • Fonction : Auteur
  • PersonId : 995617
Cédric Févotte

Résumé

We introduce a new second-order inertial method for machine learning called INDIAN, exploiting the geometry of the loss function while only requiring stochastic approximations of the function values and the generalized gradients. This makes the method fully implementable and adapted to large-scale optimization problems such as the training of a deep neural network. The algorithm combines both gradient-descent and Newton-like behaviors as well as inertia. We prove the convergence of INDIAN to critical points for most deep learning problems. To do so, we provide a well-suited framework to analyze deep learning losses involving tame optimization in which we study the continuous dynamical system together with the discrete stochastic approximations. On the theoretical side, we also prove a sublinear convergence rate for the continuous time differential inclusion which underlies the algorithm. From an empirical point of view the algorithm shows promising results on popular DNN training benchmark problems.
Fichier principal
Vignette du fichier
arXiv.pdf (1.34 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-02140748 , version 1 (27-05-2019)
hal-02140748 , version 2 (06-06-2019)
hal-02140748 , version 3 (12-12-2019)
hal-02140748 , version 4 (12-10-2020)
hal-02140748 , version 5 (02-07-2021)
hal-02140748 , version 6 (20-08-2021)

Identifiants

Citer

Camille Castera, Jérôme Bolte, Cédric Févotte, Edouard Pauwels. An Inertial Newton Algorithm for Deep Learning. 2019. ⟨hal-02140748v3⟩
702 Consultations
346 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More