Batch Normalization Orthogonalizes Representations in Deep Random Networks - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2021

Batch Normalization Orthogonalizes Representations in Deep Random Networks

Résumé

This paper underlines a subtle property of batch-normalization (BN): Successive batch normalizations with random linear transformations make hidden representations increasingly orthogonal across layers of a deep neural network. We establish a non-asymptotic characterization of the interplay between depth, width, and the orthogonality of deep representations. More precisely, under a mild assumption, we prove that the deviation of the representations from orthogonality rapidly decays with depth up to a term inversely proportional to the network width. This result has two main implications: 1) Theoretically, as the depth grows, the distribution of the representation-after the linear layers-contracts to a Wasserstein-2 ball around an isotropic Gaussian distribution. Furthermore, the radius of this Wasserstein ball shrinks with the width of the network. 2) Practically, the orthogonality of the representations directly influences the performance of stochastic gradient descent (SGD). When representations are initially aligned, we observe SGD wastes many iterations to orthogonalize representations before the classification. Nevertheless, we experimentally show that starting optimization from orthogonal representations is sufficient to accelerate SGD, with no need for BN.
Fichier principal
Vignette du fichier
NeurIPS-2021-batch-normalization-orthogonalizes-representations-in-deep-random-networks-Paper (1).pdf (349.38 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03454243 , version 1 (29-11-2021)

Identifiants

  • HAL Id : hal-03454243 , version 1

Citer

Hadi Daneshmand, Amir Joudaki, Francis Bach. Batch Normalization Orthogonalizes Representations in Deep Random Networks. NeurIPS 2021 - 35th Conference on Neural Information Processing Systems, Dec 2021, Virtual, France. ⟨hal-03454243⟩
33 Consultations
108 Téléchargements

Partager

Gmail Facebook X LinkedIn More