Whitelam S, Selin V, Park SW, Tamblyn I. Correspondence between neuroevolution and gradient descent.
Nat Commun 2021;
12:6317. [PMID:
34728632 PMCID:
PMC8563972 DOI:
10.1038/s41467-021-26568-2]
[Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Accepted: 10/04/2021] [Indexed: 11/10/2022] Open
Abstract
We show analytically that training a neural network by conditioned stochastic mutation or neuroevolution of its weights is equivalent, in the limit of small mutations, to gradient descent on the loss function in the presence of Gaussian white noise. Averaged over independent realizations of the learning process, neuroevolution is equivalent to gradient descent on the loss function. We use numerical simulation to show that this correspondence can be observed for finite mutations, for shallow and deep neural networks. Our results provide a connection between two families of neural-network training methods that are usually considered to be fundamentally different.
Gradient-based and non-gradient-based methods for training neural networks are usually considered to be fundamentally different. The authors derive, and illustrate numerically, an analytic equivalence between the dynamics of neural network training under conditioned stochastic mutations, and under gradient descent.
Collapse