Lessmann M, Würtz RP. Learning invariant object recognition from temporal correlation in a hierarchical network.
Neural Netw 2014;
54:70-84. [PMID:
24657573 DOI:
10.1016/j.neunet.2014.02.011]
[Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2013] [Revised: 02/21/2014] [Accepted: 02/23/2014] [Indexed: 11/29/2022]
Abstract
Invariant object recognition, which means the recognition of object categories independent of conditions like viewing angle, scale and illumination, is a task of great interest that humans can fulfill much better than artificial systems. During the last years several basic principles were derived from neurophysiological observations and careful consideration: (1) Developing invariance to possible transformations of the object by learning temporal sequences of visual features that occur during the respective alterations. (2) Learning in a hierarchical structure, so basic level (visual) knowledge can be reused for different kinds of objects. (3) Using feedback to compare predicted input with the current one for choosing an interpretation in the case of ambiguous signals. In this paper we propose a network which implements all of these concepts in a computationally efficient manner which gives very good results on standard object datasets. By dynamically switching off weakly active neurons and pruning weights computation is sped up and thus handling of large databases with several thousands of images and a number of categories in a similar order becomes possible. The involved parameters allow flexible adaptation to the information content of training data and allow tuning to different databases relatively easily. Precondition for successful learning is that training images are presented in an order assuring that images of the same object under similar viewing conditions follow each other. Through an implementation with sparse data structures the system has moderate memory demands and still yields very good recognition rates.
Collapse