1
|
Caves EM, Davis AL, Nowicki S, Johnsen S. Backgrounds and the evolution of visual signals. Trends Ecol Evol 2024; 39:188-198. [PMID: 37802667 DOI: 10.1016/j.tree.2023.09.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 09/04/2023] [Accepted: 09/08/2023] [Indexed: 10/08/2023]
Abstract
Color signals which mediate behavioral interactions across taxa and contexts are often thought of as color 'patches' - parts of an animal that appear colorful compared to other parts of that animal. Color patches, however, cannot be considered in isolation because how a color is perceived depends on its visual background. This is of special relevance to the function and evolution of signals because backgrounds give rise to a fundamental tradeoff between color signal detectability and discriminability: as its contrast with the background increases, a color patch becomes more detectable, but discriminating variation in that color becomes more difficult. Thus, the signal function of color patches can only be fully understood by considering patch and background together as an integrated whole.
Collapse
Affiliation(s)
- Eleanor M Caves
- Department of Ecology, Evolution, and Marine Biology, University of California Santa Barbara, Santa Barbara, CA 93106, USA.
| | | | - Stephen Nowicki
- Department of Biology, Duke University, Durham, NC, 27708, USA
| | - Sönke Johnsen
- Department of Biology, Duke University, Durham, NC, 27708, USA
| |
Collapse
|
2
|
Luna R, Zabaleta I, Bertalmío M. State-of-the-art image and video quality assessment with a metric based on an intrinsically non-linear neural summation model. Front Neurosci 2023; 17:1222815. [PMID: 37559700 PMCID: PMC10408451 DOI: 10.3389/fnins.2023.1222815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 06/30/2023] [Indexed: 08/11/2023] Open
Abstract
The development of automatic methods for image and video quality assessment that correlate well with the perception of human observers is a very challenging open problem in vision science, with numerous practical applications in disciplines such as image processing and computer vision, as well as in the media industry. In the past two decades, the goal of image quality research has been to improve upon classical metrics by developing models that emulate some aspects of the visual system, and while the progress has been considerable, state-of-the-art quality assessment methods still share a number of shortcomings, like their performance dropping considerably when they are tested on a database that is quite different from the one used to train them, or their significant limitations in predicting observer scores for high framerate videos. In this work we propose a novel objective method for image and video quality assessment that is based on the recently introduced Intrinsically Non-linear Receptive Field (INRF) formulation, a neural summation model that has been shown to be better at predicting neural activity and visual perception phenomena than the classical linear receptive field. Here we start by optimizing, on a classic image quality database, the four parameters of a very simple INRF-based metric, and proceed to test this metric on three other databases, showing that its performance equals or surpasses that of the state-of-the-art methods, some of them having millions of parameters. Next, we extend to the temporal domain this INRF image quality metric, and test it on several popular video quality datasets; again, the results of our proposed INRF-based video quality metric are shown to be very competitive.
Collapse
Affiliation(s)
- Raúl Luna
- Institute of Optics, Spanish National Research Council (CSIC), Madrid, Spain
| | - Itziar Zabaleta
- Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
| | - Marcelo Bertalmío
- Institute of Optics, Spanish National Research Council (CSIC), Madrid, Spain
| |
Collapse
|
3
|
Oganian Y, Bhaya-Grossman I, Johnson K, Chang EF. Vowel and formant representation in the human auditory speech cortex. Neuron 2023; 111:2105-2118.e4. [PMID: 37105171 PMCID: PMC10330593 DOI: 10.1016/j.neuron.2023.04.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 02/08/2023] [Accepted: 04/04/2023] [Indexed: 04/29/2023]
Abstract
Vowels, a fundamental component of human speech across all languages, are cued acoustically by formants, resonance frequencies of the vocal tract shape during speaking. An outstanding question in neurolinguistics is how formants are processed neurally during speech perception. To address this, we collected high-density intracranial recordings from the human speech cortex on the superior temporal gyrus (STG) while participants listened to continuous speech. We found that two-dimensional receptive fields based on the first two formants provided the best characterization of vowel sound representation. Neural activity at single sites was highly selective for zones in this formant space. Furthermore, formant tuning is adjusted dynamically for speaker-specific spectral context. However, the entire population of formant-encoding sites was required to accurately decode single vowels. Overall, our results reveal that complex acoustic tuning in the two-dimensional formant space underlies local vowel representations in STG. As a population code, this gives rise to phonological vowel perception.
Collapse
Affiliation(s)
- Yulia Oganian
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Ilina Bhaya-Grossman
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA; University of California, Berkeley-University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA 94720, USA
| | - Keith Johnson
- Department of Linguistics, University of California, Berkeley, Berkeley, CA, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA.
| |
Collapse
|
4
|
Doerig A, Sommers RP, Seeliger K, Richards B, Ismael J, Lindsay GW, Kording KP, Konkle T, van Gerven MAJ, Kriegeskorte N, Kietzmann TC. The neuroconnectionist research programme. Nat Rev Neurosci 2023:10.1038/s41583-023-00705-w. [PMID: 37253949 DOI: 10.1038/s41583-023-00705-w] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/21/2023] [Indexed: 06/01/2023]
Abstract
Artificial neural networks (ANNs) inspired by biology are beginning to be widely used to model behavioural and neural data, an approach we call 'neuroconnectionism'. ANNs have been not only lauded as the current best models of information processing in the brain but also criticized for failing to account for basic cognitive functions. In this Perspective article, we propose that arguing about the successes and failures of a restricted set of current ANNs is the wrong approach to assess the promise of neuroconnectionism for brain science. Instead, we take inspiration from the philosophy of science, and in particular from Lakatos, who showed that the core of a scientific research programme is often not directly falsifiable but should be assessed by its capacity to generate novel insights. Following this view, we present neuroconnectionism as a general research programme centred around ANNs as a computational language for expressing falsifiable theories about brain computation. We describe the core of the programme, the underlying computational framework and its tools for testing specific neuroscientific hypotheses and deriving novel understanding. Taking a longitudinal view, we review past and present neuroconnectionist projects and their responses to challenges and argue that the research programme is highly progressive, generating new and otherwise unreachable insights into the workings of the brain.
Collapse
Affiliation(s)
- Adrien Doerig
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany.
- Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands.
| | - Rowan P Sommers
- Department of Neurobiology of Language, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Katja Seeliger
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Blake Richards
- Department of Neurology and Neurosurgery, McGill University, Montréal, QC, Canada
- School of Computer Science, McGill University, Montréal, QC, Canada
- Mila, Montréal, QC, Canada
- Montréal Neurological Institute, Montréal, QC, Canada
- Learning in Machines and Brains Program, CIFAR, Toronto, ON, Canada
| | | | | | - Konrad P Kording
- Learning in Machines and Brains Program, CIFAR, Toronto, ON, Canada
- Bioengineering, Neuroscience, University of Pennsylvania, Pennsylvania, PA, USA
| | | | | | | | - Tim C Kietzmann
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany
| |
Collapse
|
5
|
Parker D. Neurobiological reduction: From cellular explanations of behavior to interventions. Front Psychol 2022; 13:987101. [PMID: 36619115 PMCID: PMC9815460 DOI: 10.3389/fpsyg.2022.987101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 11/28/2022] [Indexed: 12/24/2022] Open
Abstract
Scientific reductionism, the view that higher level functions can be explained by properties at some lower-level or levels, has been an assumption of nervous system analyses since the acceptance of the neuron doctrine in the late 19th century, and became a dominant experimental approach with the development of intracellular recording techniques in the mid-20th century. Subsequent refinements of electrophysiological approaches and the continual development of molecular and genetic techniques have promoted a focus on molecular and cellular mechanisms in experimental analyses and explanations of sensory, motor, and cognitive functions. Reductionist assumptions have also influenced our views of the etiology and treatment of psychopathologies, and have more recently led to claims that we can, or even should, pharmacologically enhance the normal brain. Reductionism remains an area of active debate in the philosophy of science. In neuroscience and psychology, the debate typically focuses on the mind-brain question and the mechanisms of cognition, and how or if they can be explained in neurobiological terms. However, these debates are affected by the complexity of the phenomena being considered and the difficulty of obtaining the necessary neurobiological detail. We can instead ask whether features identified in neurobiological analyses of simpler aspects in simpler nervous systems support current molecular and cellular approaches to explaining systems or behaviors. While my view is that they do not, this does not invite the opposing view prevalent in dichotomous thinking that molecular and cellular detail is irrelevant and we should focus on computations or representations. We instead need to consider how to address the long-standing dilemma of how a nervous system that ostensibly functions through discrete cell to cell communication can generate population effects across multiple spatial and temporal scales to generate behavior.
Collapse
|
6
|
Sugito Y, Vazquez-Corral J, Canham T, Bertalmio M. Image Quality Evaluation in Professional HDR/WCG Production Questions the Need for HDR Metrics. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:5163-5177. [PMID: 35853056 DOI: 10.1109/tip.2022.3190706] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In the quality evaluation of high dynamic range and wide color gamut (HDR/WCG) images, a number of works have concluded that native HDR metrics, such as HDR visual difference predictor (HDR-VDP), HDR video quality metric (HDR-VQM), or convolutional neural network (CNN)-based visibility metrics for HDR content, provide the best results. These metrics consider only the luminance component, but several color difference metrics have been specifically developed for, and validated with, HDR/WCG images. In this paper, we perform subjective evaluation experiments in a professional HDR/WCG production setting, under a real use case scenario. The results are quite relevant in that they show, firstly, that the performance of HDR metrics is worse than that of a classic, simple standard dynamic range (SDR) metric applied directly to the HDR content; and secondly, that the chrominance metrics specifically developed for HDR/WCG imaging have poor correlation with observer scores and are also outperformed by an SDR metric. Based on these findings, we show how a very simple framework for creating color HDR metrics, that uses only luminance SDR metrics, transfer functions, and classic color spaces, is able to consistently outperform, by a considerable margin, state-of-the-art HDR metrics on a varied set of HDR content, for both perceptual quantization (PQ) and Hybrid Log-Gamma (HLG) encoding, luminance and chroma distortions, and on different color spaces of common use.
Collapse
|
7
|
Gomez-Villa A, Martín A, Vazquez-Corral J, Bertalmío M, Malo J. On the synthesis of visual illusions using deep generative models. J Vis 2022; 22:2. [PMID: 35833884 PMCID: PMC9290318 DOI: 10.1167/jov.22.8.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Visual illusions expand our understanding of the visual system by imposing constraints in the models in two different ways: i) visual illusions for humans should induce equivalent illusions in the model, and ii) illusions synthesized from the model should be compelling for human viewers too. These constraints are alternative strategies to find good vision models. Following the first research strategy, recent studies have shown that artificial neural network architectures also have human-like illusory percepts when stimulated with classical hand-crafted stimuli designed to fool humans. In this work we focus on the second (less explored) strategy: we propose a framework to synthesize new visual illusions using the optimization abilities of current automatic differentiation techniques. The proposed framework can be used with classical vision models as well as with more recent artificial neural network architectures. This framework, validated by psychophysical experiments, can be used to study the difference between a vision model and the actual human perception and to optimize the vision model to decrease this difference.
Collapse
Affiliation(s)
- Alex Gomez-Villa
- Computer Vision Center, Universitat Autónoma de Barcelona, Barcelona, Spain.,
| | - Adrián Martín
- Department of Information and Communications Technologies, Universitat Pompeu Fabra, Barcelona, Spain.,
| | - Javier Vazquez-Corral
- Computer Science Department, Universitat Autónoma de Barcelona and Computer Vision Center, Barcelona, Spain.,
| | | | - Jesús Malo
- Image Processing Lab, Faculty of Physics, Universitat de Valéncia, Spain.,
| |
Collapse
|
8
|
Validation of a Saliency Map for Assessing Image Quality in Nuclear Medicine: Experimental Study Outcomes. RADIATION 2022. [DOI: 10.3390/radiation2030018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Recently, the use of saliency maps to evaluate the image quality of nuclear medicine images has been reported. However, that study only compared qualitative visual evaluations and did not perform a quantitative assessment. The study’s aim was to demonstrate the possibility of using saliency maps (calculated from intensity and flicker) to assess nuclear medicine image quality by comparison with the evaluator’s gaze data obtained from an eye-tracking device. We created 972 positron emission tomography images by changing the position of the hot sphere, imaging time, and number of iterations in the iterative reconstructions. Pearson’s correlation coefficient between the saliency map calculated from each image and the evaluator’s gaze data during image presentation was calculated. A strong correlation (r ≥ 0.94) was observed between the saliency map (intensity) and the evaluator’s gaze data. This trend was also observed in images obtained from a clinical device. For short acquisition times, the gaze to the hot sphere position was higher for images with fewer iterations during the iterative reconstruction. However, no differences in iterations were found when the acquisition time increased. Saliency by flicker could be applied to clinical images without preprocessing, although compared with the gaze image, it increased slowly.
Collapse
|
9
|
Abstract
Three decades ago, Atick et al. suggested that human frequency sensitivity may emerge from the enhancement required for a more efficient analysis of retinal images. Here we reassess the relevance of low-level vision tasks in the explanation of the contrast sensitivity functions (CSFs) in light of 1) the current trend of using artificial neural networks for studying vision, and 2) the current knowledge of retinal image representations. As a first contribution, we show that a very popular type of convolutional neural networks (CNNs), called autoencoders, may develop human-like CSFs in the spatiotemporal and chromatic dimensions when trained to perform some basic low-level vision tasks (like retinal noise and optical blur removal), but not others (like chromatic) adaptation or pure reconstruction after simple bottlenecks). As an illustrative example, the best CNN (in the considered set of simple architectures for enhancement of the retinal signal) reproduces the CSFs with a root mean square error of 11% of the maximum sensitivity. As a second contribution, we provide experimental evidence of the fact that, for some functional goals (at low abstraction level), deeper CNNs that are better in reaching the quantitative goal are actually worse in replicating human-like phenomena (such as the CSFs). This low-level result (for the explored networks) is not necessarily in contradiction with other works that report advantages of deeper nets in modeling higher level vision goals. However, in line with a growing body of literature, our results suggests another word of caution about CNNs in vision science because the use of simplified units or unrealistic architectures in goal optimization may be a limitation for the modeling and understanding of human vision.
Collapse
Affiliation(s)
- Qiang Li
- Image Processing Lab, Parc Cientific, Universitat de Valéncia, Spain.,
| | - Alex Gomez-Villa
- Computer Vision Center, Universitat Autónoma de Barcelona, Spain.,
| | - Marcelo Bertalmío
- Instituto de Óptica, Spanish National Research Council (CSIC), Spain.,
| | - Jesús Malo
- Image Processing Lab, Parc Cientific, Universitat de Valéncia, Spain., http://isp.uv.es
| |
Collapse
|