1
|
Gil Rodríguez R, Hedjar L, Toscani M, Guarnera D, Guarnera GC, Gegenfurtner KR. Color constancy mechanisms in virtual reality environments. J Vis 2024; 24:6. [PMID: 38727688 PMCID: PMC11098049 DOI: 10.1167/jov.24.5.6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Accepted: 03/11/2024] [Indexed: 05/18/2024] Open
Abstract
Prior research has demonstrated high levels of color constancy in real-world scenarios featuring single light sources, extensive fields of view, and prolonged adaptation periods. However, exploring the specific cues humans rely on becomes challenging, if not unfeasible, with actual objects and lighting conditions. To circumvent these obstacles, we employed virtual reality technology to craft immersive, realistic settings that can be manipulated in real time. We designed forest and office scenes illuminated by five colors. Participants selected a test object most resembling a previously shown achromatic reference. To study color constancy mechanisms, we modified scenes to neutralize three contributors: local surround (placing a uniform-colored leaf under test objects), maximum flux (keeping the brightest object constant), and spatial mean (maintaining a neutral average light reflectance), employing two methods for the latter: changing object reflectances or introducing new elements. We found that color constancy was high in conditions with all cues present, aligning with past research. However, removing individual cues led to varied impacts on constancy. Local surrounds significantly reduced performance, especially under green illumination, showing strong interaction between greenish light and rose-colored contexts. In contrast, the maximum flux mechanism barely affected performance, challenging assumptions used in white balancing algorithms. The spatial mean experiment showed disparate effects: Adding objects slightly impacted performance, while changing reflectances nearly eliminated constancy, suggesting human color constancy relies more on scene interpretation than pixel-based calculations.
Collapse
Affiliation(s)
| | - Laysa Hedjar
- Psychology Department, Justus-Liebig University, Giessen, Germany
| | - Matteo Toscani
- Psychology Department, Bournemouth University, Poole, UK
| | - Dar'ya Guarnera
- School of Arts and Creative Technologies, University of York, York, UK
| | | | | |
Collapse
|
2
|
Yoon MS, Kwon G, Oh J, Ryu J, Lim J, Kang BK, Lee J, Han DK. Effect of Contrast Level and Image Format on a Deep Learning Algorithm for the Detection of Pneumothorax with Chest Radiography. J Digit Imaging 2023; 36:1237-1247. [PMID: 36698035 PMCID: PMC10287877 DOI: 10.1007/s10278-022-00772-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 12/23/2022] [Accepted: 12/29/2022] [Indexed: 01/26/2023] Open
Abstract
Under the black-box nature in the deep learning model, it is uncertain how the change in contrast level and format affects the performance. We aimed to investigate the effect of contrast level and image format on the effectiveness of deep learning for diagnosing pneumothorax on chest radiographs. We collected 3316 images (1016 pneumothorax and 2300 normal images), and all images were set to the standard contrast level (100%) and stored in the Digital Imaging and Communication in Medicine and Joint Photographic Experts Group (JPEG) formats. Data were randomly separated into 80% of training and 20% of test sets, and the contrast of images in the test set was changed to 5 levels (50%, 75%, 100%, 125%, and 150%). We trained the model to detect pneumothorax using ResNet-50 with 100% level images and tested with 5-level images in the two formats. While comparing the overall performance between each contrast level in the two formats, the area under the receiver-operating characteristic curve (AUC) was significantly different (all p < 0.001) except between 125 and 150% in JPEG format (p = 0.382). When comparing the two formats at same contrast levels, AUC was significantly different (all p < 0.001) except 50% and 100% (p = 0.079 and p = 0.082, respectively). The contrast level and format of medical images could influence the performance of the deep learning model. It is required to train with various contrast levels and formats of image, and further image processing for improvement and maintenance of the performance.
Collapse
Affiliation(s)
- Myeong Seong Yoon
- Department of Emergency Medicine, College of Medicine, Hanyang University, 222 Wangsimni-Ro, Seongdong-Gu, Seoul, 04763, Republic of Korea
- Machine Learning Research Center for Medical Data, Hanyang University, 222 Wangsimni-Ro, Seongdong-Gu, Seoul, 04763, Republic of Korea
- Department of Radiological Science, Eulji University, 553 Sanseong-daero, Seongnam-si, Gyeonggi Do, 13135, Republic of Korea
| | - Gitaek Kwon
- Department of Computer Science, Hanyang University, 222 Wangsimni-Ro, Seongdong-Gu, Seoul, 04763, Republic of Korea
- VUNO, Inc, 479 Gangnam-daero, Seocho-gu, Seoul, 06541, Republic of Korea
| | - Jaehoon Oh
- Department of Emergency Medicine, College of Medicine, Hanyang University, 222 Wangsimni-Ro, Seongdong-Gu, Seoul, 04763, Republic of Korea.
- Machine Learning Research Center for Medical Data, Hanyang University, 222 Wangsimni-Ro, Seongdong-Gu, Seoul, 04763, Republic of Korea.
| | - Jongbin Ryu
- Department of Software and Computer Engineering, Ajou University, 206 World cup-ro, Suwon-si, Gyeonggi Do, 16499, Republic of Korea.
| | - Jongwoo Lim
- Department of Computer Science, Hanyang University, 222 Wangsimni-Ro, Seongdong-Gu, Seoul, 04763, Republic of Korea
- Machine Learning Research Center for Medical Data, Hanyang University, 222 Wangsimni-Ro, Seongdong-Gu, Seoul, 04763, Republic of Korea
| | - Bo-Kyeong Kang
- Machine Learning Research Center for Medical Data, Hanyang University, 222 Wangsimni-Ro, Seongdong-Gu, Seoul, 04763, Republic of Korea
- Department of Radiology, College of Medicine, Hanyang University, 222 Wangsimni-Ro, Seongdong-Gu, Seoul, 04763, Republic of Korea
| | - Juncheol Lee
- Department of Emergency Medicine, College of Medicine, Hanyang University, 222 Wangsimni-Ro, Seongdong-Gu, Seoul, 04763, Republic of Korea
| | - Dong-Kyoon Han
- Department of Radiological Science, Eulji University, 553 Sanseong-daero, Seongnam-si, Gyeonggi Do, 13135, Republic of Korea
| |
Collapse
|
3
|
Akbarinia A, Morgenstern Y, Gegenfurtner KR. Contrast sensitivity function in deep networks. Neural Netw 2023; 164:228-244. [PMID: 37156217 DOI: 10.1016/j.neunet.2023.04.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 03/14/2023] [Accepted: 04/18/2023] [Indexed: 05/10/2023]
Abstract
The contrast sensitivity function (CSF) is a fundamental signature of the visual system that has been measured extensively in several species. It is defined by the visibility threshold for sinusoidal gratings at all spatial frequencies. Here, we investigated the CSF in deep neural networks using the same 2AFC contrast detection paradigm as in human psychophysics. We examined 240 networks pretrained on several tasks. To obtain their corresponding CSFs, we trained a linear classifier on top of the extracted features from frozen pretrained networks. The linear classifier is exclusively trained on a contrast discrimination task with natural images. It has to find which of the two input images has higher contrast. The network's CSF is measured by detecting which one of two images contains a sinusoidal grating of varying orientation and spatial frequency. Our results demonstrate characteristics of the human CSF are manifested in deep networks both in the luminance channel (a band-limited inverted U-shaped function) and in the chromatic channels (two low-pass functions of similar properties). The exact shape of the networks' CSF appears to be task-dependent. The human CSF is better captured by networks trained on low-level visual tasks such as image-denoising or autoencoding. However, human-like CSF also emerges in mid- and high-level tasks such as edge detection and object recognition. Our analysis shows that human-like CSF appears in all architectures but at different depths of processing, some at early layers, while others in intermediate and final layers. Overall, these results suggest that (i) deep networks model the human CSF faithfully, making them suitable candidates for applications of image quality and compression, (ii) efficient/purposeful processing of the natural world drives the CSF shape, and (iii) visual representation from all levels of visual hierarchy contribute to the tuning curve of the CSF, in turn implying a function which we intuitively think of as modulated by low-level visual features may arise as a consequence of pooling from a larger set of neurons at all levels of the visual system.
Collapse
Affiliation(s)
- Arash Akbarinia
- Department of Experimental Psychology, University of Giessen, Germany.
| | - Yaniv Morgenstern
- Department of Experimental Psychology, University of Giessen, Germany; Faculty of Psychology and Educational Sciences, KU Leuven, Belgium
| | | |
Collapse
|
4
|
Heinke D, Leonardis A, Leek EC. What do deep neural networks tell us about biological vision? Vision Res 2022; 198:108069. [PMID: 35561463 DOI: 10.1016/j.visres.2022.108069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Dietmar Heinke
- School of Psychology, University of Birmingham, United Kingdom.
| | - Ales Leonardis
- School of Computer Science, University of Birmingham, United Kingdom
| | - E Charles Leek
- Department of Psychology, University of Liverpool, United Kingdom
| |
Collapse
|
5
|
Flachot A, Akbarinia A, Schütt HH, Fleming RW, Wichmann FA, Gegenfurtner KR. Deep neural models for color classification and color constancy. J Vis 2022; 22:17. [PMID: 35353153 PMCID: PMC8976922 DOI: 10.1167/jov.22.4.17] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Color constancy is our ability to perceive constant colors across varying illuminations. Here, we trained deep neural networks to be color constant and evaluated their performance with varying cues. Inputs to the networks consisted of two-dimensional images of simulated cone excitations derived from three-dimensional (3D) rendered scenes of 2,115 different 3D shapes, with spectral reflectances of 1,600 different Munsell chips, illuminated under 278 different natural illuminations. The models were trained to classify the reflectance of the objects. Testing was done with four new illuminations with equally spaced CIEL*a*b* chromaticities, two along the daylight locus and two orthogonal to it. High levels of color constancy were achieved with different deep neural networks, and constancy was higher along the daylight locus. When gradually removing cues from the scene, constancy decreased. Both ResNets and classical ConvNets of varying degrees of complexity performed well. However, DeepCC, our simplest sequential convolutional network, represented colors along the three color dimensions of human color vision, while ResNets showed a more complex representation.
Collapse
Affiliation(s)
- Alban Flachot
- Abteilung Allgemeine Psychologie, Justus Liebig University, Giessen, Germany.,
| | - Arash Akbarinia
- Abteilung Allgemeine Psychologie, Justus Liebig University, Giessen, Germany.,
| | - Heiko H Schütt
- Center for Neural Science, New York University, New York, NY, USA.,
| | - Roland W Fleming
- Experimental Psychology, Justus Liebig University, Giessen, Germany.,
| | - Felix A Wichmann
- Neural Information Processing Group, University of Tübingen, Germany.,
| | - Karl R Gegenfurtner
- Abteilung Allgemeine Psychologie, Justus Liebig University, Giessen, Germany.,
| |
Collapse
|
6
|
Biological convolutions improve DNN robustness to noise and generalisation. Neural Netw 2021; 148:96-110. [PMID: 35114495 DOI: 10.1016/j.neunet.2021.12.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Revised: 11/11/2021] [Accepted: 12/07/2021] [Indexed: 11/19/2022]
Abstract
Deep Convolutional Neural Networks (DNNs) have achieved superhuman accuracy on standard image classification benchmarks. Their success has reignited significant interest in their use as models of the primate visual system, bolstered by claims of their architectural and representational similarities. However, closer scrutiny of these models suggests that they rely on various forms of shortcut learning to achieve their impressive performance, such as using texture rather than shape information. Such superficial solutions to image recognition have been shown to make DNNs brittle in the face of more challenging tests such as noise-perturbed or out-of-distribution images, casting doubt on their similarity to their biological counterparts. In the present work, we demonstrate that adding fixed biological filter banks, in particular banks of Gabor filters, helps to constrain the networks to avoid reliance on shortcuts, making them develop more structured internal representations and more tolerance to noise. Importantly, they also gained around 20-35% improved accuracy when generalising to our novel out-of-distribution test image sets over standard end-to-end trained architectures. We take these findings to suggest that these properties of the primate visual system should be incorporated into DNNs to make them more able to cope with real-world vision and better capture some of the more impressive aspects of human visual perception such as generalisation.
Collapse
|
7
|
Metzger A, Toscani M, Akbarinia A, Valsecchi M, Drewing K. Deep neural network model of haptic saliency. Sci Rep 2021; 11:1395. [PMID: 33446756 PMCID: PMC7809404 DOI: 10.1038/s41598-020-80675-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 12/18/2020] [Indexed: 11/29/2022] Open
Abstract
Haptic exploration usually involves stereotypical systematic movements that are adapted to the task. Here we tested whether exploration movements are also driven by physical stimulus features. We designed haptic stimuli, whose surface relief varied locally in spatial frequency, height, orientation, and anisotropy. In Experiment 1, participants subsequently explored two stimuli in order to decide whether they were same or different. We trained a variational autoencoder to predict the spatial distribution of touch duration from the surface relief of the haptic stimuli. The model successfully predicted where participants touched the stimuli. It could also predict participants' touch distribution from the stimulus' surface relief when tested with two new groups of participants, who performed a different task (Exp. 2) or explored different stimuli (Exp. 3). We further generated a large number of virtual surface reliefs (uniformly expressing a certain combination of features) and correlated the model's responses with stimulus properties to understand the model's preferences in order to infer which stimulus features were preferentially touched by participants. Our results indicate that haptic exploratory behavior is to some extent driven by the physical features of the stimuli, with e.g. edge-like structures, vertical and horizontal patterns, and rough regions being explored in more detail.
Collapse
Affiliation(s)
- Anna Metzger
- Justus-Liebig University Giessen, 35394, Giessen, Germany.
| | - Matteo Toscani
- Justus-Liebig University Giessen, 35394, Giessen, Germany
| | | | | | - Knut Drewing
- Justus-Liebig University Giessen, 35394, Giessen, Germany
| |
Collapse
|