1
|
Wang W, Luo R, Yang W, Liu J. Unsupervised Illumination Adaptation for Low-Light Vision. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:5951-5966. [PMID: 38536689 DOI: 10.1109/tpami.2024.3382108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/07/2024]
Abstract
Insufficient lighting poses challenges to both human and machine visual analytics. While existing low-light enhancement methods prioritize human visual perception, they often neglect machine vision and high-level semantics. In this paper, we make pioneering efforts to build an illumination enhancement model for high-level vision. Drawing inspiration from camera response functions, our model could enhance images from the machine vision perspective despite being lightweight in architecture and simple in formulation. We also introduce two approaches that leverage knowledge from base enhancement curves and self-supervised pretext tasks to train for different downstream normal-to-low-light adaptation scenarios. Our proposed framework overcomes the limitations of existing algorithms without requiring access to labeled data in low-light conditions. It facilitates more effective illumination restoration and feature alignment, significantly improving the performance of downstream tasks in a plug-and-play manner. This research advances the field of low-light machine analytics and broadly applies to various high-level vision tasks, including classification, face detection, optical flow estimation, and video action recognition.
Collapse
|
2
|
Peng Z, Li L, Liu D, Zhou S, Liu Z. A Comprehensive Survey on Visual Perception Methods for Intelligent Inspection of High Dam Hubs. SENSORS (BASEL, SWITZERLAND) 2024; 24:5246. [PMID: 39204940 PMCID: PMC11359354 DOI: 10.3390/s24165246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Revised: 08/09/2024] [Accepted: 08/11/2024] [Indexed: 09/04/2024]
Abstract
There are many high dam hubs in the world, and the regular inspection of high dams is a critical task for ensuring their safe operation. Traditional manual inspection methods pose challenges related to the complexity of the on-site environment, the heavy inspection workload, and the difficulty in manually observing inspection points, which often result in low efficiency and errors related to the influence of subjective factors. Therefore, the introduction of intelligent inspection technology in this context is urgently necessary. With the development of UAVs, computer vision, artificial intelligence, and other technologies, the intelligent inspection of high dams based on visual perception has become possible, and related research has received extensive attention. This article summarizes the contents of high dam safety inspections and reviews recent studies on visual perception techniques in the context of intelligent inspections. First, this article categorizes image enhancement methods into those based on histogram equalization, Retinex, and deep learning. Representative methods and their characteristics are elaborated for each category, and the associated development trends are analyzed. Second, this article systematically enumerates the principal achievements of defect and obstacle perception methods, focusing on those based on traditional image processing and machine learning approaches, and outlines the main techniques and characteristics. Additionally, this article analyzes the principal methods for damage quantification based on visual perception. Finally, the major issues related to applying visual perception techniques for the intelligent safety inspection of high dams are summarized and future research directions are proposed.
Collapse
Affiliation(s)
- Zhangjun Peng
- School of Information Engineering, Southwest University of Science and Technology, Mianyang 621010, China; (Z.P.); (D.L.); (S.Z.)
- School of Computer Science and Technology, Southwest University of Science and Technology, Mianyang 621010, China;
| | - Li Li
- School of Computer Science and Technology, Southwest University of Science and Technology, Mianyang 621010, China;
- Sichuan Engineering Technology Research Center of Industrial Self-Supporting and Artificial Intelligence, Mianyang 621010, China
| | - Daoguang Liu
- School of Information Engineering, Southwest University of Science and Technology, Mianyang 621010, China; (Z.P.); (D.L.); (S.Z.)
- School of Computer Science and Technology, Southwest University of Science and Technology, Mianyang 621010, China;
| | - Shuai Zhou
- School of Information Engineering, Southwest University of Science and Technology, Mianyang 621010, China; (Z.P.); (D.L.); (S.Z.)
| | - Zhigui Liu
- School of Information Engineering, Southwest University of Science and Technology, Mianyang 621010, China; (Z.P.); (D.L.); (S.Z.)
- Sichuan Engineering Technology Research Center of Industrial Self-Supporting and Artificial Intelligence, Mianyang 621010, China
| |
Collapse
|
3
|
Goyal B, Dogra A, Jalamneh A, Chyophel Lepcha D, Alkhayyat A, Singh R, Jyoti Saikia M. Detailed-based dictionary learning for low-light image enhancement using camera response model for industrial applications. Sci Rep 2024; 14:17122. [PMID: 39054308 PMCID: PMC11272774 DOI: 10.1038/s41598-024-64421-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 06/09/2024] [Indexed: 07/27/2024] Open
Abstract
Images captured in low-light environments are severely degraded due to insufficient light, which causes the performance decline of both commercial and consumer devices. One of the major challenges lies in how to balance the image enhancement properties of light intensity, detail presentation, and colour integrity in low-light enhancement tasks. This study presents a novel image enhancement framework using a detailed-based dictionary learning and camera response model (CRM). It combines dictionary learning with edge-aware filter-based detail enhancement. It assumes each small detail patch could be sparsely characterised in the over-complete detail dictionary that was learned from many training detail patches using iterative ℓ 1 -norm minimization. Dictionary learning will effectively address several enhancement concerns in the progression of detail enhancement if we remove the visibility limit of training detail patches in the enhanced detail patches. We apply illumination estimation schemes to the selected CRM and the subsequent exposure ratio maps, which recover a novel enhanced detail layer and generate a high-quality output with detailed visibility when there is a training set of higher-quality images. We estimate the exposure ratio of each pixel using illumination estimation techniques. The selected camera response model adjusts each pixel to the desired exposure based on the computed exposure ratio map. Extensive experimental analysis shows an advantage of the proposed method that it can obtain enhanced results with acceptable distortions. The proposed research article can be generalised to address numerous other similar problems, such as image enhancement for remote sensing or underwater applications, medical imaging, and foggy or dusty conditions.
Collapse
Affiliation(s)
- Bhawna Goyal
- Department of UCRD and ECE, Chandigarh University, Mohali, Punjab, 140413, India.
| | - Ayush Dogra
- Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, India
| | - Ammar Jalamneh
- College of Arts & Science Applied Science University, Manama, Kingdom of Bahrain
| | - Dawa Chyophel Lepcha
- Department of UCRD and ECE, Chandigarh University, Mohali, Punjab, 140413, India
| | - Ahmed Alkhayyat
- College of Technical Engineering, The Islamic University, Najaf, Iraq
| | - Rajesh Singh
- Department of ECE, Uttaranchal Institute of Technology, Uttaranchal University, Dehradun, 248007, India
| | - Manob Jyoti Saikia
- Department of Electrical Engineering, University of North Florida, Jacksonville, FL, 32224, USA
| |
Collapse
|
4
|
Jia Y, Yu W, Chen G, Zhao L. Nighttime road scene image enhancement based on cycle-consistent generative adversarial network. Sci Rep 2024; 14:14375. [PMID: 38909068 PMCID: PMC11193765 DOI: 10.1038/s41598-024-65270-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Accepted: 06/18/2024] [Indexed: 06/24/2024] Open
Abstract
During nighttime road scenes, images are often affected by contrast distortion, loss of detailed information, and a significant amount of noise. These factors can negatively impact the accuracy of segmentation and object detection in nighttime road scenes. A cycle-consistent generative adversarial network has been proposed to address this issue to improve the quality of nighttime road scene images. The network includes two generative networks with identical structures and two adversarial networks with identical structures. The generative network comprises an encoder network and a corresponding decoder network. A context feature extraction module is designed as the foundational element of the encoder-decoder network to capture more contextual semantic information with different receptive fields. A receptive field residual module is also designed to increase the receptive field in the encoder network.The illumination attention module is inserted between the encoder and decoder to transfer critical features extracted by the encoder to the decoder. The network also includes a multiscale discriminative network to discriminate better whether the image is a real high-quality or generated image. Additionally, an improved loss function is proposed to enhance the efficacy of image enhancement. Compared to state-of-the-art methods, the proposed approach achieves the highest performance in enhancing nighttime images, making them clearer and more natural.
Collapse
Affiliation(s)
- Yanfei Jia
- College of Electrical and Information Engineering, Beihua University, Jilin, 132013, China
| | - Wenshuo Yu
- College of Electrical Engineering, Northeast Electric Power University, Jilin, 132012, China
| | - Guangda Chen
- College of Electrical and Information Engineering, Beihua University, Jilin, 132013, China.
| | - Liquan Zhao
- College of Electrical Engineering, Northeast Electric Power University, Jilin, 132012, China
| |
Collapse
|
5
|
Likassa HT, Chen DG, Chen K, Wang Y, Zhu W. Robust PCA with Lw,∗ and L2,1 Norms: A Novel Method for Low-Quality Retinal Image Enhancement. J Imaging 2024; 10:151. [PMID: 39057722 PMCID: PMC11277667 DOI: 10.3390/jimaging10070151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Revised: 06/12/2024] [Accepted: 06/17/2024] [Indexed: 07/28/2024] Open
Abstract
Nonmydriatic retinal fundus images often suffer from quality issues and artifacts due to ocular or systemic comorbidities, leading to potential inaccuracies in clinical diagnoses. In recent times, deep learning methods have been widely employed to improve retinal image quality. However, these methods often require large datasets and lack robustness in clinical settings. Conversely, the inherent stability and adaptability of traditional unsupervised learning methods, coupled with their reduced reliance on extensive data, render them more suitable for real-world clinical applications, particularly in the limited data context of high noise levels or a significant presence of artifacts. However, existing unsupervised learning methods encounter challenges such as sensitivity to noise and outliers, reliance on assumptions like cluster shapes, and difficulties with scalability and interpretability, particularly when utilized for retinal image enhancement. To tackle these challenges, we propose a novel robust PCA (RPCA) method with low-rank sparse decomposition that also integrates affine transformations τi, weighted nuclear norm, and the L2,1 norms, aiming to overcome existing method limitations and to achieve image quality improvement unseen by these methods. We employ the weighted nuclear norm (Lw,∗) to assign weights to singular values to each retinal images and utilize the L2,1 norm to eliminate correlated samples and outliers in the retinal images. Moreover, τi is employed to enhance retinal image alignment, making the new method more robust to variations, outliers, noise, and image blurring. The Alternating Direction Method of Multipliers (ADMM) method is used to optimally determine parameters, including τi, by solving an optimization problem. Each parameter is addressed separately, harnessing the benefits of ADMM. Our method introduces a novel parameter update approach and significantly improves retinal image quality, detecting cataracts, and diabetic retinopathy. Simulation results confirm our method's superiority over existing state-of-the-art methods across various datasets.
Collapse
Affiliation(s)
- Habte Tadesse Likassa
- Department of Biostatistics, College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
| | - Ding-Geng Chen
- Department of Biostatistics, College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
- Department of Statistics, University of Pretoria, Pretoria 0028, South Africa
| | - Kewei Chen
- Department of Biostatistics, College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
| | - Yalin Wang
- Computer Science and Engineering, School of Computing and Augmented Intelligence, Arizona State University, Phoenix, AZ 85287-8809, USA
| | - Wenhui Zhu
- Computer Science and Engineering, School of Computing and Augmented Intelligence, Arizona State University, Phoenix, AZ 85287-8809, USA
| |
Collapse
|
6
|
Liu Z, Li T, Ren T, Chen D, Li W, Qiu W. Day-to-Night Street View Image Generation for 24-Hour Urban Scene Auditing Using Generative AI. J Imaging 2024; 10:112. [PMID: 38786566 PMCID: PMC11121941 DOI: 10.3390/jimaging10050112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 04/16/2024] [Accepted: 04/23/2024] [Indexed: 05/25/2024] Open
Abstract
A smarter city should be a safer city. Nighttime safety in metropolitan areas has long been a global concern, particularly for large cities with diverse demographics and intricate urban forms, whose citizens are often threatened by higher street-level crime rates. However, due to the lack of night-time urban appearance data, prior studies based on street view imagery (SVI) rarely addressed the perceived night-time safety issue, which can generate important implications for crime prevention. This study hypothesizes that night-time SVI can be effectively generated from widely existing daytime SVIs using generative AI (GenAI). To test the hypothesis, this study first collects pairwise day-and-night SVIs across four cities diverged in urban landscapes to construct a comprehensive day-and-night SVI dataset. It then trains and validates a day-to-night (D2N) model with fine-tuned brightness adjustment, effectively transforming daytime SVIs to nighttime ones for distinct urban forms tailored for urban scene perception studies. Our findings indicate that: (1) the performance of D2N transformation varies significantly by urban-scape variations related to urban density; (2) the proportion of building and sky views are important determinants of transformation accuracy; (3) within prevailed models, CycleGAN maintains the consistency of D2N scene conversion, but requires abundant data. Pix2Pix achieves considerable accuracy when pairwise day-and-night-night SVIs are available and are sensitive to data quality. StableDiffusion yields high-quality images with expensive training costs. Therefore, CycleGAN is most effective in balancing the accuracy, data requirement, and cost. This study contributes to urban scene studies by constructing a first-of-its-kind D2N dataset consisting of pairwise day-and-night SVIs across various urban forms. The D2N generator will provide a cornerstone for future urban studies that heavily utilize SVIs to audit urban environments.
Collapse
Affiliation(s)
- Zhiyi Liu
- School of Architecture and Urban Planning, Beijing University of Civil Engineering and Architecture, Beijing 100044, China;
| | - Tingting Li
- School of Architecture, South Minzu University, Chengdu 610225, China;
| | - Tianyi Ren
- Department of Product Research and Development, Smart Gwei Tech, Shanghai 200940, China;
| | - Da Chen
- Department of Computer Science, University of Bath, Bath BA2 7AY, UK;
| | - Wenjing Li
- Center for Spatial Information Science, The University of Tokyo, Kashiwa-shi 277-0882, Chiba-ken, Japan;
| | - Waishan Qiu
- Department of Urban Planning and Design, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, China
| |
Collapse
|
7
|
Tang L, Ma J, Zhang H, Guo X. DRLIE: Flexible Low-Light Image Enhancement via Disentangled Representations. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:2694-2707. [PMID: 35853059 DOI: 10.1109/tnnls.2022.3190880] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Low-light image enhancement (LIME) aims to convert images with unsatisfied lighting into desired ones. Different from existing methods that manipulate illumination in uncontrollable manners, we propose a flexible framework to take user-specified guide images as references to improve the practicability. To achieve the goal, this article models an image as the combination of two components, that is, content and exposure attribute, from an information decoupling perspective. Specifically, we first adopt a content encoder and an attribute encoder to disentangle the two components. Then, we combine the scene content information of the low-light image with the exposure attribute of the guide image to reconstruct the enhanced image through a generator. Extensive experiments on public datasets demonstrate the superiority of our approach over state-of-the-art alternatives. Particularly, the proposed method allows users to enhance images according to their preferences, by providing specific guide images. Our source code and the pretrained model are available at https://github.com/Linfeng-Tang/DRLIE.
Collapse
|
8
|
Lin F, Zhang H, Wang J, Wang J. Unsupervised image enhancement under non-uniform illumination based on paired CNNs. Neural Netw 2024; 170:202-214. [PMID: 37989041 DOI: 10.1016/j.neunet.2023.11.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 10/16/2023] [Accepted: 11/06/2023] [Indexed: 11/23/2023]
Abstract
This paper presents two CNN-based systems for unsupervised image enhancement under non-uniform illumination. The core of the systems is constituted by the difference of a pair of CNNs. Each CNN is composed of two convolutional layers of neurons with exponential activation function and logarithmic activation function. A weighted sum of the non-reference loss functions is used to train the paired CNNs. It includes an entropy enhancement function and a Bézier loss function to ensure global and local enhancement complementarily. It also includes a white balance loss function to remove color cast in raw images, and a gradient improvement loss function to compensate for the high frequency degradation . In addition, it includes an SSIM (structural similarity index) loss functions to ensure image fidelity. In addition to the basic system, CNNOD, an augmented version called CNNOD+ is developed, which features an information fusion/combination module with a power-law network for gamma correction. The experimental results on two benchmark datasets are discussed to demonstrate that the proposed systems outperform the state-of-the-art methods in terms of enhancement quality, model complexity, and convergence efficiency.
Collapse
Affiliation(s)
- Feng Lin
- College of Control Science and Engineering, China University of Petroleum, Qingdao, 266580, Shandong, China.
| | - Huaqing Zhang
- College of Control Science and Engineering, China University of Petroleum, Qingdao, 266580, Shandong, China; College of Science, China University of Petroleum, Qingdao, 266580, Shandong, China.
| | - Jian Wang
- College of Science, China University of Petroleum, Qingdao, 266580, Shandong, China.
| | - Jun Wang
- Computer Science and the School of Data Science, City University of Hong Kong, 999077, Hong Kong, China.
| |
Collapse
|
9
|
Yang Z, Yang S. Multimedia image evaluation based on blockchain, visual communication design and color balance optimization. Heliyon 2023; 9:e23241. [PMID: 38144270 PMCID: PMC10746472 DOI: 10.1016/j.heliyon.2023.e23241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 10/09/2023] [Accepted: 11/29/2023] [Indexed: 12/26/2023] Open
Abstract
The relationship between image processing and image analysis is inseparable. With the increasing demand for multimedia visual images, the quality of image analysis is also increasing. However, in image processing and computer vision tasks, protecting users' privacy and preventing data leakage and abuse are not handled well. Image enhancement and nonlinear image color balance algorithm are applied to improve the visual quality of multimedia visual images and make them clearer and fuller. The article utilized image enhancement and a non-linear image color balance algorithm to improve the processing effect before visual image analysis. It also utilized the encryption mechanism of blockchain technology to detect the similarity of multimedia visual images. By comparing the feature points of the images, similar images were matched to address the copyright issue of the images. After experimental testing, the effect of image enhancement is significant, and the histogram of image equalization is significantly better than the original image. In the experiment of image analysis, the computer accurately classified visual images with different attributes. Finally, in the similarity detection algorithm of blockchain, the test results showed that when the number of image transactions reaches 500, the difference hash algorithm takes 1.13 s and 0.78 s to calculate the similarity comparison between the original and secondary images. The differential hash algorithm of blockchain is significantly superior to the Message-Digest Algorithm (MD5) in terms of computational speed and resource consumption. It has better image similarity detection performance and can also provide better image copyright protection mechanisms.
Collapse
Affiliation(s)
- Zitong Yang
- Harbin University, Harbin 150086, Heilongjiang, China
| | - Shuo Yang
- Harbin University, Harbin 150086, Heilongjiang, China
| |
Collapse
|
10
|
Xiao J, Fu X, Liu A, Wu F, Zha ZJ. Image De-Raining Transformer. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:12978-12995. [PMID: 35709118 DOI: 10.1109/tpami.2022.3183612] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Existing deep learning based de-raining approaches have resorted to the convolutional architectures. However, the intrinsic limitations of convolution, including local receptive fields and independence of input content, hinder the model's ability to capture long-range and complicated rainy artifacts. To overcome these limitations, we propose an effective and efficient transformer-based architecture for the image de-raining. First, we introduce general priors of vision tasks, i.e., locality and hierarchy, into the network architecture so that our model can achieve excellent de-raining performance without costly pre-training. Second, since the geometric appearance of rainy artifacts is complicated and of significant variance in space, it is essential for de-raining models to extract both local and non-local features. Therefore, we design the complementary window-based transformer and spatial transformer to enhance locality while capturing long-range dependencies. Besides, to compensate for the positional blindness of self-attention, we establish a separate representative space for modeling positional relationship, and design a new relative position enhanced multi-head self-attention. In this way, our model enjoys powerful abilities to capture dependencies from both content and position, so as to achieve better image content recovery while removing rainy artifacts. Experiments substantiate that our approach attains more appealing results than state-of-the-art methods quantitatively and qualitatively.
Collapse
|
11
|
Wen C, Nie T, Li M, Wang X, Huang L. Image Restoration via Low-Illumination to Normal-Illumination Networks Based on Retinex Theory. SENSORS (BASEL, SWITZERLAND) 2023; 23:8442. [PMID: 37896535 PMCID: PMC10611181 DOI: 10.3390/s23208442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Revised: 10/11/2023] [Accepted: 10/11/2023] [Indexed: 10/29/2023]
Abstract
Under low-illumination conditions, the quality of the images collected by the sensor is significantly impacted, and the images have visual problems such as noise, artifacts, and brightness reduction. Therefore, this paper proposes an effective network based on Retinex for low-illumination image enhancement. Inspired by Retinex theory, images are decomposed into two parts in the decomposition network, and sent to the sub-network for processing. The reconstruction network constructs global and local residual convolution blocks to denoize the reflection component. The enhancement network uses frequency information, combined with attention mechanism and residual density network to enhance contrast and improve the details of the illumination component. A large number of experiments on public datasets show that our method is superior to existing methods in both quantitative and visual aspects.
Collapse
Affiliation(s)
- Chaoran Wen
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Ting Nie
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
| | - Mingxuan Li
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
| | - Xiaofeng Wang
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Liang Huang
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
| |
Collapse
|
12
|
Tian Z, Qu P, Li J, Sun Y, Li G, Liang Z, Zhang W. A Survey of Deep Learning-Based Low-Light Image Enhancement. SENSORS (BASEL, SWITZERLAND) 2023; 23:7763. [PMID: 37765817 PMCID: PMC10535564 DOI: 10.3390/s23187763] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 08/29/2023] [Accepted: 09/02/2023] [Indexed: 09/29/2023]
Abstract
Images captured under poor lighting conditions often suffer from low brightness, low contrast, color distortion, and noise. The function of low-light image enhancement is to improve the visual effect of such images for subsequent processing. Recently, deep learning has been used more and more widely in image processing with the development of artificial intelligence technology, and we provide a comprehensive review of the field of low-light image enhancement in terms of network structure, training data, and evaluation metrics. In this paper, we systematically introduce low-light image enhancement based on deep learning in four aspects. First, we introduce the related methods of low-light image enhancement based on deep learning. We then describe the low-light image quality evaluation methods, organize the low-light image dataset, and finally compare and analyze the advantages and disadvantages of the related methods and give an outlook on the future development direction.
Collapse
Affiliation(s)
- Zhen Tian
- School of Information Engineering, Henan Institute of Science and Technology, Xinxiang 453003, China; (Z.T.); (J.L.); (Y.S.); (G.L.); (W.Z.)
- Institute of Computer Applications, Henan Institute of Science and Technology, Xinxiang 453003, China
| | - Peixin Qu
- School of Information Engineering, Henan Institute of Science and Technology, Xinxiang 453003, China; (Z.T.); (J.L.); (Y.S.); (G.L.); (W.Z.)
- Institute of Computer Applications, Henan Institute of Science and Technology, Xinxiang 453003, China
| | - Jielin Li
- School of Information Engineering, Henan Institute of Science and Technology, Xinxiang 453003, China; (Z.T.); (J.L.); (Y.S.); (G.L.); (W.Z.)
- Institute of Computer Applications, Henan Institute of Science and Technology, Xinxiang 453003, China
| | - Yukun Sun
- School of Information Engineering, Henan Institute of Science and Technology, Xinxiang 453003, China; (Z.T.); (J.L.); (Y.S.); (G.L.); (W.Z.)
- Institute of Computer Applications, Henan Institute of Science and Technology, Xinxiang 453003, China
| | - Guohou Li
- School of Information Engineering, Henan Institute of Science and Technology, Xinxiang 453003, China; (Z.T.); (J.L.); (Y.S.); (G.L.); (W.Z.)
- Institute of Computer Applications, Henan Institute of Science and Technology, Xinxiang 453003, China
| | - Zheng Liang
- School of Internet, Anhui University, Hefei 230039, China;
| | - Weidong Zhang
- School of Information Engineering, Henan Institute of Science and Technology, Xinxiang 453003, China; (Z.T.); (J.L.); (Y.S.); (G.L.); (W.Z.)
- Institute of Computer Applications, Henan Institute of Science and Technology, Xinxiang 453003, China
| |
Collapse
|
13
|
Zhang J, Chen X, Tang W, Yu H, Bai L, Han J. Single image relighting based on illumination field reconstruction. OPTICS EXPRESS 2023; 31:29676-29694. [PMID: 37710763 DOI: 10.1364/oe.495858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 08/13/2023] [Indexed: 09/16/2023]
Abstract
Relighting a single low-light image is a crucial and challenging task. Previous works primarily focused on brightness enhancement but neglected the differences in light and shadow variations, which leads to unsatisfactory results. Herein, an illumination field reconstruction (IFR) algorithm is proposed to address this issue by leveraging physical mechanism guidance, physical-based supervision, and data-based modeling. Firstly, we derived the Illumination field modulation equation as a physical prior to guide the network design. Next, we constructed a physical-based dataset consisting of image sequences with diverse illumination levels as supervision. Finally, we proposed the IFR neural network (IFRNet) to model the relighting progress and reconstruct photorealistic images. Extensive experiments demonstrate the effectiveness of our method on both simulated and real-world datasets, showing its generalization ability in real-world scenarios, even training solely from simulation.
Collapse
|
14
|
Liu R, Ma L, Ma T, Fan X, Luo Z. Learning With Nested Scene Modeling and Cooperative Architecture Search for Low-Light Vision. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:5953-5969. [PMID: 36215366 DOI: 10.1109/tpami.2022.3212995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Images captured from low-light scenes often suffer from severe degradations, including low visibility, color casts, intensive noises, etc. These factors not only degrade image qualities, but also affect the performance of downstream Low-Light Vision (LLV) applications. A variety of deep networks have been proposed to enhance the visual quality of low-light images. However, they mostly rely on significant architecture engineering and often suffer from the high computational burden. More importantly, it still lacks an efficient paradigm to uniformly handle various tasks in the LLV scenarios. To partially address the above issues, we establish Retinex-inspired Unrolling with Architecture Search (RUAS), a general learning framework, that can address low-light enhancement task, and has the flexibility to handle other challenging downstream vision tasks. Specifically, we first establish a nested optimization formulation, together with an unrolling strategy, to explore underlying principles of a series of LLV tasks. Furthermore, we design a differentiable strategy to cooperatively search specific scene and task architectures for RUAS. Last but not least, we demonstrate how to apply RUAS for both low- and high-level LLV applications (e.g., enhancement, detection and segmentation). Extensive experiments verify the flexibility, effectiveness, and efficiency of RUAS.
Collapse
|
15
|
Guo J, Ma J, García-Fernández ÁF, Zhang Y, Liang H. A survey on image enhancement for Low-light images. Heliyon 2023; 9:e14558. [PMID: 37025779 PMCID: PMC10070385 DOI: 10.1016/j.heliyon.2023.e14558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Revised: 01/22/2023] [Accepted: 03/09/2023] [Indexed: 03/17/2023] Open
Abstract
In real scenes, due to the problems of low light and unsuitable views, the images often exhibit a variety of degradations, such as low contrast, color distortion, and noise. These degradations affect not only visual effects but also computer vision tasks. This paper focuses on the combination of traditional algorithms and machine learning algorithms in the field of image enhancement. The traditional methods, including their principles and improvements, are introduced from three categories: gray level transformation, histogram equalization, and Retinex methods. Machine learning based algorithms are not only divided into end-to-end learning and unpaired learning, but also concluded to decomposition-based learning and fusion based learning based on the applied image processing strategies. Finally, the involved methods are comprehensively compared by multiple image quality assessment methods, including mean square error, natural image quality evaluator, structural similarity, peak signal to noise ratio, etc.
Collapse
Affiliation(s)
- Jiawei Guo
- Department of Computer Science, University of Liverpool, Liverpool, UK
- School of Advanced Technology, Xi'an Jiaotong-Liverpool University (XJTLU), Suzhou, China
| | - Jieming Ma
- School of Advanced Technology, Xi'an Jiaotong-Liverpool University (XJTLU), Suzhou, China
- Corresponding author.
| | - Ángel F. García-Fernández
- Department of Electrical Engineering and Electronics, University of Liverpool, Liverpool, UK
- ARIES research center, Universidad Antonio de Nebrija, Madrid, Spain
| | - Yungang Zhang
- School of Information Science Yunnan Normal University, Kunming, China
| | - Haining Liang
- School of Advanced Technology, Xi'an Jiaotong-Liverpool University (XJTLU), Suzhou, China
| |
Collapse
|
16
|
Zamir SW, Arora A, Khan S, Hayat M, Khan FS, Yang MH, Shao L. Learning Enriched Features for Fast Image Restoration and Enhancement. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:1934-1948. [PMID: 35417348 DOI: 10.1109/tpami.2022.3167175] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Given a degraded input image, image restoration aims to recover the missing high-quality image content. Numerous applications demand effective image restoration, e.g., computational photography, surveillance, autonomous vehicles, and remote sensing. Significant advances in image restoration have been made in recent years, dominated by convolutional neural networks (CNNs). The widely-used CNN-based methods typically operate either on full-resolution or on progressively low-resolution representations. In the former case, spatial details are preserved but the contextual information cannot be precisely encoded. In the latter case, generated outputs are semantically reliable but spatially less accurate. This paper presents a new architecture with a holistic goal of maintaining spatially-precise high-resolution representations through the entire network, and receiving complementary contextual information from the low-resolution representations. The core of our approach is a multi-scale residual block containing the following key elements: (a) parallel multi-resolution convolution streams for extracting multi-scale features, (b) information exchange across the multi-resolution streams, (c) non-local attention mechanism for capturing contextual information, and (d) attention based multi-scale feature aggregation. Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details. Extensive experiments on six real image benchmark datasets demonstrate that our method, named as MIRNet-v2, achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement. The source code and pre-trained models are available at https://github.com/swz30/MIRNetv2.
Collapse
|
17
|
Tao H, Guo W, Han R, Yang Q, Zhao J. RDASNet: Image Denoising via a Residual Dense Attention Similarity Network. SENSORS (BASEL, SWITZERLAND) 2023; 23:1486. [PMID: 36772535 PMCID: PMC9921182 DOI: 10.3390/s23031486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 01/14/2023] [Accepted: 01/19/2023] [Indexed: 06/18/2023]
Abstract
In recent years, thanks to the performance advantages of convolutional neural networks (CNNs), CNNs have been widely used in image denoising. However, most of the CNN-based image-denoising models cannot make full use of the redundancy of image data, which limits the expressiveness of the model. We propose a new image-denoising model that aims to extract the local features of the image through CNN and focus on the global information of the image through the attention similarity module (ASM), especially the global similarity details of the image. Furthermore, dilation convolution is used to enlarge the receptive field to better focus on the global features. Moreover, avg-pooling is used to smooth and suppress noise in the ASM to further improve model performance. In addition, through global residual learning, the effect is enhanced from shallow to deep layers. A large number of experiments show that our proposed model has a better image-denoising effect, including quantitative and visual results. It is more suitable for complex blind noise and real images.
Collapse
Affiliation(s)
- Haowu Tao
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an 710049, China
| | - Wenhua Guo
- State Key Laboratory for Manufacturing Systems Engineering, Xi’an Jiaotong University, Xi’an 710049, China
| | - Rui Han
- State Key Laboratory for Manufacturing Systems Engineering, Xi’an Jiaotong University, Xi’an 710049, China
| | - Qi Yang
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an 710049, China
| | - Jiyuan Zhao
- State Key Laboratory for Manufacturing Systems Engineering, Xi’an Jiaotong University, Xi’an 710049, China
| |
Collapse
|
18
|
Effects of Image Quality on the Accuracy Human Pose Estimation and Detection of Eye Lid Opening/Closing Using Openpose and DLib. J Imaging 2022; 8:jimaging8120330. [PMID: 36547495 PMCID: PMC9783075 DOI: 10.3390/jimaging8120330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 11/25/2022] [Accepted: 12/15/2022] [Indexed: 12/23/2022] Open
Abstract
OBJECTIVE The application of computer models in continuous patient activity monitoring using video cameras is complicated by the capture of images of varying qualities due to poor lighting conditions and lower image resolutions. Insufficient literature has assessed the effects of image resolution, color depth, noise level, and low light on the inference of eye opening and closing and body landmarks from digital images. METHOD This study systematically assessed the effects of varying image resolutions (from 100 × 100 pixels to 20 × 20 pixels at an interval of 10 pixels), lighting conditions (from 42 to 2 lux with an interval of 2 lux), color-depths (from 16.7 M colors to 8 M, 1 M, 512 K, 216 K, 64 K, 8 K, 1 K, 729, 512, 343, 216, 125, 64, 27, and 8 colors), and noise levels on the accuracy and model performance in eye dimension estimation and body keypoint localization using the Dlib library and OpenPose with images from the Closed Eyes in the Wild and the COCO datasets, as well as photographs of the face captured at different light intensities. RESULTS The model accuracy and rate of model failure remained acceptable at an image resolution of 60 × 60 pixels, a color depth of 343 colors, a light intensity of 14 lux, and a Gaussian noise level of 4% (i.e., 4% of pixels replaced by Gaussian noise). CONCLUSIONS The Dlib and OpenPose models failed to detect eye dimensions and body keypoints only at low image resolutions, lighting conditions, and color depths. CLINICAL IMPACT Our established baseline threshold values will be useful for future work in the application of computer vision in continuous patient monitoring.
Collapse
|
19
|
Li C, Guo C, Han L, Jiang J, Cheng MM, Gu J, Loy CC. Low-Light Image and Video Enhancement Using Deep Learning: A Survey. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:9396-9416. [PMID: 34752382 DOI: 10.1109/tpami.2021.3126387] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Low-light image enhancement (LLIE) aims at improving the perception or interpretability of an image captured in an environment with poor illumination. Recent advances in this area are dominated by deep learning-based solutions, where many learning strategies, network structures, loss functions, training data, etc. have been employed. In this paper, we provide a comprehensive survey to cover various aspects ranging from algorithm taxonomy to unsolved open issues. To examine the generalization of existing methods, we propose a low-light image and video dataset, in which the images and videos are taken by different mobile phones' cameras under diverse illumination conditions. Besides, for the first time, we provide a unified online platform that covers many popular LLIE methods, of which the results can be produced through a user-friendly web interface. In addition to qualitative and quantitative evaluation of existing methods on publicly available and our proposed datasets, we also validate their performance in face detection in the dark. This survey together with the proposed dataset and online platform could serve as a reference source for future study and promote the development of this research field. The proposed platform and dataset as well as the collected methods, datasets, and evaluation metrics are publicly available and will be regularly updated. Project page: https://www.mmlab-ntu.com/project/lliv_survey/index.html.
Collapse
|
20
|
Bi X, Wang P, Wu T, Zha F, Xu P. Non-uniform illumination underwater image enhancement via events and frame fusion. APPLIED OPTICS 2022; 61:8826-8832. [PMID: 36256018 DOI: 10.1364/ao.463099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 09/09/2022] [Indexed: 06/16/2023]
Abstract
Absorption and scattering by aqueous media can attenuate light and cause underwater optical imagery difficulty. Artificial light sources are usually used to aid deep-sea imaging. Due to the limited dynamic range of standard cameras, artificial light sources often cause underwater images to be underexposed or overexposed. By contrast, event cameras have a high dynamic range and high temporal resolution but cannot provide frames with rich color characteristics. In this paper, we exploit the complementarity of the two types of cameras to propose an efficient yet simple method for image enhancement of uneven underwater illumination, which can generate enhanced images containing better scene details and colors similar to standard frames. Additionally, we create a dataset recorded by the Dynamic and Active-pixel Vision Sensor that includes both event streams and frames, enabling testing of the proposed method and frame-based image enhancement methods. The experimental results conducted on our dataset with qualitative and quantitative measures demonstrate that the proposed method outperforms the compared enhancement algorithms.
Collapse
|
21
|
Ma L, Liu R, Zhang J, Fan X, Luo Z. Learning Deep Context-Sensitive Decomposition for Low-Light Image Enhancement. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5666-5680. [PMID: 33929967 DOI: 10.1109/tnnls.2021.3071245] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Enhancing the quality of low-light (LOL) images plays a very important role in many image processing and multimedia applications. In recent years, a variety of deep learning techniques have been developed to address this challenging task. A typical framework is to simultaneously estimate the illumination and reflectance, but they disregard the scene-level contextual information encapsulated in feature spaces, causing many unfavorable outcomes, e.g., details loss, color unsaturation, and artifacts. To address these issues, we develop a new context-sensitive decomposition network (CSDNet) architecture to exploit the scene-level contextual dependencies on spatial scales. More concretely, we build a two-stream estimation mechanism including reflectance and illumination estimation network. We design a novel context-sensitive decomposition connection to bridge the two-stream mechanism by incorporating the physical principle. The spatially varying illumination guidance is further constructed for achieving the edge-aware smoothness property of the illumination component. According to different training patterns, we construct CSDNet (paired supervision) and context-sensitive decomposition generative adversarial network (CSDGAN) (unpaired supervision) to fully evaluate our designed architecture. We test our method on seven testing benchmarks [including massachusetts institute of technology (MIT)-Adobe FiveK, LOL, ExDark, and naturalness preserved enhancement (NPE)] to conduct plenty of analytical and evaluated experiments. Thanks to our designed context-sensitive decomposition connection, we successfully realized excellent enhanced results (with sufficient details, vivid colors, and few noises), which fully indicates our superiority against existing state-of-the-art approaches. Finally, considering the practical needs for high efficiency, we develop a lightweight CSDNet (named LiteCSDNet) by reducing the number of channels. Furthermore, by sharing an encoder for these two components, we obtain a more lightweight version (SLiteCSDNet for short). SLiteCSDNet just contains 0.0301M parameters but achieves the almost same performance as CSDNet. Code is available at https://github.com/KarelZhang/CSDNet-CSDGAN.
Collapse
|
22
|
Rasheed MT, Shi D. LSR: Lightening super-resolution deep network for low-light image enhancement. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.07.058] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
23
|
Low-light image enhancement with geometrical sparse representation. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04013-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
24
|
Zhao R, Han Y, Zhao J. End-to-End Retinex-Based Illumination Attention Low-Light Enhancement Network for Autonomous Driving at Night. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:4942420. [PMID: 36039345 PMCID: PMC9420063 DOI: 10.1155/2022/4942420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 06/27/2022] [Accepted: 07/09/2022] [Indexed: 11/17/2022]
Abstract
Low-light image enhancement is a preprocessing work for many recognition and tracking tasks for autonomous driving at night. It needs to handle various factors simultaneously including uneven lighting, low contrast, and artifacts. We propose a novel end-to-end Retinex-based illumination attention low-light enhancement network. Specifically, our proposed method adopts multibranch architecture to extract rich features for different depth levels. Meanwhile, we consider the features from different scales in built-in illumination attention module. We encode reflectance features and illumination features into latent space based on Retinex in each submodule, which could cater for highly ill-posed image decomposition tasks. It aims to enhance the desired illumination features under different receptive fields. Subsequently, we propose a memory gate mechanism to learn adaptively long-term and short-term memory. Their weight could control how many high-level and low-level features should be reserved. This method could improve the image quality from both different feature scales and feature levels. Comprehensive experiments on BDD10K and cityscapes datasets demonstrate that our proposed method outperforms various types of methods in terms of visual quality and quantitative metrics. We also show that our proposed method has certain antinoise capability and generalizes well without fine-tuning when dealing with unseen images. Meanwhile, our restoration performance is comparable to that of advanced computationally intensive models.1.
Collapse
Affiliation(s)
- Ruini Zhao
- School of Automobile, Chang'an University, Xi'an, Shaanxi 710064, China
| | - Yi Han
- School of Automobile, Chang'an University, Xi'an, Shaanxi 710064, China
| | - Jian Zhao
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan 611731, China
| |
Collapse
|
25
|
Li C, Guo C, Loy CC. Learning to Enhance Low-Light Image via Zero-Reference Deep Curve Estimation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:4225-4238. [PMID: 33656989 DOI: 10.1109/tpami.2021.3063604] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
This paper presents a novel method, Zero-Reference Deep Curve Estimation (Zero-DCE), which formulates light enhancement as a task of image-specific curve estimation with a deep network. Our method trains a lightweight deep network, DCE-Net, to estimate pixel-wise and high-order curves for dynamic range adjustment of a given image. The curve estimation is specially designed, considering pixel value range, monotonicity, and differentiability. Zero-DCE is appealing in its relaxed assumption on reference images, i.e., it does not require any paired or even unpaired data during training. This is achieved through a set of carefully formulated non-reference loss functions, which implicitly measure the enhancement quality and drive the learning of the network. Despite its simplicity, we show that it generalizes well to diverse lighting conditions. Our method is efficient as image enhancement can be achieved by an intuitive and simple nonlinear curve mapping. We further present an accelerated and light version of Zero-DCE, called Zero-DCE++, that takes advantage of a tiny network with just 10K parameters. Zero-DCE++ has a fast inference speed (1000/11 FPS on a single GPU/CPU for an image of size 1200×900×3) while keeping the enhancement performance of Zero-DCE. Extensive experiments on various benchmarks demonstrate the advantages of our method over state-of-the-art methods qualitatively and quantitatively. Furthermore, the potential benefits of our method to face detection in the dark are discussed. The source code is made publicly available at https://li-chongyi.github.io/Proj_Zero-DCE++.html.
Collapse
|
26
|
Li H, Liu H, Hu Y, Fu H, Zhao Y, Miao H, Liu J. An Annotation-Free Restoration Network for Cataractous Fundus Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:1699-1710. [PMID: 35100108 DOI: 10.1109/tmi.2022.3147854] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Cataracts are the leading cause of vision loss worldwide. Restoration algorithms are developed to improve the readability of cataract fundus images in order to increase the certainty in diagnosis and treatment for cataract patients. Unfortunately, the requirement of annotation limits the application of these algorithms in clinics. This paper proposes a network to annotation-freely restore cataractous fundus images (ArcNet) so as to boost the clinical practicability of restoration. Annotations are unnecessary in ArcNet, where the high-frequency component is extracted from fundus images to replace segmentation in the preservation of retinal structures. The restoration model is learned from the synthesized images and adapted to real cataract images. Extensive experiments are implemented to verify the performance and effectiveness of ArcNet. Favorable performance is achieved using ArcNet against state-of-the-art algorithms, and the diagnosis of ocular fundus diseases in cataract patients is promoted by ArcNet. The capability of properly restoring cataractous images in the absence of annotated data promises the proposed algorithm outstanding clinical practicability.
Collapse
|
27
|
Ahn S, Shin J, Lim H, Lee J, Paik J. CODEN: combined optimization-based decomposition and learning-based enhancement network for Retinex-based brightness and contrast enhancement. OPTICS EXPRESS 2022; 30:23608-23621. [PMID: 36225037 DOI: 10.1364/oe.459063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 06/04/2022] [Indexed: 06/16/2023]
Abstract
In this paper, we present a novel low-light image enhancement method by combining optimization-based decomposition and enhancement network for simultaneously enhancing brightness and contrast. The proposed method works in two steps including Retinex decomposition and illumination enhancement, and can be trained in an end-to-end manner. The first step separates the low-light image into illumination and reflectance components based on the Retinex model. Specifically, it performs model-based optimization followed by learning for edge-preserved illumination smoothing and detail-preserved reflectance denoising. In the second step, the illumination output from the first step, together with its gamma corrected and histogram equalized versions, serves as input to illumination enhancement network (IEN) including residual squeeze and excitation blocks (RSEBs). Extensive experiments prove that our method shows better performance compared with state-of-the-art low-light enhancement methods in the sense of both objective and subjective measures.
Collapse
|
28
|
Hybrid neural networks for noise reductions of integrated navigation complexes. ARTIF INTELL 2022. [DOI: 10.15407/jai2022.01.288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The necessity of integrated navigation complexes (INC) construction is substantiated. It is proposed to include in the complex the following inertial systems: inertial, satellite and visual. It helps to increase the accuracy of determining the coordinates of unmanned aerial vehicles. It is shown that in unfavorable cases, namely the suppression of external noise of the satellite navigation system, an increase in the errors of the inertial navigation system (INS), including through the use of accelerometers and gyroscopes manufactured using MEMS technology, the presence of bad weather conditions, which complicates the work of the visual navigation system. In order to ensure the operation of the navigation complex, it is necessary to ensure the suppression of interference (noise). To improve the accuracy of the INS, which is part of the INC, it is proposed to use the procedure for extracting noise from the raw signal of the INS, its prediction using neural networks and its suppression. To solve this problem, two approaches are proposed, the first of which is based on the use of a multi-row GMDH algorithm and single-layer networks with sigm_piecewise neurons, and the second is on the use of hybrid recurrent neural networks, when neural networks were used, which included long-term and short-term memory (LSTM) and Gated Recurrent Units (GRU). Various types of noise, that are inherent in video images in visual navigation systems are considered: Gaussian noise, salt and pepper noise, Poisson noise, fractional noise, blind noise. Particular attention is paid to blind noise. To improve the accuracy of the visual navigation system, it is proposed to use hybrid convolutional neural networks.
Collapse
|
29
|
Chen Z, Jiang Y, Liu D, Wang Z. CERL: A Unified Optimization Framework for Light Enhancement With Realistic Noise. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; PP:4162-4172. [PMID: 35700251 DOI: 10.1109/tip.2022.3180213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Low-light images captured in the real world are inevitably corrupted by sensor noise. Such noise is spatially variant and highly dependent on the underlying pixel intensity, deviating from the oversimplified assumptions in conventional denoising. Existing light enhancement methods either overlook the important impact of real-world noise during enhancement, or treat noise removal as a separate pre- or post-processing step. We present Coordinated Enhancement for Real-world Low-light Noisy Images (CERL), that seamlessly integrates light enhancement and noise suppression parts into a unified and physics-grounded optimization framework. For the real low-light noise removal part, we customize a self-supervised denoising model that can easily be adapted without referring to clean ground-truth images. For the light enhancement part, we also improve the design of a state-of-the-art backbone. The two parts are then joint formulated into one principled plug-and-play optimization. Our approach is compared against state-of-the-art low-light enhancement methods both qualitatively and quantitatively. Besides standard benchmarks, we further collect and test on a new realistic low-light mobile photography dataset (RLMP), whose mobile-captured photos display heavier realistic noise than those taken by high-quality cameras. CERL consistently produces the most visually pleasing and artifact-free results across all experiments. Our RLMP dataset and codes are available at: https://github.com/VITA-Group/CERL.
Collapse
|
30
|
Low-Light Image Enhancement Method Based on Retinex Theory by Improving Illumination Map. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12105257] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Recently, low-light image enhancement has attracted much attention. However, some problems still exist. For instance, sometimes dark regions are not fully improved, but bright regions near the light source or auxiliary light source are overexposed. To address these problems, a retinex based method that strengthens the illumination map is proposed, which utilizes a brightness enhancement function (BEF) that is a weighted sum of the Sigmoid function cascading by Gamma correction (GC) and Sine function, and an improved adaptive contrast enhancement (IACE) to enhance the estimated illumination map through multi-scale fusion. Specifically, firstly, the illumination map is obtained according to retinex theory via the weighted sum method, which considers neighborhood information. Then, the Gaussian Laplacian pyramid is used to fuse two input images that are derived by BEF and IACE, so that it can improve brightness and contrast of the illuminance component acquired above. Finally, the adjusted illuminance map is multiplied by the reflection map to obtain the enhanced image according to the retinex theory. Extensive experiments show that our method has better results in subjective vision and quantitative index evaluation compared with other state-of-the-art methods.
Collapse
|
31
|
Xia W, Chen E, Pautler S, Peters T. Laparoscopic image enhancement based on distributed retinex optimization with refined information fusion. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.08.142] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
32
|
Singh LK, Garg H, Khanna M. Performance evaluation of various deep learning based models for effective glaucoma evaluation using optical coherence tomography images. MULTIMEDIA TOOLS AND APPLICATIONS 2022; 81:27737-27781. [PMID: 35368855 PMCID: PMC8962290 DOI: 10.1007/s11042-022-12826-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Revised: 02/20/2022] [Accepted: 03/09/2022] [Indexed: 06/14/2023]
Abstract
Glaucoma is the dominant reason for irreversible blindness worldwide, and its best remedy is early and timely detection. Optical coherence tomography has come to be the most commonly used imaging modality in detecting glaucomatous damage in recent years. Deep Learning using Optical Coherence Tomography Modality helps in predicting glaucoma more accurately and less tediously. This experimental study aims to perform glaucoma prediction using eight different ImageNet models from Optical Coherence Tomography of Glaucoma. A thorough investigation is performed to evaluate these models' performances on various efficiency metrics, which will help discover the best performing model. Every net is tested on three different optimizers, namely Adam, Root Mean Squared Propagation, and Stochastic Gradient Descent, to find the best relevant results. An attempt has been made to improvise the performance of models using transfer learning and fine-tuning. The work presented in this study was initially trained and tested on a private database that consists of 4220 images (2110 normal optical coherence tomography and 2110 glaucoma optical coherence tomography). Based on the results, the four best-performing models are shortlisted. Later, these models are tested on the well-recognized standard public Mendeley dataset. Experimental results illustrate that VGG16 using the Root Mean Squared Propagation Optimizer attains auspicious performance with 95.68% accuracy. The proposed work concludes that different ImageNet models are a good alternative as a computer-based automatic glaucoma screening system. This fully automated system has a lot of potential to tell the difference between normal Optical Coherence Tomography and glaucomatous Optical Coherence Tomography automatically. The proposed system helps in efficiently detecting this retinal infection in suspected patients for better diagnosis to avoid vision loss and also decreases senior ophthalmologists' (experts) precious time and involvement.
Collapse
Affiliation(s)
- Law Kumar Singh
- Department of Computer Science and Engineering, Sharda University , Greater Noida, India
- Department of Computer Science and Engineering, Hindustan College of Science and Technology, Mathura, India
| | - Hitendra Garg
- Department of Computer Engineering and Applications, GLA University, Mathura, India
| | - Munish Khanna
- Department of Computer Science and Engineering, Hindustan College of Science and Technology, Mathura, India
| |
Collapse
|
33
|
Lu Y, Jung SW. Progressive Joint Low-Light Enhancement and Noise Removal for Raw Images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:2390-2404. [PMID: 35259104 DOI: 10.1109/tip.2022.3155948] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Low-light imaging on mobile devices is typically challenging due to insufficient incident light coming through the relatively small aperture, resulting in low image quality. Most of the previous works on low-light imaging focus either only on a single task such as illumination adjustment, color enhancement, or noise removal; or on a joint illumination adjustment and denoising task that heavily relies on short-long exposure image pairs from specific camera models. These approaches are less practical and generalizable in real-world settings where camera-specific joint enhancement and restoration is required. In this paper, we propose a low-light imaging framework that performs joint illumination adjustment, color enhancement, and denoising to tackle this problem. Considering the difficulty in model-specific data collection and the ultra-high definition of the captured images, we design two branches: a coefficient estimation branch and a joint operation branch. The coefficient estimation branch works in a low-resolution space and predicts the coefficients for enhancement via bilateral learning, whereas the joint operation branch works in a full-resolution space and progressively performs joint enhancement and denoising. In contrast to existing methods, our framework does not need to recollect massive data when adapted to another camera model, which significantly reduces the efforts required to fine-tune our approach for practical usage. Through extensive experiments, we demonstrate its great potential in real-world low-light imaging applications.
Collapse
|
34
|
Pan Z, Yuan F, Lei J, Fang Y, Shao X, Kwong S. VCRNet: Visual Compensation Restoration Network for No-Reference Image Quality Assessment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:1613-1627. [PMID: 35081029 DOI: 10.1109/tip.2022.3144892] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Guided by the free-energy principle, generative adversarial networks (GAN)-based no-reference image quality assessment (NR-IQA) methods have improved the image quality prediction accuracy. However, the GAN cannot well handle the restoration task for the free-energy principle-guided NR-IQA methods, especially for the severely destroyed images, which results in that the quality reconstruction relationship between the distorted image and its restored image cannot be accurately built. To address this problem, a visual compensation restoration network (VCRNet)-based NR-IQA method is proposed, which uses a non-adversarial model to efficiently handle the distorted image restoration task. The proposed VCRNet consists of a visual restoration network and a quality estimation network. To accurately build the quality reconstruction relationship between the distorted image and its restored image, a visual compensation module, an optimized asymmetric residual block, and an error map-based mixed loss function, are proposed for increasing the restoration capability of the visual restoration network. For further addressing the NR-IQA problem of severely destroyed images, the multi-level restoration features which are obtained from the visual restoration network are used for the image quality estimation. To prove the effectiveness of the proposed VCRNet, seven representative IQA databases are used, and experimental results show that the proposed VCRNet achieves the state-of-the-art image quality prediction accuracy. The implementation of the proposed VCRNet has been released at https://github.com/NUIST-Videocoding/VCRNet.
Collapse
|
35
|
Qi B, Chen W, Dun X, Hao X, Wang R, Liu X, Li H, Peng Y. All-day thin-lens computational imaging with scene-specific learning recovery. APPLIED OPTICS 2022; 61:1097-1105. [PMID: 35201084 DOI: 10.1364/ao.448155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 01/03/2022] [Indexed: 06/14/2023]
Abstract
Modern imaging optics ensures high-quality photography at the cost of a complex optical form factor that deviates from the portability. The drastic development of image processing algorithms, especially advanced neural networks, shows great promise to use thin optics but still faces the challenges of residual artifacts and chromatic aberration. In this work, we investigate photorealistic thin-lens imaging that paves the way to actual applications by exploring several fine-tunes. Notably, to meet all-day photography demands, we develop a scene-specific generative-adversarial-network-based learning strategy and develop an integral automatic acquisition and processing pipeline. Color fringe artifacts are reduced by implementing a chromatic aberration pre-correction trick. Our method outperforms existing thin-lens imaging work with better visual perception and excels in both normal-light and low-light scenarios.
Collapse
|
36
|
Chen H, He X, Yang H, Qing L, Teng Q. A Feature-Enriched Deep Convolutional Neural Network for JPEG Image Compression Artifacts Reduction and its Applications. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:430-444. [PMID: 34793307 DOI: 10.1109/tnnls.2021.3124370] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The amount of multimedia data, such as images and videos, has been increasing rapidly with the development of various imaging devices and the Internet, bringing more stress and challenges to information storage and transmission. The redundancy in images can be reduced to decrease data size via lossy compression, such as the most widely used standard Joint Photographic Experts Group (JPEG). However, the decompressed images generally suffer from various artifacts (e.g., blocking, banding, ringing, and blurring) due to the loss of information, especially at high compression ratios. This article presents a feature-enriched deep convolutional neural network for compression artifacts reduction (FeCarNet, for short). Taking the dense network as the backbone, FeCarNet enriches features to gain valuable information via introducing multi-scale dilated convolutions, along with the efficient 1 ×1 convolution for lowering both parameter complexity and computation cost. Meanwhile, to make full use of different levels of features in FeCarNet, a fusion block that consists of attention-based channel recalibration and dimension reduction is developed for local and global feature fusion. Furthermore, short and long residual connections both in the feature and pixel domains are combined to build a multi-level residual structure, thereby benefiting the network training and performance. In addition, aiming at reducing computation complexity further, pixel-shuffle-based image downsampling and upsampling layers are, respectively, arranged at the head and tail of the FeCarNet, which also enlarges the receptive field of the whole network. Experimental results show the superiority of FeCarNet over state-of-the-art compression artifacts reduction approaches in terms of both restoration capacity and model complexity. The applications of FeCarNet on several computer vision tasks, including image deblurring, edge detection, image segmentation, and object detection, demonstrate the effectiveness of FeCarNet further.
Collapse
|
37
|
Low Light Video Enhancement Based on Temporal-Spatial Complementary Feature. ARTIF INTELL 2022. [DOI: 10.1007/978-3-031-20497-5_30] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
38
|
Guo L, Jia Z, Yang J, Kasabov NK. Detail Preserving Low Illumination Image and Video Enhancement Algorithm Based on Dark Channel Prior. SENSORS (BASEL, SWITZERLAND) 2021; 22:85. [PMID: 35009629 PMCID: PMC8747644 DOI: 10.3390/s22010085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 12/20/2021] [Accepted: 12/20/2021] [Indexed: 06/14/2023]
Abstract
In low illumination situations, insufficient light in the monitoring device results in poor visibility of effective information, which cannot meet practical applications. To overcome the above problems, a detail preserving low illumination video image enhancement algorithm based on dark channel prior is proposed in this paper. First, a dark channel refinement method is proposed, which is defined by imposing a structure prior to the initial dark channel to improve the image brightness. Second, an anisotropic guided filter (AnisGF) is used to refine the transmission, which preserves the edges of the image. Finally, a detail enhancement algorithm is proposed to avoid the problem of insufficient detail in the initial enhancement image. To avoid video flicker, the next video frames are enhanced based on the brightness of the first enhanced frame. Qualitative and quantitative analysis shows that the proposed algorithm is superior to the contrast algorithm, in which the proposed algorithm ranks first in average gradient, edge intensity, contrast, and patch-based contrast quality index. It can be effectively applied to the enhancement of surveillance video images and for wider computer vision applications.
Collapse
Affiliation(s)
- Lingli Guo
- College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China;
| | - Zhenhong Jia
- College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China;
| | - Jie Yang
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai 200400, China;
| | - Nikola K. Kasabov
- Knowledge Engineering and Discovery Research Institute, Auckland University of Technology, Auckland 1020, New Zealand;
- Intelligent Systems Research Center, Ulster University Magee Campus, Derry BT48 7JL, UK
| |
Collapse
|
39
|
Hu J, Guo X, Chen J, Liang G, Deng F, Lam TL. A Two-Stage Unsupervised Approach for Low Light Image Enhancement. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2020.3048667] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
40
|
|
41
|
Kim G, Park SW, Kwon J. Pixel-Wise Wasserstein Autoencoder for Highly Generative Dehazing. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:5452-5462. [PMID: 34086571 DOI: 10.1109/tip.2021.3084743] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
We propose a highly generative dehazing method based on pixel-wise Wasserstein autoencoders. In contrast to existing dehazing methods based on generative adversarial networks, our method can produce a variety of dehazed images with different styles. It significantly improves the dehazing accuracy via pixel-wise matching from hazy to dehazed images through 2-dimensional latent tensors of the Wasserstein autoencoder. In addition, we present an advanced feature fusion technique to deliver rich information to the latent space. For style transfer, we introduce a mapping function that transforms existing latent spaces to new ones. Thus, our method can produce highly generative haze-free images with various tones, illuminations, and moods, which induces several interesting applications, including low-light enhancement, daytime dehazing, nighttime dehazing, and underwater image enhancement. Experimental results demonstrate that our method quantitatively outperforms existing state-of-the-art methods for synthetic and real-world datasets, and simultaneously generates highly generative haze-free images, which are qualitatively diverse.
Collapse
|
42
|
Li C, Anwar S, Hou J, Cong R, Guo C, Ren W. Underwater Image Enhancement via Medium Transmission-Guided Multi-Color Space Embedding. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:4985-5000. [PMID: 33961554 DOI: 10.1109/tip.2021.3076367] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Underwater images suffer from color casts and low contrast due to wavelength- and distance-dependent attenuation and scattering. To solve these two degradation issues, we present an underwater image enhancement network via medium transmission-guided multi-color space embedding, called Ucolor. Concretely, we first propose a multi-color space encoder network, which enriches the diversity of feature representations by incorporating the characteristics of different color spaces into a unified structure. Coupled with an attention mechanism, the most discriminative features extracted from multiple color spaces are adaptively integrated and highlighted. Inspired by underwater imaging physical models, we design a medium transmission (indicating the percentage of the scene radiance reaching the camera)-guided decoder network to enhance the response of network towards quality-degraded regions. As a result, our network can effectively improve the visual quality of underwater images by exploiting multiple color spaces embedding and the advantages of both physical model-based and learning-based methods. Extensive experiments demonstrate that our Ucolor achieves superior performance against state-of-the-art methods in terms of both visual quality and quantitative metrics. The code is publicly available at: https://li-chongyi.github.io/Proj_Ucolor.html.
Collapse
|
43
|
Attention Guided Low-Light Image Enhancement with a Large Scale Low-Light Simulation Dataset. Int J Comput Vis 2021. [DOI: 10.1007/s11263-021-01466-8] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
44
|
Abstract
In unmanned aerial vehicle based urban observation and monitoring, the performance of computer vision algorithms is inevitably limited by the low illumination and light pollution caused degradation, therefore, the application image enhancement is a considerable prerequisite for the performance of subsequent image processing algorithms. Therefore, we proposed a deep learning and generative adversarial network based model for UAV low illumination image enhancement, named LighterGAN. The design of LighterGAN refers to the CycleGAN model with two improvements—attention mechanism and semantic consistency loss—having been proposed to the original structure. Additionally, an unpaired dataset that was captured by urban UAV aerial photography has been used to train this unsupervised learning model. Furthermore, in order to explore the advantages of the improvements, both the performance in the illumination enhancement task and the generalization ability improvement of LighterGAN were proven in the comparative experiments combining subjective and objective evaluations. In the experiments with five cutting edge image enhancement algorithms, in the test set, LighterGAN achieved the best results in both visual perception and PIQE (perception based image quality evaluator, a MATLAB build-in function, the lower the score, the higher the image quality) score of enhanced images, scores were 4.91 and 11.75 respectively, better than EnlightenGAN the state-of-the-art. In the enhancement of low illumination sub-dataset Y (containing 2000 images), LighterGAN also achieved the lowest PIQE score of 12.37, 2.85 points lower than second place. Moreover, compared with the CycleGAN, the improvement of generalization ability was also demonstrated. In the test set generated images, LighterGAN was 6.66 percent higher than CycleGAN in subjective authenticity assessment and 3.84 lower in PIQE score, meanwhile, in the whole dataset generated images, the PIQE score of LighterGAN is 11.67, 4.86 lower than CycleGAN.
Collapse
|
45
|
Yang W, Wang S, Fang Y, Wang Y, Liu J. Band Representation-Based Semi-Supervised Low-Light Image Enhancement: Bridging the Gap Between Signal Fidelity and Perceptual Quality. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:3461-3473. [PMID: 33656992 DOI: 10.1109/tip.2021.3062184] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
It has been widely acknowledged that under-exposure causes a variety of visual quality degradation because of intensive noise, decreased visibility, biased color, etc. To alleviate these issues, a novel semi-supervised learning approach is proposed in this paper for low-light image enhancement. More specifically, we propose a deep recursive band network (DRBN) to recover a linear band representation of an enhanced normal-light image based on the guidance of the paired low/normal-light images. Such design philosophy enables the principled network to generate a quality improved one by reconstructing the given bands based upon another learnable linear transformation which is perceptually driven by an image quality assessment neural network. On one hand, the proposed network is delicately developed to obtain a variety of coarse-to-fine band representations, of which the estimations benefit each other in a recursive process mutually. On the other hand, the extracted band representation of the enhanced image in the recursive band learning stage of DRBN is capable of bridging the gap between the restoration knowledge of paired data and the perceptual quality preference to high-quality images. Subsequently, the band recomposition learns to recompose the band representation towards fitting perceptual regularization of high-quality images with the perceptual guidance. The proposed architecture can be flexibly trained with both paired and unpaired data. Extensive experiments demonstrate that our method produces better enhanced results with visually pleasing contrast and color distributions, as well as well-restored structural details.
Collapse
|
46
|
Shen Z, Fu H, Shen J, Shao L. Modeling and Enhancing Low-Quality Retinal Fundus Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:996-1006. [PMID: 33296301 DOI: 10.1109/tmi.2020.3043495] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Retinal fundus images are widely used for the clinical screening and diagnosis of eye diseases. However, fundus images captured by operators with various levels of experience have a large variation in quality. Low-quality fundus images increase uncertainty in clinical observation and lead to the risk of misdiagnosis. However, due to the special optical beam of fundus imaging and structure of the retina, natural image enhancement methods cannot be utilized directly to address this. In this article, we first analyze the ophthalmoscope imaging system and simulate a reliable degradation of major inferior-quality factors, including uneven illumination, image blurring, and artifacts. Then, based on the degradation model, a clinically oriented fundus enhancement network (cofe-Net) is proposed to suppress global degradation factors, while simultaneously preserving anatomical retinal structures and pathological characteristics for clinical observation and analysis. Experiments on both synthetic and real images demonstrate that our algorithm effectively corrects low-quality fundus images without losing retinal details. Moreover, we also show that the fundus correction method can benefit medical image analysis applications, e.g., retinal vessel segmentation and optic disc/cup detection.
Collapse
|
47
|
Rahim A, Maqbool A, Rana T. Monitoring social distancing under various low light conditions with deep learning and a single motionless time of flight camera. PLoS One 2021; 16:e0247440. [PMID: 33630951 PMCID: PMC7906321 DOI: 10.1371/journal.pone.0247440] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Accepted: 02/06/2021] [Indexed: 11/19/2022] Open
Abstract
The purpose of this work is to provide an effective social distance monitoring solution in low light environments in a pandemic situation. The raging coronavirus disease 2019 (COVID-19) caused by the SARS-CoV-2 virus has brought a global crisis with its deadly spread all over the world. In the absence of an effective treatment and vaccine the efforts to control this pandemic strictly rely on personal preventive actions, e.g., handwashing, face mask usage, environmental cleaning, and most importantly on social distancing which is the only expedient approach to cope with this situation. Low light environments can become a problem in the spread of disease because of people's night gatherings. Especially, in summers when the global temperature is at its peak, the situation can become more critical. Mostly, in cities where people have congested homes and no proper air cross-system is available. So, they find ways to get out of their homes with their families during the night to take fresh air. In such a situation, it is necessary to take effective measures to monitor the safety distance criteria to avoid more positive cases and to control the death toll. In this paper, a deep learning-based solution is proposed for the above-stated problem. The proposed framework utilizes the you only look once v4 (YOLO v4) model for real-time object detection and the social distance measuring approach is introduced with a single motionless time of flight (ToF) camera. The risk factor is indicated based on the calculated distance and safety distance violations are highlighted. Experimental results show that the proposed model exhibits good performance with 97.84% mean average precision (mAP) score and the observed mean absolute error (MAE) between actual and measured social distance values is 1.01 cm.
Collapse
Affiliation(s)
- Adina Rahim
- Department of Computer Software Engineering, NUST, Islamabad, Pakistan
| | - Ayesha Maqbool
- Department of Computer Software Engineering, NUST, Islamabad, Pakistan
| | - Tauseef Rana
- Department of Computer Software Engineering, NUST, Islamabad, Pakistan
| |
Collapse
|
48
|
Daydriex: Translating Nighttime Scenes towards Daytime Driving Experience at Night. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11052013] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
What if the window of our cars is a magic window, which transforms dark views outside of the window at night into bright ones as we can see in the daytime? To realize such a window, one of important requirements is that the stream of transformed images displayed on the window should be of high quality so that users perceive it as real scenes in the day. Although image-to-image translation techniques based on Generative Adversarial Networks (GANs) have been widely studied, night-to-day image translation is still a challenging task. In this paper, we propose Daydriex, a processing pipeline to generate enhanced daytime translation focusing on road views. Our key idea is to supplement the missing information in dark areas of input image frames by using existing daytime images corresponding to the input images from street view services. We present a detailed processing flow and address several issues to realize our idea. Our evaluation shows that the results by Daydriex achieves lower Fréchet Inception Distance (FID) scores and higher user perception scores compared to those by CycleGAN only.
Collapse
|
49
|
|
50
|
|