1
|
Zhang K, Li D, Luo W, Liu J, Deng J, Liu W, Zafeiriou S. EDFace-Celeb-1 M: Benchmarking Face Hallucination With a Million-Scale Dataset. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:3968-3978. [PMID: 35687621 DOI: 10.1109/tpami.2022.3181579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Recent deep face hallucination methods show stunning performance in super-resolving severely degraded facial images, even surpassing human ability. However, these algorithms are mainly evaluated on non-public synthetic datasets. It is thus unclear how these algorithms perform on public face hallucination datasets. Meanwhile, most of the existing datasets do not well consider the distribution of races, which makes face hallucination methods trained on these datasets biased toward some specific races. To address the above two problems, in this paper, we build a public Ethnically Diverse Face dataset, EDFace-Celeb-1 M, and design a benchmark task for face hallucination. Our dataset includes 1.7 million photos that cover different countries, with relatively balanced race composition. To the best of our knowledge, it is the largest-scale and publicly available face hallucination dataset in the wild. Associated with this dataset, this paper also contributes various evaluation protocols and provides comprehensive analysis to benchmark the existing state-of-the-art methods. The benchmark evaluations demonstrate the performance and limitations of state-of-the-art algorithms. https://github.com/HDCVLab/EDFace-Celeb-1M.
Collapse
|
2
|
Wan R, Shi B, Li H, Hong Y, Duan LY, Kot AC. Benchmarking Single-Image Reflection Removal Algorithms. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:1424-1441. [PMID: 35439129 DOI: 10.1109/tpami.2022.3168560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Reflection removal has been discussed for more than decades. This paper aims to provide the analysis for different reflection properties and factors that influence image formation, an up-to-date taxonomy for existing methods, a benchmark dataset, and the unified benchmarking evaluations for state-of-the-art (especially learning-based) methods. Specifically, this paper presents a SIngle-image Reflection Removal Plus dataset "SIR 2+ " with the new consideration for in-the-wild scenarios and glass with diverse color and unplanar shapes. We further perform quantitative and visual quality comparisons for state-of-the-art single-image reflection removal algorithms. Open problems for improving reflection removal algorithms are discussed at the end. Our dataset and follow-up update can be found at https://reflectionremoval.github.io/sir2data/.
Collapse
|
3
|
Hu X, Ren W, Yang J, Cao X, Wipf D, Menze B, Tong X, Zha H. Face Restoration via Plug-and-Play 3D Facial Priors. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:8910-8926. [PMID: 34705635 DOI: 10.1109/tpami.2021.3123085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
State-of-the-art face restoration methods employ deep convolutional neural networks (CNNs) to learn a mapping between degraded and sharp facial patterns by exploring local appearance knowledge. However, most of these methods do not well exploit facial structures and identity information, and only deal with task-specific face restoration (e.g., face super-resolution or deblurring). In this paper, we propose cross-tasks and cross-models plug-and-play 3D facial priors to explicitly embed the network with the sharp facial structures for general face restoration tasks. Our 3D priors are the first to explore 3D morphable knowledge based on the fusion of parametric descriptions of face attributes (e.g., identity, facial expression, texture, illumination, and face pose). Furthermore, the priors can easily be incorporated into any network and are very efficient in improving the performance and accelerating the convergence speed. Firstly, a 3D face rendering branch is set up to obtain 3D priors of salient facial structures and identity knowledge. Secondly, for better exploiting this hierarchical information (i.e., intensity similarity, 3D facial structure, and identity content), a spatial attention module is designed for the image restoration problems. Extensive face restoration experiments including face super-resolution and deblurring demonstrate that the proposed 3D priors achieve superior face restoration results over the state-of-the-art algorithms.
Collapse
|
4
|
Cheikh Sidiya A, Li X. Toward extreme face super-resolution in the wild: A self-supervised learning approach. FRONTIERS IN COMPUTER SCIENCE 2022. [DOI: 10.3389/fcomp.2022.1037435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Extreme face super-resolution (FSR), that is, improving the resolution of face images by an extreme scaling factor (often greater than ×8) has remained underexplored in the literature of low-level vision. Extreme FSR in the wild must address the challenges of both unpaired training data and unknown degradation factors. Inspired by the latest advances in image super-resolution (SR) and self-supervised learning (SSL), we propose a novel two-step approach to FSR by introducing a mid-resolution (MR) image as the stepping stone. In the first step, we leverage ideas from SSL-based SR reconstruction of medical images (e.g., MRI and ultrasound) to modeling the realistic degradation process of face images in the real world; in the second step, we extract the latent codes from MR images and interpolate them in a self-supervised manner to facilitate artifact-suppressed image reconstruction. Our two-step extreme FSR can be interpreted as the combination of existing self-supervised CycleGAN (step 1) and StyleGAN (step 2) that overcomes the barrier of critical resolution in face recognition. Extensive experimental results have shown that our two-step approach can significantly outperform existing state-of-the-art FSR techniques, including FSRGAN, Bulat's method, and PULSE, especially for large scaling factors such as 64.
Collapse
|
5
|
Pei J, Chen Y, Zhao Y, Wang C, Yang X. Self-adjusting multilayer nonlinear coupled mapping for low-resolution face recognition. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
6
|
Zhang Y, Tsang IW, Luo Y, Hu C, Lu X, Yu X. Recursive Copy and Paste GAN: Face Hallucination From Shaded Thumbnails. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:4321-4338. [PMID: 33621168 DOI: 10.1109/tpami.2021.3061312] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Existing face hallucination methods based on convolutional neural networks (CNNs) have achieved impressive performance on low-resolution (LR) faces in a normal illumination condition. However, their performance degrades dramatically when LR faces are captured in non-uniform illumination conditions. This paper proposes a Recursive Copy and Paste Generative Adversarial Network (Re-CPGAN) to recover authentic high-resolution (HR) face images while compensating for non-uniform illumination. To this end, we develop two key components in our Re-CPGAN: internal and recursive external Copy and Paste networks (CPnets). Our internal CPnet exploits facial self-similarity information residing in the input image to enhance facial details; while our recursive external CPnet leverages an external guided face for illumination compensation. Specifically, our recursive external CPnet stacks multiple external Copy and Paste (EX-CP) units in a compact model to learn normal illumination and enhance facial details recursively. By doing so, our method offsets illumination and upsamples facial details progressively in a coarse-to-fine fashion, thus alleviating the ambiguity of correspondences between LR inputs and external guided inputs. Furthermore, a new illumination compensation loss is developed to capture illumination from the external guided face image effectively. Extensive experiments demonstrate that our method achieves authentic HR face images in a uniform illumination condition with a 16× magnification factor and outperforms state-of-the-art methods qualitatively and quantitatively.
Collapse
|
7
|
Zhang Y, Yu X, Lu X, Liu P. Pro-UIGAN: Progressive Face Hallucination From Occluded Thumbnails. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:3236-3250. [PMID: 35439132 DOI: 10.1109/tip.2022.3167280] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In this paper, we study the task of hallucinating an authentic high-resolution (HR) face from an occluded thumbnail. We propose a multi-stage Progressive Upsampling and Inpainting Generative Adversarial Network, dubbed Pro-UIGAN, which exploits facial geometry priors to replenish and upsample ( 8× ) the occluded and tiny faces ( 16×16 pixels). Pro-UIGAN iteratively (1) estimates facial geometry priors for low-resolution (LR) faces and (2) acquires non-occluded HR face images under the guidance of the estimated priors. Our multi-stage hallucination network upsamples and inpaints occluded LR faces via a coarse-to-fine fashion, significantly reducing undesirable artifacts and blurriness. Specifically, we design a novel cross-modal attention module for facial priors estimation, in which an input face and its landmark features are formulated as queries and keys, respectively. Such a design encourages joint feature learning across the input facial and landmark features, and deep feature correspondences will be discovered by attention. Thus, facial appearance features and facial geometry priors are learned in a mutually beneficial manner. Extensive experiments show that our Pro-UIGAN attains visually pleasing completed HR faces, thus facilitating downstream tasks, i.e., face alignment, face parsing, face recognition as well as expression classification.
Collapse
|
8
|
Jiang K, Wang Z, Yi P, Lu T, Jiang J, Xiong Z. Dual-Path Deep Fusion Network for Face Image Hallucination. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:378-391. [PMID: 33074829 DOI: 10.1109/tnnls.2020.3027849] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Along with the performance improvement of deep-learning-based face hallucination methods, various face priors (facial shape, facial landmark heatmaps, or parsing maps) have been used to describe holistic and partial facial features, making the cost of generating super-resolved face images expensive and laborious. To deal with this problem, we present a simple yet effective dual-path deep fusion network (DPDFN) for face image super-resolution (SR) without requiring additional face prior, which learns the global facial shape and local facial components through two individual branches. The proposed DPDFN is composed of three components: a global memory subnetwork (GMN), a local reinforcement subnetwork (LRN), and a fusion and reconstruction module (FRM). In particular, GMN characterize the holistic facial shape by employing recurrent dense residual learning to excavate wide-range context across spatial series. Meanwhile, LRN is committed to learning local facial components, which focuses on the patch-wise mapping relations between low-resolution (LR) and high-resolution (HR) space on local regions rather than the entire image. Furthermore, by aggregating the global and local facial information from the preceding dual-path subnetworks, FRM can generate the corresponding high-quality face image. Experimental results of face hallucination on public face data sets and face recognition on real-world data sets (VGGface and SCFace) show the superiority both on visual effect and objective indicators over the previous state-of-the-art methods.
Collapse
|
9
|
Chen L, Pan J, Jiang J, Zhang J, Han Z, Bao L. Multi-Stage Degradation Homogenization for Super-Resolution of Face Images With Extreme Degradations. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:5600-5612. [PMID: 34110993 DOI: 10.1109/tip.2021.3086595] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Face Super-Resolution (FSR) aims to infer High-Resolution (HR) face images from the captured Low-Resolution (LR) face image with the assistance of external information. Existing FSR methods are less effective for the LR face images captured with serious low-quality since the huge imaging/degradation gap caused by the different imaging scenarios (i.e., the complex practical imaging scenario that generates test LR images, the simple manual imaging degradation that generates the training LR images) is not considered in these algorithms. In this paper, we propose an image homogenization strategy via re-expression to solve this problem. In contrast to existing methods, we propose a homogenization projection in LR space and HR space as compensation for the classical LR/HR projection to formulate the FSR in a multi-stage framework. We then develop a re-expression process to bridge the gap between the complex degradation and the simple degradation, which can remove the heterogeneous factors such as serious noise and blur. To further improve the accuracy of the homogenization, we extract the image patch set that is invariant to degradation changes as Robust Neighbor Resources (RNR), with which these two homogenization projections re-express the input LR images and the initial inferred HR images successively. Both quantitative and qualitative results on the public datasets demonstrate the effectiveness of the proposed algorithm against the state-of-the-art methods.
Collapse
|
10
|
Zhang Y, Tsang IW, Li J, Liu P, Lu X, Yu X. Face Hallucination With Finishing Touches. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:1728-1743. [PMID: 33417545 DOI: 10.1109/tip.2020.3046918] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Obtaining a high-quality frontal face image from a low-resolution (LR) non-frontal face image is primarily important for many facial analysis applications. However, mainstreams either focus on super-resolving near-frontal LR faces or frontalizing non-frontal high-resolution (HR) faces. It is desirable to perform both tasks seamlessly for daily-life unconstrained face images. In this paper, we present a novel Vivid Face Hallucination Generative Adversarial Network (VividGAN) for simultaneously super-resolving and frontalizing tiny non-frontal face images. VividGAN consists of coarse-level and fine-level Face Hallucination Networks (FHnet) and two discriminators, i.e., Coarse-D and Fine-D. The coarse-level FHnet generates a frontal coarse HR face and then the fine-level FHnet makes use of the facial component appearance prior, i.e., fine-grained facial components, to attain a frontal HR face image with authentic details. In the fine-level FHnet, we also design a facial component-aware module that adopts the facial geometry guidance as clues to accurately align and merge the frontal coarse HR face and prior information. Meanwhile, two-level discriminators are designed to capture both the global outline of a face image as well as detailed facial characteristics. The Coarse-D enforces the coarsely hallucinated faces to be upright and complete while the Fine-D focuses on the fine hallucinated ones for sharper details. Extensive experiments demonstrate that our VividGAN achieves photo-realistic frontal HR faces, reaching superior performance in downstream tasks, i.e., face recognition and expression classification, compared with other state-of-the-art methods.
Collapse
|
11
|
Chen C, Gong D, Wang H, Li Z, Wong KYK. Learning Spatial Attention for Face Super-Resolution. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 30:1219-1231. [PMID: 33315560 DOI: 10.1109/tip.2020.3043093] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
General image super-resolution techniques have difficulties in recovering detailed face structures when applying to low resolution face images. Recent deep learning based methods tailored for face images have achieved improved performance by jointly trained with additional task such as face parsing and landmark prediction. However, multi-task learning requires extra manually labeled data. Besides, most of the existing works can only generate relatively low resolution face images (e.g., 128×128 ), and their applications are therefore limited. In this paper, we introduce a novel SPatial Attention Residual Network (SPARNet) built on our newly proposed Face Attention Units (FAUs) for face super-resolution. Specifically, we introduce a spatial attention mechanism to the vanilla residual blocks. This enables the convolutional layers to adaptively bootstrap features related to the key face structures and pay less attention to those less feature-rich regions. This makes the training more effective and efficient as the key face structures only account for a very small portion of the face image. Visualization of the attention maps shows that our spatial attention network can capture the key face structures well even for very low resolution faces (e.g., 16×16 ). Quantitative comparisons on various kinds of metrics (including PSNR, SSIM, identity similarity, and landmark detection) demonstrate the superiority of our method over current state-of-the-arts. We further extend SPARNet with multi-scale discriminators, named as SPARNetHD, to produce high resolution results (i.e., 512×512 ). We show that SPARNetHD trained with synthetic data can not only produce high quality and high resolution outputs for synthetically degraded face images, but also show good generalization ability to real world low quality face images. Codes are available at https://github.com/chaofengc/Face-SPARNet.
Collapse
|
12
|
Yu X, Fernando B, Hartley R, Porikli F. Semantic Face Hallucination: Super-Resolving Very Low-Resolution Face Images with Supplementary Attributes. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2020; 42:2926-2943. [PMID: 31095477 DOI: 10.1109/tpami.2019.2916881] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Given a tiny face image, existing face hallucination methods aim at super-resolving its high-resolution (HR) counterpart by learning a mapping from an exemplary dataset. Since a low-resolution (LR) input patch may correspond to many HR candidate patches, this ambiguity may lead to distorted HR facial details and wrong attributes such as gender reversal and rejuvenation. An LR input contains low-frequency facial components of its HR version while its residual face image, defined as the difference between the HR ground-truth and interpolated LR images, contains the missing high-frequency facial details. We demonstrate that supplementing residual images or feature maps with additional facial attribute information can significantly reduce the ambiguity in face super-resolution. To explore this idea, we develop an attribute-embedded upsampling network, which consists of an upsampling network and a discriminative network. The upsampling network is composed of an autoencoder with skip-connections, which incorporates facial attribute vectors into the residual features of LR inputs at the bottleneck of the autoencoder, and deconvolutional layers used for upsampling. The discriminative network is designed to examine whether super-resolved faces contain the desired attributes or not and then its loss is used for updating the upsampling network. In this manner, we can super-resolve tiny (16×16 pixels) unaligned face images with a large upscaling factor of 8× while reducing the uncertainty of one-to-many mappings remarkably. By conducting extensive evaluations on a large-scale dataset, we demonstrate that our method achieves superior face hallucination results and outperforms the state-of-the-art.
Collapse
|
13
|
Shi Y, Li G, Cao Q, Wang K, Lin L. Face Hallucination by Attentive Sequence Optimization with Reinforcement Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2020; 42:2809-2824. [PMID: 31071019 DOI: 10.1109/tpami.2019.2915301] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Face hallucination is a domain-specific super-resolution problem that aims to generate a high-resolution (HR) face image from a low-resolution (LR) input. In contrast to the existing patch-wise super-resolution models that divide a face image into regular patches and independently apply LR to HR mapping to each patch, we implement deep reinforcement learning and develop a novel attention-aware face hallucination (Attention-FH) framework, which recurrently learns to attend a sequence of patches and performs facial part enhancement by fully exploiting the global interdependency of the image. Specifically, our proposed framework incorporates two components: a recurrent policy network for dynamically specifying a new attended region at each time step based on the status of the super-resolved image and the past attended region sequence, and a local enhancement network for selected patch hallucination and global state updating. The Attention-FH model jointly learns the recurrent policy network and local enhancement network through maximizing a long-term reward that reflects the hallucination result with respect to the whole HR image. Extensive experiments demonstrate that our Attention-FH significantly outperforms the state-of-the-art methods on in-the-wild face images with large pose and illumination variations.
Collapse
|
14
|
|
15
|
Yu X, Shiri F, Ghanem B, Porikli F. Can We See More? Joint Frontalization and Hallucination of Unaligned Tiny Faces. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2020; 42:2148-2164. [PMID: 31056489 DOI: 10.1109/tpami.2019.2914039] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In popular TV programs (such as CSI), a very low-resolution face image of a person, who is not even looking at the camera in many cases, is digitally super-resolved to a degree that suddenly the person's identity is made visible and recognizable. Of course, we suspect that this is merely a cinematographic special effect and such a magical transformation of a single image is not technically possible. Or, is it? In this paper, we push the boundaries of super-resolving (hallucinating to be more accurate) a tiny, non-frontal face image to understand how much of this is possible by leveraging the availability of large datasets and deep networks. To this end, we introduce a novel Transformative Adversarial Neural Network (TANN) to jointly frontalize very-low resolution (i.e., 16 × 16 pixels) out-of-plane rotated face images (including profile views) and aggressively super-resolve them (8×), regardless of their original poses and without using any 3D information. TANN is composed of two components: a transformative upsampling network which embodies encoding, spatial transformation and deconvolutional layers, and a discriminative network that enforces the generated high-resolution frontal faces to lie on the same manifold as real frontal face images. We evaluate our method on a large set of synthesized non-frontal face images to assess its reconstruction performance. Extensive experiments demonstrate that TANN generates both qualitatively and quantitatively superior results achieving over 4 dB improvement over the state-of-the-art.
Collapse
|
16
|
Peng C, Wang N, Li J, Gao X. Universal Face Photo-Sketch Style Transfer via Multiview Domain Translation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; PP:8519-8534. [PMID: 32813659 DOI: 10.1109/tip.2020.3016502] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Face photo-sketch style transfer aims to convert a representation of a face from the photo (or sketch) domain to the sketch (respectively, photo) domain while preserving the character of the subject. It has wide-ranging applications in law enforcement, forensic investigation and digital entertainment. However, conventional face photo-sketch synthesis methods usually require training images from both the source domain and the target domain, and are limited in that they cannot be applied to universal conditions where collecting training images in the source domain that match the style of the test image is unpractical. This problem entails two major challenges: 1) designing an effective and robust domain translation model for the universal situation in which images of the source domain needed for training are unavailable, and 2) preserving the facial character while performing a transfer to the style of an entire image collection in the target domain. To this end, we present a novel universal face photo-sketch style transfer method that does not need any image from the source domain for training. The regression relationship between an input test image and the entire training image collection in the target domain is inferred via a deep domain translation framework, in which a domain-wise adaption term and a local consistency adaption term are developed. To improve the robustness of the style transfer process, we propose a multiview domain translation method that flexibly leverages a convolutional neural network representation with hand-crafted features in an optimal way. Qualitative and quantitative comparisons are provided for universal unconstrained conditions of unavailable training images from the source domain, demonstrating the effectiveness and superiority of our method for universal face photo-sketch style transfer.
Collapse
|
17
|
Fitzpatrick M, Reis GM, Anderson J, Bobadilla L, Al Sabban W, Smith RN. Development of environmental niche models for use in underwater vehicle navigation. IET CYBER-SYSTEMS AND ROBOTICS 2020. [DOI: 10.1049/iet-csr.2019.0042] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Affiliation(s)
| | | | - Jacob Anderson
- Computer Science Florida International University Miami FL USA
| | | | - Wesam Al Sabban
- Computer & Information Systems Umm Al Qura University Makkah Saudi Arabia
| | - Ryan N. Smith
- Physics & Engineering Fort Lewis College Durango CO USA
| |
Collapse
|
18
|
|
19
|
A New Integrated Approach Based on the Iterative Super-Resolution Algorithm and Expectation Maximization for Face Hallucination. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10020718] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
This paper proposed and verified a new integrated approach based on the iterative super-resolution algorithm and expectation-maximization for face hallucination, which is a process of converting a low-resolution face image to a high-resolution image. The current sparse representation for super resolving generic image patches is not suitable for global face images due to its lower accuracy and time-consumption. To solve this, in the new method, training global face sparse representation was used to reconstruct images with misalignment variations after the local geometric co-occurrence matrix. In the testing phase, we proposed a hybrid method, which is a combination of the sparse global representation and the local linear regression using the Expectation Maximization (EM) algorithm. Therefore, this work recovered the high-resolution image of a corresponding low-resolution image. Experimental validation suggested improvement of the overall accuracy of the proposed method with fast identification of high-resolution face images without misalignment.
Collapse
|
20
|
Yu X, Porikli F, Fernando B, Hartley R. Hallucinating Unaligned Face Images by Multiscale Transformative Discriminative Networks. Int J Comput Vis 2019. [DOI: 10.1007/s11263-019-01254-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
21
|
Grm K, Scheirer WJ, Struc V. Face Hallucination Using Cascaded Super-Resolution and Identity Priors. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:2150-2165. [PMID: 31613762 DOI: 10.1109/tip.2019.2945835] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this paper we address the problem of hallucinating high-resolution facial images from low-resolution inputs at high magnification factors. We approach this task with convolutional neural networks (CNNs) and propose a novel (deep) face hallucination model that incorporates identity priors into the learning procedure. The model consists of two main parts: i) a cascaded super-resolution network that upscales the low-resolution facial images, and ii) an ensemble of face recognition models that act as identity priors for the super-resolution network during training. Different from most competing super-resolution techniques that rely on a single model for upscaling (even with large magnification factors), our network uses a cascade of multiple SR models that progressively upscale the low-resolution images using steps of 2× . This characteristic allows us to apply supervision signals (target appearances) at different resolutions and incorporate identity constraints at multiple-scales. The proposed C-SRIP model (Cascaded Super Resolution with Identity Priors) is able to upscale (tiny) low-resolution images captured in unconstrained conditions and produce visually convincing results for diverse low-resolution inputs. We rigorously evaluate the proposed model on the Labeled Faces in the Wild (LFW), Helen and CelebA datasets and report superior performance compared to the existing state-of-the-art.
Collapse
|
22
|
Shao WZ, Xu JJ, Chen L, Ge Q, Wang LQ, Bao BK, Li HB. On potentials of regularized Wasserstein generative adversarial networks for realistic hallucination of tiny faces. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.07.046] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
23
|
|
24
|
Face hallucination through differential evolution parameter map learning with facial structure prior. Inf Sci (N Y) 2019. [DOI: 10.1016/j.ins.2018.12.064] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
25
|
Sadr J, Krowicki L. Face perception loves a challenge: Less information sparks more attraction. Vision Res 2019; 157:61-83. [DOI: 10.1016/j.visres.2019.01.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Revised: 01/04/2019] [Accepted: 01/14/2019] [Indexed: 10/26/2022]
|
26
|
|
27
|
|
28
|
Yu X, Porikli F. Imagining the Unimaginable Faces by Deconvolutional Networks. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:2747-2761. [PMID: 29553927 DOI: 10.1109/tip.2018.2808840] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
We tackle the challenge of constructing 64 pixels for each individual pixel of a thumbnail face image. We show that such an aggressive super-resolution objective can be attained by taking advantage of the global context and making the best use of the prior information portrayed by the image class. Our input image is so small (e.g., pixels) that it can be considered as a patch of itself. Thus, conventional patch-matching-based super-resolution solutions are unsuitable. In order to enhance the resolution while enforcing the global context, we incorporate a pixel-wise appearance similarity objective into a deconvolutional neural network, which allows efficient learning of mappings between low-resolution input images and their high-resolution counterparts in the training data set. Furthermore, the deconvolutional network blends the learned high-resolution constituent parts in an authentic manner, where the face structure is naturally imposed and the global context is preserved. To account for the possible artifacts in upsampled feature maps, we employ a sub-network composed of additional convolutional layers. During training, we use roughly aligned images (only eye locations), yet demonstrate that our network has the capacity to super-resolve face images regardless of pose and facial expression variations. This significantly reduces the requirement of precisely face alignments in the data set. Owing to the network topology we apply, our method is robust to translational misalignments. In addition, our method is able to upsample rotational unaligned faces with data augmentation. Our extensive experimental analysis manifests that our method achieves more appealing and superior results than the state of the art.
Collapse
|
29
|
Shi J, Liu X, Zong Y, Qi C, Zhao G. Hallucinating Face Image by Regularization Models in High-Resolution Feature Space. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:2980-2995. [PMID: 29994064 DOI: 10.1109/tip.2018.2813163] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In this paper, we propose two novel regularization models in patch-wise and pixel-wise respectively, which are efficient to reconstruct high-resolution (HR) face image from low-resolution (LR) input. Unlike the conventional patch-based models which depend on the assumption of local geometry consistency in LR and HR spaces, the proposed method directly regularizes the relationship between the target patch and corresponding training set in the HR space. It avoids to deal with the tough problem of preserving local geometry in various resolutions. Taking advantage of kernel function in efficiently describing intrinsic features, we further conduct the patch-based reconstruction model in the high-dimensional kernel space for capturing nonlinear characteristics. Meanwhile, a pixel-based model is proposed to regularize the relationship of pixels in the local neighborhood, which can be employed to enhance the fuzzy details in the target HR face image. It privileges the reconstruction of pixels along the dominant orientation of structure, which is useful for preserving high-frequency information on complex edges. Finally, we combine the two reconstruction models into a unified framework. The output HR face image can be finally optimized by performing an iterative procedure. Experimental results demonstrate that the proposed face hallucination method produces superior performance than the state-of-the-art methods.
Collapse
|
30
|
Zeng X, Huang H, Qi C. Expanding Training Data for Facial Image Super-Resolution. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:716-729. [PMID: 28166514 DOI: 10.1109/tcyb.2017.2655027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The quality of training data is very important for learning-based facial image super-resolution (SR). The more similarity between training data and testing input is, the better SR results we can have. To generate a better training set of low/high resolution training facial images for a particular testing input, this paper is the first work that proposes expanding the training data for improving facial image SR. To this end, observing that facial images are highly structured, we propose three constraints, i.e., the local structure constraint, the correspondence constraint and the similarity constraint, to generate new training data, where local patches are expanded with different expansion parameters. The expanded training data can be used for both patch-based facial SR methods and global facial SR methods. Extensive testings on benchmark databases and real world images validate the effectiveness of training data expansion on improving the SR quality.
Collapse
|
31
|
|
32
|
Romano Y, Elad M. Example-Based Image Synthesis via Randomized Patch-Matching. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:220-235. [PMID: 28910768 DOI: 10.1109/tip.2017.2750419] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Image and texture synthesis is a challenging task that has long been drawing attention in the fields of image processing, graphics, and machine learning. This problem consists of modeling the desired type of images, either through training examples or via a parametric modeling, and then generating images that belong to the same statistical origin. This paper addresses the image synthesis task, focusing on two specific families of images-handwritten digits and face images. This paper offers two main contributions. First, we suggest a simple and intuitive algorithm capable of generating such images in a unified way. The proposed approach taken is pyramidal, consisting of upscaling and refining the estimated image several times. For each upscaling stage, the algorithm randomly draws small patches from a patch database and merges these to form a coherent and novel image with high visual quality. The second contribution is a general framework for the evaluation of the generation performance, which combines three aspects: the likelihood, the originality, and the spread of the synthesized images. We assess the proposed synthesis scheme and show that the results are similar in nature, and yet different from the ones found in the training set, suggesting that true synthesis effect has been obtained.
Collapse
|
33
|
|
34
|
Yang W, Feng J, Yang J, Zhao F, Liu J, Guo Z, Yan S. Deep Edge Guided Recurrent Residual Learning for Image Super-Resolution. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:5895-5907. [PMID: 28910762 DOI: 10.1109/tip.2017.2750403] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
In this paper, we consider the image super-resolution (SR) problem. The main challenge of image SR is to recover high-frequency details of a low-resolution (LR) image that are important for human perception. To address this essentially ill-posed problem, we introduce a Deep Edge Guided REcurrent rEsidual (DEGREE) network to progressively recover the high-frequency details. Different from most of the existing methods that aim at predicting high-resolution (HR) images directly, the DEGREE investigates an alternative route to recover the difference between a pair of LR and HR images by recurrent residual learning. DEGREE further augments the SR process with edge-preserving capability, namely the LR image and its edge map can jointly infer the sharp edge details of the HR image during the recurrent recovery process. To speed up its training convergence rate, by-pass connections across the multiple layers of DEGREE are constructed. In addition, we offer an understanding on DEGREE from the view-point of sub-band frequency decomposition on image signal and experimentally demonstrate how the DEGREE can recover different frequency bands separately. Extensive experiments on three benchmark data sets clearly demonstrate the superiority of DEGREE over the well-established baselines and DEGREE also provides new state-of-the-arts on these data sets. We also present addition experiments for JPEG artifacts reduction to demonstrate the good generality and flexibility of our proposed DEGREE network to handle other image processing tasks.
Collapse
|
35
|
|
36
|
Cao F, Cai M, Tan Y, Zhao J. Image Super-Resolution via Adaptive Regularization and Sparse Representation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2016; 27:1550-1561. [PMID: 26766382 DOI: 10.1109/tnnls.2015.2512563] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Previous studies have shown that image patches can be well represented as a sparse linear combination of elements from an appropriately selected over-complete dictionary. Recently, single-image super-resolution (SISR) via sparse representation using blurred and downsampled low-resolution images has attracted increasing interest, where the aim is to obtain the coefficients for sparse representation by solving an l0 or l1 norm optimization problem. The l0 optimization is a nonconvex and NP-hard problem, while the l1 optimization usually requires many more measurements and presents new challenges even when the image is the usual size, so we propose a new approach for SISR recovery based on regularization nonconvex optimization. The proposed approach is potentially a powerful method for recovering SISR via sparse representations, and it can yield a sparser solution than the l1 regularization method. We also consider the best choice for lp regularization with all p in (0, 1), where we propose a scheme that adaptively selects the norm value for each image patch. In addition, we provide a method for estimating the best value of the regularization parameter λ adaptively, and we discuss an alternate iteration method for selecting p and λ . We perform experiments, which demonstrates that the proposed regularization nonconvex optimization method can outperform the convex optimization method and generate higher quality images.
Collapse
|
37
|
Dong C, Loy CC, He K, Tang X. Image Super-Resolution Using Deep Convolutional Networks. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2016; 38:295-307. [PMID: 26761735 DOI: 10.1109/tpami.2015.2439281] [Citation(s) in RCA: 1478] [Impact Index Per Article: 184.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
We propose a deep learning method for single image super-resolution (SR). Our method directly learns an end-to-end mapping between the low/high-resolution images. The mapping is represented as a deep convolutional neural network (CNN) that takes the low-resolution image as the input and outputs the high-resolution one. We further show that traditional sparse-coding-based SR methods can also be viewed as a deep convolutional network. But unlike traditional methods that handle each component separately, our method jointly optimizes all layers. Our deep CNN has a lightweight structure, yet demonstrates state-of-the-art restoration quality, and achieves fast speed for practical on-line usage. We explore different network structures and parameter settings to achieve trade-offs between performance and speed. Moreover, we extend our network to cope with three color channels simultaneously, and show better overall reconstruction quality.
Collapse
|
38
|
Ultra-Resolving Face Images by Discriminative Generative Networks. COMPUTER VISION – ECCV 2016 2016. [DOI: 10.1007/978-3-319-46454-1_20] [Citation(s) in RCA: 101] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
39
|
Zhu S, Liu S, Loy CC, Tang X. Deep Cascaded Bi-Network for Face Hallucination. COMPUTER VISION – ECCV 2016 2016. [DOI: 10.1007/978-3-319-46454-1_37] [Citation(s) in RCA: 119] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|
40
|
Zhang S, Gao X, Wang N, Li J. Robust Face Sketch Style Synthesis. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:220-232. [PMID: 26595919 DOI: 10.1109/tip.2015.2501755] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Heterogeneous image conversion is a critical issue in many computer vision tasks, among which example-based face sketch style synthesis provides a convenient way to make artistic effects for photos. However, existing face sketch style synthesis methods generate stylistic sketches depending on many photo-sketch pairs. This requirement limits the generalization ability of these methods to produce arbitrarily stylistic sketches. To handle such a drawback, we propose a robust face sketch style synthesis method, which can convert photos to arbitrarily stylistic sketches based on only one corresponding template sketch. In the proposed method, a sparse representation-based greedy search strategy is first applied to estimate an initial sketch. Then, multi-scale features and Euclidean distance are employed to select candidate image patches from the initial estimated sketch and the template sketch. In order to further refine the obtained candidate image patches, a multi-feature-based optimization model is introduced. Finally, by assembling the refined candidate image patches, the completed face sketch is obtained. To further enhance the quality of synthesized sketches, a cascaded regression strategy is adopted. Compared with the state-of-the-art face sketch synthesis methods, experimental results on several commonly used face sketch databases and celebrity photos demonstrate the effectiveness of the proposed method.
Collapse
|
41
|
Lekadir K, Lange M, Zimmer VA, Hoogendoorn C, Frangi AF. Statistically-driven 3D fiber reconstruction and denoising from multi-slice cardiac DTI using a Markov random field model. Med Image Anal 2016; 27:105-16. [DOI: 10.1016/j.media.2015.03.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Revised: 11/10/2014] [Accepted: 03/14/2015] [Indexed: 11/29/2022]
|
42
|
|
43
|
Bhatt HS, Singh R, Vatsa M, Ratha NK. Improving cross-resolution face matching using ensemble-based co-transfer learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2014; 23:5654-5669. [PMID: 25314702 DOI: 10.1109/tip.2014.2362658] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Face recognition algorithms are generally trained for matching high-resolution images and they perform well for similar resolution test data. However, the performance of such systems degrades when a low-resolution face image captured in unconstrained settings, such as videos from cameras in a surveillance scenario, are matched with high-resolution gallery images. The primary challenge, here, is to extract discriminating features from limited biometric content in low-resolution images and match it to information rich high-resolution face images. The problem of cross-resolution face matching is further alleviated when there is limited labeled positive data for training face recognition algorithms. In this paper, the problem of cross-resolution face matching is addressed where low-resolution images are matched with high-resolution gallery. A co-transfer learning framework is proposed, which is a cross-pollination of transfer learning and co-training paradigms and is applied for cross-resolution face matching. The transfer learning component transfers the knowledge that is learnt while matching high-resolution face images during training to match low-resolution probe images with high-resolution gallery during testing. On the other hand, co-training component facilitates this transfer of knowledge by assigning pseudolabels to unlabeled probe instances in the target domain. Amalgamation of these two paradigms in the proposed ensemble framework enhances the performance of cross-resolution face recognition. Experiments on multiple face databases show the efficacy of the proposed algorithm and compare with some existing algorithms and a commercial system. In addition, several high profile real-world cases have been used to demonstrate the usefulness of the proposed approach in addressing the tough challenges.
Collapse
|
44
|
Fu S, He H, Hou ZG. Learning Race from Face: A Survey. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2014; 36:2483-2509. [PMID: 26353153 DOI: 10.1109/tpami.2014.2321570] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Faces convey a wealth of social signals, including race, expression, identity, age and gender, all of which have attracted increasing attention from multi-disciplinary research, such as psychology, neuroscience, computer science, to name a few. Gleaned from recent advances in computer vision, computer graphics, and machine learning, computational intelligence based racial face analysis has been particularly popular due to its significant potential and broader impacts in extensive real-world applications, such as security and defense, surveillance, human computer interface (HCI), biometric-based identification, among others. These studies raise an important question: How implicit, non-declarative racial category can be conceptually modeled and quantitatively inferred from the face? Nevertheless, race classification is challenging due to its ambiguity and complexity depending on context and criteria. To address this challenge, recently, significant efforts have been reported toward race detection and categorization in the community. This survey provides a comprehensive and critical review of the state-of-the-art advances in face-race perception, principles, algorithms, and applications. We first discuss race perception problem formulation and motivation, while highlighting the conceptual potentials of racial face processing. Next, taxonomy of feature representational models, algorithms, performance and racial databases are presented with systematic discussions within the unified learning scenario. Finally, in order to stimulate future research in this field, we also highlight the major opportunities and challenges, as well as potentially important cross-cutting themes and research directions for the issue of learning race from face.
Collapse
|
45
|
Kiechle M, Habigt T, Hawe S, Kleinsteuber M. A Bimodal Co-sparse Analysis Model for Image Processing. Int J Comput Vis 2014. [DOI: 10.1007/s11263-014-0786-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
46
|
|
47
|
Chen YW, Sasatani S, Han XH. Alignment-free and high-frequency compensation in face hallucination. ScientificWorldJournal 2014; 2014:903160. [PMID: 24693253 PMCID: PMC3944647 DOI: 10.1155/2014/903160] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2013] [Accepted: 11/21/2013] [Indexed: 11/17/2022] Open
Abstract
Face hallucination is one of learning-based super resolution techniques, which is focused on resolution enhancement of facial images. Though face hallucination is a powerful and useful technique, some detailed high-frequency components cannot be recovered. It also needs accurate alignment between training samples. In this paper, we propose a high-frequency compensation framework based on residual images for face hallucination method in order to improve the reconstruction performance. The basic idea of proposed framework is to reconstruct or estimate a residual image, which can be used to compensate the high-frequency components of the reconstructed high-resolution image. Three approaches based on our proposed framework are proposed. We also propose a patch-based alignment-free face hallucination. In the patch-based face hallucination, we first segment facial images into overlapping patches and construct training patch pairs. For an input low-resolution (LR) image, the overlapping patches are also used to obtain the corresponding high-resolution (HR) patches by face hallucination. The whole HR image can then be reconstructed by combining all of the HR patches. Experimental results show that the high-resolution images obtained using our proposed approaches can improve the quality of those obtained by conventional face hallucination method even if the training data set is unaligned.
Collapse
Affiliation(s)
- Yen-Wei Chen
- College of Computer Science and Information Technology, Central South University of Forestry and Technology, Hunan 410004, China
- College of Information Science and Engineering, Ritsumeikan University, Shiga 525-8577, Japan
| | - So Sasatani
- College of Information Science and Engineering, Ritsumeikan University, Shiga 525-8577, Japan
| | - Xian-Hua Han
- College of Information Science and Engineering, Ritsumeikan University, Shiga 525-8577, Japan
| |
Collapse
|
48
|
Biswas S, Aggarwal G, Flynn PJ, Bowyer KW. Pose-robust recognition of low-resolution face images. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2013; 35:3037-3049. [PMID: 24136439 DOI: 10.1109/tpami.2013.68] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Face images captured by surveillance cameras usually have poor resolution in addition to uncontrolled poses and illumination conditions, all of which adversely affect the performance of face matching algorithms. In this paper, we develop a completely automatic, novel approach for matching surveillance quality facial images to high-resolution images in frontal pose, which are often available during enrollment. The proposed approach uses multidimensional scaling to simultaneously transform the features from the poor quality probe images and the high-quality gallery images in such a manner that the distances between them approximate the distances had the probe images been captured in the same conditions as the gallery images. Tensor analysis is used for facial landmark localization in the low-resolution uncontrolled probe images for computing the features. Thorough evaluation on the Multi-PIE dataset and comparisons with state-of-the-art super-resolution and classifier-based approaches are performed to illustrate the usefulness of the proposed approach. Experiments on surveillance imagery further signify the applicability of the framework. We also show the usefulness of the proposed approach for the application of tracking and recognition in surveillance videos.
Collapse
|
49
|
Wang N, Tao D, Gao X, Li X, Li J. Transductive face sketch-photo synthesis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2013; 24:1364-1376. [PMID: 24808574 DOI: 10.1109/tnnls.2013.2258174] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Face sketch-photo synthesis plays a critical role in many applications, such as law enforcement and digital entertainment. Recently, many face sketch-photo synthesis methods have been proposed under the framework of inductive learning, and these have obtained promising performance. However, these inductive learning-based face sketch-photo synthesis methods may result in high losses for test samples, because inductive learning minimizes the empirical loss for training samples. This paper presents a novel transductive face sketch-photo synthesis method that incorporates the given test samples into the learning process and optimizes the performance on these test samples. In particular, it defines a probabilistic model to optimize both the reconstruction fidelity of the input photo (sketch) and the synthesis fidelity of the target output sketch (photo), and efficiently optimizes this probabilistic model by alternating optimization. The proposed transductive method significantly reduces the expected high loss and improves the synthesis performance for test samples. Experimental results on the Chinese University of Hong Kong face sketch data set demonstrate the effectiveness of the proposed method by comparing it with representative inductive learning-based face sketch-photo synthesis methods.
Collapse
|
50
|
|