51
|
Jia K, Wang X, Tang X. Image transformation based on learning dictionaries across image spaces. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2013; 35:367-380. [PMID: 22529324 DOI: 10.1109/tpami.2012.95] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
In this paper, we propose a framework of transforming images from a source image space to a target image space, based on learning coupled dictionaries from a training set of paired images. The framework can be used for applications such as image super-resolution and estimation of image intrinsic components (shading and albedo). It is based on a local parametric regression approach, using sparse feature representations over learned coupled dictionaries across the source and target image spaces. After coupled dictionary learning, sparse coefficient vectors of training image patch pairs are partitioned into easily retrievable local clusters. For any test image patch, we can fast index into its closest local cluster and perform a local parametric regression between the learned sparse feature spaces. The obtained sparse representation (together with the learned target space dictionary) provides multiple constraints for each pixel of the target image to be estimated. The final target image is reconstructed based on these constraints. The contributions of our proposed framework are three-fold. 1) We propose a concept of coupled dictionary learning based on coupled sparse coding which requires the sparse coefficient vectors of a pair of corresponding source and target image patches to have the same support, i.e., the same indices of nonzero elements. 2) We devise a space partitioning scheme to divide the high-dimensional but sparse feature space into local clusters. The partitioning facilitates extremely fast retrieval of closest local clusters for query patches. 3) Benefiting from sparse feature-based image transformation, our method is more robust to corrupted input data, and can be considered as a simultaneous image restoration and transformation process. Experiments on intrinsic image estimation and super-resolution demonstrate the effectiveness and efficiency of our proposed method.
Collapse
Affiliation(s)
- Kui Jia
- Advanced Digital Sciences Center, University of Illinois at Urbana-Champaign, IL, USA.
| | | | | |
Collapse
|
52
|
|
53
|
Biswas S, Bowyer KW, Flynn PJ. Multidimensional scaling for matching low-resolution face images. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2012; 34:2019-2030. [PMID: 22201067 DOI: 10.1109/tpami.2011.278] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Face recognition performance degrades considerably when the input images are of Low Resolution (LR), as is often the case for images taken by surveillance cameras or from a large distance. In this paper, we propose a novel approach for matching low-resolution probe images with higher resolution gallery images, which are often available during enrollment, using Multidimensional Scaling (MDS). The ideal scenario is when both the probe and gallery images are of high enough resolution to discriminate across different subjects. The proposed method simultaneously embeds the low-resolution probe images and the high-resolution gallery images in a common space such that the distance between them in the transformed space approximates the distance had both the images been of high resolution. The two mappings are learned simultaneously from high-resolution training images using an iterative majorization algorithm. Extensive evaluation of the proposed approach on the Multi-PIE data set with probe image resolution as low as 8 6 pixels illustrates the usefulness of the method. We show that the proposed approach improves the matching performance significantly as compared to performing matching in the low-resolution domain or using super-resolution techniques to obtain a higher resolution test image prior to recognition. Experiments on low-resolution surveillance images from the Surveillance Cameras Face Database further highlight the effectiveness of the approach.
Collapse
Affiliation(s)
- Soma Biswas
- Department of Computer Science and Engineering, University of Notre Dame, 384 Fitzpatrick Hall of Engineering, Notre Dame, IN 46556, USA.
| | | | | |
Collapse
|
54
|
Woo J, Murano EZ, Stone M, Prince JL. Reconstruction of high-resolution tongue volumes from MRI. IEEE Trans Biomed Eng 2012; 59:3511-24. [PMID: 23033324 DOI: 10.1109/tbme.2012.2218246] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Magnetic resonance images of the tongue have been used in both clinical studies and scientific research to reveal tongue structure. In order to extract different features of the tongue and its relation to the vocal tract, it is beneficial to acquire three orthogonal image volumes--e.g., axial, sagittal, and coronal volumes. In order to maintain both low noise and high visual detail and minimize the blurred effect due to involuntary motion artifacts, each set of images is acquired with an in-plane resolution that is much better than the through-plane resolution. As a result, any one dataset, by itself, is not ideal for automatic volumetric analyses such as segmentation, registration, and atlas building or even for visualization when oblique slices are required. This paper presents a method of superresolution volume reconstruction of the tongue that generates an isotropic image volume using the three orthogonal image volumes. The method uses preprocessing steps that include registration and intensity matching and a data combination approach with the edge-preserving property carried out by Markov random field optimization. The performance of the proposed method was demonstrated on 15 clinical datasets, preserving anatomical details and yielding superior results when compared with different reconstruction methods as visually and quantitatively assessed.
Collapse
Affiliation(s)
- Jonghye Woo
- University of Maryland, Baltimore, MD 21201, USA.
| | | | | | | |
Collapse
|
55
|
Kong AWK. IrisCode decompression based on the dependence between its bit pairs. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2012; 34:506-520. [PMID: 21808085 DOI: 10.1109/tpami.2011.159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
IrisCode is an iris recognition algorithm developed in 1993 and continuously improved by Daugman. Understanding IrisCode's properties is extremely important because over 60 million people have been mathematically enrolled by the algorithm. In this paper, IrisCode is proved to be a compression algorithm, which is to say its templates are compressed iris images. In our experiments, the compression ratio of these images is 1:655. An algorithm is designed to perform this decompression by exploiting a graph composed of the bit pairs in IrisCode, prior knowledge from iris image databases, and the theoretical results. To remove artifacts, two postprocessing techniques that carry out optimization in the Fourier domain are developed. Decompressed iris images obtained from two public iris image databases are evaluated by visual comparison, two objective image quality assessment metrics, and eight iris recognition methods. The experimental results show that the decompressed iris images retain iris texture that their quality is roughly equivalent to a JPEG quality factor of 10 and that the iris recognition methods can match the original images with the decompressed images. This paper also discusses the impacts of these theoretical and experimental findings on privacy and security.
Collapse
|
56
|
Zou WWW, Yuen PC. Very low resolution face recognition problem. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2012; 21:327-340. [PMID: 21775262 DOI: 10.1109/tip.2011.2162423] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
This paper addresses the very low resolution (VLR) problem in face recognition in which the resolution of the face image to be recognized is lower than 16 × 16. With the increasing demand of surveillance camera-based applications, the VLR problem happens in many face application systems. Existing face recognition algorithms are not able to give satisfactory performance on the VLR face image. While face super-resolution (SR) methods can be employed to enhance the resolution of the images, the existing learning-based face SR methods do not perform well on such a VLR face image. To overcome this problem, this paper proposes a novel approach to learn the relationship between the high-resolution image space and the VLR image space for face SR. Based on this new approach, two constraints, namely, new data and discriminative constraints, are designed for good visuality and face recognition applications under the VLR problem, respectively. Experimental results show that the proposed SR algorithm based on relationship learning outperforms the existing algorithms in public face databases.
Collapse
Affiliation(s)
- Wilman W W Zou
- Department of Computer Science, Hong Kong Baptist University, Kowloon, Hong Kong.
| | | |
Collapse
|
57
|
|
58
|
|
59
|
Zhang W, Cham WK. Hallucinating face in the DCT domain. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2011; 20:2769-2779. [PMID: 21486718 DOI: 10.1109/tip.2011.2142001] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
In this paper, we propose a novel learning-based face hallucination framework built in the DCT domain, which can produce a high-resolution face image from a single low-resolution one. The problem is formulated as inferring the DCT coefficients in frequency domain instead of estimating pixel intensities in spatial domain. Our study shows that DC coefficients can be estimated fairly accurately by simple interpolation-based methods. AC coefficients, which contain the information of local features of face image, cannot be estimated well using interpolation. A simple but effective learning-based inference model is proposed to infer the ac coefficients. Experiments have been conducted to demonstrate the effectiveness of the proposed method in producing high quality hallucinated face images.
Collapse
Affiliation(s)
- Wei Zhang
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720, USA.
| | | |
Collapse
|
60
|
Sun J, Sun J, Xu Z, Shum HY. Gradient profile prior and its applications in image super-resolution and enhancement. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2011; 20:1529-1542. [PMID: 21118774 DOI: 10.1109/tip.2010.2095871] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
In this paper, we propose a novel generic image prior-gradient profile prior, which implies the prior knowledge of natural image gradients. In this prior, the image gradients are represented by gradient profiles, which are 1-D profiles of gradient magnitudes perpendicular to image structures. We model the gradient profiles by a parametric gradient profile model. Using this model, the prior knowledge of the gradient profiles are learned from a large collection of natural images, which are called gradient profile prior. Based on this prior, we propose a gradient field transformation to constrain the gradient fields of the high resolution image and the enhanced image when performing single image super-resolution and sharpness enhancement. With this simple but very effective approach, we are able to produce state-of-the-art results. The reconstructed high resolution images or the enhanced images are sharp while have rare ringing or jaggy artifacts.
Collapse
Affiliation(s)
- Jian Sun
- School of Science, Xi’an Jiaotong University, Xi’an 710049, China.
| | | | | | | |
Collapse
|
61
|
Yang J, Wright J, Huang TS, Ma Y. Image super-resolution via sparse representation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2010; 19:2861-73. [PMID: 20483687 DOI: 10.1109/tip.2010.2050625] [Citation(s) in RCA: 921] [Impact Index Per Article: 65.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
This paper presents a new approach to single-image super-resolution, based on sparse signal representation. Research on image statistics suggests that image patches can be well-represented as a sparse linear combination of elements from an appropriately chosen over-complete dictionary. Inspired by this observation, we seek a sparse representation for each patch of the low-resolution input, and then use the coefficients of this representation to generate the high-resolution output. Theoretical results from compressed sensing suggest that under mild conditions, the sparse representation can be correctly recovered from the downsampled signals. By jointly training two dictionaries for the low- and high-resolution image patches, we can enforce the similarity of sparse representations between the low resolution and high resolution image patch pair with respect to their own dictionaries. Therefore, the sparse representation of a low resolution image patch can be applied with the high resolution image patch dictionary to generate a high resolution image patch. The learned dictionary pair is a more compact representation of the patch pairs, compared to previous approaches, which simply sample a large amount of image patch pairs, reducing the computational cost substantially. The effectiveness of such a sparsity prior is demonstrated for both general image super-resolution and the special case of face hallucination. In both cases, our algorithm generates high-resolution images that are competitive or even superior in quality to images produced by other similar SR methods. In addition, the local sparse modeling of our approach is naturally robust to noise, and therefore the proposed algorithm can handle super-resolution with noisy inputs in a more unified framework.
Collapse
|
62
|
Nie F, Xu D, Tsang IWH, Zhang C. Flexible manifold embedding: a framework for semi-supervised and unsupervised dimension reduction. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2010; 19:1921-1932. [PMID: 20215078 DOI: 10.1109/tip.2010.2044958] [Citation(s) in RCA: 208] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
We propose a unified manifold learning framework for semi-supervised and unsupervised dimension reduction by employing a simple but effective linear regression function to map the new data points. For semi-supervised dimension reduction, we aim to find the optimal prediction labels F for all the training samples X, the linear regression function h(X) and the regression residue F(0) = F - h(X) simultaneously. Our new objective function integrates two terms related to label fitness and manifold smoothness as well as a flexible penalty term defined on the residue F(0). Our Semi-Supervised learning framework, referred to as flexible manifold embedding (FME), can effectively utilize label information from labeled data as well as a manifold structure from both labeled and unlabeled data. By modeling the mismatch between h(X) and F, we show that FME relaxes the hard linear constraint F = h(X) in manifold regularization (MR), making it better cope with the data sampled from a nonlinear manifold. In addition, we propose a simplified version (referred to as FME/U) for unsupervised dimension reduction. We also show that our proposed framework provides a unified view to explain and understand many semi-supervised, supervised and unsupervised dimension reduction techniques. Comprehensive experiments on several benchmark databases demonstrate the significant improvement over existing dimension reduction algorithms.
Collapse
Affiliation(s)
- Feiping Nie
- School of Computer Engineering, Nanyang Technological University, 639798 Singapore
| | | | | | | |
Collapse
|
63
|
Park JS, Lee SW. An example-based face hallucination method for single-frame, low-resolution facial images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2008; 17:1806-1816. [PMID: 18784029 DOI: 10.1109/tip.2008.2001394] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
This paper proposes a face hallucination method for the reconstruction of high-resolution facial images from single-frame, low-resolution facial images. The proposed method has been derived from example-based hallucination methods and morphable face models. First, we propose a recursive error back-projection method to compensate for residual errors, and a region-based reconstruction method to preserve characteristics of local facial regions. Then, we define an extended morphable face model, in which an extended face is composed of the interpolated high-resolution face from a given low-resolution face, and its original high-resolution equivalent. Then, the extended face is separated into an extended shape and an extended texture. We performed various hallucination experiments using the MPI, XM2VTS, and KF databases, compared the reconstruction errors, structural similarity index, and recognition rates, and showed the effects of face detection errors and shape estimation errors. The encouraging results demonstrate that the proposed methods can improve the performance of face recognition systems. Especially the proposed method can enhance the resolution of single-frame, low-resolution facial images.
Collapse
Affiliation(s)
- Jeong-Seon Park
- Department of Multimedia, Chonnam National University, Jeollanam-do, Korea.
| | | |
Collapse
|
64
|
Jia K, Gong S. Generalized face super-resolution. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2008; 17:873-886. [PMID: 18482883 DOI: 10.1109/tip.2008.922421] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Existing learning-based face super-resolution (hallucination) techniques generate high-resolution images of a single facial modality (i.e., at a fixed expression, pose and illumination) given one or set of low-resolution face images as probe. Here, we present a generalized approach based on a hierarchical tensor (multilinear) space representation for hallucinating high-resolution face images across multiple modalities, achieving generalization to variations in expression and pose. In particular, we formulate a unified tensor which can be reduced to two parts: a global image-based tensor for modeling the mappings among different facial modalities, and a local patch-based multiresolution tensor for incorporating high-resolution image details. For realistic hallucination of unregistered low-resolution faces contained in raw images, we develop an automatic face alignment algorithm capable of pixel-wise alignment by iteratively warping the probing face to its projection in the space of training face images. Our experiments show not only performance superiority over existing benchmark face super-resolution techniques on single modal face hallucination, but also novelty of our approach in coping with multimodal hallucination and its robustness in automatic alignment under practical imaging conditions.
Collapse
Affiliation(s)
- Kui Jia
- Shenzhen Institute of Advanced Integration Technology, Chinese Academy of Sciences/Chinese Academy of Hong Kong, Shenzhen, China.
| | | |
Collapse
|
65
|
|