201
|
Automated quantification of ultrasonic fatty liver texture based on curvelet transform and SVD. Biocybern Biomed Eng 2018. [DOI: 10.1016/j.bbe.2017.12.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
202
|
|
203
|
Song TH, Sanchez V, EIDaly H, Rajpoot NM. Dual-Channel Active Contour Model for Megakaryocytic Cell Segmentation in Bone Marrow Trephine Histology Images. IEEE Trans Biomed Eng 2017; 64:2913-2923. [DOI: 10.1109/tbme.2017.2690863] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
204
|
End-to-End Lifelong Learning: a Framework to Achieve Plasticities of both the Feature and Classifier Constructions. Cognit Comput 2017. [DOI: 10.1007/s12559-017-9514-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
205
|
A novel deep learning algorithm for incomplete face recognition: Low-rank-recovery network. Neural Netw 2017; 94:115-124. [DOI: 10.1016/j.neunet.2017.06.013] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2016] [Revised: 06/18/2017] [Accepted: 06/27/2017] [Indexed: 11/20/2022]
|
206
|
McWalter R, Dau T. Cascaded Amplitude Modulations in Sound Texture Perception. Front Neurosci 2017; 11:485. [PMID: 28955191 PMCID: PMC5601004 DOI: 10.3389/fnins.2017.00485] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2017] [Accepted: 08/15/2017] [Indexed: 11/25/2022] Open
Abstract
Sound textures, such as crackling fire or chirping crickets, represent a broad class of sounds defined by their homogeneous temporal structure. It has been suggested that the perception of texture is mediated by time-averaged summary statistics measured from early auditory representations. In this study, we investigated the perception of sound textures that contain rhythmic structure, specifically second-order amplitude modulations that arise from the interaction of different modulation rates, previously described as “beating” in the envelope-frequency domain. We developed an auditory texture model that utilizes a cascade of modulation filterbanks that capture the structure of simple rhythmic patterns. The model was examined in a series of psychophysical listening experiments using synthetic sound textures—stimuli generated using time-averaged statistics measured from real-world textures. In a texture identification task, our results indicated that second-order amplitude modulation sensitivity enhanced recognition. Next, we examined the contribution of the second-order modulation analysis in a preference task, where the proposed auditory texture model was preferred over a range of model deviants that lacked second-order modulation rate sensitivity. Lastly, the discriminability of textures that included second-order amplitude modulations appeared to be perceived using a time-averaging process. Overall, our results demonstrate that the inclusion of second-order modulation analysis generates improvements in the perceived quality of synthetic textures compared to the first-order modulation analysis considered in previous approaches.
Collapse
Affiliation(s)
- Richard McWalter
- Hearing Systems Group, Technical University of DenmarkKongens Lyngby, Denmark
| | - Torsten Dau
- Hearing Systems Group, Technical University of DenmarkKongens Lyngby, Denmark
| |
Collapse
|
207
|
Lan R, Zhou Y. Medical Image Retrieval via Histogram of Compressed Scattering Coefficients. IEEE J Biomed Health Inform 2017; 21:1338-1346. [DOI: 10.1109/jbhi.2016.2623840] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
208
|
Ithapu VK, Kondor R, Johnson SC, Singh V. The Incremental Multiresolution Matrix Factorization Algorithm. PROCEEDINGS. IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION 2017; 2017:692-701. [PMID: 29416293 PMCID: PMC5798492 DOI: 10.1109/cvpr.2017.81] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Multiresolution analysis and matrix factorization are foundational tools in computer vision. In this work, we study the interface between these two distinct topics and obtain techniques to uncover hierarchical block structure in symmetric matrices - an important aspect in the success of many vision problems. Our new algorithm, the incremental multiresolution matrix factorization, uncovers such structure one feature at a time, and hence scales well to large matrices. We describe how this multiscale analysis goes much farther than what a direct "global" factorization of the data can identify. We evaluate the efficacy of the resulting factorizations for relative leveraging within regression tasks using medical imaging data. We also use the factorization on representations learned by popular deep networks, providing evidence of their ability to infer semantic relationships even when they are not explicitly trained to do so. We show that this algorithm can be used as an exploratory tool to improve the network architecture, and within numerous other settings in vision.
Collapse
|
209
|
Unconstrained Still/Video-Based Face Verification with Deep Convolutional Neural Networks. Int J Comput Vis 2017. [DOI: 10.1007/s11263-017-1029-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
210
|
Chhatrala R, Jadhav D. Gait recognition based on curvelet transform and PCANet. PATTERN RECOGNITION AND IMAGE ANALYSIS 2017. [DOI: 10.1134/s1054661817030075] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
211
|
Bharath R, Rajalakshmi P. Deep scattering convolution network based features for ultrasonic fatty liver tissue characterization. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2017; 2017:1982-1985. [PMID: 29060283 DOI: 10.1109/embc.2017.8037239] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Accumulation of excess fat in the liver tissue is the leading cause for dysfunction of liver, which can lead to the diseases from fibrosis to end stage cirrhosis. Hence, early detection of fatty liver becomes crucial in avoiding the liver from permanent failure. Depending on the concentration of fat in the tissue, the liver is classified as Normal, Grade 1, Grade 2 and Grade 3 respectively. The texture of liver tissue in ultrasound image is so specific to the concentration of fat, hence classifying the fatty liver is formulated as a texture discrimination problem. In this paper, we present an automated algorithm for grading the tissue of a fatty liver based on the features obtained from the invariant scattering convolution network (ISCN). ISCN, which involves cascade of modulus complex wavelet transforms and averaging operations results in scattering coefficients (SC), these coefficients will give stable invariant representations and also maps the texture of fatty liver image to a discriminative manifold giving good features for classification. SC are of high dimension and hence a compact representation feature is obtained by summing all the SC coefficients. Summed SC features along with cubic SVM classifier gave an accuracy of 96.6% in automatically categorizing the fatty content present in the tissue of a liver.
Collapse
|
212
|
Eickenberg M, Gramfort A, Varoquaux G, Thirion B. Seeing it all: Convolutional network layers map the function of the human visual system. Neuroimage 2017; 152:184-194. [DOI: 10.1016/j.neuroimage.2016.10.001] [Citation(s) in RCA: 169] [Impact Index Per Article: 24.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2016] [Revised: 09/13/2016] [Accepted: 10/01/2016] [Indexed: 11/27/2022] Open
|
213
|
|
214
|
Porzi L, Bulo SR, Penate-Sanchez A, Ricci E, Moreno-Noguer F. Learning Depth-Aware Deep Representations for Robotic Perception. IEEE Robot Autom Lett 2017. [DOI: 10.1109/lra.2016.2637444] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
215
|
Kooi T, van Ginneken B, Karssemeijer N, den Heeten A. Discriminating solitary cysts from soft tissue lesions in mammography using a pretrained deep convolutional neural network. Med Phys 2017; 44:1017-1027. [DOI: 10.1002/mp.12110] [Citation(s) in RCA: 72] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Revised: 12/16/2016] [Accepted: 01/07/2017] [Indexed: 11/08/2022] Open
Affiliation(s)
- Thijs Kooi
- Department of Radiology and Nuclear Medicine; RadboudUMC; Geert Grooteplein Zuid 10 Nijmegen 6535 The Netherlands
| | - Bram van Ginneken
- Department of Radiology and Nuclear Medicine; RadboudUMC; Geert Grooteplein Zuid 10 Nijmegen 6535 The Netherlands
| | - Nico Karssemeijer
- Department of Radiology and Nuclear Medicine; RadboudUMC; Geert Grooteplein Zuid 10 Nijmegen 6535 The Netherlands
| | - Ard den Heeten
- Department of Radiology; Academic Medical Center Amsterdam; P.O. Box 22660 DD Amsterdam 1100 The Netherlands
| |
Collapse
|
216
|
3D scattering transforms for disease classification in neuroimaging. NEUROIMAGE-CLINICAL 2017; 14:506-517. [PMID: 28289601 PMCID: PMC5338908 DOI: 10.1016/j.nicl.2017.02.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/12/2016] [Revised: 01/29/2017] [Accepted: 02/03/2017] [Indexed: 11/23/2022]
Abstract
Classifying neurodegenerative brain diseases in MRI aims at correctly assigning discrete labels to MRI scans. Such labels usually refer to a diagnostic decision a learner infers based on what it has learned from a training sample of MRI scans. Classification from MRI voxels separately typically does not provide independent evidence towards or against a class; the information relevant for classification is only present in the form of complicated multivariate patterns (or “features”). Deep learning solves this problem by learning a sequence of non-linear transformations that result in feature representations that are better suited to classification. Such learned features have been shown to drastically outperform hand-engineered features in computer vision and audio analysis domains. However, applying the deep learning approach to the task of MRI classification is extremely challenging, because it requires a very large amount of data which is currently not available. We propose to instead use a three dimensional scattering transform, which resembles a deep convolutional neural network but has no learnable parameters. Furthermore, the scattering transform linearizes diffeomorphisms (due to e.g. residual anatomical variability in MRI scans), making the different disease states more easily separable using a linear classifier. In experiments on brain morphometry in Alzheimer's disease, and on white matter microstructural damage in HIV, scattering representations are shown to be highly effective for the task of disease classification. For instance, in semi-supervised learning of progressive versus stable MCI, we reach an accuracy of 82.7%. We also present a visualization method to highlight areas that provide evidence for or against a certain class, both on an individual and group level. We have developed and implemented a feature extraction method based on the three dimensional (3D) scattering transform. We tested it for its ability to discriminate diseased from healthy subjects and subjects with mild cognitive impairment. We have clearly shown that our proposed methodology achieves higher accuracy than the best competing methods. The scattering transform linearizes diffeomorphisms leading to more separable disease states using a linear classifier. Scattering representations are shown to be highly effective for the task of disease classification. We present a visualization method to highlight areas that provide evidence for or against a certain class.
Collapse
|
217
|
Zeng R, Wu J, Shao Z, Chen Y, Chen B, Senhadji L, Shu H. Color image classification via quaternion principal component analysis network. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2016.08.006] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
218
|
Abstract
Deep learning is a branch of machine learning that tries to model high-level abstractions of data using multiple layers of neurons consisting of complex structures or non-liner transformations. With the increase of the amount of data and the power of computation, neural networks with more complex structures have attracted widespread attention and been applied to various fields. This paper provides an overview of deep learning in neural networks including popular architecture models and training algorithms.
Collapse
Affiliation(s)
- Xing Hao
- EECS, University of California, Irvine, CA 92617, USA
| | - Guigang Zhang
- EECS, University of California, Irvine, CA 92617, USA
- Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
| | - Shang Ma
- EECS, University of California, Irvine, CA 92617, USA
| |
Collapse
|
219
|
Assessing the utility of autofluorescence-based pulmonary optical endomicroscopy to predict the malignant potential of solitary pulmonary nodules in humans. Sci Rep 2016; 6:31372. [PMID: 27550539 PMCID: PMC4993998 DOI: 10.1038/srep31372] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2015] [Accepted: 07/19/2016] [Indexed: 12/27/2022] Open
Abstract
Solitary pulmonary nodules are common, often incidental findings on chest CT scans. The investigation of pulmonary nodules is time-consuming and often leads to protracted follow-up with ongoing radiological surveillance, however, clinical calculators that assess the risk of the nodule being malignant exist to help in the stratification of patients. Furthermore recent advances in interventional pulmonology include the ability to both navigate to nodules and also to perform autofluorescence endomicroscopy. In this study we assessed the efficacy of incorporating additional information from label-free fibre-based optical endomicrosopy of the nodule on assessing risk of malignancy. Using image analysis and machine learning approaches, we find that this information does not yield any gain in predictive performance in a cohort of patients. Further advances with pulmonary endomicroscopy will require the addition of molecular tracers to improve information from this procedure.
Collapse
|
220
|
|
221
|
Joosten ERM, Shamma SA, Lorenzi C, Neri P. Dynamic Reweighting of Auditory Modulation Filters. PLoS Comput Biol 2016; 12:e1005019. [PMID: 27398600 PMCID: PMC4939963 DOI: 10.1371/journal.pcbi.1005019] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2015] [Accepted: 06/13/2016] [Indexed: 11/22/2022] Open
Abstract
Sound waveforms convey information largely via amplitude modulations (AM). A large body of experimental evidence has provided support for a modulation (bandpass) filterbank. Details of this model have varied over time partly reflecting different experimental conditions and diverse datasets from distinct task strategies, contributing uncertainty to the bandwidth measurements and leaving important issues unresolved. We adopt here a solely data-driven measurement approach in which we first demonstrate how different models can be subsumed within a common 'cascade' framework, and then proceed to characterize the cascade via system identification analysis using a single stimulus/task specification and hence stable task rules largely unconstrained by any model or parameters. Observers were required to detect a brief change in level superimposed onto random level changes that served as AM noise; the relationship between trial-by-trial noisy fluctuations and corresponding human responses enables targeted identification of distinct cascade elements. The resulting measurements exhibit a dynamic complex picture in which human perception of auditory modulations appears adaptive in nature, evolving from an initial lowpass to bandpass modes (with broad tuning, Q∼1) following repeated stimulus exposure.
Collapse
Affiliation(s)
- Eva R. M. Joosten
- Laboratoire Psychologie de la Perception (CNRS UMR 8242) and Université Paris Descartes, Sorbonne Paris Cité, Paris, France
| | - Shihab A. Shamma
- Laboratoire des Systèmes Perceptifs (CNRS UMR 8248) and Département d’études cognitives, Ecole Normale Supérieure, PSL Research University, Paris, France
- Department of Electrical and Computer Engineering, Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America
| | - Christian Lorenzi
- Laboratoire des Systèmes Perceptifs (CNRS UMR 8248) and Département d’études cognitives, Ecole Normale Supérieure, PSL Research University, Paris, France
| | - Peter Neri
- Laboratoire des Systèmes Perceptifs (CNRS UMR 8248) and Département d’études cognitives, Ecole Normale Supérieure, PSL Research University, Paris, France
| |
Collapse
|
222
|
Mallat S. Understanding deep convolutional networks. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2016; 374:20150203. [PMID: 26953183 PMCID: PMC4792410 DOI: 10.1098/rsta.2015.0203] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 12/16/2015] [Indexed: 05/22/2023]
Abstract
Deep convolutional networks provide state-of-the-art classifications and regressions results over many high-dimensional problems. We review their architecture, which scatters data with a cascade of linear filter weights and nonlinearities. A mathematical framework is introduced to analyse their properties. Computations of invariants involve multiscale contractions with wavelets, the linearization of hierarchical symmetries and sparse separations. Applications are discussed.
Collapse
Affiliation(s)
- Stéphane Mallat
- École Normale Supérieure, CNRS, PSL, 45 rue d'Ulm, Paris, France
| |
Collapse
|
223
|
Tygert M, Bruna J, Chintala S, LeCun Y, Piantino S, Szlam A. A Mathematical Motivation for Complex-Valued Convolutional Networks. Neural Comput 2016; 28:815-25. [PMID: 26890348 DOI: 10.1162/neco_a_00824] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
A complex-valued convolutional network (convnet) implements the repeated application of the following composition of three operations, recursively applying the composition to an input vector of nonnegative real numbers: (1) convolution with complex-valued vectors, followed by (2) taking the absolute value of every entry of the resulting vectors, followed by (3) local averaging. For processing real-valued random vectors, complex-valued convnets can be viewed as data-driven multiscale windowed power spectra, data-driven multiscale windowed absolute spectra, data-driven multiwavelet absolute values, or (in their most general configuration) data-driven nonlinear multiwavelet packets. Indeed, complex-valued convnets can calculate multiscale windowed spectra when the convnet filters are windowed complex-valued exponentials. Standard real-valued convnets, using rectified linear units (ReLUs), sigmoidal (e.g., logistic or tanh) nonlinearities, or max pooling, for example, do not obviously exhibit the same exact correspondence with data-driven wavelets (whereas for complex-valued convnets, the correspondence is much more than just a vague analogy). Courtesy of the exact correspondence, the remarkably rich and rigorous body of mathematical analysis for wavelets applies directly to (complex-valued) convnets.
Collapse
|
224
|
Arun KS, Govindan VK. A context-aware semantic modeling framework for efficient image retrieval. INT J MACH LEARN CYB 2016. [DOI: 10.1007/s13042-016-0498-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
225
|
Kolouri S, Park SR, Rohde GK. The Radon Cumulative Distribution Transform and Its Application to Image Classification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:920-934. [PMID: 26685245 PMCID: PMC4871726 DOI: 10.1109/tip.2015.2509419] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Invertible image representation methods (transforms) are routinely employed as low-level image processing operations based on which feature extraction and recognition algorithms are developed. Most transforms in current use (e.g., Fourier, wavelet, and so on) are linear transforms and, by themselves, are unable to substantially simplify the representation of image classes for classification. Here, we describe a nonlinear, invertible, low-level image processing transform based on combining the well-known Radon transform for image data, and the 1D cumulative distribution transform proposed earlier. We describe a few of the properties of this new transform, and with both theoretical and experimental results show that it can often render certain problems linearly separable in a transform space.
Collapse
Affiliation(s)
- Soheil Kolouri
- Department of Biomedical Engineering, Carnegie Mellon
University, Pittsburgh, PA, 15213
| | - Se Rim Park
- Department of Electrical and Computer Engineering, Carnegie
Mellon University, Pittsburgh, PA, 15213
| | - Gustavo K. Rohde
- Department of Biomedical Engineering, Carnegie Mellon
University, Pittsburgh, PA, 15213
- Department of Electrical and Computer Engineering, Carnegie
Mellon University, Pittsburgh, PA, 15213
- Lane Center for Computational Biology, Carnegie Mellon
University, Pittsburgh, PA, 15213
| |
Collapse
|
226
|
Liu L, Fieguth P, Wang X, Pietikäinen M, Hu D. Evaluation of LBP and Deep Texture Descriptors with a New Robustness Benchmark. COMPUTER VISION – ECCV 2016 2016. [DOI: 10.1007/978-3-319-46487-9_5] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|
227
|
Yang F, Xia GS, Liu G, Zhang L, Huang X. Dynamic texture recognition by aggregating spatial and temporal features via ensemble SVMs. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2015.09.004] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
228
|
Chan TH, Jia K, Gao S, Lu J, Zeng Z, Ma Y. PCANet: A Simple Deep Learning Baseline for Image Classification? IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:5017-32. [PMID: 26340772 DOI: 10.1109/tip.2015.2475625] [Citation(s) in RCA: 307] [Impact Index Per Article: 34.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
In this paper, we propose a very simple deep learning network for image classification that is based on very basic data processing components: 1) cascaded principal component analysis (PCA); 2) binary hashing; and 3) blockwise histograms. In the proposed architecture, the PCA is employed to learn multistage filter banks. This is followed by simple binary hashing and block histograms for indexing and pooling. This architecture is thus called the PCA network (PCANet) and can be extremely easily and efficiently designed and learned. For comparison and to provide a better understanding, we also introduce and study two simple variations of PCANet: 1) RandNet and 2) LDANet. They share the same topology as PCANet, but their cascaded filters are either randomly selected or learned from linear discriminant analysis. We have extensively tested these basic networks on many benchmark visual data sets for different tasks, including Labeled Faces in the Wild (LFW) for face verification; the MultiPIE, Extended Yale B, AR, Facial Recognition Technology (FERET) data sets for face recognition; and MNIST for hand-written digit recognition. Surprisingly, for all tasks, such a seemingly naive PCANet model is on par with the state-of-the-art features either prefixed, highly hand-crafted, or carefully learned [by deep neural networks (DNNs)]. Even more surprisingly, the model sets new records for many classification tasks on the Extended Yale B, AR, and FERET data sets and on MNIST variations. Additional experiments on other public data sets also demonstrate the potential of PCANet to serve as a simple but highly competitive baseline for texture classification and object recognition.
Collapse
|
229
|
Kim WH, Ravi SN, Johnson SC, Okonkwo OC, Singh V. On Statistical Analysis of Neuroimages with Imperfect Registration. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION 2015; 2015:666-674. [PMID: 27042168 PMCID: PMC4816646 DOI: 10.1109/iccv.2015.83] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
A variety of studies in neuroscience/neuroimaging seek to perform statistical inference on the acquired brain image scans for diagnosis as well as understanding the pathological manifestation of diseases. To do so, an important first step is to register (or co-register) all of the image data into a common coordinate system. This permits meaningful comparison of the intensities at each voxel across groups (e.g., diseased versus healthy) to evaluate the effects of the disease and/or use machine learning algorithms in a subsequent step. But errors in the underlying registration make this problematic, they either decrease the statistical power or make the follow-up inference tasks less effective/accurate. In this paper, we derive a novel algorithm which offers immunity to local errors in the underlying deformation field obtained from registration procedures. By deriving a deformation invariant representation of the image, the downstream analysis can be made more robust as if one had access to a (hypothetical) far superior registration procedure. Our algorithm is based on recent work on scattering transform. Using this as a starting point, we show how results from harmonic analysis (especially, non-Euclidean wavelets) yields strategies for designing deformation and additive noise invariant representations of large 3-D brain image volumes. We present a set of results on synthetic and real brain images where we achieve robust statistical analysis even in the presence of substantial deformation errors; here, standard analysis procedures significantly under-perform and fail to identify the true signal.
Collapse
Affiliation(s)
- Won Hwa Kim
- Dept. of Computer Sciences, University of Wisconsin, Madison, WI; Wisconsin Alzheimer's Disease Research Center, University of Wisconsin, Madison, WI
| | - Sathya N Ravi
- Dept. of Industrial and Systems Engineering, University of Wisconsin, Madison, WI
| | - Sterling C Johnson
- Wisconsin Alzheimer's Disease Research Center, University of Wisconsin, Madison, WI; GRECC, William S. Middleton VA Hospital, Madison, WI
| | - Ozioma C Okonkwo
- Wisconsin Alzheimer's Disease Research Center, University of Wisconsin, Madison, WI; GRECC, William S. Middleton VA Hospital, Madison, WI
| | - Vikas Singh
- Dept. of Computer Sciences, University of Wisconsin, Madison, WI; Dept. of Biostatistics & Med. Informatics, University of Wisconsin, Madison, WI; Wisconsin Alzheimer's Disease Research Center, University of Wisconsin, Madison, WI
| |
Collapse
|
230
|
Abstract
Quantitative characterization and comparison of tongue motion during speech and swallowing present fundamental challenges because of striking variations in tongue structure and motion across subjects. A reliable and objective description of the dynamics tongue motion requires the consistent integration of inter-subject variability to detect the subtle changes in populations. To this end, in this work, we present an approach to constructing an unbiased spatio-temporal atlas of the tongue during speech for the first time, based on cine-MRI from twenty two normal subjects. First, we create a common spatial space using images from the reference time frame, a neutral position, in which the unbiased spatio-temporal atlas can be created. Second, we transport images from all time frames of all subjects into this common space via the single transformation. Third, we construct atlases for each time frame via groupwise diffeomorphic registration, which serves as the initial spatio-temporal atlas. Fourth, we update the spatio-temporal atlas by realigning each time sequence based on the Lipschitz norm on diffeomorphisms between each subject and the initial atlas. We evaluate and compare different configurations such as similarity measures to build the atlas. Our proposed method permits to accurately and objectively explain the main pattern of tongue surface motion.
Collapse
|
231
|
Antonakos E, Alabort-i-Medina J, Tzimiropoulos G, Zafeiriou SP. Feature-based Lucas-Kanade and active appearance models. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:2617-2632. [PMID: 25966479 DOI: 10.1109/tip.2015.2431445] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Lucas-Kanade and active appearance models are among the most commonly used methods for image alignment and facial fitting, respectively. They both utilize nonlinear gradient descent, which is usually applied on intensity values. In this paper, we propose the employment of highly descriptive, densely sampled image features for both problems. We show that the strategy of warping the multichannel dense feature image at each iteration is more beneficial than extracting features after warping the intensity image at each iteration. Motivated by this observation, we demonstrate robust and accurate alignment and fitting performance using a variety of powerful feature descriptors. Especially with the employment of histograms of oriented gradient and scale-invariant feature transform features, our method significantly outperforms the current state-of-the-art results on in-the-wild databases.
Collapse
|
232
|
On size invariance texture image retrieval by fuzzy logic classifier and scattering statistical features. Pattern Anal Appl 2015. [DOI: 10.1007/s10044-015-0509-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
233
|
Weng D, Wang Y, Gong M, Tao D, Wei H, Huang D. DERF: distinctive efficient robust features from the biological modeling of the P ganglion cells. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:2287-2302. [PMID: 25769164 DOI: 10.1109/tip.2015.2409739] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Studies in neuroscience and biological vision have shown that the human retina has strong computational power, and its information representation supports vision tasks on both ventral and dorsal pathways. In this paper, a new local image descriptor, termed distinctive efficient robust features (DERF), is derived by modeling the response and distribution properties of the parvocellular-projecting ganglion cells in the primate retina. DERF features exponential scale distribution, exponential grid structure, and circularly symmetric function difference of Gaussian (DoG) used as a convolution kernel, all of which are consistent with the characteristics of the ganglion cell array found in neurophysiology, anatomy, and biophysics. In addition, a new explanation for local descriptor design is presented from the perspective of wavelet tight frames. DoG is naturally a wavelet, and the structure of the grid points array in our descriptor is closely related to the spatial sampling of wavelets. The DoG wavelet itself forms a frame, and when we modulate the parameters of our descriptor to make the frame tighter, the performance of the DERF descriptor improves accordingly. This is verified by designing a tight frame DoG, which leads to much better performance. Extensive experiments conducted in the image matching task on the multiview stereo correspondence data set demonstrate that DERF outperforms state of the art methods for both hand-crafted and learned descriptors, while remaining robust and being much faster to compute.
Collapse
|
234
|
Chang KY, Chen CS. A learning framework for age rank estimation based on face images with scattering transform. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:785-798. [PMID: 25576566 DOI: 10.1109/tip.2014.2387379] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
This paper presents a cost-sensitive ordinal hyperplanes ranking algorithm for human age estimation based on face images. The proposed approach exploits relative-order information among the age labels for rank prediction. In our approach, the age rank is obtained by aggregating a series of binary classification results, where cost sensitivities among the labels are introduced to improve the aggregating performance. In addition, we give a theoretical analysis on designing the cost of individual binary classifier so that the misranking cost can be bounded by the total misclassification costs. An efficient descriptor, scattering transform, which scatters the Gabor coefficients and pooled with Gaussian smoothing in multiple layers, is evaluated for facial feature extraction. We show that this descriptor is a generalization of conventional bioinspired features and is more effective for face-based age inference. Experimental results demonstrate that our method outperforms the state-of-the-art age estimation approaches.
Collapse
|
235
|
Bruna J, Mallat S, Bacry E, Muzy JF. Intermittent process analysis with scattering moments. Ann Stat 2015. [DOI: 10.1214/14-aos1276] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
236
|
Scattering features for lung cancer detection in fibered confocal fluorescence microscopy images. Artif Intell Med 2014; 61:105-18. [DOI: 10.1016/j.artmed.2014.05.003] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2013] [Revised: 05/14/2014] [Accepted: 05/16/2014] [Indexed: 11/20/2022]
|