1
|
Tan JM, Liao H, Liu W, Fan C, Huang J, Liu Z, Yan J. Hyperparameter optimization: Classics, acceleration, online, multi-objective, and tools. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:6289-6335. [PMID: 39176427 DOI: 10.3934/mbe.2024275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/24/2024]
Abstract
Hyperparameter optimization (HPO) has been well-developed and evolved into a well-established research topic over the decades. With the success and wide application of deep learning, HPO has garnered increased attention, particularly within the realm of machine learning model training and inference. The primary objective is to mitigate the challenges associated with manual hyperparameter tuning, which can be ad-hoc, reliant on human expertise, and consequently hinders reproducibility while inflating deployment costs. Recognizing the growing significance of HPO, this paper surveyed classical HPO methods, approaches for accelerating the optimization process, HPO in an online setting (dynamic algorithm configuration, DAC), and when there is more than one objective to optimize (multi-objective HPO). Acceleration strategies were categorized into multi-fidelity, bandit-based, and early stopping; DAC algorithms encompassed gradient-based, population-based, and reinforcement learning-based methods; multi-objective HPO can be approached via scalarization, metaheuristics, and model-based algorithms tailored for multi-objective situation. A tabulated overview of popular frameworks and tools for HPO was provided, catering to the interests of practitioners.
Collapse
Affiliation(s)
- Jia Mian Tan
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Haoran Liao
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Wei Liu
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Changjun Fan
- College of Systems Engineering, National University of Defense Technology, Changsha, China
| | - Jincai Huang
- College of Systems Engineering, National University of Defense Technology, Changsha, China
| | - Zhong Liu
- College of Systems Engineering, National University of Defense Technology, Changsha, China
| | - Junchi Yan
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
2
|
Yadav R, Dupé FX, Takerkart S, Auzias G. Population-wise labeling of sulcal graphs using multi-graph matching. PLoS One 2023; 18:e0293886. [PMID: 37943809 PMCID: PMC10635518 DOI: 10.1371/journal.pone.0293886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Accepted: 10/23/2023] [Indexed: 11/12/2023] Open
Abstract
Population-wise matching of the cortical folds is necessary to compute statistics, a required step for e.g. identifying biomarkers of neurological or psychiatric disorders. The difficulty arises from the massive inter-individual variations in the morphology and spatial organization of the folds. The task is challenging both methodologically and conceptually. In the widely used registration-based techniques, these variations are considered as noise and the matching of folds is only implicit. Alternative approaches are based on the extraction and explicit identification of the cortical folds. In particular, representing cortical folding patterns as graphs of sulcal basins-termed sulcal graphs-enables to formalize the task as a graph-matching problem. In this paper, we propose to address the problem of sulcal graph matching directly at the population level using multi-graph matching techniques. First, we motivate the relevance of the multi-graph matching framework in this context. We then present a procedure for generating populations of artificial sulcal graphs, which allows us to benchmark several state-of-the-art multi-graph matching methods. Our results on both artificial and real data demonstrate the effectiveness of multi-graph matching techniques in obtaining a population-wise consistent labeling of cortical folds at the sulcal basin level.
Collapse
Affiliation(s)
- Rohit Yadav
- Institut de Neurosciences de la Timone UMR 7289, CNRS, Aix-Marseille Université, Marseille, France
- Institut Marseille Imaging, Aix Marseille Université, Marseille, France
- Laboratoire d’Informatique et Systèmes UMR 7020, CNRS, Aix-Marseille Université, Marseille, France
| | - François-Xavier Dupé
- Laboratoire d’Informatique et Systèmes UMR 7020, CNRS, Aix-Marseille Université, Marseille, France
| | - Sylvain Takerkart
- Institut de Neurosciences de la Timone UMR 7289, CNRS, Aix-Marseille Université, Marseille, France
| | - Guillaume Auzias
- Institut de Neurosciences de la Timone UMR 7289, CNRS, Aix-Marseille Université, Marseille, France
| |
Collapse
|
3
|
Ma Z, Lu X, Xie J, Yang Z, Xue JH, Tan ZH, Xiao B, Guo J. On the Comparisons of Decorrelation Approaches for Non-Gaussian Neutral Vector Variables. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:1823-1837. [PMID: 32248126 DOI: 10.1109/tnnls.2020.2978858] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
As a typical non-Gaussian vector variable, a neutral vector variable contains nonnegative elements only, and its l1 -norm equals one. In addition, its neutral properties make it significantly different from the commonly studied vector variables (e.g., the Gaussian vector variables). Due to the aforementioned properties, the conventionally applied linear transformation approaches [e.g., principal component analysis (PCA) and independent component analysis (ICA)] are not suitable for neutral vector variables, as PCA cannot transform a neutral vector variable, which is highly negatively correlated, into a set of mutually independent scalar variables and ICA cannot preserve the bounded property after transformation. In recent work, we proposed an efficient nonlinear transformation approach, i.e., the parallel nonlinear transformation (PNT), for decorrelating neutral vector variables. In this article, we extensively compare PNT with PCA and ICA through both theoretical analysis and experimental evaluations. The results of our investigations demonstrate the superiority of PNT for decorrelating the neutral vector variables.
Collapse
|
4
|
Kernel Embedding Transformation Learning for Graph Matching. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.09.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
5
|
Wang R, Yan J, Yang X. Neural Graph Matching Network: Learning Lawler's Quadratic Assignment Problem With Extension to Hypergraph and Multiple-Graph Matching. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:5261-5279. [PMID: 33961550 DOI: 10.1109/tpami.2021.3078053] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Graph matching involves combinatorial optimization based on edge-to-edge affinity matrix, which can be generally formulated as Lawler's quadratic assignment problem (QAP). This paper presents a QAP network directly learning with the affinity matrix (equivalently the association graph) whereby the matching problem is translated into a constrained vertex classification task. The association graph is learned by an embedding network for vertex classification, followed by Sinkhorn normalization and a cross-entropy loss for end-to-end learning. We further improve the embedding model on association graph by introducing Sinkhorn based matching-aware constraint, as well as dummy nodes to deal with unequal sizes of graphs. To our best knowledge, this is one of the first network to directly learn with the general Lawler's QAP. In contrast, recent deep matching methods focus on the learning of node/edge features in two graphs respectively. We also show how to extend our network to hypergraph matching, and matching of multiple graphs. Experimental results on both synthetic graphs and real-world images show its effectiveness. For pure QAP tasks on synthetic data and QAPLIB benchmark, our method can perform competitively and even surpass state-of-the-art graph matching and QAP solvers with notable less time cost. We provide a project homepage at http://thinklab.sjtu.edu.cn/project/NGM/index.html.
Collapse
|
6
|
|
7
|
Feature Matching via Motion-Consistency Driven Probabilistic Graphical Model. Int J Comput Vis 2022. [DOI: 10.1007/s11263-022-01644-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
8
|
Hu B, Liu Y, Chu P, Tong M, Kong Q. Small Object Detection via Pixel Level Balancing With Applications to Blood Cell Detection. Front Physiol 2022; 13:911297. [PMID: 35784879 PMCID: PMC9249342 DOI: 10.3389/fphys.2022.911297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Accepted: 05/24/2022] [Indexed: 11/16/2022] Open
Abstract
Object detection technology has been widely used in medical field, such as detecting the images of blood cell to count the changes and distribution for assisting the diagnosis of diseases. However, detecting small objects is one of the most challenging and important problems especially in medical scenarios. Most of the objects in medical images are very small but influential. Improving the detection performance of small objects is a very meaningful topic for medical detection. Current researches mainly focus on the extraction of small object features and data augmentation for small object samples, all of these researches focus on extracting the feature space of small objects better. However, in the training process of a detection model, objects of different sizes are mixed together, which may interfere with each other and affect the performance of small object detection. In this paper, we propose a method called pixel level balancing (PLB), which takes into account the number of pixels contained in the detection box as an impact factor to characterize the size of the inspected objects, and uses this as an impact factor. The training loss of each object of different size is adjusted by a weight dynamically, so as to improve the accuracy of small object detection. Finally, through experiments, we demonstrate that the size of objects in object detection interfere with each other. So that we can improve the accuracy of small object detection through PLB operation. This method can perform well with blood cell detection in our experiments.
Collapse
Affiliation(s)
- Bin Hu
- Department of Compute Science and Engineering, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Yang Liu
- Department of Dermatology, Shanghai Ninth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
- Department of Laser and Aesthetic Medicine, Shanghai Ninth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
- *Correspondence: Yang Liu, ; Minglei Tong,
| | - Pengzhi Chu
- Department of Compute Science and Engineering, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Minglei Tong
- College of Electronics and Information Engineering, Shanghai University of Electric Power, Shanghai, China
- *Correspondence: Yang Liu, ; Minglei Tong,
| | - Qingjie Kong
- Riseye Research, Riseye Intelligent Technology (Shanghai) Co., Ltd., Shanghai, China
| |
Collapse
|
9
|
Cao H, Wang H, Zhang N, Yang Y, Zhou Z. Robust probability model based on variational Bayes for point set registration. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108182] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
10
|
Affiliation(s)
- Chao Gao
- Department of Statistics, University of Chicago
| | | |
Collapse
|
11
|
Pan X, Chen L, Liu M, Niu Z, Huang T, Cai YD. Identifying Protein Subcellular Locations With Embeddings-Based node2loc. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:666-675. [PMID: 33989156 DOI: 10.1109/tcbb.2021.3080386] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Identifying protein subcellular locations is an important topic in protein function prediction. Interacting proteins may share similar locations. Thus, it is imperative to infer protein subcellular locations by taking protein-protein interactions (PPIs)into account. In this study, we present a network embedding-based method, node2loc, to identify protein subcellular locations. node2loc first learns distributed embeddings of proteins in a protein-protein interaction (PPI)network using node2vec. Then the learned embeddings are further fed into a recurrent neural network (RNN). To resolve the severe class imbalance of different subcellular locations, Synthetic Minority Over-sampling Technique (SMOTE)is applied to artificially synthesize proteins for minority classes. node2loc is evaluated on our constructed human benchmark dataset with 16 subcellular locations and yields a Matthews correlation coefficient (MCC)value of 0.800, which is superior to baseline methods. In addition, node2loc yields a better performance on a Yeast benchmark dataset with 17 locations. The results demonstrate that the learned representations from a PPI network have certain discriminative ability for classifying protein subcellular locations. However, node2loc is a transductive method, it only works for proteins connected in a PPI network, and it needs to be retrained for new proteins. In addition, the PPI network needs be annotated to some extent with location information. node2loc is freely available at https://github.com/xypan1232/node2loc.
Collapse
|
12
|
Han R, Wang Y, Yan H, Feng W, Wang S. Multi-View Multi-Human Association With Deep Assignment Network. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:1830-1840. [PMID: 35081024 DOI: 10.1109/tip.2021.3139178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Identifying the same persons across different views plays an important role in many vision applications. In this paper, we study this important problem, denoted as Multi-view Multi-Human Association (MvMHA), on multi-view images that are taken by different cameras at the same time. Different from previous works on human association across two views, this paper is focused on more general and challenging scenarios of more than two views, and none of these views are fixed or priorly known. In addition, each involved person may be present in all the views or only a subset of views, which are also not priorly known. We develop a new end-to-end deep-network based framework to address this problem. First, we use an appearance-based deep network to extract the feature of each detected subject on each image. We then compute pairwise-similarity scores between all the detected subjects and construct a comprehensive affinity matrix. Finally, we propose a Deep Assignment Network (DAN) to transform the affinity matrix into an assignment matrix, which provides a binary assignment result for MvMHA. We build both a synthetic dataset and a real image dataset to verify the effectiveness of the proposed method. We also test the trained network on other three public datasets, resulting in very good cross-domain performance.
Collapse
|
13
|
You Z, Li J, Zhang H, Yang B, Le X. An accurate star identification approach based on spectral graph matching for attitude measurement of spacecraft. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-021-00619-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
AbstractStar identification is the foundation of star trackers, which are used to precisely determine the attitude of spacecraft. In this paper, we propose a novel star identification approach based on spectral graph matching. In the proposed approach, we construct a feature called the neighbor graph for each main star, transforming the star identification to the problem of finding the most similar neighbor graph. Then the rough search and graph matching are cooperated to form a dynamic search framework to solve the problem. In the rough search stage, the total edge weight in the minimum spanning tree of the neighbor graph is selected as an indicator, then the k-vector range search is applied for reducing the search scale. Spectral graph matching is utilized to achieve global matching, identifying all stars in the neighbor circle with good noise-tolerance ability. Extensive simulation experiments under the position noise, lost-star noise, and fake-star noise show that our approach achieves higher accuracy (mostly over 99%) and better robustness results compared with other baseline algorithms in most cases.
Collapse
|
14
|
Zhou H, Jayender J. EMDQ: Removal of Image Feature Mismatches in Real-Time. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 31:706-720. [PMID: 34914589 PMCID: PMC8777235 DOI: 10.1109/tip.2021.3134456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
This paper proposes a novel method for removing image feature mismatches in real-time that can handle both rigid and smooth deforming environments. Image distortion, parallax and object deformation may cause the pixel coordinates of feature matches to have non-rigid deformations, which cannot be represented using a single analytical rigid transformation. To solve this problem, we propose an algorithm based on the re-weighting and 1-point RANSAC strategy (R1P-RNSC), which operates under the assumption that a non-rigid deformation can be approximately represented by multiple rigid transformations. R1P-RNSC is fast but suffers from the drawback that local smoothing information cannot be considered, thus limiting its accuracy. To solve this problem, we propose a non-parametric algorithm based on the expectation-maximization algorithm and the dual quaternion-based representation (EMDQ). EMDQ generates dense and smooth deformation fields by interpolating among the feature matches, simultaneously removing mismatches that are inconsistent with the deformation field. It relies on the rigid transformations obtained by R1P-RNSC to improve its accuracy. The experimental results demonstrate that EMDQ has superior accuracy compared to other state-of-the-art mismatch removal methods. The ability to build correspondences for all image pixels using the dense deformation field is another contribution of this paper.
Collapse
|
15
|
Jiang Z, Wang T, Yan J. Unifying Offline and Online Multi-Graph Matching via Finding Shortest Paths on Supergraph. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:3648-3663. [PMID: 32340936 DOI: 10.1109/tpami.2020.2989928] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
This paper addresses the problem of multiple graph matching (MGM) by considering both offline batch mode and online setting. We explore the concept of cycle-consistency over pairwise matchings and formulate the problem as finding optimal composition path on the supergraph, whose vertices refer to graphs and edge weights denote score function regarding consistency and affinity. By our theoretical study we show that the offline and online MGM on supergraph can be converted to finding all pairwise shortest paths and single-source shortest paths respectively. We adopt the Floyd algorithm [1] and shortest path faster algorithm (SPFA) [2] , [3] to effectively find the optimal path. Extensive experimental results show our methods surpass state-of-the-art MGM methods, including CAO [4] , MISM [5], IMGM [6] , and many other recent methods in offline and online settings. Source code will be made publicly available.
Collapse
|
16
|
Chen L, Li Z, Zeng T, Zhang YH, Li H, Huang T, Cai YD. Predicting gene phenotype by multi-label multi-class model based on essential functional features. Mol Genet Genomics 2021; 296:905-918. [PMID: 33914130 DOI: 10.1007/s00438-021-01789-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 04/13/2021] [Indexed: 12/19/2022]
Abstract
Phenotype is one of the most significant concepts in genetics, which is used to describe all the characteristics of a research object that can be observed. Considering that phenotype reflects the integrated features of genotype and environment factors, it is hard to define phenotype characteristics, even difficult to predict unknown phenotypes. Restricted by current biological techniques, it is still quite expensive and time-consuming to obtain sufficient structural information of large-scale phenotype-associated genes/proteins. Various bioinformatics methods have been presented to solve such problem, and researchers have confirmed the efficacy and prediction accuracy of functional network-based prediction. But general functional descriptions have highly complicated inner structures for phenotype prediction. To further address this issue and improve the efficacy of phenotype prediction on more than ten kinds of phenotypes, we first extract functional enrichment features from GO and KEGG, and then use node2vec to learn functional embedding features of genes from a gene-gene network. All these features are analyzed by some feature selection methods (Boruta, minimum redundancy maximum relevance) to generate a feature list. Such list is fed into the incremental feature selection, incorporating some multi-label classifiers built by RAkEL and some classic base classifiers, to build an optimum multi-label multi-class classification model for phenotype prediction. According to recent researches, our method has indeed identified many literature-supported genes/proteins and their associated phenotypes, and even some candidate genes with re-assigned new phenotypes, which provide a new computational tool for the accurate and effective phenotypic prediction.
Collapse
Affiliation(s)
- Lei Chen
- School of Life Sciences, Shanghai University, Shanghai, 200444, People's Republic of China.,College of Information Engineering, Shanghai Maritime University, Shanghai, 201306, People's Republic of China
| | - Zhandong Li
- College of Food Engineering, Jilin Engineering Normal University, Changchun, 130052, People's Republic of China
| | - Tao Zeng
- CAS Key Laboratory of Computational Biology, Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, 200031, People's Republic of China
| | - Yu-Hang Zhang
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Hao Li
- College of Food Engineering, Jilin Engineering Normal University, Changchun, 130052, People's Republic of China
| | - Tao Huang
- Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai, 200031, People's Republic of China.
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, 200444, People's Republic of China.
| |
Collapse
|
17
|
Yu YF, Xu G, Jiang M, Zhu H, Dai DQ, Yan H. Joint Transformation Learning via the L 2,1-Norm Metric for Robust Graph Matching. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:521-533. [PMID: 31059466 DOI: 10.1109/tcyb.2019.2912718] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Establishing correspondence between two given geometrical graph structures is an important problem in computer vision and pattern recognition. In this paper, we propose a robust graph matching (RGM) model to improve the effectiveness and robustness on the matching graphs with deformations, rotations, outliers, and noise. First, we embed the joint geometric transformation into the graph matching model, which performs unary matching over graph nodes and local structure matching over graph edges simultaneously. Then, the L2,1 -norm is used as the similarity metric in the presented RGM to enhance the robustness. Finally, we derive an objective function which can be solved by an effective optimization algorithm, and theoretically prove the convergence of the proposed algorithm. Extensive experiments on various graph matching tasks, such as outliers, rotations, and deformations show that the proposed RGM model achieves competitive performance compared to the existing methods.
Collapse
|
18
|
Xia CQ, Pan X, Yang Y, Huang Y, Shen HB. Recent Progresses of Computational Analysis of RNA-Protein Interactions. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11315-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022] Open
|
19
|
Fathian K, Khosoussi K, Tian Y, Lusk P, How JP. CLEAR: A Consistent Lifting, Embedding, and Alignment Rectification Algorithm for Multiview Data Association. IEEE T ROBOT 2020. [DOI: 10.1109/tro.2020.3002432] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
20
|
Ma J, Jiang X, Fan A, Jiang J, Yan J. Image Matching from Handcrafted to Deep Features: A Survey. Int J Comput Vis 2020. [DOI: 10.1007/s11263-020-01359-2] [Citation(s) in RCA: 230] [Impact Index Per Article: 57.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
AbstractAs a fundamental and critical task in various visual applications, image matching can identify then correspond the same or similar structure/content from two or more images. Over the past decades, growing amount and diversity of methods have been proposed for image matching, particularly with the development of deep learning techniques over the recent years. However, it may leave several open questions about which method would be a suitable choice for specific applications with respect to different scenarios and task requirements and how to design better image matching methods with superior performance in accuracy, robustness and efficiency. This encourages us to conduct a comprehensive and systematic review and analysis for those classical and latest techniques. Following the feature-based image matching pipeline, we first introduce feature detection, description, and matching techniques from handcrafted methods to trainable ones and provide an analysis of the development of these methods in theory and practice. Secondly, we briefly introduce several typical image matching-based applications for a comprehensive understanding of the significance of image matching. In addition, we also provide a comprehensive and objective comparison of these classical and latest techniques through extensive experiments on representative datasets. Finally, we conclude with the current status of image matching technologies and deliver insightful discussions and prospects for future works. This survey can serve as a reference for (but not limited to) researchers and engineers in image matching and related fields.
Collapse
|
21
|
Yang X, Liu ZY, Qiao H. A Continuation Method for Graph Matching Based Feature Correspondence. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2020; 42:1809-1822. [PMID: 30843819 DOI: 10.1109/tpami.2019.2903483] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Feature correspondence lays the foundation for many computer vision and image processing tasks, which can be well formulated and solved by graph matching. Because of the high complexity, approximate methods are necessary for graph matching, and the continuous relaxation provides an efficient approximate scheme. But there are still many problems to be settled, such as the highly nonconvex objective function, the ignorance of the combinatorial nature of graph matching in the optimization process, and few attention to the outlier problem. Focusing on these problems, this paper introduces a continuation method directly targeting at the combinatorial optimization problem associated with graph matching. Specifically, first a regularization function incorporating the original objective function and the discrete constraints is proposed. Then a continuation method based on Gaussian smoothing is applied to it, in which the closed forms of relevant functions with respect to the outlier distribution are deduced. Experiments on both synthetic data and real world images validate the effectiveness of the proposed method.
Collapse
|
22
|
Pan X, Lu L, Cai YD. Predicting protein subcellular location with network embedding and enrichment features. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2020; 1868:140477. [PMID: 32593761 DOI: 10.1016/j.bbapap.2020.140477] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Revised: 06/17/2020] [Accepted: 06/22/2020] [Indexed: 02/06/2023]
Abstract
The subcellular location of a protein is highly related to its function. Identifying the location of a given protein is an essential step for investigating its related problems. Traditional experimental methods can produce solid determination. However, their limitations, such as high cost and low efficiency, are evident. Computational methods provide an alternative means to address these problems. Most previous methods constantly extract features from protein sequences or structures for building prediction models. In this study, we use two types of features and combine them to construct the model. The first feature type is extracted from a protein-protein interaction network to abstract the relationship between the encoded protein and other proteins. The second type is obtained from gene ontology and biological pathways to indicate the existing functions of the encoded protein. These features are analyzed using some feature selection methods. The final optimum features are adopted to build the model with recurrent neural network as the classification algorithm. Such model yields good performance with Matthews correlation coefficient of 0.844. A decision tree is used as a rule learning classifier to extract decision rules. Although the performance of decision rules is poor, they are valuable in revealing the molecular mechanism of proteins with different subcellular locations. The final analysis confirms the reliability of the extracted rules. The source code of the propose method is freely available at https://github.com/xypan1232/rnnloc.
Collapse
Affiliation(s)
- Xiaoyong Pan
- School of Life Sciences, Shanghai University, Shanghai 200444, People's Republic of China; Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Key Laboratory of System Control and Information Processing, Ministry of Education of China, 200240 Shanghai, China
| | - Lin Lu
- Department of Radiology, Columbia University Medical Center, NewYork, NY, 10032, USA.
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai 200444, People's Republic of China.
| |
Collapse
|
23
|
Yao S, Yan J, Wu M, Yang X, Zhang W, Lu H, Qian B. Texture Synthesis Based Thyroid Nodule Detection From Medical Ultrasound Images: Interpreting and Suppressing the Adversarial Effect of In-place Manual Annotation. Front Bioeng Biotechnol 2020; 8:599. [PMID: 32626697 PMCID: PMC7311795 DOI: 10.3389/fbioe.2020.00599] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Accepted: 05/15/2020] [Indexed: 11/26/2022] Open
Abstract
Deep learning method have been offering promising solutions for medical image processing, but failing to understand what features in the input image are captured and whether certain artifacts are mistakenly included in the model, thus create crucial problems in generalizability of the model. We targeted a common issue of this kind caused by manual annotations appeared in medical image. These annotations are usually made by the doctors at the spot of medical interest and have adversarial effect on many computer vision AI tasks. We developed an inpainting algorithm to remove the annotations and recover the original images. Besides we applied variational information bottleneck method in order to filter out the unwanted features and enhance the robustness of the model. Our impaiting algorithm is extensively tested in object detection in thyroid ultrasound image data. The mAP (mean average precision, with IoU = 0.3) is 27% without the annotation removal. The mAP is 83% if manually removed the annotations using Photoshop and is enhanced to 90% using our inpainting algorithm. Our work can be utilized in future development and evaluation of artificial intelligence models based on medical images with defects.
Collapse
Affiliation(s)
- Siqiong Yao
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Junchi Yan
- School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Mingyu Wu
- School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Xue Yang
- School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Weituo Zhang
- Hongqiao International Institute of Medicine, Shanghai Tong Ren Hospital and Clinical Research Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Hui Lu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Biyun Qian
- Hongqiao International Institute of Medicine, Shanghai Tong Ren Hospital and Clinical Research Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| |
Collapse
|
24
|
Wang C, Fu H, Yang L, Cao X. Text Co-detection in Multi-view Scene. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:4627-4642. [PMID: 32092000 DOI: 10.1109/tip.2020.2973511] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Multi-view scene analysis has been widely explored in computer vision, including numerous practical applications. The texts in multi-view scenes are often detected by following the existing text detection method in a single image, which however ignores the multi-view corresponding constraint. The multi-view correspondences may contain structure, location information and assist difficulties induced by factors like occlusion and perspective distortion, which are deficient in the single image scene. In this paper, we address the corresponding text detection task and propose a novel text co-detection method to identify the cooccurring texts among multi-view scene images with compositions of detection and correspondence under large environmental variations. In our text co-detection method, the visual and geometrical correspondences are designed to explore texts holding high pairwise representation similarity and guide the exploitation of texts with geometrical correspondences, simultaneously. To guarantee the pairwise consistency among multiple images, we additionally incorporate the cycle consistency constraint, which guarantees alignments of text correspondences in the image set. Finally, text correspondence is represented by a permutation matrix and solved via positive semidefinite and low-rank constraints. Moreover, we also collect a new text co-detection dataset consisting of multi-view image groups obtained from the same scene with different photographing conditions. The experiments show that our text co-detection obtains satisfactory performance and outperforms the related state-of-the-art text detection methods.
Collapse
|
25
|
Li Y, Li Q, Liu Y, Xie W. A spatial-spectral SIFT for hyperspectral image matching and classification. Pattern Recognit Lett 2019. [DOI: 10.1016/j.patrec.2018.08.032] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
26
|
Gao X, Shen S, Hu Z, Wang Z. Ground and aerial meta-data integration for localization and reconstruction: A review. Pattern Recognit Lett 2019. [DOI: 10.1016/j.patrec.2018.07.036] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
27
|
|
28
|
|
29
|
|
30
|
Dual L1-Normalized Context Aware Tensor Power Iteration and Its Applications to Multi-object Tracking and Multi-graph Matching. Int J Comput Vis 2019. [DOI: 10.1007/s11263-019-01231-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Abstract
The multi-dimensional assignment problem is universal for data association analysis such as data association-based visual multi-object tracking and multi-graph matching. In this paper, multi-dimensional assignment is formulated as a rank-1 tensor approximation problem. A dual L1-normalized context/hyper-context aware tensor power iteration optimization method is proposed. The method is applied to multi-object tracking and multi-graph matching. In the optimization method, tensor power iteration with the dual unit norm enables the capture of information across multiple sample sets. Interactions between sample associations are modeled as contexts or hyper-contexts which are combined with the global affinity into a unified optimization. The optimization is flexible for accommodating various types of contextual models. In multi-object tracking, the global affinity is defined according to the appearance similarity between objects detected in different frames. Interactions between objects are modeled as motion contexts which are encoded into the global association optimization. The tracking method integrates high order motion information and high order appearance variation. The multi-graph matching method carries out matching over graph vertices and structure matching over graph edges simultaneously. The matching consistency across multi-graphs is based on the high-order tensor optimization. Various types of vertex affinities and edge/hyper-edge affinities are flexibly integrated. Experiments on several public datasets, such as the MOT16 challenge benchmark, validate the effectiveness of the proposed methods.
Collapse
|
31
|
Ma J, Jiang X, Jiang J, Guo X. Robust Feature Matching Using Spatial Clustering with Heavy Outliers. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:736-746. [PMID: 31449018 DOI: 10.1109/tip.2019.2934572] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This paper focuses on removing mismatches from given putative feature matches created typically based on descriptor similarity. To achieve this goal, existing attempts usually involve estimating the image transformation under a geometrical constraint, where a pre-defined transformation model is demanded. This severely limits the applicability, as the transformation could vary with different data and is complex and hard to model in many real-world tasks. From a novel perspective, this paper casts the feature matching into a spatial clustering problem with outliers. The main idea is to adaptively cluster the putative matches into several motion consistent clusters together with an outlier/mismatch cluster. To implement the spatial clustering, we customize the classic density based spatial clustering method of applications with noise (DBSCAN) in the context of feature matching, which enables our approach to achieve quasi-linear time complexity. We also design an iterative clustering strategy to promote the matching performance in case of severely degraded data. Extensive experiments on several datasets involving different types of image transformations demonstrate the superiority of our approach over state-of-the-art alternatives. Our approach is also applied to near-duplicate image retrieval and co-segmentation and achieves promising performance.
Collapse
|
32
|
Ma J, Jiang X, Jiang J, Zhao J, Guo X. LMR: Learning a Two-Class Classifier for Mismatch Removal. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:4045-4059. [PMID: 30908218 DOI: 10.1109/tip.2019.2906490] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Feature matching, which refers to establishing reliable correspondence between two sets of features, is a critical prerequisite in a wide spectrum of vision-based tasks. Existing attempts typically involve the mismatch removal from a set of putative matches based on estimating the underlying image transformation. However, the transformation could vary with different data. Thus, a pre-defined transformation model is often demanded, which severely limits the applicability. From a novel perspective, this paper casts the mismatch removal into a two-class classification problem, learning a general classifier to determine the correctness of an arbitrary putative match, termed as Learning for Mismatch Removal (LMR). The classifier is trained based on a general match representation associated with each putative match through exploiting the consensus of local neighborhood structures based on a multiple K -nearest neighbors strategy. With only ten training image pairs involving about 8000 putative matches, the learned classifier can generate promising matching results in linearithmic time complexity on arbitrary testing data. The generality and robustness of our approach are verified under several representative supervised learning techniques as well as on different training and testing data. Extensive experiments on feature matching, visual homing, and near-duplicate image retrieval are conducted to reveal the superiority of our LMR over the state-of-the-art competitors.
Collapse
|
33
|
Zhang XY, Shi H, Zhu X, Li P. Active semi-supervised learning based on self-expressive correlation with generative adversarial networks. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.01.083] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
34
|
A Review of Point Set Registration: From Pairwise Registration to Groupwise Registration. SENSORS 2019; 19:s19051191. [PMID: 30857205 PMCID: PMC6427196 DOI: 10.3390/s19051191] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/07/2018] [Revised: 02/25/2019] [Accepted: 03/05/2019] [Indexed: 01/08/2023]
Abstract
This paper presents a comprehensive literature review on point set registration. The state-of-the-art modeling methods and algorithms for point set registration are discussed and summarized. Special attention is paid to methods for pairwise registration and groupwise registration. Some of the most prominent representative methods are selected to conduct qualitative and quantitative experiments. From the experiments we have conducted on 2D and 3D data, CPD-GL pairwise registration algorithm and JRMPC groupwise registration algorithm seem to outperform their rivals both in accuracy and computational complexity. Furthermore, future research directions and avenues in the area are identified.
Collapse
|
35
|
Ye M, Li J, Ma AJ, Zheng L, Yuen PC. Dynamic Graph Co-Matching for Unsupervised Video-based Person Re-Identification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:2976-2990. [PMID: 30640612 DOI: 10.1109/tip.2019.2893066] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Cross-camera label estimation from a set of unlabelled training data is an extremely important component in unsupervised person re-identification (re-ID) systems. With the estimated labels, existing advanced supervised learning methods can be leveraged to learn discriminative re-ID models. In this paper, we utilize the graph matching technique for accurate label estimation due to its advantages in optimal global matching and intra-camera relationship mining. However, the graph structure constructed with non-learnt similarity measurement cannot handle the large cross-camera variations, which leads to noisy and inaccurate label outputs. This paper designs a Dynamic Graph Matching (DGM) framework, which improves the label estimation process by iteratively refining the graph structure with better similarity measurement learnt from intermediate estimated labels. In addition, we design a positive re-weighting strategy to refine the intermediate labels, which enhances the robustness against inaccurate matching output and noisy initial training data. To fully utilize the abundant video information and reduce false matchings, a co-matching strategy is further incorporated into the framework. Comprehensive experiments conducted on three video benchmarks demonstrate that DGM outperforms state-of-the-art unsupervised re-ID methods and yields competitive performance to fully supervised upper bounds.
Collapse
|
36
|
Du Q, Xu H, Ma Y, Huang J, Fan F. Fusing Infrared and Visible Images of Different Resolutions via Total Variation Model. SENSORS 2018; 18:s18113827. [PMID: 30413066 PMCID: PMC6263655 DOI: 10.3390/s18113827] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/09/2018] [Revised: 11/03/2018] [Accepted: 11/05/2018] [Indexed: 11/23/2022]
Abstract
In infrared and visible image fusion, existing methods typically have a prerequisite that the source images share the same resolution. However, due to limitations of hardware devices and application environments, infrared images constantly suffer from markedly lower resolution compared with the corresponding visible images. In this case, current fusion methods inevitably cause texture information loss in visible images or blur thermal radiation information in infrared images. Moreover, the principle of existing fusion rules typically focuses on preserving texture details in source images, which may be inappropriate for fusing infrared thermal radiation information because it is characterized by pixel intensities, possibly neglecting the prominence of targets in fused images. Faced with such difficulties and challenges, we propose a novel method to fuse infrared and visible images of different resolutions and generate high-resolution resulting images to obtain clear and accurate fused images. Specifically, the fusion problem is formulated as a total variation (TV) minimization problem. The data fidelity term constrains the pixel intensity similarity of the downsampled fused image with respect to the infrared image, and the regularization term compels the gradient similarity of the fused image with respect to the visible image. The fast iterative shrinkage-thresholding algorithm (FISTA) framework is applied to improve the convergence rate. Our resulting fused images are similar to super-resolved infrared images, which are sharpened by the texture information from visible images. Advantages and innovations of our method are demonstrated by the qualitative and quantitative comparisons with six state-of-the-art methods on publicly available datasets.
Collapse
Affiliation(s)
- Qinglei Du
- Electronic Information School, Wuhan University, Wuhan 430072, China.
- Air Force Early Warning Academy, Wuhan 430019, China.
| | - Han Xu
- Electronic Information School, Wuhan University, Wuhan 430072, China.
| | - Yong Ma
- Electronic Information School, Wuhan University, Wuhan 430072, China.
| | - Jun Huang
- Electronic Information School, Wuhan University, Wuhan 430072, China.
| | - Fan Fan
- Electronic Information School, Wuhan University, Wuhan 430072, China.
| |
Collapse
|
37
|
Yan J, Li C, Li Y, Cao G. Adaptive Discrete Hypergraph Matching. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:765-779. [PMID: 28222006 DOI: 10.1109/tcyb.2017.2655538] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
This paper addresses the problem of hypergraph matching using higher-order affinity information. We propose a solver that iteratively updates the solution in the discrete domain by linear assignment approximation. The proposed method is guaranteed to converge to a stationary discrete solution and avoids the annealing procedure and ad-hoc post binarization step that are required in several previous methods. Specifically, we start with a simple iterative discrete gradient assignment solver. This solver can be trapped in an -circle sequence under moderate conditions, where is the order of the graph matching problem. We then devise an adaptive relaxation mechanism to jump out this degenerating case and show that the resulting new path will converge to a fixed solution in the discrete domain. The proposed method is tested on both synthetic and real-world benchmarks. The experimental results corroborate the efficacy of our method.
Collapse
|
38
|
|
39
|
Yang C, Yin XC, Pei WY, Tian S, Zuo ZY, Zhu C, Yan J. Tracking Based Multi-Orientation Scene Text Detection: A Unified Framework With Dynamic Programming. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:3235-3248. [PMID: 28436864 DOI: 10.1109/tip.2017.2695104] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
There are a variety of grand challenges for multi-orientation text detection in scene videos, where the typical issues include skew distortion, low contrast, and arbitrary motion. Most conventional video text detection methods using individual frames have limited performance. In this paper, we propose a novel tracking based multi-orientation scene text detection method using multiple frames within a unified framework via dynamic programming. First, a multi-information fusion-based multi-orientation text detection method in each frame is proposed to extensively locate possible character candidates and extract text regions with multiple channels and scales. Second, an optimal tracking trajectory is learned and linked globally over consecutive frames by dynamic programming to finally refine the detection results with all detection, recognition, and prediction information. Moreover, the effectiveness of our proposed system is evaluated with the state-of-the-art performances on several public data sets of multi-orientation scene text images and videos, including MSRA-TD500, USTB-SV1K, and ICDAR 2015 Scene Videos.
Collapse
|