Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Ji S, Li YX, Zhou ZH, Kumar S, Ye J. A bag-of-words approach for Drosophila gene expression pattern annotation. BMC Bioinformatics 2009;10:119. [PMID: 19383139 PMCID: PMC2680406 DOI: 10.1186/1471-2105-10-119] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2008] [Accepted: 04/21/2009] [Indexed: 11/27/2022] Open

For:	Ji S, Li YX, Zhou ZH, Kumar S, Ye J. A bag-of-words approach for Drosophila gene expression pattern annotation. BMC Bioinformatics 2009;10:119. [PMID: 19383139 PMCID: PMC2680406 DOI: 10.1186/1471-2105-10-119] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2008] [Accepted: 04/21/2009] [Indexed: 11/27/2022] Open

Number

Cited by Other Article(s)

Cai L, Wang Z, Kulathinal R, Kumar S, Ji S. Deep Low-Shot Learning for Biological Image Classification and Visualization From Limited Training Samples. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023;34:2528-2538. [PMID: 34487501 DOI: 10.1109/tnnls.2021.3106831] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Abstract

Predictive modeling is useful but very challenging in biological image analysis due to the high cost of obtaining and labeling training data. For example, in the study of gene interaction and regulation in Drosophila embryogenesis, the analysis is most biologically meaningful when in situ hybridization (ISH) gene expression pattern images from the same developmental stage are compared. However, labeling training data with precise stages is very time-consuming even for developmental biologists. Thus, a critical challenge is how to build accurate computational models for precise developmental stage classification from limited training samples. In addition, identification and visualization of developmental landmarks are required to enable biologists to interpret prediction results and calibrate models. To address these challenges, we propose a deep two-step low-shot learning framework to accurately classify ISH images using limited training images. Specifically, to enable accurate model training on limited training samples, we formulate the task as a deep low-shot learning problem and develop a novel two-step learning approach, including data-level learning and feature-level learning. We use a deep residual network as our base model and achieve improved performance in the precise stage prediction task of ISH images. Furthermore, the deep model can be interpreted by computing saliency maps, which consists of pixel-wise contributions of an image to its prediction result. In our task, saliency maps are used to assist the identification and visualization of developmental landmarks. Our experimental results show that the proposed model can not only make accurate predictions but also yield biologically meaningful interpretations. We anticipate our methods to be easily generalizable to other biological image classification tasks with small training datasets. Our open-source code is available at https://github.com/divelab/lsl-fly.

Collapse

Long W, Li T, Yang Y, Shen HB. FlyIT: Drosophila Embryogenesis Image Annotation based on Image Tiling and Convolutional Neural Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:194-204. [PMID: 31425122 DOI: 10.1109/tcbb.2019.2935723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Abstract

With the rise of image-based transcriptomics, spatial gene expression data has become increasingly important for understanding gene regulations from the tissue level down to the cell level. Especially, the gene expression images of Drosophila embryos provide a new data source in the study of Drosophila embryogenesis. It is imperative to develop automatic annotation tools since manual annotation is labor-intensive and requires professional knowledge. Although a lot of image annotation methods have been proposed in the computer vision field, they may not work well for gene expression images, due to the great difference between these two annotation tasks. Besides the apparent difference on images, the annotation is performed at the gene level rather than the image level, where the expression patterns of a gene are recorded in multiple images. Moreover, the annotation terms often correspond to local expression patterns of images, yet they are assigned collectively to groups of images and the relations between the terms and single images are unknown. In order to learn the spatial expression patterns comprehensively for genes, we propose a new method, called FlyIT (image annotation based on Image Tiling and convolutional neural networks for fruit Fly). We implement two versions of FlyIT, learning at image-level and gene-level, respectively. The gene-level version employs an image tiling strategy to get a combined image feature representation for each gene. FlyIT uses a pre-trained ResNet model to obtain feature representation and a new loss function to deal with the class imbalance problem. As the annotation of Drosophila images is a multi-label classification problem, the new loss function considers the difficulty levels for recognizing different labels of the same sample and adjusts the sample weights accordingly. The experimental results on the FlyExpress database show that both the image tiling strategy and the deep architecture lead to the great enhancement of the annotation performance. FlyIT outperforms the existing annotators by a large margin (over 9 percent on AUC and 12 percent on macro F1 for predicting the top 10 terms). It also shows advantages over other deep learning models, including both single-instance and multi-instance learning frameworks.

Collapse

Zhang W, Li R, Zeng T, Sun Q, Kumar S, Ye J, Ji S. Deep Model Based Transfer and Multi-Task Learning for Biological Image Analysis. IEEE TRANSACTIONS ON BIG DATA 2020;6:322-333. [PMID: 36846743 PMCID: PMC9957557 DOI: 10.1109/tbdata.2016.2573280] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Zeng T, Li R, Mukkamala R, Ye J, Ji S. Deep convolutional neural networks for annotating gene expression patterns in the mouse brain. BMC Bioinformatics 2015;16:147. [PMID: 25948335 PMCID: PMC4432953 DOI: 10.1186/s12859-015-0553-9] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2014] [Accepted: 03/27/2015] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Profiling gene expression in brain structures at various spatial and temporal scales is essential to understanding how genes regulate the development of brain structures. The Allen Developing Mouse Brain Atlas provides high-resolution 3-D in situ hybridization (ISH) gene expression patterns in multiple developing stages of the mouse brain. Currently, the ISH images are annotated with anatomical terms manually. In this paper, we propose a computational approach to annotate gene expression pattern images in the mouse brain at various structural levels over the course of development.

RESULTS

We applied deep convolutional neural network that was trained on a large set of natural images to extract features from the ISH images of developing mouse brain. As a baseline representation, we applied invariant image feature descriptors to capture local statistics from ISH images and used the bag-of-words approach to build image-level representations. Both types of features from multiple ISH image sections of the entire brain were then combined to build 3-D, brain-wide gene expression representations. We employed regularized learning methods for discriminating gene expression patterns in different brain structures. Results show that our approach of using convolutional model as feature extractors achieved superior performance in annotating gene expression patterns at multiple levels of brain structures throughout four developing ages. Overall, we achieved average AUC of 0.894 ± 0.014, as compared with 0.820 ± 0.046 yielded by the bag-of-words approach.

CONCLUSIONS

Deep convolutional neural network model trained on natural image sets and applied to gene expression pattern annotation tasks yielded superior performance, demonstrating its transfer learning property is applicable to such biological image sets.

Collapse

Li R, Zhang W, Ji S. Automated identification of cell-type-specific genes in the mouse brain by image computing of expression patterns. BMC Bioinformatics 2014;15:209. [PMID: 24947138 PMCID: PMC4078975 DOI: 10.1186/1471-2105-15-209] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2013] [Accepted: 05/29/2014] [Indexed: 02/07/2023] Open

Jug F, Pietzsch T, Preibisch S, Tomancak P. Bioimage Informatics in the context of Drosophila research. Methods 2014;68:60-73. [PMID: 24732429 DOI: 10.1016/j.ymeth.2014.04.004] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Revised: 04/02/2014] [Accepted: 04/04/2014] [Indexed: 01/05/2023] Open

Augmenting multi-instance multilabel learning with sparse bayesian models for skin biopsy image analysis. BIOMED RESEARCH INTERNATIONAL 2014;2014:305629. [PMID: 24860817 PMCID: PMC3997873 DOI: 10.1155/2014/305629] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2014] [Accepted: 02/03/2014] [Indexed: 11/20/2022]

Zhang W, Feng D, Li R, Chernikov A, Chrisochoides N, Osgood C, Konikoff C, Newfeld S, Kumar S, Ji S. A mesh generation and machine learning framework for Drosophila gene expression pattern image analysis. BMC Bioinformatics 2013;14:372. [PMID: 24373308 PMCID: PMC3879658 DOI: 10.1186/1471-2105-14-372] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2013] [Accepted: 12/16/2013] [Indexed: 01/11/2023] Open

Pruteanu-Malinici I, Majoros WH, Ohler U. Automated annotation of gene expression image sequences via non-parametric factor analysis and conditional random fields. Bioinformatics 2013;29:i27-35. [PMID: 23812993 PMCID: PMC3694682 DOI: 10.1093/bioinformatics/btt206] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open

Sun Q, Muckatira S, Yuan L, Ji S, Newfeld S, Kumar S, Ye J. Image-level and group-level models for Drosophila gene expression pattern annotation. BMC Bioinformatics 2013;14:350. [PMID: 24299119 PMCID: PMC3924186 DOI: 10.1186/1471-2105-14-350] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2013] [Accepted: 11/06/2013] [Indexed: 12/27/2022] Open

Abstract

Background

Drosophila melanogaster has been established as a model organism for investigating the developmental gene interactions. The spatio-temporal gene expression patterns of Drosophila melanogaster can be visualized by in situ hybridization and documented as digital images. Automated and efficient tools for analyzing these expression images will provide biological insights into the gene functions, interactions, and networks. To facilitate pattern recognition and comparison, many web-based resources have been created to conduct comparative analysis based on the body part keywords and the associated images. With the fast accumulation of images from high-throughput techniques, manual inspection of images will impose a serious impediment on the pace of biological discovery. It is thus imperative to design an automated system for efficient image annotation and comparison.

Results

We present a computational framework to perform anatomical keywords annotation for Drosophila gene expression images. The spatial sparse coding approach is used to represent local patches of images in comparison with the well-known bag-of-words (BoW) method. Three pooling functions including max pooling, average pooling and Sqrt (square root of mean squared statistics) pooling are employed to transform the sparse codes to image features. Based on the constructed features, we develop both an image-level scheme and a group-level scheme to tackle the key challenges in annotating Drosophila gene expression pattern images automatically. To deal with the imbalanced data distribution inherent in image annotation tasks, the undersampling method is applied together with majority vote. Results on Drosophila embryonic expression pattern images verify the efficacy of our approach.

Conclusion

In our experiment, the three pooling functions perform comparably well in feature dimension reduction. The undersampling with majority vote is shown to be effective in tackling the problem of imbalanced data. Moreover, combining sparse coding and image-level scheme leads to consistent performance improvement in keywords annotation.

Collapse

Zhang G, Yin J, Li Z, Su X, Li G, Zhang H. Automated skin biopsy histopathological image annotation using multi-instance representation and learning. BMC Med Genomics 2013;6 Suppl 3:S10. [PMID: 24565115 PMCID: PMC3980401 DOI: 10.1186/1755-8794-6-s3-s10] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

A two-layer framework for appearance based recognition using spatial and discriminant influences. Neurocomputing 2013. [DOI: 10.1016/j.neucom.2013.03.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Cai X, Wang H, Huang H, Ding C. Joint stage recognition and anatomical annotation of Drosophila gene expression patterns. Bioinformatics 2013;28:i16-24. [PMID: 22689756 PMCID: PMC3371852 DOI: 10.1093/bioinformatics/bts220] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open

Mwangi B, Ebmeier KP, Matthews K, Steele JD. Multi-centre diagnostic classification of individual structural neuroimaging scans from patients with major depressive disorder. ACTA ACUST UNITED AC 2012;135:1508-21. [PMID: 22544901 DOI: 10.1093/brain/aws084] [Citation(s) in RCA: 134] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Abstract

Quantitative abnormalities of brain structure in patients with major depressive disorder have been reported at a group level for decades. However, these structural differences appear subtle in comparison with conventional radiologically defined abnormalities, with considerable inter-subject variability. Consequently, it has not been possible to readily identify scans from patients with major depressive disorder at an individual level. Recently, machine learning techniques such as relevance vector machines and support vector machines have been applied to predictive classification of individual scans with variable success. Here we describe a novel hybrid method, which combines machine learning with feature selection and characterization, with the latter aimed at maximizing the accuracy of machine learning prediction. The method was tested using a multi-centre dataset of T(1)-weighted 'structural' scans. A total of 62 patients with major depressive disorder and matched controls were recruited from referred secondary care clinical populations in Aberdeen and Edinburgh, UK. The generalization ability and predictive accuracy of the classifiers was tested using data left out of the training process. High prediction accuracy was achieved (~90%). While feature selection was important for maximizing high predictive accuracy with machine learning, feature characterization contributed only a modest improvement to relevance vector machine-based prediction (~5%). Notably, while the only information provided for training the classifiers was T(1)-weighted scans plus a categorical label (major depressive disorder versus controls), both relevance vector machine and support vector machine 'weighting factors' (used for making predictions) correlated strongly with subjective ratings of illness severity. These results indicate that machine learning techniques have the potential to inform clinical practice and research, as they can make accurate predictions about brain scan data from individual subjects. Furthermore, machine learning weighting factors may reflect an objective biomarker of major depressive disorder illness severity, based on abnormalities of brain structure.

Collapse

Yuan L, Woodard A, Ji S, Jiang Y, Zhou ZH, Kumar S, Ye J. Learning sparse representations for fruit-fly gene expression pattern image annotation and retrieval. BMC Bioinformatics 2012;13:107. [PMID: 22621237 PMCID: PMC3434040 DOI: 10.1186/1471-2105-13-107] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2011] [Accepted: 05/23/2012] [Indexed: 11/10/2022] Open

Li YX, Ji S, Kumar S, Ye J, Zhou ZH. Drosophila gene expression pattern annotation through multi-instance multi-label learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012;9:98-112. [PMID: 21519115 DOI: 10.1109/tcbb.2011.73] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

Li Q, Kambhamettu C. Contour extraction of Drosophila embryos. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011;8:1509-1521. [PMID: 21339537 DOI: 10.1109/tcbb.2011.37] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

Pruteanu-Malinici I, Mace DL, Ohler U. Automatic annotation of spatial expression patterns via sparse Bayesian factor models. PLoS Comput Biol 2011;7:e1002098. [PMID: 21814502 PMCID: PMC3140966 DOI: 10.1371/journal.pcbi.1002098] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2010] [Accepted: 05/08/2011] [Indexed: 01/24/2023] Open

Abstract

Advances in reporters for gene expression have made it possible to document and quantify expression patterns in 2D–4D. In contrast to microarrays, which provide data for many genes but averaged and/or at low resolution, images reveal the high spatial dynamics of gene expression. Developing computational methods to compare, annotate, and model gene expression based on images is imperative, considering that available data are rapidly increasing. We have developed a sparse Bayesian factor analysis model in which the observed expression diversity of among a large set of high-dimensional images is modeled by a small number of hidden common factors. We apply this approach on embryonic expression patterns from a Drosophila RNA in situ image database, and show that the automatically inferred factors provide for a meaningful decomposition and represent common co-regulation or biological functions. The low-dimensional set of factor mixing weights is further used as features by a classifier to annotate expression patterns with functional categories. On human-curated annotations, our sparse approach reaches similar or better classification of expression patterns at different developmental stages, when compared to other automatic image annotation methods using thousands of hard-to-interpret features. Our study therefore outlines a general framework for large microscopy data sets, in which both the generative model itself, as well as its application for analysis tasks such as automated annotation, can provide insight into biological questions.

High throughput image acquisition is a quickly increasing new source of data for problems in computational biology, such as phenotypic screens. Given the very diverse nature of imaging technology, samples, and biological questions, approaches are oftentimes very tailored and ad hoc to a specific data set. In particular, the image-based genome scale profiling of gene expression patterns via approaches like in situ hybridization requires the development of accurate and automatic image analysis systems for understanding regulatory networks and development of multicellular organisms. Here, we present a computational method for automated annotation of Drosophila gene expression images. This framework allows us to extract, identify and compare spatial expression patterns, of essence for higher organisms. Based on a sparse feature extraction technique, we successfully cluster and annotate expression patterns with high reliability, and show that the model represents a “vocabulary” of basic patterns reflecting common function or regulation.

Collapse

Frise E, Hammonds AS, Celniker SE. Systematic image-driven analysis of the spatial Drosophila embryonic expression landscape. Mol Syst Biol 2010;6:345. [PMID: 20087342 PMCID: PMC2824522 DOI: 10.1038/msb.2009.102] [Citation(s) in RCA: 79] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2009] [Accepted: 12/21/2009] [Indexed: 11/09/2022] Open

Ji S, Yuan L, Li YX, Zhou ZH, Kumar S, Ye J. Drosophila Gene Expression Pattern Annotation Using Sparse Features and Term-Term Interactions. KDD : PROCEEDINGS. INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING 2009;2009:407-415. [PMID: 21614142 DOI: 10.1145/1557019.1557068] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]