Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Yuan L, Woodard A, Ji S, Jiang Y, Zhou ZH, Kumar S, Ye J. Learning sparse representations for fruit-fly gene expression pattern image annotation and retrieval. BMC Bioinformatics 2012;13:107. [PMID: 22621237 PMCID: PMC3434040 DOI: 10.1186/1471-2105-13-107] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2011] [Accepted: 05/23/2012] [Indexed: 11/10/2022] Open

For:	Yuan L, Woodard A, Ji S, Jiang Y, Zhou ZH, Kumar S, Ye J. Learning sparse representations for fruit-fly gene expression pattern image annotation and retrieval. BMC Bioinformatics 2012;13:107. [PMID: 22621237 PMCID: PMC3434040 DOI: 10.1186/1471-2105-13-107] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2011] [Accepted: 05/23/2012] [Indexed: 11/10/2022] Open

Number

Cited by Other Article(s)

Long W, Li T, Yang Y, Shen HB. FlyIT: Drosophila Embryogenesis Image Annotation based on Image Tiling and Convolutional Neural Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:194-204. [PMID: 31425122 DOI: 10.1109/tcbb.2019.2935723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Abstract

With the rise of image-based transcriptomics, spatial gene expression data has become increasingly important for understanding gene regulations from the tissue level down to the cell level. Especially, the gene expression images of Drosophila embryos provide a new data source in the study of Drosophila embryogenesis. It is imperative to develop automatic annotation tools since manual annotation is labor-intensive and requires professional knowledge. Although a lot of image annotation methods have been proposed in the computer vision field, they may not work well for gene expression images, due to the great difference between these two annotation tasks. Besides the apparent difference on images, the annotation is performed at the gene level rather than the image level, where the expression patterns of a gene are recorded in multiple images. Moreover, the annotation terms often correspond to local expression patterns of images, yet they are assigned collectively to groups of images and the relations between the terms and single images are unknown. In order to learn the spatial expression patterns comprehensively for genes, we propose a new method, called FlyIT (image annotation based on Image Tiling and convolutional neural networks for fruit Fly). We implement two versions of FlyIT, learning at image-level and gene-level, respectively. The gene-level version employs an image tiling strategy to get a combined image feature representation for each gene. FlyIT uses a pre-trained ResNet model to obtain feature representation and a new loss function to deal with the class imbalance problem. As the annotation of Drosophila images is a multi-label classification problem, the new loss function considers the difficulty levels for recognizing different labels of the same sample and adjusts the sample weights accordingly. The experimental results on the FlyExpress database show that both the image tiling strategy and the deep architecture lead to the great enhancement of the annotation performance. FlyIT outperforms the existing annotators by a large margin (over 9 percent on AUC and 12 percent on macro F1 for predicting the top 10 terms). It also shows advantages over other deep learning models, including both single-instance and multi-instance learning frameworks.

Collapse

Zhang W, Li R, Zeng T, Sun Q, Kumar S, Ye J, Ji S. Deep Model Based Transfer and Multi-Task Learning for Biological Image Analysis. IEEE TRANSACTIONS ON BIG DATA 2020;6:322-333. [PMID: 36846743 PMCID: PMC9957557 DOI: 10.1109/tbdata.2016.2573280] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Yang Y, Fang Q, Shen HB. Predicting gene regulatory interactions based on spatial gene expression data and deep learning. PLoS Comput Biol 2019;15:e1007324. [PMID: 31527870 PMCID: PMC6764701 DOI: 10.1371/journal.pcbi.1007324] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2018] [Revised: 09/27/2019] [Accepted: 08/08/2019] [Indexed: 11/23/2022] Open

Yang Y, Zhou M, Fang Q, Shen HB. AnnoFly: annotating Drosophila embryonic images based on an attention-enhanced RNN model. Bioinformatics 2019;35:2834-2842. [DOI: 10.1093/bioinformatics/bty1064] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2018] [Revised: 12/01/2018] [Accepted: 12/27/2018] [Indexed: 11/13/2022] Open

Abstract Abstract Motivation In the post-genomic era, image-based transcriptomics have received huge attention, because the visualization of gene expression distribution is able to reveal spatial and temporal expression pattern, which is significantly important for understanding biological mechanisms. The Berkeley Drosophila Genome Project has collected a large-scale spatial gene expression database for studying Drosophila embryogenesis. Given the expression images, how to annotate them for the study of Drosophila embryonic development is the next urgent task. In order to speed up the labor-intensive labeling work, automatic tools are highly desired. However, conventional image annotation tools are not applicable here, because the labeling is at the gene-level rather than the image-level, where each gene is represented by a bag of multiple related images, showing a multi-instance phenomenon, and the image quality varies by image orientations and experiment batches. Moreover, different local regions of an image correspond to different CV annotation terms, i.e. an image has multiple labels. Designing an accurate annotation tool in such a multi-instance multi-label scenario is a very challenging task. Results To address these challenges, we develop a new annotator for the fruit fly embryonic images, called AnnoFly. Driven by an attention-enhanced RNN model, it can weight images of different qualities, so as to focus on the most informative image patterns. We assess the new model on three standard datasets. The experimental results reveal that the attention-based model provides a transparent approach for identifying the important images for labeling, and it substantially enhances the accuracy compared with the existing annotation methods, including both single-instance and multi-instance learning methods. Availability and implementation http://www.csbio.sjtu.edu.cn/bioinf/annofly/ Supplementary information Supplementary data are available at Bioinformatics online. Collapse

Jug F, Pietzsch T, Preibisch S, Tomancak P. Bioimage Informatics in the context of Drosophila research. Methods 2014;68:60-73. [PMID: 24732429 DOI: 10.1016/j.ymeth.2014.04.004] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Revised: 04/02/2014] [Accepted: 04/04/2014] [Indexed: 01/05/2023] Open

Zhang W, Feng D, Li R, Chernikov A, Chrisochoides N, Osgood C, Konikoff C, Newfeld S, Kumar S, Ji S. A mesh generation and machine learning framework for Drosophila gene expression pattern image analysis. BMC Bioinformatics 2013;14:372. [PMID: 24373308 PMCID: PMC3879658 DOI: 10.1186/1471-2105-14-372] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2013] [Accepted: 12/16/2013] [Indexed: 01/11/2023] Open

Sun Q, Muckatira S, Yuan L, Ji S, Newfeld S, Kumar S, Ye J. Image-level and group-level models for Drosophila gene expression pattern annotation. BMC Bioinformatics 2013;14:350. [PMID: 24299119 PMCID: PMC3924186 DOI: 10.1186/1471-2105-14-350] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2013] [Accepted: 11/06/2013] [Indexed: 12/27/2022] Open

Abstract

Background

Drosophila melanogaster has been established as a model organism for investigating the developmental gene interactions. The spatio-temporal gene expression patterns of Drosophila melanogaster can be visualized by in situ hybridization and documented as digital images. Automated and efficient tools for analyzing these expression images will provide biological insights into the gene functions, interactions, and networks. To facilitate pattern recognition and comparison, many web-based resources have been created to conduct comparative analysis based on the body part keywords and the associated images. With the fast accumulation of images from high-throughput techniques, manual inspection of images will impose a serious impediment on the pace of biological discovery. It is thus imperative to design an automated system for efficient image annotation and comparison.

Results

We present a computational framework to perform anatomical keywords annotation for Drosophila gene expression images. The spatial sparse coding approach is used to represent local patches of images in comparison with the well-known bag-of-words (BoW) method. Three pooling functions including max pooling, average pooling and Sqrt (square root of mean squared statistics) pooling are employed to transform the sparse codes to image features. Based on the constructed features, we develop both an image-level scheme and a group-level scheme to tackle the key challenges in annotating Drosophila gene expression pattern images automatically. To deal with the imbalanced data distribution inherent in image annotation tasks, the undersampling method is applied together with majority vote. Results on Drosophila embryonic expression pattern images verify the efficacy of our approach.

Conclusion

In our experiment, the three pooling functions perform comparably well in feature dimension reduction. The undersampling with majority vote is shown to be effective in tackling the problem of imbalanced data. Moreover, combining sparse coding and image-level scheme leads to consistent performance improvement in keywords annotation.

Collapse

Puniyani K, Xing EP. GINI: from ISH images to gene interaction networks. PLoS Comput Biol 2013;9:e1003227. [PMID: 24130465 PMCID: PMC3794902 DOI: 10.1371/journal.pcbi.1003227] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2012] [Accepted: 07/30/2013] [Indexed: 12/22/2022] Open

Abstract

Accurate inference of molecular and functional interactions among genes, especially in multicellular organisms such as Drosophila, often requires statistical analysis of correlations not only between the magnitudes of gene expressions, but also between their temporal-spatial patterns. The ISH (in-situ-hybridization)-based gene expression micro-imaging technology offers an effective approach to perform large-scale spatial-temporal profiling of whole-body mRNA abundance. However, analytical tools for discovering gene interactions from such data remain an open challenge due to various reasons, including difficulties in extracting canonical representations of gene activities from images, and in inference of statistically meaningful networks from such representations. In this paper, we present GINI, a machine learning system for inferring gene interaction networks from Drosophila embryonic ISH images. GINI builds on a computer-vision-inspired vector-space representation of the spatial pattern of gene expression in ISH images, enabled by our recently developed system; and a new multi-instance-kernel algorithm that learns a sparse Markov network model, in which, every gene (i.e., node) in the network is represented by a vector-valued spatial pattern rather than a scalar-valued gene intensity as in conventional approaches such as a Gaussian graphical model. By capturing the notion of spatial similarity of gene expression, and at the same time properly taking into account the presence of multiple images per gene via multi-instance kernels, GINI is well-positioned to infer statistically sound, and biologically meaningful gene interaction networks from image data. Using both synthetic data and a small manually curated data set, we demonstrate the effectiveness of our approach in network building. Furthermore, we report results on a large publicly available collection of Drosophila embryonic ISH images from the Berkeley Drosophila Genome Project, where GINI makes novel and interesting predictions of gene interactions. Software for GINI is available at http://sailing.cs.cmu.edu/Drosophila_ISH_images/

As high-throughput technologies for molecular abundance profiling are becoming more inexpensive and accessible, computational inference of gene interaction networks from such data based on well-founded statistical principles is imperative to advance the understanding of regulatory mechanisms in various biological systems. Reverse engineering of gene networks has traditionally relied on analysis of whole-genome microarray data; here we present a new method, GINI, to infer gene networks from ISH images, thereby enabling exploration of spatial characteristics of gene expressions for network inference. Our method generates a Markov network, which encapsulates globally meaningful statistical-dependencies from vector-valued gene spatial patterns. In other words, we advance the state-of-art in both the usage of richer forms of expression data, and the employment of principled statistical methodology for sound network inference on such new form of data. Our results show that analyzing the spatial distribution of gene expression enables us to capture information not available from microarray data. Such an analysis is especially important in analyzing genes involved in embryonic development of Drosophila to reveal specific spatial patterning that determines the development of the 14 segments of the adult fly.

Collapse

Ye J, Liu J. Sparse Methods for Biomedical Data. SIGKDD EXPLORATIONS : NEWSLETTER OF THE SPECIAL INTEREST GROUP (SIG) ON KNOWLEDGE DISCOVERY & DATA MINING 2012;14:4-15. [PMID: 24076585 PMCID: PMC3783968 DOI: 10.1145/2408736.2408739] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]