Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Xu M, Li W, James GM, Mehan MR, Zhou XJ. Automated multidimensional phenotypic profiling using large public microarray repositories. Proc Natl Acad Sci U S A 2009;106:12323-8. [PMID: 19590007 DOI: 10.1073/pnas.0900883106] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

For:	Xu M, Li W, James GM, Mehan MR, Zhou XJ. Automated multidimensional phenotypic profiling using large public microarray repositories. Proc Natl Acad Sci U S A 2009;106:12323-8. [PMID: 19590007 DOI: 10.1073/pnas.0900883106] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Number

Cited by Other Article(s)

Najafzadeh L, Mahmoudi M, Ebadi M, Dehghan Shasaltaneh M. Co-expression Network Analysis Reveals Key Genes Related to Ankylosing spondylitis Arthritis Disease: Computational and Experimental Validation. IRANIAN JOURNAL OF BIOTECHNOLOGY 2021;19:e2630. [PMID: 34179194 PMCID: PMC8217537 DOI: 10.30498/ijb.2021.2630] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

Abstract

BACKGROUND

Ankylosing spondylitis (AS) is a type of arthritis which can cause inflammation in the vertebrae and joints between the spine and pelvis. However, our understanding of the exact genetic mechanisms of AS is still far from being clear.

OBJECTIVE

To study and find the mechanisms and possible biomarkers related to AS by surveying inter-gene correlations of networks.

MATERIALS AND METHODS

A weighted gene co-expression network was constructed among genes identified by microarray analysis, gene co-expression network analysis, and network clustering. Then receiver operating characteristic (ROC) curves were conducted to identify a significant module with the genes implicated in the AS pathogenesis. Real-time PCR was performed to validate the results of microarray analysis.

RESULTS

In the significant module obtained from the network analysis there were eight AS related genes (LSM3, MRPS11, NSMCE2, PSMA4, UBL5, RPL17, MRPL22 and RPS17) which have been reported in previous studies as hub genes. Further, in this module, eight significant enriched pathways were found with adjusted p-values < 0.001 consisting of oxidative phosphorylation, ribosome, nonalcoholic fatty liver disease, Alzheimer's, Huntington's, and Parkinson's diseases, spliceosome, and cardiac muscle contraction pathways which have been linked to AS. Furthermore, we identified nine AS related genes (UQCRB, UQCRH, UQCRHL, UQCRQ, COX7B, COX5B, COX6C, COX6A1 and COX7C) in these pathways which can play essential roles in controlling mitochondrial activity and pathogenesis of autoimmune diseases. Real-time PCR results showed that three genes including UQCRH, MRPS11, and NSMCE2 in AS patients were significantly differentially expressed compared with normal controls.

CONCLUSIONS

The results of the present study may contribute to understanding of AS molecular pathogenesis, thereby aiding the early prognosis, diagnosis, and effective therapies of the disease.

Collapse

DeepPheno: Predicting single gene loss-of-function phenotypes using an ontology-aware hierarchical classifier. PLoS Comput Biol 2020;16:e1008453. [PMID: 33206638 PMCID: PMC7710064 DOI: 10.1371/journal.pcbi.1008453] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 12/02/2020] [Accepted: 10/20/2020] [Indexed: 12/21/2022] Open

Abstract

Predicting the phenotypes resulting from molecular perturbations is one of the key challenges in genetics. Both forward and reverse genetic screen are employed to identify the molecular mechanisms underlying phenotypes and disease, and these resulted in a large number of genotype–phenotype association being available for humans and model organisms. Combined with recent advances in machine learning, it may now be possible to predict human phenotypes resulting from particular molecular aberrations. We developed DeepPheno, a neural network based hierarchical multi-class multi-label classification method for predicting the phenotypes resulting from loss-of-function in single genes. DeepPheno uses the functional annotations with gene products to predict the phenotypes resulting from a loss-of-function; additionally, we employ a two-step procedure in which we predict these functions first and then predict phenotypes. Prediction of phenotypes is ontology-based and we propose a novel ontology-based classifier suitable for very large hierarchical classification tasks. These methods allow us to predict phenotypes associated with any known protein-coding gene. We evaluate our approach using evaluation metrics established by the CAFA challenge and compare with top performing CAFA2 methods as well as several state of the art phenotype prediction approaches, demonstrating the improvement of DeepPheno over established methods. Furthermore, we show that predictions generated by DeepPheno are applicable to predicting gene–disease associations based on comparing phenotypes, and that a large number of new predictions made by DeepPheno have recently been added as phenotype databases.

Gene–phenotype associations can help to understand the underlying mechanisms of many genetic diseases. However, experimental identification, often involving animal models, is time consuming and expensive. Computational methods that predict gene–phenotype associations can be used instead. We developed DeepPheno, a novel approach for predicting the phenotypes resulting from a loss of function of a single gene. We use gene functions and gene expression as information to prediction phenotypes. Our method uses a neural network classifier that is able to account for hierarchical dependencies between phenotypes. We extensively evaluate our method and compare it with related approaches, and we show that DeepPheno results in better performance in several evaluations. Furthermore, we found that many of the new predictions made by our method have been added to phenotype association databases released one year later. Overall, DeepPheno simulates some aspects of human physiology and how molecular and physiological alterations lead to abnormal phenotypes.

Collapse

Hu B, Ruan Y, Wei F, Qin G, Mo X, Wang X, Zou D. Identification of three glioblastoma subtypes and a six-gene prognostic risk index based on the expression of growth factors and cytokines. Am J Transl Res 2020;12:4669-4682. [PMID: 32913540 PMCID: PMC7476164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Accepted: 07/22/2020] [Indexed: 06/11/2023]

Baez-Ortega A, Gori K. Computational approaches for discovery of mutational signatures in cancer. Brief Bioinform 2019;20:77-88. [PMID: 28968631 DOI: 10.1093/bib/bbx082] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2017] [Indexed: 01/07/2023] Open

Che C, Lin R, Zeng X, Elmaaroufi K, Galeotti J, Xu M. Improved deep learning-based macromolecules structure classification from electron cryo-tomograms. MACHINE VISION AND APPLICATIONS 2018;29:1227-1236. [PMID: 31511756 PMCID: PMC6738941 DOI: 10.1007/s00138-018-0949-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2017] [Revised: 01/16/2018] [Accepted: 05/18/2018] [Indexed: 05/30/2023]

Kim M, Tagkopoulos I. Data integration and predictive modeling methods for multi-omics datasets. Mol Omics 2018;14:8-25. [DOI: 10.1039/c7mo00051k] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Li YE, Xiao M, Shi B, Yang YCT, Wang D, Wang F, Marcia M, Lu ZJ. Identification of high-confidence RNA regulatory elements by combinatorial classification of RNA-protein binding sites. Genome Biol 2017;18:169. [PMID: 28886744 PMCID: PMC5591525 DOI: 10.1186/s13059-017-1298-8] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2017] [Accepted: 08/14/2017] [Indexed: 12/20/2022] Open

Gandy LM, Gumm J, Fertig B, Thessen A, Kennish MJ, Chavan S, Marchionni L, Xia X, Shankrit S, Fertig EJ. Synthesizer: Expediting synthesis studies from context-free data with information retrieval techniques. PLoS One 2017;12:e0175860. [PMID: 28437440 PMCID: PMC5402950 DOI: 10.1371/journal.pone.0175860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2016] [Accepted: 03/31/2017] [Indexed: 11/18/2022] Open

Ji Z, Vokes SA, Dang CV, Ji H. Turning publicly available gene expression data into discoveries using gene set context analysis. Nucleic Acids Res 2015;44:e8. [PMID: 26350211 PMCID: PMC4705686 DOI: 10.1093/nar/gkv873] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2015] [Accepted: 08/20/2015] [Indexed: 12/17/2022] Open

Kim M, Zorraquino V, Tagkopoulos I. Microbial forensics: predicting phenotypic characteristics and environmental conditions from large-scale gene expression profiles. PLoS Comput Biol 2015;11:e1004127. [PMID: 25774498 PMCID: PMC4361189 DOI: 10.1371/journal.pcbi.1004127] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2014] [Accepted: 01/14/2015] [Indexed: 01/13/2023] Open

Abstract

A tantalizing question in cellular physiology is whether the cellular state and environmental conditions can be inferred by the expression signature of an organism. To investigate this relationship, we created an extensive normalized gene expression compendium for the bacterium Escherichia coli that was further enriched with meta-information through an iterative learning procedure. We then constructed an ensemble method to predict environmental and cellular state, including strain, growth phase, medium, oxygen level, antibiotic and carbon source presence. Results show that gene expression is an excellent predictor of environmental structure, with multi-class ensemble models achieving balanced accuracy between 70.0% (±3.5%) to 98.3% (±2.3%) for the various characteristics. Interestingly, this performance can be significantly boosted when environmental and strain characteristics are simultaneously considered, as a composite classifier that captures the inter-dependencies of three characteristics (medium, phase and strain) achieved 10.6% (±1.0%) higher performance than any individual models. Contrary to expectations, only 59% of the top informative genes were also identified as differentially expressed under the respective conditions. Functional analysis of the respective genetic signatures implicates a wide spectrum of Gene Ontology terms and KEGG pathways with condition-specific information content, including iron transport, transferases, and enterobactin synthesis. Further experimental phenotypic-to-genotypic mapping that we conducted for knock-out mutants argues for the information content of top-ranked genes. This work demonstrates the degree at which genome-scale transcriptional information can be predictive of latent, heterogeneous and seemingly disparate phenotypic and environmental characteristics, with far-reaching applications.

The transcriptional profile of an organism contains clues about the environmental context in which it has evolved and currently lives, its behavior and cellular state. It is yet unclear, however, how much information can be efficiently extracted and how it can be used to classify new samples with respect to their environmental and genetic characteristics. Here, we have constructed an extensive transcriptome compendium of Escherichia coli that we have further enriched via an iterative learning approach. We then apply an ensemble of various machine learning algorithms to infer environmental and cellular information such as strain, growth phase, medium, oxygen level, antibiotic and carbon source. Functional analysis of the most informative genes provides mechanistic insights and palpable hypotheses regarding their role in each environmental or genetic context. Our work argues that genome-scale gene expression can be a multi-purpose marker for identifying latent, heterogeneous cellular and environmental states and that optimal classification can be achieved with a feature set of a couple hundred genes that might not necessarily have the most pronounced differential expression in the respective conditions.

Collapse

Johnson MD, Bell J, Clarke K, Chandler R, Pathak P, Xia Y, Marshall RL, Weinstock GM, Loman NJ, Winn PJ, Lund PA. Characterization of mutations in the PAS domain of the EvgS sensor kinase selected by laboratory evolution for acid resistance in Escherichia coli. Mol Microbiol 2014;93:911-27. [PMID: 24995530 PMCID: PMC4283999 DOI: 10.1111/mmi.12704] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/02/2014] [Indexed: 01/25/2023]

Giannopoulou EG, Elemento O. Inferring chromatin-bound protein complexes from genome-wide binding assays. Genome Res 2013;23:1295-306. [PMID: 23554462 PMCID: PMC3730103 DOI: 10.1101/gr.149419.112] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]

Oldham MC, Langfelder P, Horvath S. Network methods for describing sample relationships in genomic datasets: application to Huntington's disease. BMC SYSTEMS BIOLOGY 2012;6:63. [PMID: 22691535 PMCID: PMC3441531 DOI: 10.1186/1752-0509-6-63] [Citation(s) in RCA: 99] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2012] [Accepted: 05/03/2012] [Indexed: 01/08/2023]

Abstract

BACKGROUND

Genomic datasets generated by new technologies are increasingly prevalent in disparate areas of biological research. While many studies have sought to characterize relationships among genomic features, commensurate efforts to characterize relationships among biological samples have been less common. Consequently, the full extent of sample variation in genomic studies is often under-appreciated, complicating downstream analytical tasks such as gene co-expression network analysis.

RESULTS

Here we demonstrate the use of network methods for characterizing sample relationships in microarray data generated from human brain tissue. We describe an approach for identifying outlying samples that does not depend on the choice or use of clustering algorithms. We introduce a battery of measures for quantifying the consistency and integrity of sample relationships, which can be compared across disparate studies, technology platforms, and biological systems. Among these measures, we provide evidence that the correlation between the connectivity and the clustering coefficient (two important network concepts) is a sensitive indicator of homogeneity among biological samples. We also show that this measure, which we refer to as cor(K,C), can distinguish biologically meaningful relationships among subgroups of samples. Specifically, we find that cor(K,C) reveals the profound effect of Huntington's disease on samples from the caudate nucleus relative to other brain regions. Furthermore, we find that this effect is concentrated in specific modules of genes that are naturally co-expressed in human caudate nucleus, highlighting a new strategy for exploring the effects of disease on sets of genes.

CONCLUSIONS

These results underscore the importance of systematically exploring sample relationships in large genomic datasets before seeking to analyze genomic feature activity. We introduce a standardized platform for this purpose using freely available R software that has been designed to enable iterative and interactive exploration of sample networks.

Collapse

Drummond RSM, Sheehan H, Simons JL, Martínez-Sánchez NM, Turner RM, Putterill J, Snowden KC. The Expression of Petunia Strigolactone Pathway Genes is Altered as Part of the Endogenous Developmental Program. FRONTIERS IN PLANT SCIENCE 2012;2:115. [PMID: 22645562 PMCID: PMC3355783 DOI: 10.3389/fpls.2011.00115] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2011] [Accepted: 12/26/2011] [Indexed: 05/18/2023]

Jorgensen RA, Dorantes-Acosta AE. Conserved Peptide Upstream Open Reading Frames are Associated with Regulatory Genes in Angiosperms. FRONTIERS IN PLANT SCIENCE 2012;3:191. [PMID: 22936940 PMCID: PMC3426882 DOI: 10.3389/fpls.2012.00191] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2012] [Accepted: 08/04/2012] [Indexed: 05/20/2023]

Li W, Liu CC, Zhang T, Li H, Waterman MS, Zhou XJ. Integrative analysis of many weighted co-expression networks using tensor computation. PLoS Comput Biol 2011;7:e1001106. [PMID: 21698123 PMCID: PMC3116899 DOI: 10.1371/journal.pcbi.1001106] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2010] [Accepted: 02/08/2011] [Indexed: 11/18/2022] Open

Abstract

The rapid accumulation of biological networks poses new challenges and calls for powerful integrative analysis tools. Most existing methods capable of simultaneously analyzing a large number of networks were primarily designed for unweighted networks, and cannot easily be extended to weighted networks. However, it is known that transforming weighted into unweighted networks by dichotomizing the edges of weighted networks with a threshold generally leads to information loss. We have developed a novel, tensor-based computational framework for mining recurrent heavy subgraphs in a large set of massive weighted networks. Specifically, we formulate the recurrent heavy subgraph identification problem as a heavy 3D subtensor discovery problem with sparse constraints. We describe an effective approach to solving this problem by designing a multi-stage, convex relaxation protocol, and a non-uniform edge sampling technique. We applied our method to 130 co-expression networks, and identified 11,394 recurrent heavy subgraphs, grouped into 2,810 families. We demonstrated that the identified subgraphs represent meaningful biological modules by validating against a large set of compiled biological knowledge bases. We also showed that the likelihood for a heavy subgraph to be meaningful increases significantly with its recurrence in multiple networks, highlighting the importance of the integrative approach to biological network analysis. Moreover, our approach based on weighted graphs detects many patterns that would be overlooked using unweighted graphs. In addition, we identified a large number of modules that occur predominately under specific phenotypes. This analysis resulted in a genome-wide mapping of gene network modules onto the phenome. Finally, by comparing module activities across many datasets, we discovered high-order dynamic cooperativeness in protein complex networks and transcriptional regulatory networks.

Collapse

Congdon E, Poldrack RA, Freimer NB. Neurocognitive phenotypes and genetic dissection of disorders of brain and behavior. Neuron 2010;68:218-30. [PMID: 20955930 DOI: 10.1016/j.neuron.2010.10.007] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/05/2010] [Indexed: 01/10/2023]

Gaujoux R, Seoighe C. A flexible R package for nonnegative matrix factorization. BMC Bioinformatics 2010;11:367. [PMID: 20598126 PMCID: PMC2912887 DOI: 10.1186/1471-2105-11-367] [Citation(s) in RCA: 824] [Impact Index Per Article: 58.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2009] [Accepted: 07/02/2010] [Indexed: 11/23/2022] Open

Eisenstein M. Reading between the lines. Nat Methods 2009. [DOI: 10.1038/nmeth0909-632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]