1
|
Alghamdi SM, Schofield PN, Hoehndorf R. How much do model organism phenotypes contribute to the computational identification of human disease genes? Dis Model Mech 2022; 15:275986. [PMID: 35758016 PMCID: PMC9366895 DOI: 10.1242/dmm.049441] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Accepted: 06/13/2022] [Indexed: 12/04/2022] Open
Abstract
Computing phenotypic similarity helps identify new disease genes and diagnose rare diseases. Genotype–phenotype data from orthologous genes in model organisms can compensate for lack of human data and increase genome coverage. In the past decade, cross-species phenotype comparisons have proven valuble, and several ontologies have been developed for this purpose. The relative contribution of different model organisms to computational identification of disease-associated genes is not fully explored. We used phenotype ontologies to semantically relate phenotypes resulting from loss-of-function mutations in model organisms to disease-associated phenotypes in humans. Semantic machine learning methods were used to measure the contribution of different model organisms to the identification of known human gene–disease associations. We found that mouse genotype–phenotype data provided the most important dataset in the identification of human disease genes by semantic similarity and machine learning over phenotype ontologies. Other model organisms' data did not improve identification over that obtained using the mouse alone, and therefore did not contribute significantly to this task. Our work impacts on the development of integrated phenotype ontologies, as well as for the use of model organism phenotypes in human genetic variant interpretation. This article has an associated First Person interview with the first author of the paper. Editor's choice: We investigated the use of model organism phenotypes in the computational identification of disease genes, identifying several data biases and concluding that mouse model phenotypes contribute most to computational disease gene identification.
Collapse
Affiliation(s)
- Sarah M Alghamdi
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, 4700 KAUST, 23955 Thuwal, Saudi Arabia
| | - Paul N Schofield
- Department of Physiology, Development & Neuroscience, University of Cambridge, Downing Street, CB2 3EG, Cambridge, UK
| | - Robert Hoehndorf
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, 4700 KAUST, 23955 Thuwal, Saudi Arabia
| |
Collapse
|
2
|
Canzoneri R, Naipauer J, Stedile M, Rodriguez Peña A, Lacunza E, Gandini NA, Curino AC, Facchinetti MM, Coso OA, Kordon E, Abba MC. Identification of an AP1-ZFP36 Regulatory Network Associated with Breast Cancer Prognosis. J Mammary Gland Biol Neoplasia 2020; 25:163-172. [PMID: 32248342 DOI: 10.1007/s10911-020-09448-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Accepted: 03/24/2020] [Indexed: 01/05/2023] Open
Abstract
It has been established that ZFP36 (also known as Tristetraprolin or TTP) promotes mRNA degradation of proteins involved in inflammation, proliferation and tumor invasiveness. In mammary epithelial cells ZFP36 expression is induced by STAT5 activation during lactogenesis, while in breast cancer ZFP36 expression is associated with lower grade and better prognosis. Here, we show that the AP-1 transcription factor components, i.e. JUN, JUNB, FOS, FOSB, in addition to DUSP1, EGR1, NR4A1, IER2 and BTG2, behave as a conserved co-regulated group of genes whose expression is associated to ZFP36 in cancer cells. In fact, a significant down-modulation of this gene network is observed in breast, liver, lung, kidney, and thyroid carcinomas compared to their normal counterparts. In breast cancer, the normal-like and Luminal A, show the highest expression of the ZFP36 gene network among the other intrinsic subtypes and patients with low expression of these genes display poor prognosis. It is also proposed that AP-1 regulates ZFP36 expression through responsive elements detected in the promoter region of this gene. Culture assays show that AP-1 activity induces ZFP36 expression in mammary cells in response to prolactin (PRL) treatment thorough ERK1/2 activation. These results suggest that JUN, JUNB, FOS and FOSB are not only co-expressed, but would also play a relevant role in regulating ZFP36 expression in mammary epithelial cells.
Collapse
Affiliation(s)
- R Canzoneri
- Centro de Investigaciones Inmunológicas Básicas y Aplicadas, CINIBA, Facultad de Ciencias Médicas, Universidad Nacional de La Plata, La Plata, Argentina
| | - J Naipauer
- Laboratorio de Expresión Génica en Mama y Apoptosis, LEGMA, IFIBYNE-CONICET, Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - M Stedile
- Laboratorio de Expresión Génica en Mama y Apoptosis, LEGMA, IFIBYNE-CONICET, Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - A Rodriguez Peña
- Centro de Investigaciones Inmunológicas Básicas y Aplicadas, CINIBA, Facultad de Ciencias Médicas, Universidad Nacional de La Plata, La Plata, Argentina
| | - E Lacunza
- Centro de Investigaciones Inmunológicas Básicas y Aplicadas, CINIBA, Facultad de Ciencias Médicas, Universidad Nacional de La Plata, La Plata, Argentina
| | - N A Gandini
- Laboratorio de Biología del Cáncer, INIBIBB, Universidad Nacional del Sur - CONICET, Bahía Blanca, Argentina
| | - A C Curino
- Laboratorio de Biología del Cáncer, INIBIBB, Universidad Nacional del Sur - CONICET, Bahía Blanca, Argentina
| | - M M Facchinetti
- Laboratorio de Biología del Cáncer, INIBIBB, Universidad Nacional del Sur - CONICET, Bahía Blanca, Argentina
| | - O A Coso
- Laboratorio de Expresión Génica en Mama y Apoptosis, LEGMA, IFIBYNE-CONICET, Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - E Kordon
- Laboratorio de Expresión Génica en Mama y Apoptosis, LEGMA, IFIBYNE-CONICET, Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - M C Abba
- Centro de Investigaciones Inmunológicas Básicas y Aplicadas, CINIBA, Facultad de Ciencias Médicas, Universidad Nacional de La Plata, La Plata, Argentina.
| |
Collapse
|
3
|
Abstract
Background Since the invention of next-generation RNA sequencing (RNA-seq) technologies, they have become a powerful tool to study the presence and quantity of RNA molecules in biological samples and have revolutionized transcriptomic studies. The analysis of RNA-seq data at four different levels (samples, genes, transcripts, and exons) involve multiple statistical and computational questions, some of which remain challenging up to date. Results We review RNA-seq analysis tools at the sample, gene, transcript, and exon levels from a statistical perspective. We also highlight the biological and statistical questions of most practical considerations. Conclusions The development of statistical and computational methods for analyzing RNA-seq data has made significant advances in the past decade. However, methods developed to answer the same biological question often rely on diverse statistical models and exhibit different performance under different scenarios. This review discusses and compares multiple commonly used statistical models regarding their assumptions, in the hope of helping users select appropriate methods as needed, as well as assisting developers for future method development.
Collapse
|
4
|
|
5
|
Obayashi T, Aoki Y, Tadaka S, Kagaya Y, Kinoshita K. ATTED-II in 2018: A Plant Coexpression Database Based on Investigation of the Statistical Property of the Mutual Rank Index. PLANT & CELL PHYSIOLOGY 2018; 59:e3. [PMID: 29216398 PMCID: PMC5914358 DOI: 10.1093/pcp/pcx191] [Citation(s) in RCA: 135] [Impact Index Per Article: 22.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Accepted: 11/25/2017] [Indexed: 05/17/2023]
Abstract
ATTED-II (http://atted.jp) is a coexpression database for plant species to aid in the discovery of relationships of unknown genes within a species. As an advanced coexpression analysis method, multispecies comparisons have the potential to detect alterations in gene relationships within an evolutionary context. However, determining the validity of comparative coexpression studies is difficult without quantitative assessments of the quality of coexpression data. ATTED-II (version 9) provides 16 coexpression platforms for nine plant species, including seven species supported by both microarray- and RNA sequencing (RNAseq)-based coexpression data. Two independent sources of coexpression data enable the assessment of the reproducibility of coexpression. The latest coexpression data for Arabidopsis (Ath-m.c7-1 and Ath-r.c3-0) showed the highest reproducibility (Jaccard coefficient = 0.13) among previous coexpression data in ATTED-II. We also investigated the statistical basis of the mutual rank (MR) index as a coexpression measure by bootstrap sampling of experimental units. We found that the error distribution of the logit-transformed MR index showed normality with equal variances for each coexpression platform. Because the MR error was strongly correlated with the number of samples for the coexpression data, typical confidence intervals for the MR index can be estimated for any coexpression platform. These new, high-quality coexpression data can be analyzed with any tool in ATTED-II and combined with external resources to obtain insight into plant biology.
Collapse
Affiliation(s)
- Takeshi Obayashi
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan
- Corresponding author: E-mail, ; Fax, +81-22-795-7179
| | - Yuichi Aoki
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, 980-8573 Japan
- Graduate School of Medicine, Tohoku University, Sendai, 980-8573 Japan
| | - Shu Tadaka
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, 980-8573 Japan
- Graduate School of Medicine, Tohoku University, Sendai, 980-8573 Japan
| | - Yuki Kagaya
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan
| | - Kengo Kinoshita
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, 980-8573 Japan
- Institute of Development, Aging, and Cancer, Tohoku University, Sendai, 980-8575 Japan
| |
Collapse
|
6
|
Aoki Y, Okamura Y, Tadaka S, Kinoshita K, Obayashi T. ATTED-II in 2016: A Plant Coexpression Database Towards Lineage-Specific Coexpression. PLANT & CELL PHYSIOLOGY 2016; 57:e5. [PMID: 26546318 PMCID: PMC4722172 DOI: 10.1093/pcp/pcv165] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2015] [Accepted: 10/20/2015] [Indexed: 05/17/2023]
Abstract
ATTED-II (http://atted.jp) is a coexpression database for plant species with parallel views of multiple coexpression data sets and network analysis tools. The user can efficiently find functional gene relationships and design experiments to identify gene functions by reverse genetics and general molecular biology techniques. Here, we report updates to ATTED-II (version 8.0), including new and updated coexpression data and analysis tools. ATTED-II now includes eight microarray- and six RNA sequencing-based coexpression data sets for seven dicot species (Arabidopsis, field mustard, soybean, barrel medick, poplar, tomato and grape) and two monocot species (rice and maize). Stand-alone coexpression analyses tend to have low reliability. Therefore, examining evolutionarily conserved coexpression is a more effective approach from the viewpoints of reliability and evolutionary importance. In contrast, the reliability of species-specific coexpression data remains poor. Our assessment scores for individual coexpression data sets indicated that the quality of the new coexpression data sets in ATTED-II is higher than for any previous coexpression data set. In addition, five species (Arabidopsis, soybean, tomato, rice and maize) in ATTED-II are now supported by both microarray- and RNA sequencing-based coexpression data, which has increased the reliability. Consequently, ATTED-II can now provide lineage-specific coexpression information. As an example of the use of ATTED-II to explore lineage-specific coexpression, we demonstrate monocot- and dicot-specific coexpression of cell wall genes. With the expanded coexpression data for multilevel evaluation, ATTED-II provides new opportunities to investigate lineage-specific evolution in plants.
Collapse
Affiliation(s)
- Yuichi Aoki
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan Core Research for Evolutional Science and Technology (CREST), Japan Science and Technology Agency, Kawaguchi, Saitama, Japan
| | - Yasunobu Okamura
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan
| | - Shu Tadaka
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan
| | - Kengo Kinoshita
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan Institute of Development, Aging, and Cancer, Tohoku University, Sendai, 980-8575 Japan Tohoku Medical Megabank Organization, Tohoku University, Sendai, 980-8573 Japan
| | - Takeshi Obayashi
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan Core Research for Evolutional Science and Technology (CREST), Japan Science and Technology Agency, Kawaguchi, Saitama, Japan
| |
Collapse
|
7
|
Aldaz CM, Ferguson BW, Abba MC. WWOX at the crossroads of cancer, metabolic syndrome related traits and CNS pathologies. Biochim Biophys Acta Rev Cancer 2014; 1846:188-200. [PMID: 24932569 DOI: 10.1016/j.bbcan.2014.06.001] [Citation(s) in RCA: 65] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2014] [Revised: 06/04/2014] [Accepted: 06/05/2014] [Indexed: 01/17/2023]
Abstract
WWOX was cloned as a putative tumor suppressor gene mapping to chromosomal fragile site FRA16D. Deletions affecting WWOX accompanied by loss of expression are frequent in various epithelial cancers. Translocations and deletions affecting WWOX are also common in multiple myeloma and are associated with worse prognosis. Metanalysis of gene expression datasets demonstrates that low WWOX expression is significantly associated with shorter relapse-free survival in ovarian and breast cancer patients. Although somatic mutations affecting WWOX are not frequent, analysis of TCGA tumor datasets led to identifying 44 novel mutations in various tumor types. The highest frequencies of mutations were found in head and neck cancers and uterine and gastric adenocarcinomas. Mouse models of gene ablation led us to conclude that Wwox does not behave as a highly penetrant, classical tumor suppressor gene since its deletion is not tumorigenic in most models and its role is more likely to be of relevance in tumor progression rather than in initiation. Analysis of signaling pathways associated with WWOX expression confirmed previous in vivo and in vitro observations linking WWOX function with the TGFβ/SMAD and WNT signaling pathways and with specific metabolic processes. Supporting these conclusions recently we demonstrated that indeed WWOX behaves as a modulator of TGFβ/SMAD signaling by binding and sequestering SMAD3 in the cytoplasmic compartment. As a consequence progressive loss of WWOX expression in advanced breast cancer would contribute to the pro-metastatic effects resulting from TGFβ/SMAD3 hyperactive signaling in breast cancer. Recently, GWAS and resequencing studies have linked the WWOX locus with familial dyslipidemias and metabolic syndrome related traits. Indeed, gene expression studies in liver conditional KO mice confirmed an association between WWOX expression and lipid metabolism. Finally, very recently the first human pedigrees with probands carrying homozygous germline loss of function WWOX mutations have been identified. These patients are characterized by severe CNS related pathology that includes epilepsy, ataxia and mental retardation. In summary, WWOX is a highly conserved and tightly regulated gene throughout evolution and when defective or deregulated the consequences are important and deleterious as demonstrated by its association not only with poor prognosis in cancer but also with other important human pathologies such as metabolic syndrome and CNS related pathologic conditions.
Collapse
Affiliation(s)
- C Marcelo Aldaz
- Department of Molecular Carcinogenesis, Science Park, The University of Texas M.D. Anderson Cancer Center, Smithville, TX 78957, USA.
| | - Brent W Ferguson
- Department of Molecular Carcinogenesis, Science Park, The University of Texas M.D. Anderson Cancer Center, Smithville, TX 78957, USA
| | - Martin C Abba
- CINIBA, Facultad de Medicina, Universidad Nacional de La Plata, La Plata, Argentina
| |
Collapse
|
8
|
Obayashi T, Okamura Y, Ito S, Tadaka S, Aoki Y, Shirota M, Kinoshita K. ATTED-II in 2014: evaluation of gene coexpression in agriculturally important plants. PLANT & CELL PHYSIOLOGY 2014; 55:e6. [PMID: 24334350 PMCID: PMC3894708 DOI: 10.1093/pcp/pct178] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
ATTED-II (http://atted.jp) is a database of coexpressed genes that was originally developed to identify functionally related genes in Arabidopsis and rice. Herein, we describe an updated version of ATTED-II, which expands this resource to include additional agriculturally important plants. To improve the quality of the coexpression data for Arabidopsis and rice, we included more gene expression data from microarray and RNA sequencing studies. The RNA sequencing-based coexpression data now cover 94% of the Arabidopsis protein-encoding genes, representing a substantial increase from previously available microarray-based coexpression data (76% coverage). We also generated coexpression data for four dicots (soybean, poplar, grape and alfalfa) and one monocot (maize). As both the quantity and quality of expression data for the non-model species are generally poorer than for the model species, we verified coexpression data associated with these new species using multiple methods. First, the overall performance of the coexpression data was evaluated using gene ontology annotations and the coincidence of a genomic feature. Secondly, the reliability of each guide gene was determined by comparing coexpressed gene lists between platforms. With the expanded and newly evaluated coexpression data, ATTED-II represents an important resource for identifying functionally related genes in agriculturally important plants.
Collapse
Affiliation(s)
- Takeshi Obayashi
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan
- Core Research for Evolutional Science and Technology (CREST), Japan Science and Technology Agency, Kawaguchi, Saitama, Japan
- *Corresponding author: E-mail, ; Fax, +81-22-795-7179
| | - Yasunobu Okamura
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan
| | - Satoshi Ito
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan
| | - Shu Tadaka
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan
| | - Yuichi Aoki
- Graduate School of Engineering, Tohoku University, 6-6-04, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8579 Japan
| | - Matsuyuki Shirota
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan
| | - Kengo Kinoshita
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan
- Institute of Development, Aging, and Cancer, Tohoku University, Sendai, 980-8575 Japan
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, 980-8573 Japan
| |
Collapse
|
9
|
Molineris I, Ala U, Provero P, Di Cunto F. Drug repositioning for orphan genetic diseases through Conserved Anticoexpressed Gene Clusters (CAGCs). BMC Bioinformatics 2013; 14:288. [PMID: 24088245 PMCID: PMC3851137 DOI: 10.1186/1471-2105-14-288] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Accepted: 09/24/2013] [Indexed: 12/12/2022] Open
Abstract
Background The development of new therapies for orphan genetic diseases represents an extremely important medical and social challenge. Drug repositioning, i.e. finding new indications for approved drugs, could be one of the most cost- and time-effective strategies to cope with this problem, at least in a subset of cases. Therefore, many computational approaches based on the analysis of high throughput gene expression data have so far been proposed to reposition available drugs. However, most of these methods require gene expression profiles directly relevant to the pathologic conditions under study, such as those obtained from patient cells and/or from suitable experimental models. In this work we have developed a new approach for drug repositioning, based on identifying known drug targets showing conserved anti-correlated expression profiles with human disease genes, which is completely independent from the availability of ‘ad hoc’ gene expression data-sets. Results By analyzing available data, we provide evidence that the genes displaying conserved anti-correlation with drug targets are antagonistically modulated in their expression by treatment with the relevant drugs. We then identified clusters of genes associated to similar phenotypes and showing conserved anticorrelation with drug targets. On this basis, we generated a list of potential candidate drug-disease associations. Importantly, we show that some of the proposed associations are already supported by independent experimental evidence. Conclusions Our results support the hypothesis that the identification of gene clusters showing conserved anticorrelation with drug targets can be an effective method for drug repositioning and provide a wide list of new potential drug-disease associations for experimental validation.
Collapse
Affiliation(s)
- Ivan Molineris
- Molecular Biotechnology Centre, Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy.
| | | | | | | |
Collapse
|
10
|
Hitzemann R, Bottomly D, Darakjian P, Walter N, Iancu O, Searles R, Wilmot B, McWeeney S. Genes, behavior and next-generation RNA sequencing. GENES, BRAIN, AND BEHAVIOR 2013; 12:1-12. [PMID: 23194347 PMCID: PMC6050050 DOI: 10.1111/gbb.12007] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2012] [Revised: 10/31/2012] [Accepted: 11/21/2012] [Indexed: 12/30/2022]
Abstract
Advances in next-generation sequencing suggest that RNA-Seq is poised to supplant microarray-based approaches for transcriptome analysis. This article briefly reviews the use of microarrays in the brain-behavior context and then illustrates why RNA-Seq is a superior strategy. Compared with microarrays, RNA-Seq has a greater dynamic range, detects both coding and noncoding RNAs, is superior for gene network construction, detects alternative spliced transcripts, detects allele specific expression and can be used to extract genotype information, e.g. nonsynonymous coding single nucleotide polymorphisms. Examples of where RNA-Seq has been used to assess brain gene expression are provided. Despite the advantages of RNA-Seq, some disadvantages remain. These include the high cost of RNA-Seq and the computational complexities associated with data analysis. RNA-Seq embraces the complexity of the transcriptome and provides a mechanism to understand the underlying regulatory code; the potential to inform the brain-behavior relationship is substantial.
Collapse
Affiliation(s)
- R Hitzemann
- Department of Behavioral Neuroscience, Oregon Health & Science University, Portland, OR 97239-3098, USA.
| | | | | | | | | | | | | | | |
Collapse
|
11
|
Obayashi T, Okamura Y, Ito S, Tadaka S, Motoike IN, Kinoshita K. COXPRESdb: a database of comparative gene coexpression networks of eleven species for mammals. Nucleic Acids Res 2012. [PMID: 23203868 PMCID: PMC3531062 DOI: 10.1093/nar/gks1014] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Coexpressed gene databases are valuable resources for identifying new gene functions or functional modules in metabolic pathways and signaling pathways. Although coexpressed gene databases are a fundamental platform in the field of plant biology, their use in animal studies is relatively limited. The COXPRESdb (http://coxpresdb.jp) provides coexpression relationships for multiple animal species, as comparisons of coexpressed gene lists can enhance the reliability of gene coexpression determinations. Here, we report the updates of the database, mainly focusing on the following two points. First, we updated our coexpression data by including recent microarray data for the previous seven species (human, mouse, rat, chicken, fly, zebrafish and nematode) and adding four new species (monkey, dog, budding yeast and fission yeast), along with a new human microarray platform. A reliability scoring function was also implemented, based on coexpression conservation to filter out coexpression with low reliability. Second, the network drawing function was updated, to implement automatic cluster analyses with enrichment analyses in Gene Ontology and in cis elements, along with interactive network analyses with Cytoscape Web. With these updates, COXPRESdb will become a more powerful tool for analyses of functional and regulatory networks of genes in a variety of animal species.
Collapse
Affiliation(s)
- Takeshi Obayashi
- Graduate School of Information Sciences, Tohoku University, Sendai 980-8679, Japan
| | | | | | | | | | | |
Collapse
|
12
|
Iancu OD, Darakjian P, Malmanger B, Walter NAR, McWeeney S, Hitzemann R. Gene networks and haloperidol-induced catalepsy. GENES BRAIN AND BEHAVIOR 2011; 11:29-37. [PMID: 21967164 DOI: 10.1111/j.1601-183x.2011.00736.x] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The current study examined the changes in striatal gene network structure induced by short-term selective breeding from a heterogeneous stock for haloperidol response. Brain (striatum) gene expression data were obtained using the Illumina WG 8.2 array, and the datasets from responding and non-responding selected lines were independently interrogated using a weighted gene coexpression network analysis (WGCNA). We detected several gene modules (groups of coexpressed genes) in each dataset; the membership of the modules was found to be largely concordant, and a consensus network was constructed. Further validation of the network topology showed that using approximately 35 samples is sufficient to reliably infer the transcriptome network. An in-depth analysis showed significant changes in network structure and gene connectivity associated with the selected lines; these changes were validated using a bootstrapping procedure. The most dramatic changes were associated with a gene module richly annotated with neurobehavioral traits. The changes in network connectivity were concentrated in the links between this module and the rest of the network, in addition to changes within the module; this observation is consistent with recent results in protein and metabolic networks. These results suggest that a network-based strategy will help identify the genetic factors associated with haloperidol response.
Collapse
Affiliation(s)
- O D Iancu
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR, USA.
| | | | | | | | | | | |
Collapse
|
13
|
Differential expression pattern-based prioritization of candidate genes through integrating disease-specific expression data. Genomics 2011; 98:64-71. [DOI: 10.1016/j.ygeno.2011.04.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2010] [Revised: 03/11/2011] [Accepted: 04/01/2011] [Indexed: 01/30/2023]
|
14
|
An atlas of tissue-specific conserved coexpression for functional annotation and disease gene prediction. Eur J Hum Genet 2011; 19:1173-80. [PMID: 21654723 DOI: 10.1038/ejhg.2011.96] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Gene coexpression relationships that are phylogenetically conserved between human and mouse have been shown to provide important clues about gene function that can be efficiently used to identify promising candidate genes for human hereditary disorders. In the past, such approaches have considered mostly generic gene expression profiles that cover multiple tissues and organs. The individual genes of multicellular organisms, however, can participate in different transcriptional programs, operating at scales as different as single-cell types, tissues, organs, body regions or the entire organism. Therefore, systematic analysis of tissue-specific coexpression could be, in principle, a very powerful strategy to dissect those functional relationships among genes that emerge only in particular tissues or organs. In this report, we show that, in fact, conserved coexpression as determined from tissue-specific and condition-specific data sets can predict many functional relationships that are not detected by analyzing heterogeneous microarray data sets. More importantly, we find that, when combined with disease networks, the simultaneous use of both generic (multi-tissue) and tissue-specific conserved coexpression allows a more efficient prediction of human disease genes than the use of generic conserved coexpression alone. Using this strategy, we were able to identify high-probability candidates for 238 orphan disease loci. We provide proof of concept that this combined use of generic and tissue-specific conserved coexpression can be very useful to prioritize the mutational candidates obtained from deep-sequencing projects, even in the case of genetic disorders as heterogeneous as XLMR.
Collapse
|
15
|
Obayashi T, Nishida K, Kasahara K, Kinoshita K. ATTED-II updates: condition-specific gene coexpression to extend coexpression analyses and applications to a broad range of flowering plants. PLANT & CELL PHYSIOLOGY 2011; 52:213-9. [PMID: 21217125 PMCID: PMC3037081 DOI: 10.1093/pcp/pcq203] [Citation(s) in RCA: 87] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2010] [Accepted: 12/17/2010] [Indexed: 05/18/2023]
Abstract
ATTED-II (http://atted.jp) is a gene coexpression database for a wide variety of experimental designs, such as prioritizations of genes for functional identification and analyses of the regulatory relationships among genes. Here, we report updates of ATTED-II focusing on two new features: condition-specific coexpression and homologous coexpression with rice. To analyze a broad range of biological phenomena, it is important to collect data under many diverse experimental conditions, but the meaning of coexpression can become ambiguous under these conditions. One approach to overcome this difficulty is to calculate the coexpression for each set of conditions with a clear biological meaning. With this viewpoint, we prepared five sets of experimental conditions (tissue, abiotic stress, biotic stress, hormones and light conditions), and users can evaluate the coexpression by employing comparative gene lists and switchable gene networks. We also developed an interactive visualization system, using the Cytoscape web system, to improve the network representation. As the second update, rice coexpression is now available. The previous version of ATTED-II was specifically developed for Arabidopsis, and thus coexpression analyses for other useful plants have been difficult. To solve this problem, we extended ATTED-II by including comparison tables between Arabidopsis and rice. This representation will make it possible to analyze the conservation of coexpression among flowering plants. With the ability to investigate condition-specific coexpression and species conservation, ATTED-II can help researchers to clarify the functional and regulatory networks of genes in a broad array of plant species.
Collapse
Affiliation(s)
- Takeshi Obayashi
- Graduate School of Information Science, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan.
| | | | | | | |
Collapse
|
16
|
Abstract
Despite increasing sequencing capacity, genetic disease investigation still frequently results in the identification of loci containing multiple candidate disease genes that need to be tested for involvement in the disease. This process can be expedited by prioritizing the candidates prior to testing. Over the last decade, a large number of computational methods and tools have been developed to assist the clinical geneticist in prioritizing candidate disease genes. In this chapter, we give an overview of computational tools that can be used for this purpose, all of which are freely available over the web.
Collapse
Affiliation(s)
- Martin Oti
- Structural and Computational Biology Division, Victor Chang Cardiac Research Institute, 2010, Darlinghurst, NSW, Australia.
| | | | | |
Collapse
|
17
|
Obayashi T, Kinoshita K. COXPRESdb: a database to compare gene coexpression in seven model animals. Nucleic Acids Res 2010; 39:D1016-22. [PMID: 21081562 PMCID: PMC3013720 DOI: 10.1093/nar/gkq1147] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Publicly available databases of coexpressed gene sets are a valuable resource for a wide variety of experimental studies, including gene targeting for functional identification, and for investigations of regulatory mechanisms or protein-protein interaction networks. Although coexpressed gene databases are becoming more and more popular in the field of plant biology, those with animal data are rather limited, possibly due to the lower reliability of the coexpression data. The original COXPRESdb (coexpressed gene database) (http://coxpresdb.jp) represented the coexpression relationship for human and mouse. Here, we report updates of this database that especially focus on the enhancement of the reliability of gene coexpression data in animals. For this purpose, we implemented a new comparable coexpression measure, Mutual Rank, included five other animal species, rat, chicken, zebrafish, fly and nematoda, to assess the conservation of coexpression, and added different layers of omics data into the integrated network of genes. Comparison of coexpression is a key concept to enhance the reliability of gene coexpression, and the integration of different information can reduce the noise inherent in the information. With the functions for gene network representation, COXPRESdb can help researchers to clarify the functional and regulatory networks of genes in a broad array of animal species.
Collapse
Affiliation(s)
- Takeshi Obayashi
- Graduate School of Information Science, Tohoku University, 6-3-09 Aramaki-Aza-Aoba, Aoba-ku, Sendai 980-8679, Japan
| | | |
Collapse
|
18
|
Iancu OD, Darakjian P, Walter NAR, Malmanger B, Oberbeck D, Belknap J, McWeeney S, Hitzemann R. Genetic diversity and striatal gene networks: focus on the heterogeneous stock-collaborative cross (HS-CC) mouse. BMC Genomics 2010; 11:585. [PMID: 20959017 PMCID: PMC3091732 DOI: 10.1186/1471-2164-11-585] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2010] [Accepted: 10/19/2010] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND The current study focused on the extent genetic diversity within a species (Mus musculus) affects gene co-expression network structure. To examine this issue, we have created a new mouse resource, a heterogeneous stock (HS) formed from the same eight inbred strains that have been used to create the collaborative cross (CC). The eight inbred strains capture > 90% of the genetic diversity available within the species. For contrast with the HS-CC, a C57BL/6J (B6) × DBA/2J (D2) F2 intercross and the HS4, derived from crossing the B6, D2, BALB/cJ and LP/J strains, were used. Brain (striatum) gene expression data were obtained using the Illumina Mouse WG 6.1 array, and the data sets were interrogated using a weighted gene co-expression network analysis (WGCNA). RESULTS Genes reliably detected as expressed were similar in all three data sets as was the variability of expression. As measured by the WGCNA, the modular structure of the transcriptome networks was also preserved both on the basis of module assignment and from the perspective of the topological overlap maps. Details of the HS-CC gene modules are provided; essentially identical results were obtained for the HS4 and F2 modules. Gene ontology annotation of the modules revealed a significant overrepresentation in some modules for neuronal processes, e.g., central nervous system development. Integration with known protein-protein interactions data indicated significant enrichment among co-expressed genes. We also noted significant overlap with markers of central nervous system cell types (neurons, oligodendrocytes and astrocytes). Using the Allen Brain Atlas, we found evidence of spatial co-localization within the striatum for several modules. Finally, for some modules it was possible to detect an enrichment of transcription binding sites. The binding site for Wt1, which is associated with neurodegeneration, was the most significantly overrepresented. CONCLUSIONS Despite the marked differences in genetic diversity, the transcriptome structure was remarkably similar for the F2, HS4 and HS-CC. These data suggest that it should be possible to integrate network data from simple and complex crosses. A careful examination of the HS-CC transcriptome revealed the expected structure for striatal gene expression. Importantly, we demonstrate the integration of anatomical and network expression data.
Collapse
Affiliation(s)
- Ovidiu D Iancu
- Department of Behavioral Neuroscience, Oregon Health & Science University, Portland, OR, USA.
| | | | | | | | | | | | | | | |
Collapse
|
19
|
Chen X, Yan GY, Liao XP. A Novel Candidate Disease Genes Prioritization Method Based on Module Partition and Rank Fusion. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2010; 14:337-56. [DOI: 10.1089/omi.2009.0143] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- Xing Chen
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, People's Republic of China
- Graduate University of Chinese Academy of Sciences, Beijing 100190, People's Republic of China
| | - Gui-Ying Yan
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, People's Republic of China
| | - Xiao-Ping Liao
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, People's Republic of China
- Graduate University of Chinese Academy of Sciences, Beijing 100190, People's Republic of China
| |
Collapse
|
20
|
Baye TM, Martin LJ, Khurana Hershey GK. Application of genetic/genomic approaches to allergic disorders. J Allergy Clin Immunol 2010; 126:425-36; quiz 437-8. [PMID: 20638111 DOI: 10.1016/j.jaci.2010.05.025] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2010] [Revised: 04/28/2010] [Accepted: 05/07/2010] [Indexed: 11/16/2022]
Abstract
Completion of the human genome project and rapid progress in genetics and bioinformatics have enabled the development of large public databases, which include genetic and genomic data linked to clinical health data. With the massive amount of information available, clinicians and researchers have the unique opportunity to complement and integrate their daily practice with the existing resources to clarify the underlying cause of complex phenotypes, such as allergic diseases. The genome itself is now often used as a starting point for many studies, and multiple innovative approaches have emerged applying genetic/genomic strategies to key questions in the field of allergy and immunology. There have been several successes that have uncovered new insights into the biologic underpinnings of allergic disorders. Herein we will provide an in-depth review of genomic approaches to identifying genes and biologic networks involved in allergic diseases. We will discuss genetic and phenotypic variation, statistical approaches for gene discovery, public databases, functional genomics, clinical implications, and the challenges that remain.
Collapse
Affiliation(s)
- Tesfaye M Baye
- Division of Asthma Research, University of Cincinnati, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
| | | | | |
Collapse
|
21
|
Miozzi L, Provero P, Accotto GP. ORTom: a multi-species approach based on conserved co-expression to identify putative functional relationships among genes in tomato. PLANT MOLECULAR BIOLOGY 2010; 73:519-532. [PMID: 20411302 DOI: 10.1007/s11103-010-9638-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2009] [Accepted: 04/11/2010] [Indexed: 05/29/2023]
Abstract
Co-expressed genes are often expected to be functionally related and many bioinformatics approaches based on co-expression have been developed to infer their biological role. However, such annotations may be unreliable, whereas the evolutionary conservation of gene co-expression among species may form a basis for more confident predictions. The huge amount of expression data (microarrays, SAGE, ESTs) has already allowed functional studies based on conserved co-expression in animals. Up to now, the implementation of analogous tools for plants has been strongly limited probably by the paucity and heterogeneity of data. Here we present ORTom, a tomato-centred EST data-mining approach based on conserved co-expression in the Solanaceae family. ORTom can be used to predict functional relationships among genes and to prioritize candidate genes for targeted studies. The method consists in ranking ESTs co-expressed with a gene of interest according to the level of expression pattern conservation in phylogenetically-related plants (potato, tobacco and pepper) to obtain lists of putative functionally-related genes. The lists are then analyzed for Gene Ontology keyword enrichment. The web server ORTom has been implemented to make the results publicly-available and searchable. Few biological examples on how the tool can be used are presented.
Collapse
Affiliation(s)
- Laura Miozzi
- Istituto di Virologia Vegetale, CNR, Strada delle Cacce 73, 10135 Turin, Italy.
| | | | | |
Collapse
|
22
|
Hua L, Li DG, Lin H, Li L, Li X, Liu ZC. The correlation of gene expression and co-regulated gene patterns in characteristic KEGG pathways. J Theor Biol 2010; 266:242-9. [PMID: 20599549 DOI: 10.1016/j.jtbi.2010.06.029] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2010] [Accepted: 06/20/2010] [Indexed: 12/01/2022]
Abstract
There is great interest in chromosome- and pathway-based techniques for genomics data analysis in the current work in order to understand the mechanism of disease. However, there are few studies addressing the abilities of machine learning methods in incorporating pathway information for analyzing microarray data. In this paper, we identified the characteristic pathways by combining the classification error rates of out-of-bag (OOB) in random forests with pathways information. At each characteristic pathway, the correlation of gene expression was studied and the co-regulated gene patterns in different biological conditions were mined by Mining Attribute Profile (MAP) algorithm. The discovered co-regulated gene patterns were clustered by the average-linkage hierarchical clustering technique. The results showed that the expression of genes at the same characteristic pathway were approximate. Furthermore, two characteristic pathways were discovered to present co-regulated gene patterns in which one contained 108 patterns and the other contained one pattern. The results of cluster analysis showed that the smallest similarity coefficient of clusters was more than 0.623, which indicated that the co-regulated patterns in different biological conditions were more approximate at the same characteristic pathway. The methods discussed in this paper can provide additional insight into the study of microarray data.
Collapse
Affiliation(s)
- Lin Hua
- Biomedical Engineering Institute of Capital Medical University, Beijing 100069, China
| | | | | | | | | | | |
Collapse
|
23
|
Ostlund G, Lindskog M, Sonnhammer ELL. Network-based Identification of novel cancer genes. Mol Cell Proteomics 2009; 9:648-55. [PMID: 19959820 DOI: 10.1074/mcp.m900227-mcp200] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Genes involved in cancer susceptibility and progression can serve as templates for searching protein networks for novel cancer genes. To this end, we introduce a general network searching method, MaxLink, and apply it to find and rank cancer gene candidates by their connectivity to known cancer genes. Using a comprehensive protein interaction network, we searched for genes connected to known cancer genes. First, we compiled a new set of 812 genes involved in cancer, more than twice the number in the Cancer Gene Census. Their network neighbors were then extracted. This candidate list was refined by selecting genes with unexpectedly high levels of connectivity to cancer genes and without previous association to cancer. This produced a list of 1891 new cancer candidates with up to 55 connections to known cancer genes. We validated our method by cross-validation, Gene Ontology term bias, and differential expression in cancer versus normal tissue. An example novel cancer gene candidate is presented with detailed analysis of the local network and neighbor annotation. Our study provides a ranked list of high priority targets for further studies in cancer research. Supplemental material is included.
Collapse
Affiliation(s)
- Gabriel Ostlund
- Stockholm Bioinformatics Centre, Stockholm University, Stockholm, Sweden.
| | | | | |
Collapse
|
24
|
Aid-Pavlidis T, Pavlidis P, Timmusk T. Meta-coexpression conservation analysis of microarray data: a "subset" approach provides insight into brain-derived neurotrophic factor regulation. BMC Genomics 2009; 10:420. [PMID: 19737418 PMCID: PMC2748098 DOI: 10.1186/1471-2164-10-420] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2008] [Accepted: 09/08/2009] [Indexed: 11/26/2022] Open
Abstract
Background Alterations in brain-derived neurotrophic factor (BDNF) gene expression contribute to serious pathologies such as depression, epilepsy, cancer, Alzheimer's, Huntington and Parkinson's disease. Therefore, exploring the mechanisms of BDNF regulation represents a great clinical importance. Studying BDNF expression remains difficult due to its multiple neural activity-dependent and tissue-specific promoters. Thus, microarray data could provide insight into the regulation of this complex gene. Conventional microarray co-expression analysis is usually carried out by merging the datasets or by confirming the re-occurrence of significant correlations across datasets. However, co-expression patterns can be different under various conditions that are represented by subsets in a dataset. Therefore, assessing co-expression by measuring correlation coefficient across merged samples of a dataset or by merging datasets might not capture all correlation patterns. Results In our study, we performed meta-coexpression analysis of publicly available microarray data using BDNF as a "guide-gene" introducing a "subset" approach. The key steps of the analysis included: dividing datasets into subsets with biologically meaningful sample content (e.g. tissue, gender or disease state subsets); analyzing co-expression with the BDNF gene in each subset separately; and confirming co- expression links across subsets. Finally, we analyzed conservation in co-expression with BDNF between human, mouse and rat, and sought for conserved over-represented TFBSs in BDNF and BDNF-correlated genes. Correlated genes discovered in this study regulate nervous system development, and are associated with various types of cancer and neurological disorders. Also, several transcription factor identified here have been reported to regulate BDNF expression in vitro and in vivo. Conclusion The study demonstrates the potential of the "subset" approach in co-expression conservation analysis for studying the regulation of single genes and proposes novel regulators of BDNF gene expression.
Collapse
Affiliation(s)
- Tamara Aid-Pavlidis
- Department of Gene Technology, Tallinn University of Technology, Akadeemia tee 15, 19086 Tallinn, Estonia.
| | | | | |
Collapse
|
25
|
Linghu B, Snitkin ES, Hu Z, Xia Y, Delisi C. Genome-wide prioritization of disease genes and identification of disease-disease associations from an integrated human functional linkage network. Genome Biol 2009; 10:R91. [PMID: 19728866 PMCID: PMC2768980 DOI: 10.1186/gb-2009-10-9-r91] [Citation(s) in RCA: 180] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2009] [Revised: 07/09/2009] [Accepted: 09/03/2009] [Indexed: 11/16/2022] Open
Abstract
An evidence-weighted functional-linkage network of human genes reveals associations among diseases that share no known disease genes and have dissimilar phenotypes
We integrate 16 genomic features to construct an evidence-weighted functional-linkage network comprising 21,657 human genes. The functional-linkage network is used to prioritize candidate genes for 110 diseases, and to reliably disclose hidden associations between disease pairs having dissimilar phenotypes, such as hypercholesterolemia and Alzheimer's disease. Many of these disease-disease associations are supported by epidemiology, but with no previous genetic basis. Such associations can drive novel hypotheses on molecular mechanisms of diseases and therapies.
Collapse
Affiliation(s)
- Bolan Linghu
- Bioinformatics Program, Boston University, 24 Cummington Street, Boston, MA 02215, USA.
| | | | | | | | | |
Collapse
|