1
|
Soni P, Edwards H, Anupom T, Rahman M, Lesanpezeshki L, Blawzdziewicz J, Cope H, Gharahdaghi N, Scott D, Toh LS, Williams PM, Etheridge T, Szewczyk N, Willis CRG, Vanapalli SA. Spaceflight Induces Strength Decline in Caenorhabditis elegans. Cells 2023; 12:2470. [PMID: 37887314 PMCID: PMC10605753 DOI: 10.3390/cells12202470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/14/2023] [Accepted: 10/15/2023] [Indexed: 10/28/2023] Open
Abstract
Background: Understanding and countering the well-established negative health consequences of spaceflight remains a primary challenge preventing safe deep space exploration. Targeted/personalized therapeutics are at the forefront of space medicine strategies, and cross-species molecular signatures now define the 'typical' spaceflight response. However, a lack of direct genotype-phenotype associations currently limits the robustness and, therefore, the therapeutic utility of putative mechanisms underpinning pathological changes in flight. Methods: We employed the worm Caenorhabditis elegans as a validated model of space biology, combined with 'NemaFlex-S' microfluidic devices for assessing animal strength production as one of the most reproducible physiological responses to spaceflight. Wild-type and dys-1 (BZ33) strains (a Duchenne muscular dystrophy (DMD) model for comparing predisposed muscle weak animals) were cultured on the International Space Station in chemically defined media before loading second-generation gravid adults into NemaFlex-S devices to assess individual animal strength. These same cultures were then frozen on orbit before returning to Earth for next-generation sequencing transcriptomic analysis. Results: Neuromuscular strength was lower in flight versus ground controls (16.6% decline, p < 0.05), with dys-1 significantly more (23% less strength, p < 0.01) affected than wild types. The transcriptional gene ontology signatures characterizing both strains of weaker animals in flight strongly corroborate previous results across species, enriched for upregulated stress response pathways and downregulated mitochondrial and cytoskeletal processes. Functional gene cluster analysis extended this to implicate decreased neuronal function, including abnormal calcium handling and acetylcholine signaling, in space-induced strength declines under the predicted control of UNC-89 and DAF-19 transcription factors. Finally, gene modules specifically altered in dys-1 animals in flight again cluster to neuronal/neuromuscular pathways, suggesting strength loss in DMD comprises a strong neuronal component that predisposes these animals to exacerbated strength loss in space. Conclusions: Highly reproducible gene signatures are strongly associated with space-induced neuromuscular strength loss across species and neuronal changes in calcium/acetylcholine signaling require further study. These results promote targeted medical efforts towards and provide an in vivo model for safely sending animals and people into deep space in the near future.
Collapse
Affiliation(s)
- Purushottam Soni
- Department of Chemical Engineering, Texas Tech University, Lubbock, TX 79409, USA; (P.S.); (M.R.); (L.L.)
| | - Hunter Edwards
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA;
| | - Taslim Anupom
- Department of Electrical Engineering, Texas Tech University, Lubbock, TX 79409, USA;
| | - Mizanur Rahman
- Department of Chemical Engineering, Texas Tech University, Lubbock, TX 79409, USA; (P.S.); (M.R.); (L.L.)
| | - Leila Lesanpezeshki
- Department of Chemical Engineering, Texas Tech University, Lubbock, TX 79409, USA; (P.S.); (M.R.); (L.L.)
| | - Jerzy Blawzdziewicz
- Department of Mechanical Engineering, Texas Tech University, Lubbock, TX 79409, USA;
- Department of Physics and Astronomy, Texas Tech University, Lubbock, TX 79409, USA
| | - Henry Cope
- School of Medicine, University of Nottingham, Derby DE22 3DT, UK; (H.C.); (N.G.)
| | - Nima Gharahdaghi
- School of Medicine, University of Nottingham, Derby DE22 3DT, UK; (H.C.); (N.G.)
| | - Daniel Scott
- School of Life Sciences, University of Nottingham, Nottingham NG7 2UH, UK;
| | - Li Shean Toh
- School of Pharmacy, University of Nottingham, Nottingham NG7 2RD, UK; (L.S.T.); (P.M.W.)
| | - Philip M. Williams
- School of Pharmacy, University of Nottingham, Nottingham NG7 2RD, UK; (L.S.T.); (P.M.W.)
| | - Timothy Etheridge
- Department of Sport and Health Sciences, College of Life and Environmental Sciences, University of Exeter, Exeter EX1 2LU, UK;
| | - Nathaniel Szewczyk
- School of Medicine, University of Nottingham, Derby DE22 3DT, UK; (H.C.); (N.G.)
- Ohio Musculoskeletal and Neurological Institute, Heritage College of Osteopathic Medicine, Ohio University, Athens, OH 45701, USA
| | - Craig R. G. Willis
- School of Chemistry and Biosciences, Faculty of Life Sciences, University of Bradford, Bradford BD7 1DP, UK;
| | - Siva A. Vanapalli
- Department of Chemical Engineering, Texas Tech University, Lubbock, TX 79409, USA; (P.S.); (M.R.); (L.L.)
| |
Collapse
|
2
|
The cognitive and speech genes are jointly shaped by both positive and relaxed selection in the human lineage. Genomics 2020; 112:2922-2927. [PMID: 32387504 DOI: 10.1016/j.ygeno.2020.05.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Revised: 04/17/2020] [Accepted: 05/05/2020] [Indexed: 11/22/2022]
Abstract
The emergence of a coordinated network of cognitive and speech genes in the human lineage performing overlapping functions is a great evolutionary puzzle. Prior studies on the speech gene FOXP2 are inconclusive on the nature of selection operating on this gene in the human lineage. Here, I show that the evolution of FOXP2 is accelerated in the human lineage due to relaxation of purifying selection (relaxed selection). Five potential genes associated with human-specific intelligence and speech genes have evolved under the impact of positive selection and three genes including FOXP2 have undergone relaxation of purifying selection in the human lineage. Overall, three evolutionary processes namely positive selection, relaxation of purifying selection and neutral evolution have contributed for the genomic evolution of extraordinary cognitive ability and speech in the hominin lineage. The cognitive and speech genes subjected to natural selection in the human lineage have demonstrated a coevolutionary trend.
Collapse
|
3
|
Lalremmawia H, Tiwary BK. Identification of molecular biomarkers for ovarian cancer using computational approaches. Carcinogenesis 2020; 40:742-748. [PMID: 30753333 DOI: 10.1093/carcin/bgz025] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2018] [Revised: 01/26/2019] [Accepted: 02/01/2019] [Indexed: 12/31/2022] Open
Abstract
Ovarian cancer is one of the major causes of mortality among women. This is partly because of highly asymptomatic nature, lack of reliable screening techniques and non-availability of effective biomarkers of ovarian cancer. The recent availability of high-throughput data and consequently the development of network medicine approach may play a key role in deciphering the underlying global mechanism involved in a complex disease. This novel approach in medicine will pave the way in translating the new molecular insights into an effective drug therapy applying better diagnostic, prognostic and predictive tests for a complex disease. In this study, we performed reconstruction of gene co-expression networks with a query-based method in healthy and different stages of ovarian cancer to identify new potential biomarkers from the reported biomarker genes. We proposed 17 genes as new potential biomarkers for ovarian cancer that can effectively classify a disease sample from a healthy sample. Most of the predicted genes are found to be differentially expressed between healthy and diseased states. Moreover, the survival analysis showed that these genes have a significantly higher effect on the overall survival rate of the patient than the established biomarkers. The comparative analyses of the co-expression networks across healthy and different stages of ovarian cancer have provided valuable insights into the dynamic nature of ovarian cancer.
Collapse
Affiliation(s)
- H Lalremmawia
- Centre for Bioinformatics, School of Life Sciences, Pondicherry University, Pondicherry, India
| | - Basant K Tiwary
- Centre for Bioinformatics, School of Life Sciences, Pondicherry University, Pondicherry, India
| |
Collapse
|
4
|
Hurlock ME, Čavka I, Kursel LE, Haversat J, Wooten M, Nizami Z, Turniansky R, Hoess P, Ries J, Gall JG, Rog O, Köhler S, Kim Y. Identification of novel synaptonemal complex components in C. elegans. J Cell Biol 2020; 219:e201910043. [PMID: 32211899 PMCID: PMC7199856 DOI: 10.1083/jcb.201910043] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Revised: 02/04/2020] [Accepted: 02/26/2020] [Indexed: 11/22/2022] Open
Abstract
The synaptonemal complex (SC) is a tripartite protein scaffold that forms between homologous chromosomes during meiosis. Although the SC is essential for stable homologue pairing and crossover recombination in diverse eukaryotes, it is unknown how individual components assemble into the highly conserved SC structure. Here we report the biochemical identification of two new SC components, SYP-5 and SYP-6, in Caenorhabditis elegans. SYP-5 and SYP-6 are paralogous to each other and play redundant roles in synapsis, providing an explanation for why these genes have evaded previous genetic screens. Superresolution microscopy reveals that they localize between the chromosome axes and span the width of the SC in a head-to-head manner, similar to the orientation of other known transverse filament proteins. Using genetic redundancy and structure-function analyses to truncate C-terminal tails of SYP-5/6, we provide evidence supporting the role of SC in both limiting and promoting crossover formation.
Collapse
Affiliation(s)
| | - Ivana Čavka
- The European Molecular Biology Laboratory, Heidelberg, Germany
| | - Lisa E. Kursel
- School of Biological Sciences, University of Utah, Salt Lake City, UT
| | | | - Matthew Wooten
- Department of Biology, Johns Hopkins University, Baltimore, MD
| | - Zehra Nizami
- Department of Embryology, Carnegie Institution for Science, Baltimore, MD
| | | | - Philipp Hoess
- The European Molecular Biology Laboratory, Heidelberg, Germany
- Collaboration for joint PhD degree between European Molecular Biology Laboratory and Faculty of Biosciences, Heidelberg University, Heidelberg, Germany
| | - Jonas Ries
- The European Molecular Biology Laboratory, Heidelberg, Germany
| | - Joseph G. Gall
- Department of Embryology, Carnegie Institution for Science, Baltimore, MD
| | - Ofer Rog
- School of Biological Sciences, University of Utah, Salt Lake City, UT
| | - Simone Köhler
- The European Molecular Biology Laboratory, Heidelberg, Germany
| | - Yumi Kim
- Department of Biology, Johns Hopkins University, Baltimore, MD
| |
Collapse
|
5
|
Nan N, Chen Q, Wang Y, Zhai X, Yang CC, Cao B, Chong T. Screening disrupted molecular functions and pathways associated with clear cell renal cell carcinoma using Gibbs sampling. Comput Biol Chem 2017; 70:15-20. [PMID: 28735111 DOI: 10.1016/j.compbiolchem.2017.07.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2017] [Revised: 06/28/2017] [Accepted: 07/09/2017] [Indexed: 12/14/2022]
Abstract
OBJECTIVE To explore the disturbed molecular functions and pathways in clear cell renal cell carcinoma (ccRCC) using Gibbs sampling. METHODS Gene expression data of ccRCC samples and adjacent non-tumor renal tissues were recruited from public available database. Then, molecular functions of expression changed genes in ccRCC were classed to Gene Ontology (GO) project, and these molecular functions were converted into Markov chains. Markov chain Monte Carlo (MCMC) algorithm was implemented to perform posterior inference and identify probability distributions of molecular functions in Gibbs sampling. Differentially expressed molecular functions were selected under posterior value more than 0.95, and genes with the appeared times in differentially expressed molecular functions ≥5 were defined as pivotal genes. Functional analysis was employed to explore the pathways of pivotal genes and their strongly co-regulated genes. RESULTS In this work, we obtained 396 molecular functions, and 13 of them were differentially expressed. Oxidoreductase activity showed the highest posterior value. Gene composition analysis identified 79 pivotal genes, and survival analysis indicated that these pivotal genes could be used as a strong independent predictor of poor prognosis in patients with ccRCC. Pathway analysis identified one pivotal pathway - oxidative phosphorylation. CONCLUSIONS We identified the differentially expressed molecular functions and pivotal pathway in ccRCC using Gibbs sampling. The results could be considered as potential signatures for early detection and therapy of ccRCC.
Collapse
Affiliation(s)
- Ning Nan
- Department of Urinary Surgery, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710004, China
| | - Qi Chen
- Department of Urinary Surgery, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710004, China
| | - Yu Wang
- Department of Urinary Surgery, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710004, China
| | - Xu Zhai
- Department of Urinary Surgery, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710004, China
| | - Chuan-Ce Yang
- Department of Urinary Surgery, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710004, China
| | - Bin Cao
- Department of Urinary Surgery, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710004, China
| | - Tie Chong
- Department of Urinary Surgery, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710004, China.
| |
Collapse
|
6
|
Li Y, Jourdain AA, Calvo SE, Liu JS, Mootha VK. CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets. PLoS Comput Biol 2017; 13:e1005653. [PMID: 28719601 PMCID: PMC5546725 DOI: 10.1371/journal.pcbi.1005653] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Revised: 08/07/2017] [Accepted: 06/21/2017] [Indexed: 12/31/2022] Open
Abstract
In recent years, there has been a huge rise in the number of publicly available transcriptional profiling datasets. These massive compendia comprise billions of measurements and provide a special opportunity to predict the function of unstudied genes based on co-expression to well-studied pathways. Such analyses can be very challenging, however, since biological pathways are modular and may exhibit co-expression only in specific contexts. To overcome these challenges we introduce CLIC, CLustering by Inferred Co-expression. CLIC accepts as input a pathway consisting of two or more genes. It then uses a Bayesian partition model to simultaneously partition the input gene set into coherent co-expressed modules (CEMs), while assigning the posterior probability for each dataset in support of each CEM. CLIC then expands each CEM by scanning the transcriptome for additional co-expressed genes, quantified by an integrated log-likelihood ratio (LLR) score weighted for each dataset. As a byproduct, CLIC automatically learns the conditions (datasets) within which a CEM is operative. We implemented CLIC using a compendium of 1774 mouse microarray datasets (28628 microarrays) or 1887 human microarray datasets (45158 microarrays). CLIC analysis reveals that of 910 canonical biological pathways, 30% consist of strongly co-expressed gene modules for which new members are predicted. For example, CLIC predicts a functional connection between protein C7orf55 (FMC1) and the mitochondrial ATP synthase complex that we have experimentally validated. CLIC is freely available at www.gene-clic.org. We anticipate that CLIC will be valuable both for revealing new components of biological pathways as well as the conditions in which they are active.
Collapse
Affiliation(s)
- Yang Li
- Howard Hughes Medical Institute and Department of Molecular Biology and the Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States of America and Department of Systems Biology, Harvard Medical School, Boston, MA United States of America
- Department of Statistics, Harvard University, Cambridge, MA, United States of America
| | - Alexis A. Jourdain
- Howard Hughes Medical Institute and Department of Molecular Biology and the Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States of America and Department of Systems Biology, Harvard Medical School, Boston, MA United States of America
- Broad Institute, Cambridge, MA, United States of America
| | - Sarah E. Calvo
- Howard Hughes Medical Institute and Department of Molecular Biology and the Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States of America and Department of Systems Biology, Harvard Medical School, Boston, MA United States of America
- Broad Institute, Cambridge, MA, United States of America
- * E-mail: (SEC); (JSL); (VKM)
| | - Jun S. Liu
- Department of Statistics, Harvard University, Cambridge, MA, United States of America
- * E-mail: (SEC); (JSL); (VKM)
| | - Vamsi K. Mootha
- Howard Hughes Medical Institute and Department of Molecular Biology and the Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States of America and Department of Systems Biology, Harvard Medical School, Boston, MA United States of America
- Broad Institute, Cambridge, MA, United States of America
- * E-mail: (SEC); (JSL); (VKM)
| |
Collapse
|
7
|
Nelms BD, Waldron L, Barrera LA, Weflen AW, Goettel JA, Guo G, Montgomery RK, Neutra MR, Breault DT, Snapper SB, Orkin SH, Bulyk ML, Huttenhower C, Lencer WI. CellMapper: rapid and accurate inference of gene expression in difficult-to-isolate cell types. Genome Biol 2016; 17:201. [PMID: 27687735 PMCID: PMC5043525 DOI: 10.1186/s13059-016-1062-5] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2016] [Accepted: 09/13/2016] [Indexed: 02/25/2023] Open
Abstract
We present a sensitive approach to predict genes expressed selectively in specific cell types, by searching publicly available expression data for genes with a similar expression profile to known cell-specific markers. Our method, CellMapper, strongly outperforms previous computational algorithms to predict cell type-specific expression, especially for rare and difficult-to-isolate cell types. Furthermore, CellMapper makes accurate predictions for human brain cell types that have never been isolated, and can be rapidly applied to diverse cell types from many tissues. We demonstrate a clinically relevant application to prioritize candidate genes in disease susceptibility loci identified by GWAS.
Collapse
Affiliation(s)
- Bradlee D Nelms
- Division of Gastroenterology, Children's Hospital and Harvard Medical School, Boston, MA, 02115, USA. .,Graduate Program in Biophysics, Harvard University, Cambridge, MA, 02138, USA.
| | - Levi Waldron
- City University of New York School of Public Health, New York, NY, 10027, USA
| | - Luis A Barrera
- Graduate Program in Biophysics, Harvard University, Cambridge, MA, 02138, USA.,Division of Genetics, Department of Medicine and Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA
| | - Andrew W Weflen
- Division of Gastroenterology, Children's Hospital and Harvard Medical School, Boston, MA, 02115, USA
| | - Jeremy A Goettel
- Division of Gastroenterology, Children's Hospital and Harvard Medical School, Boston, MA, 02115, USA
| | - Guoji Guo
- Center of Stem Cell and Regenerative Medicine, Zhejiang University School of Medicine, Zhejiang, 310058, People's Republic of China
| | - Robert K Montgomery
- Division of Gastroenterology, Children's Hospital and Harvard Medical School, Boston, MA, 02115, USA
| | - Marian R Neutra
- Division of Gastroenterology, Children's Hospital and Harvard Medical School, Boston, MA, 02115, USA.,Harvard Digestive Diseases Center, Harvard Medical School, Boston, MA, 02115, USA
| | - David T Breault
- Harvard Digestive Diseases Center, Harvard Medical School, Boston, MA, 02115, USA.,Division of Endocrinology, Children's Hospital and Harvard Medical School, Boston, MA, 02115, USA
| | - Scott B Snapper
- Division of Gastroenterology, Children's Hospital and Harvard Medical School, Boston, MA, 02115, USA.,Harvard Digestive Diseases Center, Harvard Medical School, Boston, MA, 02115, USA.,Department of Gastroenterology, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - Stuart H Orkin
- Division of Hematology/Oncology and Harvard Stem Cell Institute, Children's Hospital and Harvard Medical School, Boston, MA, 02115, USA.,Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA, 02115, USA
| | - Martha L Bulyk
- Division of Genetics, Department of Medicine and Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA
| | - Curtis Huttenhower
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, 02115, USA
| | - Wayne I Lencer
- Division of Gastroenterology, Children's Hospital and Harvard Medical School, Boston, MA, 02115, USA. .,Graduate Program in Biophysics, Harvard University, Cambridge, MA, 02138, USA. .,Harvard Digestive Diseases Center, Harvard Medical School, Boston, MA, 02115, USA.
| |
Collapse
|
8
|
Zhu Q, Wong AK, Krishnan A, Aure MR, Tadych A, Zhang R, Corney DC, Greene CS, Bongo LA, Kristensen VN, Charikar M, Li K, Troyanskaya OG. Targeted exploration and analysis of large cross-platform human transcriptomic compendia. Nat Methods 2015; 12:211-4, 3 p following 214. [PMID: 25581801 PMCID: PMC4768301 DOI: 10.1038/nmeth.3249] [Citation(s) in RCA: 107] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2014] [Accepted: 11/12/2014] [Indexed: 01/15/2023]
Abstract
We present SEEK (search-based exploration of expression compendia; http://seek.princeton.edu/), a query-based search engine for very large transcriptomic data collections, including thousands of human data sets from many different microarray and high-throughput sequencing platforms. SEEK uses a query-level cross-validation-based algorithm to automatically prioritize data sets relevant to the query and a robust search approach to identify genes, pathways and processes co-regulated with the query. SEEK provides multigene query searching with iterative metadata-based search refinement and extensive visualization-based analysis options.
Collapse
Affiliation(s)
- Qian Zhu
- Department of Computer Science, Princeton University, Princeton, NJ, USA
- Lewis-Sigler Institute of Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Aaron K Wong
- Department of Computer Science, Princeton University, Princeton, NJ, USA
- Lewis-Sigler Institute of Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Arjun Krishnan
- Lewis-Sigler Institute of Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Miriam R Aure
- Department of Genetics, Institute for Cancer Research, Oslo University Hospital, The Norwegian Radiumhospital, Oslo, Norway
| | - Alicja Tadych
- Lewis-Sigler Institute of Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Ran Zhang
- Lewis-Sigler Institute of Integrative Genomics, Princeton University, Princeton, NJ, USA
- Department of Molecular Biology, Princeton University, Princeton, NJ, USA
| | - David C Corney
- Lewis-Sigler Institute of Integrative Genomics, Princeton University, Princeton, NJ, USA
- Department of Molecular Biology, Princeton University, Princeton, NJ, USA
| | - Casey S Greene
- Department of Genetics, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
- Institute for Quantitative Biomedical Sciences, Dartmouth College, Hanover, NH, USA
| | - Lars A Bongo
- Department of Computer Science, University of Tromsø, Tromsø, Norway
| | - Vessela N Kristensen
- Department of Genetics, Institute for Cancer Research, Oslo University Hospital, The Norwegian Radiumhospital, Oslo, Norway
- Institute for Clinical Medicine, Department of Clinical Molecular Biology (EpiGen), Faculty of Medicine, UiO and Division of Medicine, Akershus University Hospital, Akershus, Norway
| | - Moses Charikar
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Kai Li
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Olga G. Troyanskaya
- Department of Computer Science, Princeton University, Princeton, NJ, USA
- Lewis-Sigler Institute of Integrative Genomics, Princeton University, Princeton, NJ, USA
- Simons Center for Data Analysis, Simons Foundation, New York City, NY, USA
| |
Collapse
|
9
|
Deveci M, Küçüktunç O, Eren K, Bozdağ D, Kaya K, Çatalyürek ÜV. Querying Co-regulated Genes on Diverse Gene Expression Datasets Via Biclustering. Methods Mol Biol 2015; 1375:55-74. [PMID: 26626937 DOI: 10.1007/7651_2015_246] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Rapid development and increasing popularity of gene expression microarrays have resulted in a number of studies on the discovery of co-regulated genes. One important way of discovering such co-regulations is the query-based search since gene co-expressions may indicate a shared role in a biological process. Although there exist promising query-driven search methods adapting clustering, they fail to capture many genes that function in the same biological pathway because microarray datasets are fraught with spurious samples or samples of diverse origin, or the pathways might be regulated under only a subset of samples. On the other hand, a class of clustering algorithms known as biclustering algorithms which simultaneously cluster both the items and their features are useful while analyzing gene expression data, or any data in which items are related in only a subset of their samples. This means that genes need not be related in all samples to be clustered together. Because many genes only interact under specific circumstances, biclustering may recover the relationships that traditional clustering algorithms can easily miss. In this chapter, we briefly summarize the literature using biclustering for querying co-regulated genes. Then we present a novel biclustering approach and evaluate its performance by a thorough experimental analysis.
Collapse
Affiliation(s)
- Mehmet Deveci
- Computer Science and Engineering, The Ohio State University, Columbus, OH, USA
| | - Onur Küçüktunç
- Computer Science and Engineering, The Ohio State University, Columbus, OH, USA
| | - Kemal Eren
- Computer Science and Engineering, The Ohio State University, Columbus, OH, USA
| | - Doruk Bozdağ
- Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Kamer Kaya
- Computer Science and Engineering, Sabancı University, Istanbul, Turkey
| | - Ümit V Çatalyürek
- Biomedical Informatics, Department of Electrical and Computer Engineering, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
10
|
Tiwary BK. The coordinated expression, interaction and evolution of the neuroendocrine genes. Integr Biol (Camb) 2012; 4:1377-85. [PMID: 22990097 DOI: 10.1039/c2ib20081c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The neuroendocrine system is a complex biological system controlled by various neuropeptides and hormones. The evolution and network properties of neuroendocrine genes are analyzed along with their expression profiles. The neuroendocrine genes show very similar expression profiles and local network properties across a wide range of tissues consistent with the physiological roles of their proteins. Moreover, the coordinated evolution of 10 neuroendocrine genes involved in mammalian reproduction and homeostasis is demonstrated using several methods, such as correlated evolution, relative-rate test, relative-ratio test and codon usage bias. The neuroendocrine genes seem to evolve predominantly under similar selective strengths and regimes of purifying selection, which is well reflected in their evolutionary fingerprints. This result demonstrates for the first time a key role of natural selection in creating and maintaining a well-designed neuroendocrine system at the genomic level. It also indicates that component properties of a complex system at a higher physiological scale may determine component properties at a lower genomic scale and/or vice versa.
Collapse
Affiliation(s)
- Basant K Tiwary
- Centre for Bioinformatics, School of Life Sciences, Pondicherry University, Pondicherry-605 014, India.
| |
Collapse
|
11
|
Dissecting the gene network of dietary restriction to identify evolutionarily conserved pathways and new functional genes. PLoS Genet 2012; 8:e1002834. [PMID: 22912585 PMCID: PMC3415404 DOI: 10.1371/journal.pgen.1002834] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2012] [Accepted: 06/04/2012] [Indexed: 01/19/2023] Open
Abstract
Dietary restriction (DR), limiting nutrient intake from diet without causing malnutrition, delays the aging process and extends lifespan in multiple organisms. The conserved life-extending effect of DR suggests the involvement of fundamental mechanisms, although these remain a subject of debate. To help decipher the life-extending mechanisms of DR, we first compiled a list of genes that if genetically altered disrupt or prevent the life-extending effects of DR. We called these DR–essential genes and identified more than 100 in model organisms such as yeast, worms, flies, and mice. In order for other researchers to benefit from this first curated list of genes essential for DR, we established an online database called GenDR (http://genomics.senescence.info/diet/). To dissect the interactions of DR–essential genes and discover the underlying lifespan-extending mechanisms, we then used a variety of network and systems biology approaches to analyze the gene network of DR. We show that DR–essential genes are more conserved at the molecular level and have more molecular interactions than expected by chance. Furthermore, we employed a guilt-by-association method to predict novel DR–essential genes. In budding yeast, we predicted nine genes related to vacuolar functions; we show experimentally that mutations deleting eight of those genes prevent the life-extending effects of DR. Three of these mutants (OPT2, FRE6, and RCR2) had extended lifespan under ad libitum, indicating that the lack of further longevity under DR is not caused by a general compromise of fitness. These results demonstrate how network analyses of DR using GenDR can be used to make phenotypically relevant predictions. Moreover, gene-regulatory circuits reveal that the DR–induced transcriptional signature in yeast involves nutrient-sensing, stress responses and meiotic transcription factors. Finally, comparing the influence of gene expression changes during DR on the interactomes of multiple organisms led us to suggest that DR commonly suppresses translation, while stimulating an ancient reproduction-related process. Dietary restriction has been shown to extend lifespan in diverse, evolutionarily distant species, yet its underlying mechanisms remain unknown. We first constructed a database of genes essential for the life-extending effects of dietary restriction in various model organisms and then studied their interactions using a variety of network and systems biology approaches. This enabled us to predict novel genes related to dietary restriction, which we validated experimentally in yeast. By comparing large-scale data compilations (interactomes and transcriptomes) from multiple organisms, we were able to condense this -omics information to the most conserved essential elements, eliminating species-specific adaptive responses. These results lead us to the rather surprising conclusion that lifespan extension by a restricted diet commonly may exploit an ancient rejuvenation process derived from gametogenesis.
Collapse
|
12
|
Tiwary BK. The severity of mental disorders is linked to interaction among candidate genes. Integr Biol (Camb) 2012; 4:1096-101. [PMID: 22777684 DOI: 10.1039/c2ib20066j] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
There is a considerable overlap in the manifestation of symptoms in three mental disorders namely unipolar disorder, bipolar disorder and schizophrenia. A gene coexpression network was developed based on a mutual information approach including four candidate genes (NRG1, DISC1, BDNF and COMT) along with other coexpressing genes in unipolar disorder, bipolar disorder and schizophrenia. There is a significant difference in the degree distribution of nodes between normal and bipolar disorder network and bipolar disorder network and schizophrenia network. Moreover, there is a differential direct connectivity among candidate genes in various mental disorders and between normal and mental disorders. All candidate genes are directly connected to each other in schizophrenia except one pair (NRG1-BDNF) indicating a strong role of inter-gene interactions in the manifestation of severe symptoms in this disease. DISC1 and NRG1 are key hub genes in the unipolar disorder network and the bipolar disorder network but have lost the role of hub genes in schizophrenia network, despite their significant association with schizophrenia. This study indicates that the three psychiatric diseases may not have discrete classes but three phenotypic manifestations of the same continuous disease based on severity.
Collapse
Affiliation(s)
- Basant K Tiwary
- Centre for Bioinformatics, School of Life Sciences, Pondicherry University, Puducherry-605 014, India.
| |
Collapse
|
13
|
De Smet R, Marchal K. An ensemble biclustering approach for querying gene expression compendia with experimental lists. Bioinformatics 2011; 27:1948-56. [PMID: 21593133 DOI: 10.1093/bioinformatics/btr307] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Query-based biclustering techniques allow interrogating a gene expression compendium with a given gene or gene list. They do so by searching for genes in the compendium that have a profile close to the average expression profile of the genes in this query-list. As it can often not be guaranteed that the genes in a long query-list will all be mutually coexpressed, it is advisable to use each gene separately as a query. This approach, however, leaves the user with a tedious post-processing of partially redundant biclustering results. The fact that for each query-gene multiple parameter settings need to be tested in order to detect the 'most optimal bicluster size' adds to the redundancy problem. RESULTS To aid with this post-processing, we developed an ensemble approach to be used in combination with query-based biclustering. The method relies on a specifically designed consensus matrix in which the biclustering outcomes for multiple query-genes and for different possible parameter settings are merged in a statistically robust way. Clustering of this matrix results in distinct, non-redundant consensus biclusters that maximally reflect the information contained within the original query-based biclustering results. The usefulness of the developed approach is illustrated on a biological case study in Escherichia coli. AVAILABILITY AND IMPLEMENTATION Compiled Matlab code is available from http://homes.esat.kuleuven.be/~kmarchal/Supplementary_Information_DeSmet_2011/.
Collapse
Affiliation(s)
- Riet De Smet
- Department of Plant Systems Biology, VIB, Ghent University, Technologiepark 927, Ghent, Belgium
| | | |
Collapse
|
14
|
Zhao H, Cloots L, Van den Bulcke T, Wu Y, De Smet R, Storms V, Meysman P, Engelen K, Marchal K. Query-based biclustering of gene expression data using Probabilistic Relational Models. BMC Bioinformatics 2011; 12 Suppl 1:S37. [PMID: 21342568 PMCID: PMC3044293 DOI: 10.1186/1471-2105-12-s1-s37] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background With the availability of large scale expression compendia it is now possible to view own findings in the light of what is already available and retrieve genes with an expression profile similar to a set of genes of interest (i.e., a query or seed set) for a subset of conditions. To that end, a query-based strategy is needed that maximally exploits the coexpression behaviour of the seed genes to guide the biclustering, but that at the same time is robust against the presence of noisy genes in the seed set as seed genes are often assumed, but not guaranteed to be coexpressed in the queried compendium. Therefore, we developed ProBic, a query-based biclustering strategy based on Probabilistic Relational Models (PRMs) that exploits the use of prior distributions to extract the information contained within the seed set. Results We applied ProBic on a large scale Escherichia coli compendium to extend partially described regulons with potentially novel members. We compared ProBic's performance with previously published query-based biclustering algorithms, namely ISA and QDB, from the perspective of bicluster expression quality, robustness of the outcome against noisy seed sets and biological relevance. This comparison learns that ProBic is able to retrieve biologically relevant, high quality biclusters that retain their seed genes and that it is particularly strong in handling noisy seeds. Conclusions ProBic is a query-based biclustering algorithm developed in a flexible framework, designed to detect biologically relevant, high quality biclusters that retain relevant seed genes even in the presence of noise or when dealing with low quality seed sets.
Collapse
Affiliation(s)
- Hui Zhao
- Microbial and Molecular Systems, KU Leuven, Leuven 3001, Belgium.
| | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Kompass KS, Witte JS. Co-regulatory expression quantitative trait loci mapping: method and application to endometrial cancer. BMC Med Genomics 2011; 4:6. [PMID: 21226949 PMCID: PMC3032645 DOI: 10.1186/1755-8794-4-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2010] [Accepted: 01/12/2011] [Indexed: 01/16/2023] Open
Abstract
Background Expression quantitative trait loci (eQTL) studies have helped identify the genetic determinants of gene expression. Understanding the potential interacting mechanisms underlying such findings, however, is challenging. Methods We describe a method to identify the trans-acting drivers of multiple gene co-expression, which reflects the action of regulatory molecules. This method-termed co-regulatory expression quantitative trait locus (creQTL) mapping-allows for evaluation of a more focused set of phenotypes within a clear biological context than conventional eQTL mapping. Results Applying this method to a study of endometrial cancer revealed regulatory mechanisms supported by the literature: a creQTL between a locus upstream of STARD13/DLC2 and a group of seven IFNβ-induced genes. This suggests that the Rho-GTPase encoded by STARD13 regulates IFNβ-induced genes and the DNA damage response. Conclusions Because of the importance of IFNβ in cancer, our results suggest that creQTL may provide a finer picture of gene regulation and may reveal additional molecular targets for intervention. An open source R implementation of the method is available at http://sites.google.com/site/kenkompass/.
Collapse
Affiliation(s)
- Kenneth S Kompass
- Department of Epidemiology and Biostatistics, Institute for Human Genetics, University of California, San Francisco, USA
| | | |
Collapse
|
16
|
Freudenberg JM, Sivaganesan S, Phatak M, Shinde K, Medvedovic M. Generalized random set framework for functional enrichment analysis using primary genomics datasets. ACTA ACUST UNITED AC 2010; 27:70-7. [PMID: 20971985 DOI: 10.1093/bioinformatics/btq593] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
MOTIVATION Functional enrichment analysis using primary genomics datasets is an emerging approach to complement established methods for functional enrichment based on predefined lists of functionally related genes. Currently used methods depend on creating lists of 'significant' and 'non-significant' genes based on ad hoc significance cutoffs. This can lead to loss of statistical power and can introduce biases affecting the interpretation of experimental results. RESULTS We developed and validated a new statistical framework, generalized random set (GRS) analysis, for comparing the genomic signatures in two datasets without the need for gene categorization. In our tests, GRS produced correct measures of statistical significance, and it showed dramatic improvement in the statistical power over other methods currently used in this setting. We also developed a procedure for identifying genes driving the concordance of the genomics profiles and demonstrated a dramatic improvement in functional coherence of genes identified in such analysis. AVAILABILITY GRS can be downloaded as part of the R package CLEAN from http://ClusterAnalysis.org/. An online implementation is available at http://GenomicsPortals.org/.
Collapse
Affiliation(s)
- Johannes M Freudenberg
- Department of Environmental Health, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | | | | | | | | |
Collapse
|
17
|
Le HS, Oltvai ZN, Bar-Joseph Z. Cross-species queries of large gene expression databases. ACTA ACUST UNITED AC 2010; 26:2416-23. [PMID: 20702396 DOI: 10.1093/bioinformatics/btq451] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
MOTIVATION Expression databases, including the Gene Expression Omnibus and ArrayExpress, have experienced significant growth over the past decade and now hold hundreds of thousands of arrays from multiple species. Since most drugs are initially tested on model organisms, the ability to compare expression experiments across species may help identify pathways that are activated in a similar way in humans and other organisms. However, while several methods exist for finding co-expressed genes in the same species as a query gene, looking at co-expression of homologs or arbitrary genes in other species is challenging. Unlike sequence, which is static, expression is dynamic and changes between tissues, conditions and time. Thus, to carry out cross-species analysis using these databases, we need methods that can match experiments in one species with experiments in another species. RESULTS To facilitate queries in large databases, we developed a new method for comparing expression experiments from different species. We define a distance metric between the ranking of orthologous genes in the two species. We show how to solve an optimization problem for learning the parameters of this function using a training dataset of known similar expression experiments pairs. The function we learn outperforms previous methods and simpler rank comparison methods that have been used in the past for single species analysis. We used our method to compare millions of array pairs from mouse and human expression experiments. The resulting matches can be used to find functionally related genes, to hypothesize about biological response mechanisms and to highlight conditions and diseases that are activating similar pathways in both species. AVAILABILITY Supporting methods, results and a Matlab implementation are available from http://sb.cs.cmu.edu/ExpQ/.
Collapse
Affiliation(s)
- Hai-Son Le
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA, USA
| | | | | |
Collapse
|
18
|
Liu Q, Cui J, Yang Q, Xu Y. In-silico prediction of blood-secretory human proteins using a ranking algorithm. BMC Bioinformatics 2010; 11:250. [PMID: 20465853 PMCID: PMC2877692 DOI: 10.1186/1471-2105-11-250] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2010] [Accepted: 05/14/2010] [Indexed: 01/19/2023] Open
Abstract
Background Computational identification of blood-secretory proteins, especially proteins with differentially expressed genes in diseased tissues, can provide highly useful information in linking transcriptomic data to proteomic studies for targeted disease biomarker discovery in serum. Results A new algorithm for prediction of blood-secretory proteins is presented using an information-retrieval technique, called manifold ranking. On a dataset containing 305 known blood-secretory human proteins and a large number of other proteins that are either not blood-secretory or unknown, the new method performs better than the previous published method, measured in terms of the area under the recall-precision curve (AUC). A key advantage of the presented method is that it does not explicitly require a negative training set, which could often be noisy or difficult to derive for most biological problems, hence making our method more applicable than classification-based data mining methods in general biological studies. Conclusion We believe that our program will prove to be very useful to biomedical researchers who are interested in finding serum markers, especially when they have candidate proteins derived through transcriptomic or proteomic analyses of diseased tissues. A computer program is developed for prediction of blood-secretory proteins based on manifold ranking, which is accessible at our website http://csbl.bmb.uga.edu/publications/materials/qiliu/blood_secretory_protein.html.
Collapse
Affiliation(s)
- Qi Liu
- Computational Systems Biology Laboratory, Department of Biochemistry and Molecular Biology, and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | | | | | | |
Collapse
|
19
|
Hutter H, Ng MP, Chen N. GExplore: a web server for integrated queries of protein domains, gene expression and mutant phenotypes. BMC Genomics 2009; 10:529. [PMID: 19917126 PMCID: PMC2779824 DOI: 10.1186/1471-2164-10-529] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2009] [Accepted: 11/16/2009] [Indexed: 01/17/2023] Open
Abstract
Background The majority of the genes even in well-studied multi-cellular model organisms have not been functionally characterized yet. Mining the numerous genome wide data sets related to protein function to retrieve potential candidate genes for a particular biological process remains a challenge. Description GExplore has been developed to provide a user-friendly database interface for data mining at the gene expression/protein function level to help in hypothesis development and experiment design. It supports combinatorial searches for proteins with certain domains, tissue- or developmental stage-specific expression patterns, and mutant phenotypes. GExplore operates on a stand-alone database and has fast response times, which is essential for exploratory searches. The interface is not only user-friendly, but also modular so that it accommodates additional data sets in the future. Conclusion GExplore is an online database for quick mining of data related to gene and protein function, providing a multi-gene display of data sets related to the domain composition of proteins as well as expression and phenotype data. GExplore is publicly available at: http://genome.sfu.ca/gexplore/
Collapse
Affiliation(s)
- Harald Hutter
- Department of Biological Sciences, Simon Fraser University, Burnaby, Canada.
| | | | | |
Collapse
|
20
|
Lee TH, Kim YK, Pham TTM, Song SI, Kim JK, Kang KY, An G, Jung KH, Galbraith DW, Kim M, Yoon UH, Nahm BH. RiceArrayNet: a database for correlating gene expression from transcriptome profiling, and its application to the analysis of coexpressed genes in rice. PLANT PHYSIOLOGY 2009; 151:16-33. [PMID: 19605550 PMCID: PMC2735985 DOI: 10.1104/pp.109.139030] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2009] [Accepted: 07/06/2009] [Indexed: 05/18/2023]
Abstract
Microarray data can be used to derive understanding of the relationships between the genes involved in various biological systems of an organism, given the availability of databases of gene expression measurements from the complete spectrum of experimental conditions and materials. However, there have been no reports, to date, of such a database being constructed for rice (Oryza sativa). Here, we describe the construction of such a database, called RiceArrayNet (RAN; http://www.ggbio.com/arraynet/), which provides information on coexpression between genes in terms of correlation coefficients (r values). The average number of coexpressed genes is 214, with sd of 440 at r >or= 0.5. Given the correlation between genes in a gene pair, the degrees of closeness between genes can be visualized in a relational tree and a relational network. The distribution of correlated genes according to degree of stringency shows how each gene is related to other genes. As an application of RAN, the 16-member L7Ae ribosomal protein family was explored for coexpressed genes and gene expression values within and between rice and Arabidopsis (Arabidopsis thaliana), and common and unique features in coexpression partners and expression patterns were observed for these family members. We observed a correlation pattern between Os01g0968800, a drought-responsive element-binding transcription factor, Os02g0790500, a trehalose-6-phosphate synthase, and Os06g0219500, a small heat shock factor, reflecting the fact that genes responding to the same biological stresses are regulated together. The RAN database can be used as a tool to gain insight into a particular gene by examining its coexpression partners.
Collapse
Affiliation(s)
- Tae-Ho Lee
- Division of Bioscience and Bioinformatics, Myong Ji University, Yongin, Kyonggido 449-728, Korea
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Meng J, Gao SJ, Huang Y. Enrichment constrained time-dependent clustering analysis for finding meaningful temporal transcription modules. ACTA ACUST UNITED AC 2009; 25:1521-7. [PMID: 19351618 DOI: 10.1093/bioinformatics/btp235] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
MOTIVATION Clustering is a popular data exploration technique widely used in microarray data analysis. When dealing with time-series data, most conventional clustering algorithms, however, either use one-way clustering methods, which fail to consider the heterogeneity of temporary domain, or use two-way clustering methods that do not take into account the time dependency between samples, thus producing less informative results. Furthermore, enrichment analysis is often performed independent of and after clustering and such practice, though capable of revealing biological significant clusters, cannot guide the clustering to produce biologically significant result. RESULT We present a new enrichment constrained framework (ECF) coupled with a time-dependent iterative signature algorithm (TDISA), which, by applying a sliding time window to incorporate the time dependency of samples and imposing an enrichment constraint to parameters of clustering, allows supervised identification of temporal transcription modules (TTMs) that are biologically meaningful. Rigorous mathematical definitions of TTM as well as the enrichment constraint framework are also provided that serve as objective functions for retrieving biologically significant modules. We applied the enrichment constrained time-dependent iterative signature algorithm (ECTDISA) to human gene expression time-series data of Kaposi's sarcoma-associated herpesvirus (KSHV) infection of human primary endothelial cells; the result not only confirms known biological facts, but also reveals new insight into the molecular mechanism of KSHV infection. AVAILABILITY Data and Matlab code are available at http://engineering.utsa.edu/ approximately yfhuang/ECTDISA.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jia Meng
- Department of ECE, University of Texas at San Antonio, Texas, USA
| | | | | |
Collapse
|
22
|
Hibbs MA, Myers CL, Huttenhower C, Hess DC, Li K, Caudy AA, Troyanskaya OG. Directing experimental biology: a case study in mitochondrial biogenesis. PLoS Comput Biol 2009; 5:e1000322. [PMID: 19300515 PMCID: PMC2654405 DOI: 10.1371/journal.pcbi.1000322] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2008] [Accepted: 02/06/2009] [Indexed: 11/25/2022] Open
Abstract
Computational approaches have promised to organize collections of functional genomics data into testable predictions of gene and protein involvement in biological processes and pathways. However, few such predictions have been experimentally validated on a large scale, leaving many bioinformatic methods unproven and underutilized in the biology community. Further, it remains unclear what biological concerns should be taken into account when using computational methods to drive real-world experimental efforts. To investigate these concerns and to establish the utility of computational predictions of gene function, we experimentally tested hundreds of predictions generated from an ensemble of three complementary methods for the process of mitochondrial organization and biogenesis in Saccharomyces cerevisiae. The biological data with respect to the mitochondria are presented in a companion manuscript published in PLoS Genetics (doi:10.1371/journal.pgen.1000407). Here we analyze and explore the results of this study that are broadly applicable for computationalists applying gene function prediction techniques, including a new experimental comparison with 48 genes representing the genomic background. Our study leads to several conclusions that are important to consider when driving laboratory investigations using computational prediction approaches. While most genes in yeast are already known to participate in at least one biological process, we confirm that genes with known functions can still be strong candidates for annotation of additional gene functions. We find that different analysis techniques and different underlying data can both greatly affect the types of functional predictions produced by computational methods. This diversity allows an ensemble of techniques to substantially broaden the biological scope and breadth of predictions. We also find that performing prediction and validation steps iteratively allows us to more completely characterize a biological area of interest. While this study focused on a specific functional area in yeast, many of these observations may be useful in the contexts of other processes and organisms. Genome sequencing has provided us with “parts lists” of genes for many organisms, but many of the biological roles these genes are still unknown. While a great deal of functional genomic data exists, providing information about these genes and their roles, the rate at which these data are leveraged into concrete biological knowledge lags far behind the rate of data generation. Many computational approaches have been developed to generate accurate predictions of gene functions, with the goal of bridging this divide. However, as no large-scale experimental efforts have been based on such approaches, their validity and utility remains unproven. We have performed a study that experimentally evaluates predictions from a combination of three computational function prediction approaches, focusing on mitochondrion-related processes in brewer's yeast as a model system. By using computational predictions to guide our laboratory investigation, we have greatly accelerated the rate at which proteins can be assigned to biological processes. Further, our results demonstrate that in order to achieve the best results, it is important for computational biologists to consider both the underlying data and the algorithmic foundations of the methods used to predict function. Lastly, we demonstrate that iterating through phases of prediction and validation has quickly and extensively expanded our knowledge of mitochondrial biology.
Collapse
Affiliation(s)
- Matthew A. Hibbs
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, New Jersey, United States of America
- Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America
| | - Chad L. Myers
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, New Jersey, United States of America
- Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Curtis Huttenhower
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, New Jersey, United States of America
- Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America
| | - David C. Hess
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, New Jersey, United States of America
| | - Kai Li
- Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America
| | - Amy A. Caudy
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, New Jersey, United States of America
| | - Olga G. Troyanskaya
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, New Jersey, United States of America
- Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America
- * E-mail:
| |
Collapse
|
23
|
Hu M, Qin ZS. Query large scale microarray compendium datasets using a model-based bayesian approach with variable selection. PLoS One 2009; 4:e4495. [PMID: 19214232 PMCID: PMC2637418 DOI: 10.1371/journal.pone.0004495] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2008] [Accepted: 12/06/2008] [Indexed: 11/19/2022] Open
Abstract
In microarray gene expression data analysis, it is often of interest to identify genes that share similar expression profiles with a particular gene such as a key regulatory protein. Multiple studies have been conducted using various correlation measures to identify co-expressed genes. While working well for small datasets, the heterogeneity introduced from increased sample size inevitably reduces the sensitivity and specificity of these approaches. This is because most co-expression relationships do not extend to all experimental conditions. With the rapid increase in the size of microarray datasets, identifying functionally related genes from large and diverse microarray gene expression datasets is a key challenge. We develop a model-based gene expression query algorithm built under the Bayesian model selection framework. It is capable of detecting co-expression profiles under a subset of samples/experimental conditions. In addition, it allows linearly transformed expression patterns to be recognized and is robust against sporadic outliers in the data. Both features are critically important for increasing the power of identifying co-expressed genes in large scale gene expression datasets. Our simulation studies suggest that this method outperforms existing correlation coefficients or mutual information-based query tools. When we apply this new method to the Escherichia coli microarray compendium data, it identifies a majority of known regulons as well as novel potential target genes of numerous key transcription factors.
Collapse
Affiliation(s)
- Ming Hu
- Center for Statistical Genetics, Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Zhaohui S. Qin
- Center for Statistical Genetics, Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, Michigan, United States of America
- * E-mail:
| |
Collapse
|
24
|
Michaud J, Simpson KM, Escher R, Buchet-Poyau K, Beissbarth T, Carmichael C, Ritchie ME, Schütz F, Cannon P, Liu M, Shen X, Ito Y, Raskind WH, Horwitz MS, Osato M, Turner DR, Speed TP, Kavallaris M, Smyth GK, Scott HS. Integrative analysis of RUNX1 downstream pathways and target genes. BMC Genomics 2008; 9:363. [PMID: 18671852 PMCID: PMC2529319 DOI: 10.1186/1471-2164-9-363] [Citation(s) in RCA: 101] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2007] [Accepted: 07/31/2008] [Indexed: 01/19/2023] Open
Abstract
Background The RUNX1 transcription factor gene is frequently mutated in sporadic myeloid and lymphoid leukemia through translocation, point mutation or amplification. It is also responsible for a familial platelet disorder with predisposition to acute myeloid leukemia (FPD-AML). The disruption of the largely unknown biological pathways controlled by RUNX1 is likely to be responsible for the development of leukemia. We have used multiple microarray platforms and bioinformatic techniques to help identify these biological pathways to aid in the understanding of why RUNX1 mutations lead to leukemia. Results Here we report genes regulated either directly or indirectly by RUNX1 based on the study of gene expression profiles generated from 3 different human and mouse platforms. The platforms used were global gene expression profiling of: 1) cell lines with RUNX1 mutations from FPD-AML patients, 2) over-expression of RUNX1 and CBFβ, and 3) Runx1 knockout mouse embryos using either cDNA or Affymetrix microarrays. We observe that our datasets (lists of differentially expressed genes) significantly correlate with published microarray data from sporadic AML patients with mutations in either RUNX1 or its cofactor, CBFβ. A number of biological processes were identified among the differentially expressed genes and functional assays suggest that heterozygous RUNX1 point mutations in patients with FPD-AML impair cell proliferation, microtubule dynamics and possibly genetic stability. In addition, analysis of the regulatory regions of the differentially expressed genes has for the first time systematically identified numerous potential novel RUNX1 target genes. Conclusion This work is the first large-scale study attempting to identify the genetic networks regulated by RUNX1, a master regulator in the development of the hematopoietic system and leukemia. The biological pathways and target genes controlled by RUNX1 will have considerable importance in disease progression in both familial and sporadic leukemia as well as therapeutic implications.
Collapse
Affiliation(s)
- Joëlle Michaud
- Molecular Medicine Division, The Walter and Eliza Hall Institute of Medical Research, Parkville 3050, Victoria, Australia.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Gonzalez G, Uribe JC, Armstrong B, McDonough W, Berens ME. GeneRanker: An Online System for Predicting Gene-Disease Associations for Translational Research. SUMMIT ON TRANSLATIONAL BIOINFORMATICS 2008; 2008:26-30. [PMID: 21347122 PMCID: PMC3041521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
Abstract
With the overwhelming volume of genomic and molecular information available on many databases nowadays, researchers need from bioinformaticians more than encouragement to refine their searches. We present here GeneRanker, an online system that allows researchers to obtain a ranked list of genes potentially related to a specific disease or biological process by combining gene-disease (or genebiological process) associations with protein-protein interactions extracted from the literature, using computational analysis of the protein network topology to more accurately rank the predicted associations. GeneRanker was evaluated in the context of brain cancer research, and is freely available online at http://www.generanker.org.
Collapse
|
26
|
Hu Z, Ng DM, Yamada T, Chen C, Kawashima S, Mellor J, Linghu B, Kanehisa M, Stuart JM, DeLisi C. VisANT 3.0: new modules for pathway visualization, editing, prediction and construction. Nucleic Acids Res 2007; 35:W625-32. [PMID: 17586824 PMCID: PMC1933155 DOI: 10.1093/nar/gkm295] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
With the integration of the KEGG and Predictome databases as well as two search engines for coexpressed genes/proteins using data sets obtained from the Stanford Microarray Database (SMD) and Gene Expression Omnibus (GEO) database, VisANT 3.0 supports exploratory pathway analysis, which includes multi-scale visualization of multiple pathways, editing and annotating pathways using a KEGG compatible visual notation and visualization of expression data in the context of pathways. Expression levels are represented either by color intensity or by nodes with an embedded expression profile. Multiple experiments can be navigated or animated. Known KEGG pathways can be enriched by querying either coexpressed components of known pathway members or proteins with known physical interactions. Predicted pathways for genes/proteins with unknown functions can be inferred from coexpression or physical interaction data. Pathways produced in VisANT can be saved as computer-readable XML format (VisML), graphic images or high-resolution Scalable Vector Graphics (SVG). Pathways in the format of VisML can be securely shared within an interested group or published online using a simple Web link. VisANT is freely available at http://visant.bu.edu.
Collapse
Affiliation(s)
- Zhenjun Hu
- Center for Advanced Genomic Technology, Boston University, Boston, MA 02215, USA, Bioinformatics Center, Institute for Chemical Research, Kyoto University, Japan, Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA and Human Genome Center, Institute of Medical Science, University of Tokyo, Japan
| | - David M. Ng
- Center for Advanced Genomic Technology, Boston University, Boston, MA 02215, USA, Bioinformatics Center, Institute for Chemical Research, Kyoto University, Japan, Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA and Human Genome Center, Institute of Medical Science, University of Tokyo, Japan
| | - Takuji Yamada
- Center for Advanced Genomic Technology, Boston University, Boston, MA 02215, USA, Bioinformatics Center, Institute for Chemical Research, Kyoto University, Japan, Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA and Human Genome Center, Institute of Medical Science, University of Tokyo, Japan
| | - Chunnuan Chen
- Center for Advanced Genomic Technology, Boston University, Boston, MA 02215, USA, Bioinformatics Center, Institute for Chemical Research, Kyoto University, Japan, Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA and Human Genome Center, Institute of Medical Science, University of Tokyo, Japan
| | - Shuichi Kawashima
- Center for Advanced Genomic Technology, Boston University, Boston, MA 02215, USA, Bioinformatics Center, Institute for Chemical Research, Kyoto University, Japan, Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA and Human Genome Center, Institute of Medical Science, University of Tokyo, Japan
| | | | - Bolan Linghu
- Center for Advanced Genomic Technology, Boston University, Boston, MA 02215, USA, Bioinformatics Center, Institute for Chemical Research, Kyoto University, Japan, Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA and Human Genome Center, Institute of Medical Science, University of Tokyo, Japan
| | - Minoru Kanehisa
- Center for Advanced Genomic Technology, Boston University, Boston, MA 02215, USA, Bioinformatics Center, Institute for Chemical Research, Kyoto University, Japan, Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA and Human Genome Center, Institute of Medical Science, University of Tokyo, Japan
| | - Joshua M. Stuart
- Center for Advanced Genomic Technology, Boston University, Boston, MA 02215, USA, Bioinformatics Center, Institute for Chemical Research, Kyoto University, Japan, Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA and Human Genome Center, Institute of Medical Science, University of Tokyo, Japan
| | - Charles DeLisi
- Center for Advanced Genomic Technology, Boston University, Boston, MA 02215, USA, Bioinformatics Center, Institute for Chemical Research, Kyoto University, Japan, Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA and Human Genome Center, Institute of Medical Science, University of Tokyo, Japan
- *To whom correspondence should be addressed. +617 353 1122+617 353 3333
| |
Collapse
|
27
|
Chen C, Weirauch MT, Powell CC, Zambon AC, Stuart JM. A search engine to identify pathway genes from expression data on multiple organisms. BMC SYSTEMS BIOLOGY 2007; 1:20. [PMID: 17477880 PMCID: PMC1878502 DOI: 10.1186/1752-0509-1-20] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2007] [Accepted: 05/04/2007] [Indexed: 02/20/2023]
Abstract
Background The completion of several genome projects showed that most genes have not yet been characterized, especially in multicellular organisms. Although most genes have unknown functions, a large collection of data is available describing their transcriptional activities under many different experimental conditions. In many cases, the coregulatation of a set of genes across a set of conditions can be used to infer roles for genes of unknown function. Results We developed a search engine, the Multiple-Species Gene Recommender (MSGR), which scans gene expression datasets from multiple organisms to identify genes that participate in a genetic pathway. The MSGR takes a query consisting of a list of genes that function together in a genetic pathway from one of six organisms: Homo sapiens, Drosophila melanogaster, Caenorhabditis elegans, Saccharomyces cerevisiae, Arabidopsis thaliana, and Helicobacter pylori. Using a probabilistic method to merge searches, the MSGR identifies genes that are significantly coregulated with the query genes in one or more of those organisms. The MSGR achieves its highest accuracy for many human pathways when searches are combined across species. We describe specific examples in which new genes were identified to be involved in a neuromuscular signaling pathway and a cell-adhesion pathway. Conclusion The search engine can scan large collections of gene expression data for new genes that are significantly coregulated with a pathway of interest. By integrating searches across organisms, the MSGR can identify pathway members whose coregulation is either ancient or newly evolved.
Collapse
Affiliation(s)
- Chunnuan Chen
- Department of Biomolecular Engineering, University of California, Santa Cruz, California, 95064, USA
| | - Matthew T Weirauch
- Department of Biomolecular Engineering, University of California, Santa Cruz, California, 95064, USA
| | - Corey C Powell
- Department of Biomolecular Engineering, University of California, Santa Cruz, California, 95064, USA
| | - Alexander C Zambon
- Department of Medicine, Gladstone Institute of Cardiovascular Disease, San Francisco, California 94158, USA
| | - Joshua M Stuart
- Department of Biomolecular Engineering, University of California, Santa Cruz, California, 95064, USA
| |
Collapse
|
28
|
Andersen SU, Algreen-Petersen RG, Hoedl M, Jurkiewicz A, Cvitanich C, Braunschweig U, Schauser L, Oh SA, Twell D, Jensen EØ. The conserved cysteine-rich domain of a tesmin/TSO1-like protein binds zinc in vitro and TSO1 is required for both male and female fertility in Arabidopsis thaliana. JOURNAL OF EXPERIMENTAL BOTANY 2007; 58:3657-3670. [PMID: 18057042 DOI: 10.1093/jxb/erm215] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Development of reproductive tissue and control of cell division are common challenges to all sexually reproducing eukaryotes. The Arabidopsis thaliana TSO1 gene is involved in both these processes. Mild tso1 mutant alleles influence only ovule development, whereas strong alleles have an effect on all floral tissues and cause cell division defects. The tso1 mutants described so far carry point mutations in a conserved cysteine-rich domain, the CRC domain, but the reason for the range of phenotypes observed is poorly understood. In the present study, the tesmin/TSO1-like CXC (TCX) proteins are characterized at the biochemical, genomic, transcriptomic, and functional level to address this question. It is shown that the CRC domain binds zinc, offering an explanation for the severity of tso1 alleles where cysteine residues are affected. In addition, the phylogenetic and expression analysis of the TCX genes suggested an overlap in function between AtTSO1 and the related gene AtTCX2. Their expression ratios indicated that pollen, in addition to ovules, would be sensitive to loss of TSO1 function. This was confirmed by analysis of novel tso1 T-DNA insertion alleles where the development of both pollen and ovules was affected.
Collapse
Affiliation(s)
- Stig Uggerhøj Andersen
- Laboratory of Gene Expression, Department of Molecular Biology, University of Aarhus, Gustav Wieds Vej 10, DK-8000 Aarhus C, Denmark.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Harrison MM, Ceol CJ, Lu X, Horvitz HR. Some C. elegans class B synthetic multivulva proteins encode a conserved LIN-35 Rb-containing complex distinct from a NuRD-like complex. Proc Natl Acad Sci U S A 2006; 103:16782-7. [PMID: 17075059 PMCID: PMC1636532 DOI: 10.1073/pnas.0608461103] [Citation(s) in RCA: 110] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
The Caenorhabditis elegans synthetic multivulva (synMuv) genes act redundantly to antagonize the specification of vulval cell fates, which are promoted by an RTK/Ras pathway. At least 26 synMuv genes have been genetically identified, several of which encode proteins with homologs that act in chromatin remodeling or transcriptional repression. Here we report the molecular characterization of two synMuv genes, lin-37 and lin-54. We show that lin-37 and lin-54 encode proteins in a complex with at least seven synMuv proteins, including LIN-35, the only C. elegans homolog of the mammalian tumor suppressor Rb. Biochemical analyses of mutants suggest that LIN-9, LIN-53, and LIN-54 are required for the stable formation of this complex. This complex is distinct from a second complex of synMuv proteins with a composition similar to that of the mammalian Nucleosome Remodeling and Deacetylase complex. The class B synMuv complex we identified is evolutionarily conserved and likely functions in transcriptional repression and developmental regulation.
Collapse
Affiliation(s)
- Melissa M. Harrison
- Howard Hughes Medical Institute, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Craig J. Ceol
- Howard Hughes Medical Institute, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Xiaowei Lu
- Howard Hughes Medical Institute, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - H. Robert Horvitz
- Howard Hughes Medical Institute, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
30
|
Manfield IW, Jen CH, Pinney JW, Michalopoulos I, Bradford JR, Gilmartin PM, Westhead DR. Arabidopsis Co-expression Tool (ACT): web server tools for microarray-based gene expression analysis. Nucleic Acids Res 2006; 34:W504-9. [PMID: 16845059 PMCID: PMC1538833 DOI: 10.1093/nar/gkl204] [Citation(s) in RCA: 134] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
The Arabidopsis Co-expression Tool, ACT, ranks the genes across a large microarray dataset according to how closely their expression follows the expression of a query gene. A database stores pre-calculated co-expression results for ∼21 800 genes based on data from over 300 arrays. These results can be corroborated by calculation of co-expression results for user-defined sub-sets of arrays or experiments from the NASC/GARNet array dataset. Clique Finder (CF) identifies groups of genes which are consistently co-expressed with each other across a user-defined co-expression list. The parameters can be altered easily to adjust cluster size and the output examined for optimal inclusion of genes with known biological roles. Alternatively, a Scatter Plot tool displays the correlation coefficients for all genes against two user-selected queries on a scatter plot which can be useful for visual identification of clusters of genes with similar r-values. User-input groups of genes can be highlighted on the scatter plots. Inclusion of genes with known biology in sets of genes identified using CF and Scatter Plot tools allows inferences to be made about the roles of the other genes in the set and both tools can therefore be used to generate short lists of genes for further characterization. ACT is freely available at .
Collapse
Affiliation(s)
- Iain W. Manfield
- To whom correspondence should be addressed. Tel: +44 113 343 2901; Fax: +44 113 343 3144;
| | - Chih-Hung Jen
- Institute of Molecular and Cellular Biology, Faculty of Biological Sciences, University of LeedsWest Yorkshire, LS2 9JT, UK
| | - John W. Pinney
- Institute of Molecular and Cellular Biology, Faculty of Biological Sciences, University of LeedsWest Yorkshire, LS2 9JT, UK
| | - Ioannis Michalopoulos
- Institute of Molecular and Cellular Biology, Faculty of Biological Sciences, University of LeedsWest Yorkshire, LS2 9JT, UK
| | - James R. Bradford
- Institute of Molecular and Cellular Biology, Faculty of Biological Sciences, University of LeedsWest Yorkshire, LS2 9JT, UK
| | | | - David R. Westhead
- Institute of Molecular and Cellular Biology, Faculty of Biological Sciences, University of LeedsWest Yorkshire, LS2 9JT, UK
| |
Collapse
|
31
|
Giallourakis C, Cao Z, Green T, Wachtel H, Xie X, Lopez-Illasaca M, Daly M, Rioux J, Xavier R. A molecular-properties-based approach to understanding PDZ domain proteins and PDZ ligands. Genes Dev 2006; 16:1056-72. [PMID: 16825666 PMCID: PMC1524865 DOI: 10.1101/gr.5285206] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2006] [Accepted: 05/08/2006] [Indexed: 11/24/2022]
Abstract
PDZ domain-containing proteins and their interaction partners are mutated in numerous human diseases and function in complexes regulating epithelial polarity, ion channels, cochlear hair cell development, vesicular sorting, and neuronal synaptic communication. Among several properties of a collection of documented PDZ domain-ligand interactions, we discovered embedded in a large-scale expression data set the existence of a significant level of co-regulation between PDZ domain-encoding genes and these ligands. From this observation, we show how integration of expression data, a comparative genomics catalog of 899 mammalian genes with conserved PDZ-binding motifs, phylogenetic analysis, and literature mining can be utilized to infer PDZ complexes. Using molecular studies we map novel interaction partners for the PDZ proteins DLG1 and CARD11. These results provide insight into the diverse roles of PDZ-ligand complexes in cellular signaling and provide a computational framework for the genome-wide evaluation of PDZ complexes.
Collapse
Affiliation(s)
- Cosmas Giallourakis
- Massachusetts General Hospital, Gastrointestinal Unit, Harvard University Medical School, Boston, Massachusetts 02114, USA
- Broad Institute of MIT and Harvard University, Cambridge, Massachusetts 02139, USA
| | - Zhifang Cao
- Massachusetts General Hospital, Center for Computational and Integrative Biology, Harvard University Medical School, Boston, Massachusetts 02114, USA
- Massachusetts General Hospital, Gastrointestinal Unit, Harvard University Medical School, Boston, Massachusetts 02114, USA
| | - Todd Green
- Broad Institute of MIT and Harvard University, Cambridge, Massachusetts 02139, USA
| | - Heather Wachtel
- Massachusetts General Hospital, Center for Computational and Integrative Biology, Harvard University Medical School, Boston, Massachusetts 02114, USA
- Massachusetts General Hospital, Gastrointestinal Unit, Harvard University Medical School, Boston, Massachusetts 02114, USA
| | - Xiaohui Xie
- Broad Institute of MIT and Harvard University, Cambridge, Massachusetts 02139, USA
| | - Marco Lopez-Illasaca
- Cardiovascular Division, Department of Medicine, Brigham and Women’s Hospital, Harvard University Medical School, Boston, Massachusetts 02115, USA
| | - Mark Daly
- Broad Institute of MIT and Harvard University, Cambridge, Massachusetts 02139, USA
| | - John Rioux
- Broad Institute of MIT and Harvard University, Cambridge, Massachusetts 02139, USA
| | - Ramnik Xavier
- Massachusetts General Hospital, Center for Computational and Integrative Biology, Harvard University Medical School, Boston, Massachusetts 02114, USA
- Massachusetts General Hospital, Gastrointestinal Unit, Harvard University Medical School, Boston, Massachusetts 02114, USA
| |
Collapse
|
32
|
Jen CH, Manfield IW, Michalopoulos I, Pinney JW, Willats WGT, Gilmartin PM, Westhead DR. The Arabidopsis co-expression tool (ACT): a WWW-based tool and database for microarray-based gene expression analysis. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2006; 46:336-48. [PMID: 16623895 DOI: 10.1111/j.1365-313x.2006.02681.x] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
We present a new WWW-based tool for plant gene analysis, the Arabidopsis Co-Expression Tool (ACT), based on a large Arabidopsis thaliana microarray data set obtained from the Nottingham Arabidopsis Stock Centre. The co-expression analysis tool allows users to identify genes whose expression patterns are correlated across selected experiments or the complete data set. Results are accompanied by estimates of the statistical significance of the correlation relationships, expressed as probability (P) and expectation (E) values. Additionally, highly ranked genes on a correlation list can be examined using the novel clique finder tool to determine the sets of genes most likely to be regulated in a similar manner. In combination, these tools offer three levels of analysis: creation of correlation lists of co-expressed genes, refinement of these lists using two-dimensional scatter plots, and dissection into cliques of co-regulated genes. We illustrate the applications of the software by analysing genes encoding functionally related proteins, as well as pathways involved in plant responses to environmental stimuli. These analyses demonstrate novel biological relationships underlying the observed gene co-expression patterns. To demonstrate the ability of the software to develop testable hypotheses on gene function within a defined biological process we have used the example of cell wall biosynthesis genes. The resource is freely available at http://www.arabidopsis.leeds.ac.uk/ACT/
Collapse
Affiliation(s)
- Chih-Hung Jen
- School of Biochemistry and Microbiology, University of Leeds, Leeds, West Yorkshire, LS2 9JT, UK
| | | | | | | | | | | | | |
Collapse
|
33
|
Module Identification from Heterogeneous Biological Data Using Multiobjective Evolutionary Algorithms. PARALLEL PROBLEM SOLVING FROM NATURE - PPSN IX 2006. [DOI: 10.1007/11844297_58] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
34
|
Poulin G, Dong Y, Fraser AG, Hopper NA, Ahringer J. Chromatin regulation and sumoylation in the inhibition of Ras-induced vulval development in Caenorhabditis elegans. EMBO J 2005; 24:2613-23. [PMID: 15990876 PMCID: PMC1176455 DOI: 10.1038/sj.emboj.7600726] [Citation(s) in RCA: 105] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2004] [Accepted: 06/03/2005] [Indexed: 12/30/2022] Open
Abstract
In Caenorhabditis elegans, numerous 'synMuv' (synthetic multivulval) genes encode for chromatin-associated proteins involved in transcriptional repression, including an orthologue of Rb and components of the NuRD histone deacetylase complex. These genes antagonize Ras signalling to prevent erroneous adoption of vulval fate. To identify new components of this mechanism, we performed a genome-wide RNA interference (RNAi) screen. After RNAi of 16 757 genes, we found nine new synMuv genes. Based on predicted functions and genetic epistasis experiments, we propose that at least four post-translational modifications converge to inhibit Ras-stimulated vulval development: sumoylation, histone tail deacetylation, methylation, and acetylation. In addition, we demonstrate a novel role for sumoylation in inhibiting LIN-12/Notch signalling in the vulva. We further show that many of the synMuv genes are involved in gene regulation outside the vulva, negatively regulating the expression of the Delta homologue lag-2. As most of the genes identified in this screen are conserved in humans, we suggest that similar interactions may be relevant in mammals for control of Ras and Notch signalling, crosstalk between these pathways, and cell proliferation.
Collapse
Affiliation(s)
- Gino Poulin
- Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - Yan Dong
- Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - Andrew G Fraser
- Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge, UK
| | - Neil A Hopper
- School of Biological Sciences, University of Southampton, Southampton, UK
| | - Julie Ahringer
- Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
- Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK. Tel.: +44 1223 334088; Fax: +44 1223 334089; E-mail:
| |
Collapse
|
35
|
Lewis PW, Beall EL, Fleischer TC, Georlette D, Link AJ, Botchan MR. Identification of a Drosophila Myb-E2F2/RBF transcriptional repressor complex. Genes Dev 2004; 18:2929-40. [PMID: 15545624 PMCID: PMC534653 DOI: 10.1101/gad.1255204] [Citation(s) in RCA: 215] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The Drosophila Myb complex has roles in both activating and repressing developmentally regulated DNA replication. To further understand biochemically the functions of the Myb complex, we fractionated Drosophila embryo extracts relying upon affinity chromatography. We found that E2F2, DP, RBF1, RBF2, and the Drosophila homolog of LIN-52, a class B synthetic multivulva (synMuv) protein, copurify with the Myb complex components to form the Myb-MuvB complex. In addition, we found that the transcriptional repressor protein, lethal (3) malignant brain tumor protein, L(3)MBT, and the histone deacetylase, Rpd3, associated with the Myb-MuvB complex. Members of the Myb-MuvB complex were localized to promoters and were shown to corepress transcription of developmentally regulated genes. These and other data now link together the Myb and E2F2 complexes in higher-order assembly to specific chromosomal sites for the regulation of transcription.
Collapse
Affiliation(s)
- Peter W Lewis
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720-3204, USA
| | | | | | | | | | | |
Collapse
|