1
|
Ilnitskiy IS, Zharikova AA, Mironov AA. OUP accepted manuscript. Nucleic Acids Res 2022; 50:W534-W540. [PMID: 35610035 PMCID: PMC9252792 DOI: 10.1093/nar/gkac385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 04/19/2022] [Accepted: 04/29/2022] [Indexed: 11/27/2022] Open
Abstract
Extensive amounts of data from next-generation sequencing and omics studies have led to the accumulation of information that provides insight into the evolutionary landscape of related proteins. Here, we present OrthoQuantum, a web server that allows for time-efficient analysis and visualization of phylogenetic profiles of any set of eukaryotic proteins. It is a simple-to-use tool capable of searching large input sets of proteins. Using data from open source databases of orthologous sequences in a wide range of taxonomic groups, it enables users to assess coupled evolutionary patterns and helps define lineage-specific innovations. The web interface allows to perform queries with gene names and UniProt identifiers in different phylogenetic clades and supplement presence with an additional BLAST search. The conservation patterns of proteins are coded as binary vectors, i.e., strings that encode the presence or absence of orthologous proteins in other genomes. These strings are used to calculate top-scoring correlation pairs needed for finding co-inherited proteins which are simultaneously present or simultaneously absent in specific lineages. Profiles are visualized in combination with phylogenetic trees in a JavaScript-based interface. The OrthoQuantum v1.0 web server is freely available at http://orthoq.bioinf.fbb.msu.ru along with documentation and tutorial.
Collapse
Affiliation(s)
| | - Anastasia A Zharikova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Lomonosovsky Prospect 27, Building 10, 119991 Moscow, Russia
- Kharkevich Institute of Information Transmission Problems, Russian Academy of Sciences, Big Karetny Lane 19, Building 1, 127051 Moscow, Russia
| | - Andrey A Mironov
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Lomonosovsky Prospect 27, Building 10, 119991 Moscow, Russia
- Kharkevich Institute of Information Transmission Problems, Russian Academy of Sciences, Big Karetny Lane 19, Building 1, 127051 Moscow, Russia
| |
Collapse
|
2
|
Parey E, Crombach A. Evolution of the Drosophila melanogaster Chromatin Landscape and Its Associated Proteins. Genome Biol Evol 2019; 11:660-677. [PMID: 30689829 PMCID: PMC6411481 DOI: 10.1093/gbe/evz019] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/24/2019] [Indexed: 12/30/2022] Open
Abstract
In the nucleus of eukaryotic cells, genomic DNA associates with numerous protein complexes and RNAs, forming the chromatin landscape. Through a genome-wide study of chromatin-associated proteins in Drosophila cells, five major chromatin types were identified as a refinement of the traditional binary division into hetero- and euchromatin. These five types were given color names in reference to the Greek word chroma. They are defined by distinct but overlapping combinations of proteins and differ in biological and biochemical properties, including transcriptional activity, replication timing, and histone modifications. In this work, we assess the evolutionary relationships of chromatin-associated proteins and present an integrated view of the evolution and conservation of the fruit fly Drosophila melanogaster chromatin landscape. We combine homology prediction across a wide range of species with gene age inference methods to determine the origin of each chromatin-associated protein. This provides insight into the evolution of the different chromatin types. Our results indicate that for the euchromatic types, YELLOW and RED, young associated proteins are more specialized than old ones; and for genes found in either chromatin type, intron/exon structure is lineage-specific. Next, we provide evidence that a subset of GREEN-associated proteins is involved in a centromere drive in D. melanogaster. Our results on BLUE chromatin support the hypothesis that the emergence of Polycomb Group proteins is linked to eukaryotic multicellularity. In light of these results, we discuss how the regulatory complexification of chromatin links to the origins of eukaryotic multicellularity.
Collapse
Affiliation(s)
- Elise Parey
- Center for Interdisciplinary Research in Biology (CIRB), Collège de France, CNRS, INSERM, PSL Université Paris, France.,Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, PSL Université Paris, Paris, France
| | - Anton Crombach
- Center for Interdisciplinary Research in Biology (CIRB), Collège de France, CNRS, INSERM, PSL Université Paris, France.,Inria, Antenne Lyon La Doua, Villeurbanne, France.,Université de Lyon, INSA-Lyon, LIRIS, UMR 5205, Villeurbanne, France
| |
Collapse
|
3
|
Andrés-León E, Cases I, Arcas A, Rojas AM. DDRprot: a database of DNA damage response-related proteins. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw123. [PMID: 27577567 PMCID: PMC5004197 DOI: 10.1093/database/baw123] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/23/2016] [Accepted: 08/04/2016] [Indexed: 01/05/2023]
Abstract
The DNA Damage Response (DDR) signalling network is an essential system that protects the genome’s integrity. The DDRprot database presented here is a resource that integrates manually curated information on the human DDR network and its sub-pathways. For each particular DDR protein, we present detailed information about its function. If involved in post-translational modifications (PTMs) with each other, we depict the position of the modified residue/s in the three-dimensional structures, when resolved structures are available for the proteins. All this information is linked to the original publication from where it was obtained. Phylogenetic information is also shown, including time of emergence and conservation across 47 selected species, family trees and sequence alignments of homologues. The DDRprot database can be queried by different criteria: pathways, species, evolutionary age or involvement in (PTM). Sequence searches using hidden Markov models can be also used. Database URL:http://ddr.cbbio.es.
Collapse
Affiliation(s)
- Eduardo Andrés-León
- Computational Biology and Bioinformatics Group, Institute of Biomedicine of Seville, 41013 Sevilla, Spain
| | - Ildefonso Cases
- Computational Biology and Bioinformatics Group, Institute of Biomedicine of Seville, 41013 Sevilla, Spain
| | - Aida Arcas
- Computational Biology and Bioinformatics Group, Institute of Biomedicine of Seville, 41013 Sevilla, Spain Cell Movements in Development and Disease Lab, Instituto de Neurociencias (CSIC-UMH), 03550 Alicante, Spain
| | - Ana M Rojas
- Computational Biology and Bioinformatics Group, Institute of Biomedicine of Seville, 41013 Sevilla, Spain
| |
Collapse
|
4
|
Cromar GL, Zhao A, Xiong X, Swapna LS, Loughran N, Song H, Parkinson J. PhyloPro2.0: a database for the dynamic exploration of phylogenetically conserved proteins and their domain architectures across the Eukarya. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw013. [PMID: 26980519 PMCID: PMC4792532 DOI: 10.1093/database/baw013] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2015] [Accepted: 01/29/2016] [Indexed: 11/13/2022]
Abstract
PhyloPro is a database and accompanying web-based application for the construction and exploration of phylogenetic profiles across the Eukarya. In this update article, we present six major new developments in PhyloPro: (i) integration of Pfam-A domain predictions for all proteins; (ii) new summary heatmaps and detailed level views of domain conservation; (iii) an interactive, network-based visualization tool for exploration of domain architectures and their conservation; (iv) ability to browse based on protein functional categories (GOSlim); (v) improvements to the web interface to enhance drill down capability from the heatmap view; and (vi) improved coverage including 164 eukaryotes and 12 reference species. In addition, we provide improved support for downloading data and images in a variety of formats. Among the existing tools available for phylogenetic profiles, PhyloPro provides several innovative domain-based features including a novel domain adjacency visualization tool. These are designed to allow the user to identify and compare proteins with similar domain architectures across species and thus develop hypotheses about the evolution of lineage-specific trajectories. Database URL: http://www.compsysbio.org/phylopro/.
Collapse
Affiliation(s)
- Graham L Cromar
- Program in Molecular Structure and Function, Hospital for Sick Children, 21-9830 PGCRL, 686 Bay Street, Toronto, ON M5G 0A4, Canada and
| | - Anthony Zhao
- Program in Molecular Structure and Function, Hospital for Sick Children, 21-9830 PGCRL, 686 Bay Street, Toronto, ON M5G 0A4, Canada and
| | - Xuejian Xiong
- Program in Molecular Structure and Function, Hospital for Sick Children, 21-9830 PGCRL, 686 Bay Street, Toronto, ON M5G 0A4, Canada and
| | - Lakshmipuram S Swapna
- Program in Molecular Structure and Function, Hospital for Sick Children, 21-9830 PGCRL, 686 Bay Street, Toronto, ON M5G 0A4, Canada and
| | - Noeleen Loughran
- Program in Molecular Structure and Function, Hospital for Sick Children, 21-9830 PGCRL, 686 Bay Street, Toronto, ON M5G 0A4, Canada and
| | - Hongyan Song
- Program in Molecular Structure and Function, Hospital for Sick Children, 21-9830 PGCRL, 686 Bay Street, Toronto, ON M5G 0A4, Canada and
| | - John Parkinson
- Program in Molecular Structure and Function, Hospital for Sick Children, 21-9830 PGCRL, 686 Bay Street, Toronto, ON M5G 0A4, Canada and Departments of Biochemistry, Computer Science and Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| |
Collapse
|
5
|
Morel G, Sterck L, Swennen D, Marcet-Houben M, Onesime D, Levasseur A, Jacques N, Mallet S, Couloux A, Labadie K, Amselem J, Beckerich JM, Henrissat B, Van de Peer Y, Wincker P, Souciet JL, Gabaldón T, Tinsley CR, Casaregola S. Differential gene retention as an evolutionary mechanism to generate biodiversity and adaptation in yeasts. Sci Rep 2015; 5:11571. [PMID: 26108467 PMCID: PMC4479816 DOI: 10.1038/srep11571] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2015] [Accepted: 05/29/2015] [Indexed: 12/13/2022] Open
Abstract
The evolutionary history of the characters underlying the adaptation of microorganisms to food and biotechnological uses is poorly understood. We undertook comparative genomics to investigate evolutionary relationships of the dairy yeast Geotrichum candidum within Saccharomycotina. Surprisingly, a remarkable proportion of genes showed discordant phylogenies, clustering with the filamentous fungus subphylum (Pezizomycotina), rather than the yeast subphylum (Saccharomycotina), of the Ascomycota. These genes appear not to be the result of Horizontal Gene Transfer (HGT), but to have been specifically retained by G. candidum after the filamentous fungi-yeasts split concomitant with the yeasts' genome contraction. We refer to these genes as SRAGs (Specifically Retained Ancestral Genes), having been lost by all or nearly all other yeasts, and thus contributing to the phenotypic specificity of lineages. SRAG functions include lipases consistent with a role in cheese making and novel endoglucanases associated with degradation of plant material. Similar gene retention was observed in three other distantly related yeasts representative of this ecologically diverse subphylum. The phenomenon thus appears to be widespread in the Saccharomycotina and argues that, alongside neo-functionalization following gene duplication and HGT, specific gene retention must be recognized as an important mechanism for generation of biodiversity and adaptation in yeasts.
Collapse
Affiliation(s)
- Guillaume Morel
- INRA UMR1319, Micalis Institute, CIRM-Levures, 78850 F-Thiverval-Grignon, France
- AgroParisTech UMR1319, Micalis Institute, 78850 F-Thiverval-Grignon, France
| | - Lieven Sterck
- Department of Plant Systems Biology VIB, Technologiepark 927, 9052 Gent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, 9052 Gent, Belgium
| | - Dominique Swennen
- INRA UMR1319, Micalis Institute, CIRM-Levures, 78850 F-Thiverval-Grignon, France
- AgroParisTech UMR1319, Micalis Institute, 78850 F-Thiverval-Grignon, France
| | - Marina Marcet-Houben
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation, Dr. Aiguader 88, Barcelona 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona 08003, Spain
| | - Djamila Onesime
- INRA UMR1319, Micalis Institute, CIRM-Levures, 78850 F-Thiverval-Grignon, France
- AgroParisTech UMR1319, Micalis Institute, 78850 F-Thiverval-Grignon, France
| | - Anthony Levasseur
- INRA UMR1163, Biotechnologie des Champignons Filamenteux, Aix-Marseille Université, Polytech Marseille, 163 avenue de Luminy, CP 925, 13288 Marseille Cedex 09, France
| | - Noémie Jacques
- INRA UMR1319, Micalis Institute, CIRM-Levures, 78850 F-Thiverval-Grignon, France
- AgroParisTech UMR1319, Micalis Institute, 78850 F-Thiverval-Grignon, France
| | - Sandrine Mallet
- INRA UMR1319, Micalis Institute, CIRM-Levures, 78850 F-Thiverval-Grignon, France
- AgroParisTech UMR1319, Micalis Institute, 78850 F-Thiverval-Grignon, France
| | - Arnaux Couloux
- CEA, Institut de Génomique, Genoscope, 2 Rue Gaston Crémieux, Évry F-91000, France
| | - Karine Labadie
- CEA, Institut de Génomique, Genoscope, 2 Rue Gaston Crémieux, Évry F-91000, France
| | - Joëlle Amselem
- INRA UR1164, Unité de Recherche Génomique – Info, 78000 Versailles, France
| | - Jean-Marie Beckerich
- INRA UMR1319, Micalis Institute, CIRM-Levures, 78850 F-Thiverval-Grignon, France
- AgroParisTech UMR1319, Micalis Institute, 78850 F-Thiverval-Grignon, France
| | | | - Yves Van de Peer
- Department of Plant Systems Biology VIB, Technologiepark 927, 9052 Gent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, 9052 Gent, Belgium
- Genomics Research Institute, University of Pretoria, Hatfield Campus, Pretoria 0028, South Africa
| | - Patrick Wincker
- CEA, Institut de Génomique, Genoscope, 2 Rue Gaston Crémieux, Évry F-91000, France
- CNRS UMR 8030, 2 Rue Gaston Crémieux, Évry, 91000, France
- Université d’Evry, Bd François Mitterand, Evry,91025, France
| | - Jean-Luc Souciet
- Université de Strasbourg, CNRS UMR7156, Strasbourg, 67000, France
| | - Toni Gabaldón
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation, Dr. Aiguader 88, Barcelona 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona 08003, Spain
| | - Colin R. Tinsley
- INRA UMR1319, Micalis Institute, CIRM-Levures, 78850 F-Thiverval-Grignon, France
- AgroParisTech UMR1319, Micalis Institute, 78850 F-Thiverval-Grignon, France
| | - Serge Casaregola
- INRA UMR1319, Micalis Institute, CIRM-Levures, 78850 F-Thiverval-Grignon, France
- AgroParisTech UMR1319, Micalis Institute, 78850 F-Thiverval-Grignon, France
| |
Collapse
|
6
|
Panzeri I, Rossetti G, Abrignani S, Pagani M. Long Intergenic Non-Coding RNAs: Novel Drivers of Human Lymphocyte Differentiation. Front Immunol 2015; 6:175. [PMID: 25926836 PMCID: PMC4397839 DOI: 10.3389/fimmu.2015.00175] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2015] [Accepted: 03/28/2015] [Indexed: 12/29/2022] Open
Abstract
Upon recognition of a foreign antigen, CD4(+) naïve T lymphocytes proliferate and differentiate into subsets with distinct functions. This process is fundamental for the effective immune system function, as CD4(+) T cells orchestrate both the innate and adaptive immune response. Traditionally, this differentiation event has been regarded as the acquisition of an irreversible cell fate so that memory and effector CD4(+) T subsets were considered terminally differentiated cells or lineages. Consequently, these lineages are conventionally defined thanks to their prototypical set of cytokines and transcription factors. However, recent findings suggest that CD4(+) T lymphocytes possess a remarkable phenotypic plasticity, as they can often re-direct their functional program depending on the milieu they encounter. Therefore, new questions are now compelling such as which are the molecular determinants underlying plasticity and stability and how the balance between these two opposite forces drives the cell fate. As already mentioned, in some cases, the mere expression of cytokines and master regulators could not fully explain lymphocytes plasticity. We should consider other layers of regulation, including epigenetic factors such as the modulation of chromatin state or the transcription of non-coding RNAs, whose high cell-specificity give a hint on their involvement in cell fate determination. In this review, we will focus on the recent advances in understanding CD4(+) T lymphocytes subsets specification from an epigenetic point of view. In particular, we will emphasize the emerging importance of non-coding RNAs as key players in these differentiation events. We will also present here new data from our laboratory highlighting the contribution of long non-coding RNAs in driving human CD4(+) T lymphocytes differentiation.
Collapse
Affiliation(s)
- Ilaria Panzeri
- Integrative Biology Unit, Istituto Nazionale Genetica Molecolare "Romeo ed Enrica Invernizzi", IRCCS Ospedale Maggiore Policlinico , Milano , Italy
| | - Grazisa Rossetti
- Integrative Biology Unit, Istituto Nazionale Genetica Molecolare "Romeo ed Enrica Invernizzi", IRCCS Ospedale Maggiore Policlinico , Milano , Italy
| | - Sergio Abrignani
- Integrative Biology Unit, Istituto Nazionale Genetica Molecolare "Romeo ed Enrica Invernizzi", IRCCS Ospedale Maggiore Policlinico , Milano , Italy
| | - Massimiliano Pagani
- Integrative Biology Unit, Istituto Nazionale Genetica Molecolare "Romeo ed Enrica Invernizzi", IRCCS Ospedale Maggiore Policlinico , Milano , Italy ; Department of Medical Biotechnology and Translational Medicine, Università degli Studi di Milano , Milano , Italy
| |
Collapse
|
7
|
Human-Chromatin-Related Protein Interactions Identify a Demethylase Complex Required for Chromosome Segregation. Cell Rep 2014; 8:297-310. [DOI: 10.1016/j.celrep.2014.05.050] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2013] [Revised: 04/24/2014] [Accepted: 05/27/2014] [Indexed: 01/14/2023] Open
|
8
|
Arcas A, Fernández-Capetillo O, Cases I, Rojas AM. Emergence and evolutionary analysis of the human DDR network: implications in comparative genomics and downstream analyses. Mol Biol Evol 2014; 31:940-61. [PMID: 24441036 PMCID: PMC3969565 DOI: 10.1093/molbev/msu046] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
The DNA damage response (DDR) is a crucial signaling network that preserves the integrity of the genome. This network is an ensemble of distinct but often overlapping subnetworks, where different components fulfill distinct functions in precise spatial and temporal scenarios. To understand how these elements have been assembled together in humans, we performed comparative genomic analyses in 47 selected species to trace back their emergence using systematic phylogenetic analyses and estimated gene ages. The emergence of the contribution of posttranslational modifications to the complex regulation of DDR was also investigated. This is the first time a systematic analysis has focused on the evolution of DDR subnetworks as a whole. Our results indicate that a DDR core, mostly constructed around metabolic activities, appeared soon after the emergence of eukaryotes, and that additional regulatory capacities appeared later through complex evolutionary process. Potential key posttranslational modifications were also in place then, with interacting pairs preferentially appearing at the same evolutionary time, although modifications often led to the subsequent acquisition of new targets afterwards. We also found extensive gene loss in essential modules of the regulatory network in fungi, plants, and arthropods, important for their validation as model organisms for DDR studies.
Collapse
Affiliation(s)
- Aida Arcas
- Computational Cell Biology Group, Institute for Predictive and Personalized Medicine of Cancer, Badalona, Spain
| | | | | | | |
Collapse
|
9
|
Structure and function of long noncoding RNAs in epigenetic regulation. Nat Struct Mol Biol 2013; 20:300-7. [DOI: 10.1038/nsmb.2480] [Citation(s) in RCA: 1087] [Impact Index Per Article: 98.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2012] [Accepted: 11/20/2012] [Indexed: 12/21/2022]
|
10
|
Kono M, Herrmann S, Loughran NB, Cabrera A, Engelberg K, Lehmann C, Sinha D, Prinz B, Ruch U, Heussler V, Spielmann T, Parkinson J, Gilberger TW. Evolution and architecture of the inner membrane complex in asexual and sexual stages of the malaria parasite. Mol Biol Evol 2012; 29:2113-32. [PMID: 22389454 DOI: 10.1093/molbev/mss081] [Citation(s) in RCA: 90] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
The inner membrane complex (IMC) is a unifying morphological feature of all alveolate organisms. It consists of flattened vesicles underlying the plasma membrane and is interconnected with the cytoskeleton. Depending on the ecological niche of the organisms, the function of the IMC ranges from a fundamental role as reinforcement system to more specialized roles in motility and cytokinesis. In this article, we present a comprehensive evolutionary analysis of IMC components, which exemplifies the adaptive nature of the IMCs' protein composition. Focusing on eight structurally distinct proteins in the most prominent "genus" of the Alveolata-the malaria parasite Plasmodium-we demonstrate that the level of conservation is reflected in phenotypic characteristics, accentuated in differential spatial-temporal patterns of these proteins in the motile stages of the parasite's life cycle. Colocalization studies with the centromere and the spindle apparatus reveal their discriminative biogenesis. We also reveal that the IMC is an essential structural compartment for the development of the sexual stages of Plasmodium, as it seems to drive the morphological changes of the parasite during the long and multistaged process of sexual differentiation. We further found a Plasmodium-specific IMC membrane matrix protein that highlights transversal structures in gametocytes, which could represent a genus-specific structural innovation required by Plasmodium. We conclude that the IMC has an additional role during sexual development supporting morphogenesis of the cell, which in addition to its functions in the asexual stages highlights the multifunctional nature of the IMC in the Plasmodium life cycle.
Collapse
Affiliation(s)
- Maya Kono
- Department of Molecular Parasitology, Bernhard Nocht Institute for Tropical Medicine, Hamburg, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Cromar GL, Xiong X, Chautard E, Ricard-Blum S, Parkinson J. Toward a systems level view of the ECM and related proteins: a framework for the systematic definition and analysis of biological systems. Proteins 2012; 80:1522-44. [PMID: 22275077 DOI: 10.1002/prot.24036] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2011] [Revised: 12/19/2011] [Accepted: 12/29/2011] [Indexed: 12/20/2022]
Abstract
Advances in high throughput 'omic technologies are starting to provide unprecedented insights into how components of biological systems are organized and interact. Key to exploiting these datasets is the definition of the components that comprise the system of interest. Although a variety of knowledge bases exist that capture such information, a major challenge is determining how these resources may be best utilized. Here we present a systematic curation strategy to define a systems-level view of the human extracellular matrix (ECM)--a three-dimensional meshwork of proteins and polysaccharides that impart structure and mechanical stability to tissues. Employing our curation strategy we define a set of 357 proteins that represent core components of the ECM, together with an additional 524 genes that mediate related functional roles, and construct a map of their physical interactions. Topological properties help identify modules of functionally related proteins, including those involved in cell adhesion, bone formation and blood clotting. Because of its major role in cell adhesion, proliferation and morphogenesis, defects in the ECM have been implicated in cancer, atherosclerosis, asthma, fibrosis, and arthritis. We use MeSH annotations to identify modules enriched for specific disease terms that aid to strengthen existing as well as predict novel gene-disease associations. Mapping expression and conservation data onto the network reveal modules evolved in parallel to convey tissue-specific functionality on otherwise broadly expressed units. In addition to demonstrating an effective workflow for defining biological systems, this study crystallizes our current knowledge surrounding the organization of the ECM.
Collapse
Affiliation(s)
- Graham L Cromar
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | | | | | | | | |
Collapse
|
12
|
PhyloPro: a web-based tool for the generation and visualization of phylogenetic profiles across Eukarya. Bioinformatics 2011; 27:877-8. [DOI: 10.1093/bioinformatics/btr023] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
13
|
Turinsky AL, Turner B, Borja RC, Gleeson JA, Heath M, Pu S, Switzer T, Dong D, Gong Y, On T, Xiong X, Emili A, Greenblatt J, Parkinson J, Zhang Z, Wodak SJ. DAnCER: disease-annotated chromatin epigenetics resource. Nucleic Acids Res 2011; 39:D889-94. [PMID: 20876685 PMCID: PMC3013761 DOI: 10.1093/nar/gkq857] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2010] [Accepted: 09/12/2010] [Indexed: 12/15/2022] Open
Abstract
Chromatin modification (CM) is a set of epigenetic processes that govern many aspects of DNA replication, transcription and repair. CM is carried out by groups of physically interacting proteins, and their disruption has been linked to a number of complex human diseases. CM remains largely unexplored, however, especially in higher eukaryotes such as human. Here we present the DAnCER resource, which integrates information on genes with CM function from five model organisms, including human. Currently integrated are gene functional annotations, Pfam domain architecture, protein interaction networks and associated human diseases. Additional supporting evidence includes orthology relationships across organisms, membership in protein complexes, and information on protein 3D structure. These data are available for 962 experimentally confirmed and manually curated CM genes and for over 5000 genes with predicted CM function on the basis of orthology and domain composition. DAnCER allows visual explorations of the integrated data and flexible query capabilities using a variety of data filters. In particular, disease information and functional annotations are mapped onto the protein interaction networks, enabling the user to formulate new hypotheses on the function and disease associations of a given gene based on those of its interaction partners. DAnCER is freely available at http://wodaklab.org/dancer/.
Collapse
Affiliation(s)
- Andrei L. Turinsky
- Program in Molecular Structure and Function, Hospital for Sick Children, Banting and Best Department of Medical Research, University of Toronto, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Department of Molecular Genetics, University of Toronto and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Brian Turner
- Program in Molecular Structure and Function, Hospital for Sick Children, Banting and Best Department of Medical Research, University of Toronto, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Department of Molecular Genetics, University of Toronto and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Rosanne C. Borja
- Program in Molecular Structure and Function, Hospital for Sick Children, Banting and Best Department of Medical Research, University of Toronto, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Department of Molecular Genetics, University of Toronto and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - James A. Gleeson
- Program in Molecular Structure and Function, Hospital for Sick Children, Banting and Best Department of Medical Research, University of Toronto, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Department of Molecular Genetics, University of Toronto and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Michael Heath
- Program in Molecular Structure and Function, Hospital for Sick Children, Banting and Best Department of Medical Research, University of Toronto, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Department of Molecular Genetics, University of Toronto and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Shuye Pu
- Program in Molecular Structure and Function, Hospital for Sick Children, Banting and Best Department of Medical Research, University of Toronto, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Department of Molecular Genetics, University of Toronto and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Thomas Switzer
- Program in Molecular Structure and Function, Hospital for Sick Children, Banting and Best Department of Medical Research, University of Toronto, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Department of Molecular Genetics, University of Toronto and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Dong Dong
- Program in Molecular Structure and Function, Hospital for Sick Children, Banting and Best Department of Medical Research, University of Toronto, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Department of Molecular Genetics, University of Toronto and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Yunchen Gong
- Program in Molecular Structure and Function, Hospital for Sick Children, Banting and Best Department of Medical Research, University of Toronto, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Department of Molecular Genetics, University of Toronto and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Tuan On
- Program in Molecular Structure and Function, Hospital for Sick Children, Banting and Best Department of Medical Research, University of Toronto, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Department of Molecular Genetics, University of Toronto and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Xuejian Xiong
- Program in Molecular Structure and Function, Hospital for Sick Children, Banting and Best Department of Medical Research, University of Toronto, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Department of Molecular Genetics, University of Toronto and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Andrew Emili
- Program in Molecular Structure and Function, Hospital for Sick Children, Banting and Best Department of Medical Research, University of Toronto, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Department of Molecular Genetics, University of Toronto and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Jack Greenblatt
- Program in Molecular Structure and Function, Hospital for Sick Children, Banting and Best Department of Medical Research, University of Toronto, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Department of Molecular Genetics, University of Toronto and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - John Parkinson
- Program in Molecular Structure and Function, Hospital for Sick Children, Banting and Best Department of Medical Research, University of Toronto, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Department of Molecular Genetics, University of Toronto and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Zhaolei Zhang
- Program in Molecular Structure and Function, Hospital for Sick Children, Banting and Best Department of Medical Research, University of Toronto, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Department of Molecular Genetics, University of Toronto and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Shoshana J. Wodak
- Program in Molecular Structure and Function, Hospital for Sick Children, Banting and Best Department of Medical Research, University of Toronto, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Department of Molecular Genetics, University of Toronto and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
14
|
Pu S, Turinsky AL, Vlasblom J, On T, Xiong X, Emili A, Zhang Z, Greenblatt J, Parkinson J, Wodak SJ. Expanding the landscape of chromatin modification (CM)-related functional domains and genes in human. PLoS One 2010; 5:e14122. [PMID: 21124763 PMCID: PMC2993927 DOI: 10.1371/journal.pone.0014122] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2010] [Accepted: 10/26/2010] [Indexed: 01/06/2023] Open
Abstract
Chromatin modification (CM) plays a key role in regulating transcription, DNA replication, repair and recombination. However, our knowledge of these processes in humans remains very limited. Here we use computational approaches to study proteins and functional domains involved in CM in humans. We analyze the abundance and the pair-wise domain-domain co-occurrences of 25 well-documented CM domains in 5 model organisms: yeast, worm, fly, mouse and human. Results show that domains involved in histone methylation, DNA methylation, and histone variants are remarkably expanded in metazoan, reflecting the increased demand for cell type-specific gene regulation. We find that CM domains tend to co-occur with a limited number of partner domains and are hence not promiscuous. This property is exploited to identify 47 potentially novel CM domains, including 24 DNA-binding domains, whose role in CM has received little attention so far. Lastly, we use a consensus Machine Learning approach to predict 379 novel CM genes (coding for 329 proteins) in humans based on domain compositions. Several of these predictions are supported by very recent experimental studies and others are slated for experimental verification. Identification of novel CM genes and domains in humans will aid our understanding of fundamental epigenetic processes that are important for stem cell differentiation and cancer biology. Information on all the candidate CM domains and genes reported here is publicly available.
Collapse
Affiliation(s)
- Shuye Pu
- Program in Molecular Structure & Function, Hospital for Sick Children, Toronto, Canada
| | - Andrei L. Turinsky
- Program in Molecular Structure & Function, Hospital for Sick Children, Toronto, Canada
| | - James Vlasblom
- Program in Molecular Structure & Function, Hospital for Sick Children, Toronto, Canada
- Department of Biochemistry, University of Toronto, Toronto, Canada
| | - Tuan On
- Program in Molecular Structure & Function, Hospital for Sick Children, Toronto, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Canada
| | - Xuejian Xiong
- Program in Molecular Structure & Function, Hospital for Sick Children, Toronto, Canada
| | - Andrew Emili
- Department of Molecular Genetics, University of Toronto, Toronto, Canada
- Terrence Donnelly Centre for Cellular and Biomolecular Research, Toronto, Canada
- Banting and Best Department of Medical Research, Toronto, Canada
| | - Zhaolei Zhang
- Department of Molecular Genetics, University of Toronto, Toronto, Canada
- Terrence Donnelly Centre for Cellular and Biomolecular Research, Toronto, Canada
- Banting and Best Department of Medical Research, Toronto, Canada
| | - Jack Greenblatt
- Department of Molecular Genetics, University of Toronto, Toronto, Canada
- Terrence Donnelly Centre for Cellular and Biomolecular Research, Toronto, Canada
- Banting and Best Department of Medical Research, Toronto, Canada
| | - John Parkinson
- Program in Molecular Structure & Function, Hospital for Sick Children, Toronto, Canada
- Department of Biochemistry, University of Toronto, Toronto, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Canada
| | - Shoshana J. Wodak
- Program in Molecular Structure & Function, Hospital for Sick Children, Toronto, Canada
- Department of Biochemistry, University of Toronto, Toronto, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Canada
- * E-mail:
| |
Collapse
|