201
|
Murakami Y, Matsumoto Y, Tsuru S, Ying BW, Yomo T. Global coordination in adaptation to gene rewiring. Nucleic Acids Res 2015; 43:1304-16. [PMID: 25564530 PMCID: PMC4333410 DOI: 10.1093/nar/gku1366] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Gene rewiring is a common evolutionary phenomenon in nature that may lead to extinction for living organisms. Recent studies on synthetic biology demonstrate that cells can survive genetic rewiring. This survival (adaptation) is often linked to the stochastic expression of rewired genes with random transcriptional changes. However, the probability of adaptation and the underlying common principles are not clear. We performed a systematic survey of an assortment of gene-rewired Escherichia coli strains to address these questions. Three different cell fates, designated good survivors, poor survivors and failures, were observed when the strains starved. Large fluctuations in the expression of the rewired gene were commonly observed with increasing cell size, but these changes were insufficient for adaptation. Cooperative reorganizations in the corresponding operon and genome-wide gene expression largely contributed to the final success. Transcriptome reorganizations that generally showed high-dimensional dynamic changes were restricted within a one-dimensional trajectory for adaptation to gene rewiring, indicating a general path directed toward cellular plasticity for a successful cell fate. This finding of global coordination supports a mechanism of stochastic adaptation and provides novel insights into the design and application of complex genetic or metabolic networks.
Collapse
Affiliation(s)
- Yoshie Murakami
- Graduate School of Information Science and Technology, Osaka University, 1-5 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Yuki Matsumoto
- Graduate School of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki 305-8572, Japan
| | - Saburo Tsuru
- Graduate School of Information Science and Technology, Osaka University, 1-5 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Bei-Wen Ying
- Graduate School of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki 305-8572, Japan
| | - Tetsuya Yomo
- Graduate School of Information Science and Technology, Osaka University, 1-5 Yamadaoka, Suita, Osaka 565-0871, Japan Graduate School of Frontier Biosciences, Osaka University, Suita, Osaka 565-0871, Japan Exploratory Research for Advanced Technology (ERATO), Japan Science and Technology Agency (JST), Suita, Osaka 565-0871, Japan Earth-Life Science Institute, Tokyo Institute of Technology, Meguro, Tokyo 152-8550, Japan
| |
Collapse
|
202
|
The Little Known Universe of Short Proteins in Insects: A Machine Learning Approach. SHORT VIEWS ON INSECT GENOMICS AND PROTEOMICS 2015. [DOI: 10.1007/978-3-319-24235-4_8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
203
|
Zhao Y, Liu T, Luo J, Zhang Q, Xu S, Han C, Xu J, Chen M, Chen Y, Kong L. Integration of a Decrescent Transcriptome and Metabolomics Dataset of Peucedanum praeruptorum to Investigate the CYP450 and MDR Genes Involved in Coumarins Biosynthesis and Transport. FRONTIERS IN PLANT SCIENCE 2015; 6:996. [PMID: 26697023 PMCID: PMC4674560 DOI: 10.3389/fpls.2015.00996] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2015] [Accepted: 10/30/2015] [Indexed: 05/09/2023]
Abstract
Peucedanum praeruptorum Dunn is well-known traditional Chinese medicine. However, little is known in the biosynthesis and the transport mechanisms of its coumarin compounds at the molecular level. Although transcriptomic sequence is playing an increasingly significant role in gene discovery, it is not sufficient in predicting the specific function of target gene. Furthermore, there is also a huge database to be analyzed. In this study, RNA sequencing assisted transcriptome dataset and high-performance liquid chromatography (HPLC) coupled with electrospray-ionization quadrupole time-of-flight mass spectrometry (Q-TOF MS)-based metabolomics dataset of P. praeruptorum were firstly constructed for gene discovery and compound identification. Subsequently, methyl jasmonate (MeJA)-induced gene expression analysis and metabolomics analysis were conducted to narrow-down the dataset for selecting the candidate genes and the potential marker metabolites. Finally, the genes involved in coumarins biosynthesis and transport were predicted with parallel analysis of transcript and metabolic profiles. As a result, a total of 40,952 unigenes and 19 coumarin compounds were obtained. Based on the results of gene expression and metabolomics analysis, 7 cytochrome-P450 and 8 multidrug resistance transporter unigenes were selected as candidate genes and 8 marker compounds were selected as biomarkers, respectively. The parallel analysis of gene expression and metabolites accumulation indicated that the gene labeled as 23,746, 228, and 30,922 were related to the formation of the coumarin core compounds whereas 36,276 and 9533 participated in the prenylation, hydroxylation, cyclization or structural modification. Similarly, 1462, 20,815, and 15,318 participated in the transport of coumarin core compounds while 124,029 and 324,293 participated in the transport of the modified compounds. This finding suggested that integration of a decrescent transcriptome and metabolomics dataset could largely narrow down the number of gene to be investigated and significantly improve the efficiency of functional gene predication. In addition, the large amount of transcriptomic data produced from P. praeruptorum and the genes discovered in this study would provide useful information in investigating the biosynthesis and transport mechanism of coumarins.
Collapse
Affiliation(s)
- Yucheng Zhao
- State Key Laboratory of Natural Medicines, Department of Natural Medicinal Chemistry, China Pharmaceutical UniversityNanjing, China
| | - Tingting Liu
- State Key Laboratory of Natural Medicines, Department of Natural Medicinal Chemistry, China Pharmaceutical UniversityNanjing, China
| | - Jun Luo
- State Key Laboratory of Natural Medicines, Department of Natural Medicinal Chemistry, China Pharmaceutical UniversityNanjing, China
| | - Qian Zhang
- State Key Laboratory of Natural Medicines, Department of Natural Medicinal Chemistry, China Pharmaceutical UniversityNanjing, China
| | - Sheng Xu
- Institute of Botany, Jiangsu Province and Chinese Academy of SciencesNanjing, China
| | - Chao Han
- State Key Laboratory of Natural Medicines, Department of Natural Medicinal Chemistry, China Pharmaceutical UniversityNanjing, China
| | - Jinfang Xu
- State Key Laboratory of Natural Medicines, Department of Natural Medicinal Chemistry, China Pharmaceutical UniversityNanjing, China
| | - Menghan Chen
- State Key Laboratory of Natural Medicines, Department of Natural Medicinal Chemistry, China Pharmaceutical UniversityNanjing, China
| | - Yijun Chen
- State Key Laboratory of Natural Medicines, Department of Natural Medicinal Chemistry, China Pharmaceutical UniversityNanjing, China
| | - Lingyi Kong
- State Key Laboratory of Natural Medicines, Department of Natural Medicinal Chemistry, China Pharmaceutical UniversityNanjing, China
- *Correspondence: Lingyi Kong
| |
Collapse
|
204
|
Abstract
The database of essential genes (DEG, available at http://www.essentialgene.org), constructed in 2003, has been timely updated to harbor essential-gene records of bacteria, archaea, and eukaryotes. DEG 10, the current release, includes not only essential protein-coding genes determined by genome-wide gene essentiality screens but also essential noncoding RNAs, promoters, regulatory sequences, and replication origins. Therefore, DEG 10 includes essential genomic elements under different conditions in three domains of life, with customizable BLAST tools. Based on the analysis of DEG 10, we show that the percentage of essential genes in bacterial genomes exhibits an exponential decay with increasing genome sizes. The functions, ATP binding (GO:0005524), GTP binding (GO:0005525), and DNA-directed RNA polymerase activity (GO:0003899), are likely required for organisms across life domains.
Collapse
|
205
|
Abstract
The high complexity of the total cellular proteome underscores the need for a more targeted investigation of particular subcellular fractions as a means to detect the changes at the level of low abundance proteins. However, this approach requires the application of an enrichment strategy. In this chapter, we present the protocols, which have been used for the analysis of secretome from cell lines, targeting the investigation of protein expression changes.
Collapse
|
206
|
Fang X, Chen W, Zhao Y, Ruan S, Zhang H, Yan C, Jin L, Cao L, Zhu J, Ma H, Cheng Z. Global analysis of lysine acetylation in strawberry leaves. FRONTIERS IN PLANT SCIENCE 2015; 6:739. [PMID: 26442052 PMCID: PMC4569977 DOI: 10.3389/fpls.2015.00739] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/24/2015] [Accepted: 08/31/2015] [Indexed: 05/08/2023]
Abstract
Protein lysine acetylation is a reversible and dynamic post-translational modification. It plays an important role in regulating diverse cellular processes including chromatin dynamic, metabolic pathways, and transcription in both prokaryotes and eukaryotes. Although studies of lysine acetylome in plants have been reported, the throughput was not high enough, hindering the deep understanding of lysine acetylation in plant physiology and pathology. In this study, taking advantages of anti-acetyllysine-based enrichment and high-sensitive-mass spectrometer, we applied an integrated proteomic approach to comprehensively investigate lysine acetylome in strawberry. In total, we identified 1392 acetylation sites in 684 proteins, representing the largest dataset of acetylome in plants to date. To reveal the functional impacts of lysine acetylation in strawberry, intensive bioinformatic analysis was performed. The results significantly expanded our current understanding of plant acetylome and demonstrated that lysine acetylation is involved in multiple cellular metabolism and cellular processes. More interestingly, nearly 50% of all acetylated proteins identified in this work were localized in chloroplast and the vital role of lysine acetylation in photosynthesis was also revealed. Taken together, this study not only established the most extensive lysine acetylome in plants to date, but also systematically suggests the significant and unique roles of lysine acetylation in plants.
Collapse
Affiliation(s)
- Xianping Fang
- Institute of Biology, Hangzhou Academy of Agricultural SciencesHangzhou, China
| | - Wenyue Chen
- Institute of Biology, Hangzhou Academy of Agricultural SciencesHangzhou, China
| | - Yun Zhao
- Experiment Center, Hangzhou Academy of Agricultural SciencesHangzhou, China
| | - Songlin Ruan
- Institute of Biology, Hangzhou Academy of Agricultural SciencesHangzhou, China
| | - Hengmu Zhang
- Institute of Virology and Biotechnology, Zhejiang Academy of Agricultural SciencesHangzhou, China
| | - Chengqi Yan
- Institute of Virology and Biotechnology, Zhejiang Academy of Agricultural SciencesHangzhou, China
| | - Liang Jin
- Research and Development Center of Flower, Zhejiang Academy of Agricultural SciencesHangzhou, China
| | | | - Jun Zhu
- Jingjie PTM BiolabsHangzhou, China
| | - Huasheng Ma
- Institute of Biology, Hangzhou Academy of Agricultural SciencesHangzhou, China
- *Correspondence: Huasheng Ma, Hangzhou Academy of Agricultural Sciences, Institute of Biology, East Hangxin Road 1, Hangzhou 310024, China
| | - Zhongyi Cheng
- Institute for Advanced Study of Translational Medicine, Tongji UniversityShanghai, China
- Zhongyi Cheng, Institute for Advanced Study of Translational Medicine, Tongji University, Siping Road 1239, Shanghai 200092, China
| |
Collapse
|
207
|
|
208
|
Diament A, Pinter RY, Tuller T. Three-dimensional eukaryotic genomic organization is strongly correlated with codon usage expression and function. Nat Commun 2014; 5:5876. [PMID: 25510862 DOI: 10.1038/ncomms6876] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2014] [Accepted: 11/17/2014] [Indexed: 01/08/2023] Open
Abstract
It has been shown that the distribution of genes in eukaryotic genomes is not random; however, formerly reported relations between gene function and genomic organization were relatively weak. Previous studies have demonstrated that codon usage bias is related to all stages of gene expression and to protein function. Here we apply a novel tool for assessing functional relatedness, codon usage frequency similarity (CUFS), which measures similarity between genes in terms of codon and amino acid usage. By analyzing chromosome conformation capture data, describing the three-dimensional (3D) conformation of the DNA, we show that the functional similarity between genes captured by CUFS is directly and very strongly correlated with their 3D distance in Saccharomyces cerevisiae, Schizosaccharomyces pombe, Arabidopsis thaliana, mouse and human. This emphasizes the importance of three-dimensional genomic localization in eukaryotes and indicates that codon usage is tightly linked to genome architecture.
Collapse
Affiliation(s)
- Alon Diament
- Department of Biomedical Engineering, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Ron Y Pinter
- Department of Computer Science, Technion-Israel Institute of Technology, Haifa 32000, Israel
| | - Tamir Tuller
- 1] Department of Biomedical Engineering, Tel Aviv University, Tel Aviv 6997801, Israel [2] The Sagol School of Neuroscience, Tel Aviv University, Tel Aviv 6997801, Israel
| |
Collapse
|
209
|
Transcriptome sequencing reveals the virulence and environmental genetic programs of Vibrio vulnificus exposed to host and estuarine conditions. PLoS One 2014; 9:e114376. [PMID: 25489854 PMCID: PMC4260858 DOI: 10.1371/journal.pone.0114376] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2014] [Accepted: 11/09/2014] [Indexed: 12/31/2022] Open
Abstract
Vibrio vulnificus is a natural inhabitant of estuarine waters worldwide and is of medical relevance due to its ability to cause grievous wound infections and/or fatal septicemia. Genetic polymorphisms within the virulence-correlated gene (vcg) serve as a primary feature to distinguish clinical (C-) genotypes from environmental (E-) genotypes. C-genotypes demonstrate superior survival in human serum relative to E-genotypes, and genome comparisons have allowed for the identification of several putative virulence factors that could potentially aid C-genotypes in disease progression. We used RNA sequencing to analyze the transcriptome of C-genotypes exposed to human serum relative to seawater, which revealed two divergent genetic programs under these two conditions. In human serum, cells displayed a distinct "virulence profile" in which a number of putative virulence factors were upregulated, including genes involved in intracellular signaling, substrate binding and transport, toxin and exoenzyme production, and the heat shock response. Conversely, the "environmental profile" exhibited by cells in seawater revealed upregulation of transcription factors such as rpoS, rpoN, and iscR, as well as genes involved in intracellular signaling, chemotaxis, adherence, and biofilm formation. This dichotomous genetic switch appears to be largely governed by cyclic-di-GMP signaling, and remarkably resembles the dual life-style of V. cholerae as it transitions from host to environment. Furthermore, we found a "general stress response" module, known as the stressosome, to be upregulated in seawater. This signaling system has been well characterized in Gram-positive bacteria, however its role in V. vulnificus is not clear. We examined temporal gene expression patterns of the stressosome and found it to be upregulated in natural estuarine waters indicating that this system plays a role in sensing and responding to the environment. This study advances our understanding of gene regulation in V. vulnificus, and brings to the forefront a number of previously overlooked genetic networks.
Collapse
|
210
|
Goldstone RJ, Popat R, Schuberth HJ, Sandra O, Sheldon IM, Smith DGE. Genomic characterisation of an endometrial pathogenic Escherichia coli strain reveals the acquisition of genetic elements associated with extra-intestinal pathogenicity. BMC Genomics 2014; 15:1075. [PMID: 25481482 PMCID: PMC4298941 DOI: 10.1186/1471-2164-15-1075] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2014] [Accepted: 11/24/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Strains of Escherichia coli cause a wide variety of intestinal and extra-intestinal diseases in both humans and animals, and are also often found in healthy individuals or the environment. Broadly, a strong phylogenetic relationship exists that distinguishes most E. coli causing intestinal disease from those that cause extra-intestinal disease, however, isolates within a recently described subclass of Extra-Intestinal Pathogenic E. coli (ExPEC), termed endometrial pathogenic E. coli, tend to be phylogenetically distant from the vast majority of characterised ExPECs, and more closely related to human intestinal pathogens. In this work, we investigate the genetic basis for ExPEC infection in the prototypic endometrial pathogenic E. coli strain MS499. RESULTS By investigating the genome of MS499 in comparison with a range of other E. coli sequences, we have discovered that this bacterium has acquired substantial lengths of DNA which encode factors more usually associated with ExPECs and less frequently found in the phylogroup relatives of MS499. Many of these acquired factors, including several iron acquisition systems and a virulence plasmid similar to that found in several ExPECs such as APEC O1 and the neonatal meningitis E. coli S88, play characterised roles in a variety of typical ExPEC infections and appear to have been acquired recently by the evolutionary lineage leading to MS499. CONCLUSIONS Taking advantage of the phylogenetic relationship we describe between MS499 and several other closely related E. coli isolates from across the globe, we propose a step-wise evolution of a novel clade of sequence type 453 ExPECs within phylogroup B1, involving the recruitment of ExPEC virulence factors into the genome of an ancestrally non-extraintestinal E. coli, which has repurposed this lineage with the capacity to cause extraintestinal disease. These data reveal the genetic components which may be involved in this phenotype switching, and argue that horizontal gene exchange may be a key factor in the emergence of novel lineages of ExPECs.
Collapse
Affiliation(s)
| | | | | | | | | | - David G E Smith
- Institute for Infection, Immunity and Inflammation, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK.
| |
Collapse
|
211
|
Penarete-Vargas DM, Boisson A, Urbach S, Chantelauze H, Peyrottes S, Fraisse L, Vial HJ. A chemical proteomics approach for the search of pharmacological targets of the antimalarial clinical candidate albitiazolium in Plasmodium falciparum using photocrosslinking and click chemistry. PLoS One 2014; 9:e113918. [PMID: 25470252 PMCID: PMC4254740 DOI: 10.1371/journal.pone.0113918] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2014] [Accepted: 10/31/2014] [Indexed: 11/18/2022] Open
Abstract
Plasmodium falciparum is responsible for severe malaria which is one of the most prevalent and deadly infectious diseases in the world. The antimalarial therapeutic arsenal is hampered by the onset of resistance to all known pharmacological classes of compounds, so new drugs with novel mechanisms of action are critically needed. Albitiazolium is a clinical antimalarial candidate from a series of choline analogs designed to inhibit plasmodial phospholipid metabolism. Here we developed an original chemical proteomic approach to identify parasite proteins targeted by albitiazolium during their native interaction in living parasites. We designed a bifunctional albitiazolium-derived compound (photoactivable and clickable) to covalently crosslink drug-interacting parasite proteins in situ followed by their isolation via click chemistry reactions. Mass spectrometry analysis of drug-interacting proteins and subsequent clustering on gene ontology terms revealed parasite proteins involved in lipid metabolic activities and, interestingly, also in lipid binding, transport, and vesicular transport functions. In accordance with this, the albitiazolium-derivative was localized in the endoplasmic reticulum and trans-Golgi network of P. falciparum. Importantly, during competitive assays with albitiazolium, the binding of choline/ethanolamine phosphotransferase (the enzyme involved in the last step of phosphatidylcholine synthesis) was substantially displaced, thus confirming the efficiency of this strategy for searching albitiazolium targets.
Collapse
Affiliation(s)
- Diana Marcela Penarete-Vargas
- Dynamique des Interactions Membranaires Normales et Pathologiques, CNRS UMR 5235, Université Montpellier II, cc107, Place Eugène Bataillon, 34095 Montpellier Cedex 05, France
- * E-mail: (DMPV); (HJV)
| | - Anaïs Boisson
- Dynamique des Interactions Membranaires Normales et Pathologiques, CNRS UMR 5235, Université Montpellier II, cc107, Place Eugène Bataillon, 34095 Montpellier Cedex 05, France
| | - Serge Urbach
- Institut de Génomique Fonctionnelle, CNRS UMR 5203, INSERM U661, Université Montpellier I, Université Montpellier II, F-34094 Montpellier, France
| | - Hervé Chantelauze
- Institut des Biomolécules Max Mousseron, CNRS UMR 5247, Université Montpellier II, cc1705, Place Eugène Bataillon, 34095 Montpellier Cedex 05, France
| | - Suzanne Peyrottes
- Institut des Biomolécules Max Mousseron, CNRS UMR 5247, Université Montpellier II, cc1705, Place Eugène Bataillon, 34095 Montpellier Cedex 05, France
| | - Laurent Fraisse
- Sanofi, Therapeutic Strategic Unit for Infectious Diseases, 195 route d’Espagne, BP 13669, 31036 Toulouse Cedex, France
| | - Henri J. Vial
- Dynamique des Interactions Membranaires Normales et Pathologiques, CNRS UMR 5235, Université Montpellier II, cc107, Place Eugène Bataillon, 34095 Montpellier Cedex 05, France
- * E-mail: (DMPV); (HJV)
| |
Collapse
|
212
|
Abstract
The Gene Ontology (GO; http://www.geneontology.org) is a community-based bioinformatics resource that supplies information about gene product function using ontologies to represent biological knowledge. Here we describe improvements and expansions to several branches of the ontology, as well as updates that have allowed us to more efficiently disseminate the GO and capture feedback from the research community. The Gene Ontology Consortium (GOC) has expanded areas of the ontology such as cilia-related terms, cell-cycle terms and multicellular organism processes. We have also implemented new tools for generating ontology terms based on a set of logical rules making use of templates, and we have made efforts to increase our use of logical definitions. The GOC has a new and improved web site summarizing new developments and documentation, serving as a portal to GO data. Users can perform GO enrichment analysis, and search the GO for terms, annotations to gene products, and associated metadata across multiple species using the all-new AmiGO 2 browser. We encourage and welcome the input of the research community in all biological areas in our continued effort to improve the Gene Ontology.
Collapse
|
213
|
Bennett L, Kittas A, Liu S, Papageorgiou LG, Tsoka S. Community structure detection for overlapping modules through mathematical programming in protein interaction networks. PLoS One 2014; 9:e112821. [PMID: 25412367 PMCID: PMC4239042 DOI: 10.1371/journal.pone.0112821] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2014] [Accepted: 10/15/2014] [Indexed: 12/05/2022] Open
Abstract
Community structure detection has proven to be important in revealing the underlying properties of complex networks. The standard problem, where a partition of disjoint communities is sought, has been continually adapted to offer more realistic models of interactions in these systems. Here, a two-step procedure is outlined for exploring the concept of overlapping communities. First, a hard partition is detected by employing existing methodologies. We then propose a novel mixed integer non linear programming (MINLP) model, known as OverMod, which transforms disjoint communities to overlapping. The procedure is evaluated through its application to protein-protein interaction (PPI) networks of the rat, E. coli, yeast and human organisms. Connector nodes of hard partitions exhibit topological and functional properties indicative of their suitability as candidates for multiple module membership. OverMod identifies two types of connector nodes, inter and intra-connector, each with their own particular characteristics pertaining to their topological and functional role in the organisation of the network. Inter-connector proteins are shown to be highly conserved proteins participating in pathways that control essential cellular processes, such as proliferation, differentiation and apoptosis and their differences with intra-connectors is highlighted. Many of these proteins are shown to possess multiple roles of distinct nature through their participation in different network modules, setting them apart from proteins that are simply ‘hubs’, i.e. proteins with many interaction partners but with a more specific biochemical role.
Collapse
Affiliation(s)
- Laura Bennett
- Centre for Process Systems Engineering, Department of Chemical Engineering, UCL (University College London), Torrington Place, WC1E 7JE, London, United Kingdom
| | - Aristotelis Kittas
- Department of Informatics, King's College London, Strand, WC2R 2LS, London, United Kingdom
| | - Songsong Liu
- Centre for Process Systems Engineering, Department of Chemical Engineering, UCL (University College London), Torrington Place, WC1E 7JE, London, United Kingdom
| | - Lazaros G. Papageorgiou
- Centre for Process Systems Engineering, Department of Chemical Engineering, UCL (University College London), Torrington Place, WC1E 7JE, London, United Kingdom
| | - Sophia Tsoka
- Department of Informatics, King's College London, Strand, WC2R 2LS, London, United Kingdom
- * E-mail:
| |
Collapse
|
214
|
Altenhoff AM, Škunca N, Glover N, Train CM, Sueki A, Piližota I, Gori K, Tomiczek B, Müller S, Redestig H, Gonnet GH, Dessimoz C. The OMA orthology database in 2015: function predictions, better plant support, synteny view and other improvements. Nucleic Acids Res 2014; 43:D240-9. [PMID: 25399418 PMCID: PMC4383958 DOI: 10.1093/nar/gku1158] [Citation(s) in RCA: 177] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The Orthologous Matrix (OMA) project is a method and associated database inferring evolutionary relationships amongst currently 1706 complete proteomes (i.e. the protein sequence associated for every protein-coding gene in all genomes). In this update article, we present six major new developments in OMA: (i) a new web interface; (ii) Gene Ontology function predictions as part of the OMA pipeline; (iii) better support for plant genomes and in particular homeologs in the wheat genome; (iv) a new synteny viewer providing the genomic context of orthologs; (v) statically computed hierarchical orthologous groups subsets downloadable in OrthoXML format; and (vi) possibility to export parts of the all-against-all computations and to combine them with custom data for 'client-side' orthology prediction. OMA can be accessed through the OMA Browser and various programmatic interfaces at http://omabrowser.org.
Collapse
Affiliation(s)
- Adrian M Altenhoff
- University College London, Gower Street, London WC1E 6BT, UK Swiss Institute of Bioinformatics, Universitätstr. 6, 8092 Zurich, Switzerland ETH Zurich, Computer Science, Universitätstr. 6, 8092 Zurich, Switzerland
| | - Nives Škunca
- University College London, Gower Street, London WC1E 6BT, UK Swiss Institute of Bioinformatics, Universitätstr. 6, 8092 Zurich, Switzerland ETH Zurich, Computer Science, Universitätstr. 6, 8092 Zurich, Switzerland
| | - Natasha Glover
- University College London, Gower Street, London WC1E 6BT, UK Institut National de la Recherche Agronomique (INRA) UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France Bayer CropScience NV, Technologiepark 38, 9052 Gent, Belgium
| | | | - Anna Sueki
- University College London, Gower Street, London WC1E 6BT, UK
| | - Ivana Piližota
- University College London, Gower Street, London WC1E 6BT, UK
| | - Kevin Gori
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Steven Müller
- University College London, Gower Street, London WC1E 6BT, UK
| | | | - Gaston H Gonnet
- Swiss Institute of Bioinformatics, Universitätstr. 6, 8092 Zurich, Switzerland ETH Zurich, Computer Science, Universitätstr. 6, 8092 Zurich, Switzerland
| | - Christophe Dessimoz
- University College London, Gower Street, London WC1E 6BT, UK European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
215
|
Day-Williams AG, Sun C, Jelcic I, McLaughlin H, Harris T, Martin R, Carulli JP. Whole Genome Sequencing Reveals a Chromosome 9p Deletion Causing DOCK8 Deficiency in an Adult Diagnosed with Hyper IgE Syndrome Who Developed Progressive Multifocal Leukoencephalopathy. J Clin Immunol 2014; 35:92-6. [PMID: 25388448 PMCID: PMC4306731 DOI: 10.1007/s10875-014-0114-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2014] [Accepted: 10/23/2014] [Indexed: 11/27/2022]
Abstract
PURPOSE A 30 year-old man with a history of recurrent skin infections as well as elevated serum IgE and eosinophils developed neurological symptoms and had T2-hyperintense lesions observed in cerebral MRI. The immune symptoms were attributed to Hyper IgE syndrome (HIES) and the neurological symptoms with presence of JC virus in cerebrospinal fluid were diagnosed as Progressive Multifocal Leukoencephalopathy (PML). The patient was negative for STAT3 mutations. To determine if other mutations explain HIES and/or PML in this subject, his DNA was analyzed by whole genome sequencing. METHODS Whole genome sequencing was completed to 30X coverage, and whole genome SNP typing was used to complement these data. The methods revealed single nucleotide variants, structural variants, and copy number variants across the genome. Genome-wide data were analyzed for homozygous or compound heterozygous null mutations for all protein coding genes. Mutations were confirmed by PCR and/or Sanger sequencing. RESULTS Whole genome analysis revealed deletions near the telomere of both copies of chromosome 9p. Several genes, including DOCK8, were impacted by the deletions but it was unclear whether each chromosome had identical or distinct deletions. PCR across the impacted region combined with Sanger sequencing of selected fragments confirmed a homozygous deletion from position 10,211 to 586,751. CONCLUSION While several genes are impacted by the deletion, DOCK8 deficiency is the most probable cause of HIES in this patient. DOCK8 deficiency may have also predisposed the patient to develop PML.
Collapse
Affiliation(s)
| | - Chao Sun
- Translational Sciences, Biogen Idec, Cambridge, MA, USA
| | - Ilijas Jelcic
- Department of Neurology, University Hospital Zurich, Zurich, Switzerland
| | | | - Tim Harris
- Translational Sciences, Biogen Idec, Cambridge, MA, USA
| | - Roland Martin
- Department of Neurology, University Hospital Zurich, Zurich, Switzerland
| | - John P Carulli
- Translational Sciences, Biogen Idec, Cambridge, MA, USA.
| |
Collapse
|
216
|
Jung WY, Lee SS, Kim CW, Kim HS, Min SR, Moon JS, Kwon SY, Jeon JH, Cho HS. RNA-seq analysis and de novo transcriptome assembly of Jerusalem artichoke (Helianthus tuberosus Linne). PLoS One 2014; 9:e111982. [PMID: 25375764 PMCID: PMC4222968 DOI: 10.1371/journal.pone.0111982] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2014] [Accepted: 10/09/2014] [Indexed: 11/18/2022] Open
Abstract
Jerusalem artichoke (Helianthus tuberosus L.) has long been cultivated as a vegetable and as a source of fructans (inulin) for pharmaceutical applications in diabetes and obesity prevention. However, transcriptomic and genomic data for Jerusalem artichoke remain scarce. In this study, Illumina RNA sequencing (RNA-Seq) was performed on samples from Jerusalem artichoke leaves, roots, stems and two different tuber tissues (early and late tuber development). Data were used for de novo assembly and characterization of the transcriptome. In total 206,215,632 paired-end reads were generated. These were assembled into 66,322 loci with 272,548 transcripts. Loci were annotated by querying against the NCBI non-redundant, Phytozome and UniProt databases, and 40,215 loci were homologous to existing database sequences. Gene Ontology terms were assigned to 19,848 loci, 15,434 loci were matched to 25 Clusters of Eukaryotic Orthologous Groups classifications, and 11,844 loci were classified into 142 Kyoto Encyclopedia of Genes and Genomes pathways. The assembled loci also contained 10,778 potential simple sequence repeats. The newly assembled transcriptome was used to identify loci with tissue-specific differential expression patterns. In total, 670 loci exhibited tissue-specific expression, and a subset of these were confirmed using RT-PCR and qRT-PCR. Gene expression related to inulin biosynthesis in tuber tissue was also investigated. Exsiting genetic and genomic data for H. tuberosus are scarce. The sequence resources developed in this study will enable the analysis of thousands of transcripts and will thus accelerate marker-assisted breeding studies and studies of inulin biosynthesis in Jerusalem artichoke.
Collapse
Affiliation(s)
- Won Yong Jung
- Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea; Animal Material Engineering, Gyeongnam National University of Science and Technology, Jinju, Korea
| | - Sang Sook Lee
- Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea
| | - Chul Wook Kim
- Animal Material Engineering, Gyeongnam National University of Science and Technology, Jinju, Korea
| | - Hyun-Soon Kim
- Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea
| | - Sung Ran Min
- Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea
| | - Jae Sun Moon
- Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea
| | - Suk-Yoon Kwon
- Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea
| | - Jae-Heung Jeon
- Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea
| | - Hye Sun Cho
- Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea
| |
Collapse
|
217
|
Huntley RP, Sawford T, Mutowo-Meullenet P, Shypitsyna A, Bonilla C, Martin MJ, O'Donovan C. The GOA database: gene Ontology annotation updates for 2015. Nucleic Acids Res 2014; 43:D1057-63. [PMID: 25378336 PMCID: PMC4383930 DOI: 10.1093/nar/gku1113] [Citation(s) in RCA: 378] [Impact Index Per Article: 37.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
The Gene Ontology Annotation (GOA) resource (http://www.ebi.ac.uk/GOA) provides evidence-based Gene Ontology (GO) annotations to proteins in the UniProt Knowledgebase (UniProtKB). Manual annotations provided by UniProt curators are supplemented by manual and automatic annotations from model organism databases and specialist annotation groups. GOA currently supplies 368 million GO annotations to almost 54 million proteins in more than 480 000 taxonomic groups. The resource now provides annotations to five times the number of proteins it did 4 years ago. As a member of the GO Consortium, we adhere to the most up-to-date Consortium-agreed annotation guidelines via the use of quality control checks that ensures that the GOA resource supplies high-quality functional information to proteins from a wide range of species. Annotations from GOA are freely available and are accessible through a powerful web browser as well as a variety of annotation file formats.
Collapse
Affiliation(s)
- Rachael P Huntley
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tony Sawford
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Prudence Mutowo-Meullenet
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Aleksandra Shypitsyna
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carlos Bonilla
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Claire O'Donovan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
218
|
Lee T, Yang S, Kim E, Ko Y, Hwang S, Shin J, Shim JE, Shim H, Kim H, Kim C, Lee I. AraNet v2: an improved database of co-functional gene networks for the study of Arabidopsis thaliana and 27 other nonmodel plant species. Nucleic Acids Res 2014; 43:D996-1002. [PMID: 25355510 PMCID: PMC4383895 DOI: 10.1093/nar/gku1053] [Citation(s) in RCA: 104] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Arabidopsis thaliana is a reference plant that has been studied intensively for several decades. Recent advances in high-throughput experimental technology have enabled the generation of an unprecedented amount of data from A. thaliana, which has facilitated data-driven approaches to unravel the genetic organization of plant phenotypes. We previously published a description of a genome-scale functional gene network for A. thaliana, AraNet, which was constructed by integrating multiple co-functional gene networks inferred from diverse data types, and we demonstrated the predictive power of this network for complex phenotypes. More recently, we have observed significant growth in the availability of omics data for A. thaliana as well as improvements in data analysis methods that we anticipate will further enhance the integrated database of co-functional networks. Here, we present an updated co-functional gene network for A. thaliana, AraNet v2 (available at http://www.inetbio.org/aranet), which covers approximately 84% of the coding genome. We demonstrate significant improvements in both genome coverage and accuracy. To enhance the usability of the network, we implemented an AraNet v2 web server, which generates functional predictions for A. thaliana and 27 nonmodel plant species using an orthology-based projection of nonmodel plant genes on the A. thaliana gene network.
Collapse
Affiliation(s)
- Tak Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Sunmo Yang
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Eiru Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Younhee Ko
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Sohyun Hwang
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, TX 78712, USA
| | - Junha Shin
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Jung Eun Shim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Hongseok Shim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Hyojin Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Chanyoung Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| |
Collapse
|
219
|
Vorwerk S, Krieger V, Deiwick J, Hensel M, Hansmeier N. Proteomes of host cell membranes modified by intracellular activities of Salmonella enterica. Mol Cell Proteomics 2014; 14:81-92. [PMID: 25348832 DOI: 10.1074/mcp.m114.041145] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
Intracellular pathogens need to establish a growth-stimulating host niche for survival and replication. A unique feature of the gastrointestinal pathogen Salmonella enterica serovar Typhimurium is the creation of extensive membrane networks within its host. An understanding of the origin and function of these membranes is crucial for the development of new treatment strategies. However, the characterization of this compartment is very challenging, and only fragmentary knowledge of its composition and biogenesis exists. Here, we describe a new proteome-based approach to enrich and characterize Salmonella-modified membranes. Using a Salmonella mutant strain that does not form this unique membrane network as a reference, we identified a high-confidence set of host proteins associated with Salmonella-modified membranes. This comprehensive analysis allowed us to reconstruct the interactions of Salmonella with host membranes. For example, we noted that Salmonella redirects endoplasmic reticulum (ER) membrane trafficking to its intracellular niche, a finding that has not been described for Salmonella previously. Our system-wide approach therefore has the potential to rapidly close gaps in our knowledge of the infection process of intracellular pathogens and demonstrates a hitherto unrecognized complexity in the formation of Salmonella host niches.
Collapse
Affiliation(s)
- Stephanie Vorwerk
- From the ‡Division of Microbiology, School of Biology/Chemistry, University of Osnabrück, 49076 Osnabrück, Germany
| | - Viktoria Krieger
- From the ‡Division of Microbiology, School of Biology/Chemistry, University of Osnabrück, 49076 Osnabrück, Germany
| | - Jörg Deiwick
- From the ‡Division of Microbiology, School of Biology/Chemistry, University of Osnabrück, 49076 Osnabrück, Germany
| | - Michael Hensel
- From the ‡Division of Microbiology, School of Biology/Chemistry, University of Osnabrück, 49076 Osnabrück, Germany
| | - Nicole Hansmeier
- From the ‡Division of Microbiology, School of Biology/Chemistry, University of Osnabrück, 49076 Osnabrück, Germany
| |
Collapse
|
220
|
Abstract
UniProt is an important collection of protein sequences and their annotations, which has doubled in size to 80 million sequences during the past year. This growth in sequences has prompted an extension of UniProt accession number space from 6 to 10 characters. An increasing fraction of new sequences are identical to a sequence that already exists in the database with the majority of sequences coming from genome sequencing projects. We have created a new proteome identifier that uniquely identifies a particular assembly of a species and strain or subspecies to help users track the provenance of sequences. We present a new website that has been designed using a user-experience design process. We have introduced an annotation score for all entries in UniProt to represent the relative amount of knowledge known about each protein. These scores will be helpful in identifying which proteins are the best characterized and most informative for comparative analysis. All UniProt data is provided freely and is available on the web at http://www.uniprot.org/.
Collapse
Affiliation(s)
- The UniProt Consortium
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
- SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, 1 rue Michel Servet, 1211 Geneva 4, Switzerland
- Protein Information Resource, Georgetown University Medical Center, 3300 Whitehaven Street North West, Suite 1200, Washington, DC 20007, USA
- Protein Information Resource, University of Delaware, 15 Innovation Way, Suite 205, Newark, DE 19711, USA
- To whom correspondence should be addressed. Tel: +44 1223 494100; Fax: +44 1223 494468;
| |
Collapse
|
221
|
Torto-Alalibo T, Purwantini E, Lomax J, Setubal JC, Mukhopadhyay B, Tyler BM. Genetic resources for advanced biofuel production described with the Gene Ontology. Front Microbiol 2014; 5:528. [PMID: 25346727 PMCID: PMC4193338 DOI: 10.3389/fmicb.2014.00528] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2014] [Accepted: 09/22/2014] [Indexed: 12/12/2022] Open
Abstract
Dramatic increases in research in the area of microbial biofuel production coupled with high-throughput data generation on bioenergy-related microbes has led to a deluge of information in the scientific literature and in databases. Consolidating this information and making it easily accessible requires a unified vocabulary. The Gene Ontology (GO) fulfills that requirement, as it is a well-developed structured vocabulary that describes the activities and locations of gene products in a consistent manner across all kingdoms of life. The Microbial ENergy processes Gene Ontology () project is extending the GO to include new terms to describe microbial processes of interest to bioenergy production. Our effort has added over 600 bioenergy related terms to the Gene Ontology. These terms will aid in the comprehensive annotation of gene products from diverse energy-related microbial genomes. An area of microbial energy research that has received a lot of attention is microbial production of advanced biofuels. These include alcohols such as butanol, isopropanol, isobutanol, and fuels derived from fatty acids, isoprenoids, and polyhydroxyalkanoates. These fuels are superior to first generation biofuels (ethanol and biodiesel esterified from vegetable oil or animal fat), can be generated from non-food feedstock sources, can be used as supplements or substitutes for gasoline, diesel and jet fuels, and can be stored and distributed using existing infrastructure. Here we review the roles of genes associated with synthesis of advanced biofuels, and at the same time introduce the use of the GO to describe the functions of these genes in a standardized way.
Collapse
Affiliation(s)
- Trudy Torto-Alalibo
- Department of Biochemistry, Virginia Polytechnic Institute and State UniversityBlacksburg, VA, USA
- Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State UniversityBlacksburg, VA, USA
| | - Endang Purwantini
- Department of Biochemistry, Virginia Polytechnic Institute and State UniversityBlacksburg, VA, USA
- Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State UniversityBlacksburg, VA, USA
| | - Jane Lomax
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome CampusCambridge, UK
| | - João C. Setubal
- Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State UniversityBlacksburg, VA, USA
- Departamento de Bioquímica, Instituto de Química, Universidade de São PauloSão Paulo, Brazil
| | - Biswarup Mukhopadhyay
- Department of Biochemistry, Virginia Polytechnic Institute and State UniversityBlacksburg, VA, USA
- Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State UniversityBlacksburg, VA, USA
- Department of Biological Sciences, Oregon State UniversityCorvallis, OR, USA
| | - Brett M. Tyler
- Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State UniversityBlacksburg, VA, USA
- Center for Genome Research and Biocomputing, Oregon State UniversityCorvallis, OR, USA
| |
Collapse
|
222
|
Patwardhan A, Ashton A, Brandt R, Butcher S, Carzaniga R, Chiu W, Collinson L, Doux P, Duke E, Ellisman MH, Franken E, Grünewald K, Heriche JK, Koster A, Kühlbrandt W, Lagerstedt I, Larabell C, Lawson CL, Saibil HR, Sanz-García E, Subramaniam S, Verkade P, Swedlow JR, Kleywegt GJ. A 3D cellular context for the macromolecular world. Nat Struct Mol Biol 2014; 21:841-5. [PMID: 25289590 PMCID: PMC4346196 DOI: 10.1038/nsmb.2897] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
We report the outcomes of the discussion initiated at the workshop entitled A 3D Cellular Context for the Macromolecular World and propose how data from emerging three-dimensional (3D) cellular imaging techniques—such as electron tomography, 3D scanning electron microscopy and soft X-ray tomography—should be archived, curated, validated and disseminated, to enable their interpretation and reuse by the biomedical community.
Collapse
Affiliation(s)
- Ardan Patwardhan
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | | | | | - Sarah Butcher
- Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| | - Raffaella Carzaniga
- Electron Microscopy Unit, Cancer Research UK London Research Institute, London, UK
| | - Wah Chiu
- National Center for Macromolecular Imaging, Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas
| | - Lucy Collinson
- Electron Microscopy Unit, Cancer Research UK London Research Institute, London, UK
| | - Pascal Doux
- FEI Visualization Sciences Group, Mérignac, France
| | | | - Mark H Ellisman
- Center for Research in Biological Systems, National Center for Microscopy and Imaging Research (NCMIR), University of California, San Diego, San Diego, California, USA
| | - Erik Franken
- FEI Electron Optics B.V., Eindhoven, the Netherlands
| | - Kay Grünewald
- Division of Structural Biology, Wellcome Trust Centre for Human Genetics, Oxford, UK
| | - Jean-Karim Heriche
- Cell Biology and Biophysics Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Abraham Koster
- Department of Molecular Cell Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Werner Kühlbrandt
- Department of Structural Biology, Max Planck Institute for Biophysics, Frankfurt, Germany
| | - Ingvar Lagerstedt
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Carolyn Larabell
- Department of Anatomy, University of California, San Francisco, San Francisco, California, USA
| | - Catherine L Lawson
- Research Collaboratory for Structural Bioinformatics, Rutgers University, Piscataway, New Jersey, USA
| | - Helen R Saibil
- Institute of Structural and Molecular Biology, Department of Crystallography, Birkbeck College, London, UK
| | - Eduardo Sanz-García
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Sriram Subramaniam
- Center for Cancer Research, National Cancer Institute, Bethesda, Maryland, USA
| | - Paul Verkade
- Wolfson Bioimaging Facility, School of Biochemistry, University of Bristol, Bristol, UK
| | - Jason R Swedlow
- Centre for Gene Regulation and Expression, University of Dundee, Dundee, UK
| | - Gerard J Kleywegt
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| |
Collapse
|
223
|
van Dam JCJ, Schaap PJ, Martins dos Santos VAP, Suárez-Diez M. Integration of heterogeneous molecular networks to unravel gene-regulation in Mycobacterium tuberculosis. BMC SYSTEMS BIOLOGY 2014; 8:111. [PMID: 25279447 PMCID: PMC4181829 DOI: 10.1186/s12918-014-0111-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/15/2014] [Accepted: 09/05/2014] [Indexed: 12/23/2022]
Abstract
BACKGROUND Different methods have been developed to infer regulatory networks from heterogeneous omics datasets and to construct co-expression networks. Each algorithm produces different networks and efforts have been devoted to automatically integrate them into consensus sets. However each separate set has an intrinsic value that is diluted and partly lost when building a consensus network. Here we present a methodology to generate co-expression networks and, instead of a consensus network, we propose an integration framework where the different networks are kept and analysed with additional tools to efficiently combine the information extracted from each network. RESULTS We developed a workflow to efficiently analyse information generated by different inference and prediction methods. Our methodology relies on providing the user the means to simultaneously visualise and analyse the coexisting networks generated by different algorithms, heterogeneous datasets, and a suite of analysis tools. As a show case, we have analysed the gene co-expression networks of Mycobacterium tuberculosis generated using over 600 expression experiments. Regarding DNA damage repair, we identified SigC as a key control element, 12 new targets for LexA, an updated LexA binding motif, and a potential mismatch repair system. We expanded the DevR regulon with 27 genes while identifying 9 targets wrongly assigned to this regulon. We discovered 10 new genes linked to zinc uptake and a new regulatory mechanism for ZuR. The use of co-expression networks to perform system level analysis allows the development of custom made methodologies. As show cases we implemented a pipeline to integrate ChIP-seq data and another method to uncover multiple regulatory layers. CONCLUSIONS Our workflow is based on representing the multiple types of information as network representations and presenting these networks in a synchronous framework that allows their simultaneous visualization while keeping specific associations from the different networks. By simultaneously exploring these networks and metadata, we gained insights into regulatory mechanisms in M. tuberculosis that could not be obtained through the separate analysis of each data type.
Collapse
Affiliation(s)
- Jesse CJ van Dam
- />Laboratory of Systems and Synthetic Biology, Wageningen University, Dreijenplein 10, 6703 HB Wageningen, The Netherlands
| | - Peter J Schaap
- />Laboratory of Systems and Synthetic Biology, Wageningen University, Dreijenplein 10, 6703 HB Wageningen, The Netherlands
| | - Vitor AP Martins dos Santos
- />Laboratory of Systems and Synthetic Biology, Wageningen University, Dreijenplein 10, 6703 HB Wageningen, The Netherlands
- />LifeGlimmer GmbH, Markelstrasse 38, Berlin, Germany
| | - María Suárez-Diez
- />Laboratory of Systems and Synthetic Biology, Wageningen University, Dreijenplein 10, 6703 HB Wageningen, The Netherlands
| |
Collapse
|
224
|
Peterson EJR, Reiss DJ, Turkarslan S, Minch KJ, Rustad T, Plaisier CL, Longabaugh WJR, Sherman DR, Baliga NS. A high-resolution network model for global gene regulation in Mycobacterium tuberculosis. Nucleic Acids Res 2014; 42:11291-303. [PMID: 25232098 PMCID: PMC4191388 DOI: 10.1093/nar/gku777] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
The resilience of Mycobacterium tuberculosis (MTB) is largely due to its ability to effectively counteract and even take advantage of the hostile environments of a host. In order to accelerate the discovery and characterization of these adaptive mechanisms, we have mined a compendium of 2325 publicly available transcriptome profiles of MTB to decipher a predictive, systems-scale gene regulatory network model. The resulting modular organization of 98% of all MTB genes within this regulatory network was rigorously tested using two independently generated datasets: a genome-wide map of 7248 DNA-binding locations for 143 transcription factors (TFs) and global transcriptional consequences of overexpressing 206 TFs. This analysis has discovered specific TFs that mediate conditional co-regulation of genes within 240 modules across 14 distinct environmental contexts. In addition to recapitulating previously characterized regulons, we discovered 454 novel mechanisms for gene regulation during stress, cholesterol utilization and dormancy. Significantly, 183 of these mechanisms act uniquely under conditions experienced during the infection cycle to regulate diverse functions including 23 genes that are essential to host-pathogen interactions. These and other insights underscore the power of a rational, model-driven approach to unearth novel MTB biology that operates under some but not all phases of infection.
Collapse
Affiliation(s)
| | - David J Reiss
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA 98109, USA
| | - Serdar Turkarslan
- Seattle Biomed Research Institute, 307 Westlake Avenue North, Suite 500, Seattle, WA 98109, USA
| | - Kyle J Minch
- Seattle Biomed Research Institute, 307 Westlake Avenue North, Suite 500, Seattle, WA 98109, USA
| | - Tige Rustad
- Seattle Biomed Research Institute, 307 Westlake Avenue North, Suite 500, Seattle, WA 98109, USA
| | | | | | - David R Sherman
- Seattle Biomed Research Institute, 307 Westlake Avenue North, Suite 500, Seattle, WA 98109, USA
| | - Nitin S Baliga
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA 98109, USA
| |
Collapse
|
225
|
Orfanoudaki G, Economou A. Proteome-wide subcellular topologies of E. coli polypeptides database (STEPdb). Mol Cell Proteomics 2014; 13:3674-87. [PMID: 25210196 DOI: 10.1074/mcp.o114.041137] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Cell compartmentalization serves both the isolation and the specialization of cell functions. After synthesis in the cytoplasm, over a third of all proteins are targeted to other subcellular compartments. Knowing how proteins are distributed within the cell and how they interact is a prerequisite for understanding it as a whole. Surface and secreted proteins are important pathogenicity determinants. Here we present the STEP database (STEPdb) that contains a comprehensive characterization of subcellular localization and topology of the complete proteome of Escherichia coli. Two widely used E. coli proteomes (K-12 and BL21) are presented organized into thirteen subcellular classes. STEPdb exploits the wealth of genetic, proteomic, biochemical, and functional information on protein localization, secretion, and targeting in E. coli, one of the best understood model organisms. Subcellular annotations were derived from a combination of bioinformatics prediction, proteomic, biochemical, functional, topological data and extensive literature re-examination that were refined through manual curation. Strong experimental support for the location of 1553 out of 4303 proteins was based on 426 articles and some experimental indications for another 526. Annotations were provided for another 320 proteins based on firm bioinformatic predictions. STEPdb is the first database that contains an extensive set of peripheral IM proteins (PIM proteins) and includes their graphical visualization into complexes, cellular functions, and interactions. It also summarizes all currently known protein export machineries of E. coli K-12 and pairs them, where available, with the secretory proteins that use them. It catalogs the Sec- and TAT-utilizing secretomes and summarizes their topological features such as signal peptides and transmembrane regions, transmembrane topologies and orientations. It also catalogs physicochemical and structural features that influence topology such as abundance, solubility, disorder, heat resistance, and structural domain families. Finally, STEPdb incorporates prediction tools for topology (TMHMM, SignalP, and Phobius) and disorder (IUPred) and implements the BLAST2STEP that performs protein homology searches against the STEPdb.
Collapse
Affiliation(s)
- Georgia Orfanoudaki
- From the ‡Institute of Molecular Biology and Biotechnology-FoRTH and §Department of Biology-University of Crete, P.O. Box 1385, Iraklio, Crete, Greece
| | - Anastassios Economou
- From the ‡Institute of Molecular Biology and Biotechnology-FoRTH and §Department of Biology-University of Crete, P.O. Box 1385, Iraklio, Crete, Greece; ¶Laboratory of Molecular Bacteriology; Rega Institute, Department of Microbiology and Immunology, KU Leuven, Herrestraat 49, B-3000 Leuven, Belgium
| |
Collapse
|
226
|
Engelke R, Riede J, Hegermann J, Wuerch A, Eimer S, Dengjel J, Mittler G. The Quantitative Nuclear Matrix Proteome as a Biochemical Snapshot of Nuclear Organization. J Proteome Res 2014; 13:3940-56. [DOI: 10.1021/pr500218f] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Rudolf Engelke
- Max Planck Institute of Immunobiology and Epigenetics, Stübeweg 51, 79108 Freiburg, Germany
| | - Julia Riede
- Freiburg
Institute for Advanced Studies, School of Life Sciences − LifeNet, University of Freiburg, Albertstrasse 19, 79104 Freiburg, Germany
- Center
for Biological Systems Analysis, University of Freiburg, Habsburgerstrasse
49, 79104 Freiburg, Germany
| | - Jan Hegermann
- European Neuroscience Institute and Center for Molecular Physiology of the Brain (CMPB), 37077 Göttingen, Germany
| | - Andreas Wuerch
- Max Planck Institute of Immunobiology and Epigenetics, Stübeweg 51, 79108 Freiburg, Germany
| | - Stefan Eimer
- European Neuroscience Institute and Center for Molecular Physiology of the Brain (CMPB), 37077 Göttingen, Germany
| | - Joern Dengjel
- Freiburg
Institute for Advanced Studies, School of Life Sciences − LifeNet, University of Freiburg, Albertstrasse 19, 79104 Freiburg, Germany
- Center
for Biological Systems Analysis, University of Freiburg, Habsburgerstrasse
49, 79104 Freiburg, Germany
| | - Gerhard Mittler
- Max Planck Institute of Immunobiology and Epigenetics, Stübeweg 51, 79108 Freiburg, Germany
- BIOSS,
Center for Biological Signalling Studies, University of Freiburg, Schänzlestrasse 18, 79104 Freiburg, Germany
| |
Collapse
|
227
|
Hu Q, Wang Z, Zhang Z. FSim: a novel functional similarity search algorithm and tool for discovering functionally related gene products. BIOMED RESEARCH INTERNATIONAL 2014; 2014:509149. [PMID: 25184141 PMCID: PMC4145548 DOI: 10.1155/2014/509149] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/13/2014] [Revised: 06/24/2014] [Accepted: 07/22/2014] [Indexed: 01/21/2023]
Abstract
BACKGROUND During the analysis of genomics data, it is often required to quantify the functional similarity of genes and their products based on the annotation information from gene ontology (GO) with hierarchical structure. A flexible and user-friendly way to estimate the functional similarity of genes utilizing GO annotation is therefore highly desired. RESULTS We proposed a novel algorithm using a level coefficient-weighted model to measure the functional similarity of gene products based on multiple ontologies of hierarchical GO annotations. The performance of our algorithm was evaluated and found to be superior to the other tested methods. We implemented the proposed algorithm in a software package, FSim, based on R statistical and computing environment. It can be used to discover functionally related genes for a given gene, group of genes, or set of function terms. CONCLUSIONS FSim is a flexible tool to analyze functional gene groups based on the GO annotation databases.
Collapse
Affiliation(s)
- Qiang Hu
- Department of Biomedical Engineering, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences, School of Basic Medicine, Peking Union Medical College, Beijing 100005, China
| | - ZhiGang Wang
- Department of Biomedical Engineering, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences, School of Basic Medicine, Peking Union Medical College, Beijing 100005, China
| | - ZhengGuo Zhang
- Department of Biomedical Engineering, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences, School of Basic Medicine, Peking Union Medical College, Beijing 100005, China
| |
Collapse
|
228
|
Peng C, Gao F. Protein localization analysis of essential genes in prokaryotes. Sci Rep 2014; 4:6001. [PMID: 25105358 PMCID: PMC4126397 DOI: 10.1038/srep06001] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2014] [Accepted: 07/22/2014] [Indexed: 01/27/2023] Open
Abstract
Essential genes, those critical for the survival of an organism under certain conditions, play a significant role in pharmaceutics and synthetic biology. Knowledge of protein localization is invaluable for understanding their function as well as the interaction of different proteins. However, systematical examination of essential genes from the aspect of the localizations of proteins they encode has not been explored before. Here, a comprehensive protein localization analysis of essential genes in 27 prokaryotes including 24 bacteria, 2 mycoplasmas and 1 archaeon has been performed. Both statistical analysis of localization information in these genomes and GO (Gene Ontology) terms enriched in the essential genes show that proteins encoded by essential genes are enriched in internal location sites, while exist in cell envelope with a lower proportion compared with non-essential ones. Meanwhile, there are few essential proteins in the external subcellular location sites such as flagellum and fimbrium, and proteins encoded by non-essential genes tend to have diverse localizations. These results would provide further insights into the understanding of fundamental functions needed to support a cellular life and improve gene essentiality prediction by taking the protein localization and enriched GO terms into consideration.
Collapse
Affiliation(s)
- Chong Peng
- Department of Physics, Tianjin University, Tianjin 300072, China
| | - Feng Gao
- Department of Physics, Tianjin University, Tianjin 300072, China
- Key Laboratory of Systems Bioengineering, Ministry of Education, Tianjin University, Tianjin 300072, China
- Collaborative Innovation Center of Chemical Science and Engineering, Tianjin 300072, China
| |
Collapse
|
229
|
Gilbert TM, McDaniel SL, Byrum SD, Cades JA, Dancy BCR, Wade H, Tackett AJ, Strahl BD, Taverna SD. A PWWP domain-containing protein targets the NuA3 acetyltransferase complex via histone H3 lysine 36 trimethylation to coordinate transcriptional elongation at coding regions. Mol Cell Proteomics 2014; 13:2883-95. [PMID: 25104842 DOI: 10.1074/mcp.m114.038224] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Post-translational modifications of histones, such as acetylation and methylation, are differentially positioned in chromatin with respect to gene organization. For example, although histone H3 is often trimethylated on lysine 4 (H3K4me3) and acetylated on lysine 14 (H3K14ac) at active promoter regions, histone H3 lysine 36 trimethylation (H3K36me3) occurs throughout the open reading frames of transcriptionally active genes. The conserved yeast histone acetyltransferase complex, NuA3, specifically binds H3K4me3 through a plant homeodomain (PHD) finger in the Yng1 subunit, and subsequently catalyzes the acetylation of H3K14 through the histone acetyltransferase domain of Sas3, leading to transcription initiation at a subset of genes. We previously found that Ylr455w (Pdp3), an uncharacterized proline-tryptophan-tryptophan-proline (PWWP) domain-containing protein, copurifies with stable members of NuA3. Here, we employ mass-spectrometric analysis of affinity purified Pdp3, biophysical binding assays, and genetic analyses to classify NuA3 into two functionally distinct forms: NuA3a and NuA3b. Although NuA3a uses the PHD finger of Yng1 to interact with H3K4me3 at the 5'-end of open reading frames, NuA3b contains the unique member, Pdp3, which regulates an interaction between NuA3b and H3K36me3 at the transcribed regions of genes through its PWWP domain. We find that deletion of PDP3 decreases NuA3-directed transcription and results in growth defects when combined with transcription elongation mutants, suggesting NuA3b acts as a positive elongation factor. Finally, we determine that NuA3a, but not NuA3b, is synthetically lethal in combination with a deletion of the histone acetyltransferase GCN5, indicating NuA3b has a specialized role at coding regions that is independent of Gcn5 activity. Collectively, these studies define a new form of the NuA3 complex that associates with H3K36me3 to effect transcriptional elongation. MS data are available via ProteomeXchange with identifier PXD001156.
Collapse
Affiliation(s)
- Tonya M Gilbert
- From the ‡Department of Pharmacology and Molecular Sciences, Johns Hopkins University School of Medicine, Baltimore, Maryland, 21205; §Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, Maryland, 21205
| | - Stephen L McDaniel
- ¶Curriculum in Genetics and Molecular Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599
| | - Stephanie D Byrum
- ‖Department of Biochemistry and Molecular Biology, University of Arkansas for Medical Sciences, Little Rock, Arkansas, 72205
| | - Jessica A Cades
- From the ‡Department of Pharmacology and Molecular Sciences, Johns Hopkins University School of Medicine, Baltimore, Maryland, 21205
| | - Blair C R Dancy
- From the ‡Department of Pharmacology and Molecular Sciences, Johns Hopkins University School of Medicine, Baltimore, Maryland, 21205; §Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, Maryland, 21205
| | - Herschel Wade
- **Department of Biophysics and Biological Chemistry, Johns Hopkins University School of Medicine, Baltimore, Maryland, 21205
| | - Alan J Tackett
- ‖Department of Biochemistry and Molecular Biology, University of Arkansas for Medical Sciences, Little Rock, Arkansas, 72205
| | - Brian D Strahl
- ¶Curriculum in Genetics and Molecular Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599; ‡‡Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599
| | - Sean D Taverna
- From the ‡Department of Pharmacology and Molecular Sciences, Johns Hopkins University School of Medicine, Baltimore, Maryland, 21205; §Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, Maryland, 21205;
| |
Collapse
|
230
|
Mazandu GK, Mulder NJ. The use of semantic similarity measures for optimally integrating heterogeneous Gene Ontology data from large scale annotation pipelines. Front Genet 2014; 5:264. [PMID: 25147557 PMCID: PMC4123725 DOI: 10.3389/fgene.2014.00264] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2014] [Accepted: 07/18/2014] [Indexed: 11/14/2022] Open
Abstract
With the advancement of new high throughput sequencing technologies, there has been an increase in the number of genome sequencing projects worldwide, which has yielded complete genome sequences of human, animals and plants. Subsequently, several labs have focused on genome annotation, consisting of assigning functions to gene products, mostly using Gene Ontology (GO) terms. As a consequence, there is an increased heterogeneity in annotations across genomes due to different approaches used by different pipelines to infer these annotations and also due to the nature of the GO structure itself. This makes a curator's task difficult, even if they adhere to the established guidelines for assessing these protein annotations. Here we develop a genome-scale approach for integrating GO annotations from different pipelines using semantic similarity measures. We used this approach to identify inconsistencies and similarities in functional annotations between orthologs of human and Drosophila melanogaster, to assess the quality of GO annotations derived from InterPro2GO mappings compared to manually annotated GO annotations for the Drosophila melanogaster proteome from a FlyBase dataset and human, and to filter GO annotation data for these proteomes. Results obtained indicate that an efficient integration of GO annotations eliminates redundancy up to 27.08 and 22.32% in the Drosophila melanogaster and human GO annotation datasets, respectively. Furthermore, we identified lack of and missing annotations for some orthologs, and annotation mismatches between InterPro2GO and manual pipelines in these two proteomes, thus requiring further curation. This simplifies and facilitates tasks of curators in assessing protein annotations, reduces redundancy and eliminates inconsistencies in large annotation datasets for ease of comparative functional genomics.
Collapse
Affiliation(s)
- Gaston K Mazandu
- Computational Biology Group, Department of Clinical Laboratory Sciences, Institute of Infectious Disease and Molecular Medicine, University of Cape Town Cape Town, South Africa
| | - Nicola J Mulder
- Computational Biology Group, Department of Clinical Laboratory Sciences, Institute of Infectious Disease and Molecular Medicine, University of Cape Town Cape Town, South Africa
| |
Collapse
|
231
|
In vivo mRNA profiling of uropathogenic Escherichia coli from diverse phylogroups reveals common and group-specific gene expression profiles. mBio 2014; 5:e01075-14. [PMID: 25096872 PMCID: PMC4128348 DOI: 10.1128/mbio.01075-14] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
mRNA profiling of pathogens during the course of human infections gives detailed information on the expression levels of relevant genes that drive pathogenicity and adaptation and at the same time allows for the delineation of phylogenetic relatedness of pathogens that cause specific diseases. In this study, we used mRNA sequencing to acquire information on the expression of Escherichia coli pathogenicity genes during urinary tract infections (UTI) in humans and to assign the UTI-associated E. coli isolates to different phylogenetic groups. Whereas the in vivo gene expression profiles of the majority of genes were conserved among 21 E. coli strains in the urine of elderly patients suffering from an acute UTI, the specific gene expression profiles of the flexible genomes was diverse and reflected phylogenetic relationships. Furthermore, genes transcribed in vivo relative to laboratory media included well-described virulence factors, small regulatory RNAs, as well as genes not previously linked to bacterial virulence. Knowledge on relevant transcriptional responses that drive pathogenicity and adaptation of isolates to the human host might lead to the introduction of a virulence typing strategy into clinical microbiology, potentially facilitating management and prevention of the disease. Urinary tract infections (UTI) are very common; at least half of all women experience UTI, most of which are caused by pathogenic Escherichia coli strains. In this study, we applied massive parallel cDNA sequencing (RNA-seq) to provide unbiased, deep, and accurate insight into the nature and the dimension of the uropathogenic E. coli gene expression profile during an acute UTI within the human host. This work was undertaken to identify key players in physiological adaptation processes and, hence, potential targets for new infection prevention and therapy interventions specifically aimed at sabotaging bacterial adaptation to the human host.
Collapse
|
232
|
Chibucos MC, Mungall CJ, Balakrishnan R, Christie KR, Huntley RP, White O, Blake JA, Lewis SE, Giglio M. Standardized description of scientific evidence using the Evidence Ontology (ECO). DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bau075. [PMID: 25052702 PMCID: PMC4105709 DOI: 10.1093/database/bau075] [Citation(s) in RCA: 82] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
The Evidence Ontology (ECO) is a structured, controlled vocabulary for capturing evidence in biological research. ECO includes diverse terms for categorizing evidence that supports annotation assertions including experimental types, computational methods, author statements and curator inferences. Using ECO, annotation assertions can be distinguished according to the evidence they are based on such as those made by curators versus those automatically computed or those made via high-throughput data review versus single test experiments. Originally created for capturing evidence associated with Gene Ontology annotations, ECO is now used in other capacities by many additional annotation resources including UniProt, Mouse Genome Informatics, Saccharomyces Genome Database, PomBase, the Protein Information Resource and others. Information on the development and use of ECO can be found at http://evidenceontology.org. The ontology is freely available under Creative Commons license (CC BY-SA 3.0), and can be downloaded in both Open Biological Ontologies and Web Ontology Language formats at http://code.google.com/p/evidenceontology. Also at this site is a tracker for user submission of term requests and questions. ECO remains under active development in response to user-requested terms and in collaborations with other ontologies and database resources. Database URL: Evidence Ontology Web site: http://evidenceontology.org
Collapse
Affiliation(s)
- Marcus C Chibucos
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USAInstitute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Christopher J Mungall
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Rama Balakrishnan
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Karen R Christie
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Rachael P Huntley
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Owen White
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USAInstitute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Judith A Blake
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Suzanna E Lewis
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Michelle Giglio
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USAInstitute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| |
Collapse
|
233
|
Panek J, El Alaoui H, Mone A, Urbach S, Demettre E, Texier C, Brun C, Zanzoni A, Peyretaillade E, Parisot N, Lerat E, Peyret P, Delbac F, Biron DG. Hijacking of host cellular functions by an intracellular parasite, the microsporidian Anncaliia algerae. PLoS One 2014; 9:e100791. [PMID: 24967735 PMCID: PMC4072689 DOI: 10.1371/journal.pone.0100791] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2014] [Accepted: 05/29/2014] [Indexed: 11/18/2022] Open
Abstract
Intracellular pathogens including bacteria, viruses and protozoa hijack host cell functions to access nutrients and to bypass cellular defenses and immune responses. These strategies have been acquired through selective pressure and allowed pathogens to reach an appropriate cellular niche for their survival and growth. To get new insights on how parasites hijack host cellular functions, we developed a SILAC (Stable Isotope Labeling by Amino Acids in Cell culture) quantitative proteomics workflow. Our study focused on deciphering the cross-talk in a host-parasite association, involving human foreskin fibroblasts (HFF) and the microsporidia Anncaliia algerae, a fungus related parasite with an obligate intracellular lifestyle and a strong host dependency. The host-parasite cross-talk was analyzed at five post-infection times 1, 6, 12 and 24 hours post-infection (hpi) and 8 days post-infection (dpi). A significant up-regulation of four interferon-induced proteins with tetratricopeptide repeats IFIT1, IFIT2, IFIT3 and MX1 was observed at 8 dpi suggesting a type 1 interferon (IFN) host response. Quantitative alteration of host proteins involved in biological functions such as signaling (STAT1, Ras) and reduction of the translation activity (EIF3) confirmed a host type 1 IFN response. Interestingly, the SILAC approach also allowed the detection of 148 A. algerae proteins during the kinetics of infection. Among these proteins many are involved in parasite proliferation, and an over-representation of putative secreted effectors proteins was observed. Finally our survey also suggests that A. algerae could use a transposable element as a lure strategy to escape the host innate immune system.
Collapse
Affiliation(s)
- Johan Panek
- Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, Clermont-Ferrand, France
- CNRS, UMR 6023, LMGE, Aubière, France
| | - Hicham El Alaoui
- Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, Clermont-Ferrand, France
- CNRS, UMR 6023, LMGE, Aubière, France
- * E-mail: (HEA); (DGB)
| | - Anne Mone
- Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, Clermont-Ferrand, France
- CNRS, UMR 6023, LMGE, Aubière, France
| | - Serge Urbach
- Functional Proteomics Platform. UMR CNRS 5203, Montpellier, France
| | - Edith Demettre
- Functional Proteomics Platform. UMS CNRS 3426, Montpellier, France
| | - Catherine Texier
- Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, Clermont-Ferrand, France
- CNRS, UMR 6023, LMGE, Aubière, France
| | - Christine Brun
- INSERM, UMR1090 TAGC, Marseille, Marseille, France
- Aix-Marseille Université, UMR1090 TAGC, Marseille, France
- CNRS, Marseille, France
| | - Andreas Zanzoni
- INSERM, UMR1090 TAGC, Marseille, Marseille, France
- Aix-Marseille Université, UMR1090 TAGC, Marseille, France
| | - Eric Peyretaillade
- Clermont Université, Université d'Auvergne, I.U.T., UFR Pharmacie, Clermont-Ferrand, France
- Clermont Université, Université d'Auvergne, EA 4678, Conception, Ingénierie et Développement de l'Aliment et du Médicament, Clermont-Ferrand, France
| | - Nicolas Parisot
- Clermont Université, Université d'Auvergne, I.U.T., UFR Pharmacie, Clermont-Ferrand, France
- Clermont Université, Université d'Auvergne, EA 4678, Conception, Ingénierie et Développement de l'Aliment et du Médicament, Clermont-Ferrand, France
| | - Emmanuelle Lerat
- Université de Lyon, Université Lyon 1, CNRS, UMR5558, Laboratoire de Biométrie et Biologie Evolutive, Villeurbanne, France
| | - Pierre Peyret
- Clermont Université, Université d'Auvergne, I.U.T., UFR Pharmacie, Clermont-Ferrand, France
- Clermont Université, Université d'Auvergne, EA 4678, Conception, Ingénierie et Développement de l'Aliment et du Médicament, Clermont-Ferrand, France
| | - Frederic Delbac
- Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, Clermont-Ferrand, France
- CNRS, UMR 6023, LMGE, Aubière, France
| | - David G. Biron
- Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, Clermont-Ferrand, France
- CNRS, UMR 6023, LMGE, Aubière, France
- * E-mail: (HEA); (DGB)
| |
Collapse
|
234
|
Bragina EY, Tiys ES, Freidin MB, Koneva LA, Demenkov PS, Ivanisenko VA, Kolchanov NA, Puzyrev VP. Insights into pathophysiology of dystropy through the analysis of gene networks: an example of bronchial asthma and tuberculosis. Immunogenetics 2014; 66:457-65. [PMID: 24954693 DOI: 10.1007/s00251-014-0786-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2013] [Accepted: 06/12/2014] [Indexed: 01/18/2023]
Abstract
Co-existence of bronchial asthma (BA) and tuberculosis (TB) is extremely uncommon (dystropic). We assume that this is caused by the interplay between genes involved into specific pathophysiological pathways that arrest simultaneous manifestation of BA and TB. Identification of common and specific genes may be important to determine the molecular genetic mechanisms leading to rare co-occurrence of these diseases and may contribute to the identification of susceptibility genes for each of these dystropic diseases. To address the issue, we propose a new methodological strategy that is based on reconstruction of associative networks that represent molecular relationships between proteins/genes associated with BA and TB, thus facilitating a better understanding of the biological context of antagonistic relationships between the diseases. The results of our study revealed a number of proteins/genes important for the development of both BA and TB.
Collapse
Affiliation(s)
- Elena Yu Bragina
- Laboratory of Population Genetics, Research Institute of Medical Genetics, Siberian Branch of Russian Academy of Medical Sciences, Nabereznaya Ushaiki str. 10, Tomsk, Russian Federation, 634050,
| | | | | | | | | | | | | | | |
Collapse
|
235
|
Dikicioglu D, Wood V, Rutherford KM, McDowall MD, Oliver SG. Improving functional annotation for industrial microbes: a case study with Pichia pastoris. Trends Biotechnol 2014; 32:396-9. [PMID: 24929579 PMCID: PMC4111905 DOI: 10.1016/j.tibtech.2014.05.003] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2014] [Revised: 05/10/2014] [Accepted: 05/13/2014] [Indexed: 11/29/2022]
Abstract
The current status of the Pichia pastoris genome is shown to lack extensive functional annotation. GO annotation transfer and literature curation pipelines improve the functional annotation of genomes. Pipelines and tools that can improve the annotation status of the genomes of Pichia pastoris and many industrial microbes are considered. Well-annotated genome sequences will facilitate the utilization of these microbes in a broader range of synthetic biology applications.
The research communities studying microbial model organisms, such as Escherichia coli or Saccharomyces cerevisiae, are well served by model organism databases that have extensive functional annotation. However, this is not true of many industrial microbes that are used widely in biotechnology. In this Opinion piece, we use Pichia (Komagataella) pastoris to illustrate the limitations of the available annotation. We consider the resources that can be implemented in the short term both to improve Gene Ontology (GO) annotation coverage based on annotation transfer, and to establish curation pipelines for the literature corpus of this organism.
Collapse
Affiliation(s)
- Duygu Dikicioglu
- Cambridge Systems Biology Centre & Department of Biochemistry, University of Cambridge, Sanger Building, 80 Tennis Court Road, Cambridge CB2 1GA, UK
| | - Valerie Wood
- Cambridge Systems Biology Centre & Department of Biochemistry, University of Cambridge, Sanger Building, 80 Tennis Court Road, Cambridge CB2 1GA, UK
| | - Kim M Rutherford
- Cambridge Systems Biology Centre & Department of Biochemistry, University of Cambridge, Sanger Building, 80 Tennis Court Road, Cambridge CB2 1GA, UK
| | - Mark D McDowall
- European Molecular Biology Laboratory European Bioinformatics Institute (EMBL-EBI) Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephen G Oliver
- Cambridge Systems Biology Centre & Department of Biochemistry, University of Cambridge, Sanger Building, 80 Tennis Court Road, Cambridge CB2 1GA, UK.
| |
Collapse
|
236
|
Arighi CN, Wu CH, Cohen KB, Hirschman L, Krallinger M, Valencia A, Lu Z, Wilbur JW, Wiegers TC. BioCreative-IV virtual issue. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bau039. [PMID: 24852177 PMCID: PMC4030502 DOI: 10.1093/database/bau039] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Affiliation(s)
- Cecilia N Arighi
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, USA
| | - Cathy H Wu
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, USA
| | - Kevin B Cohen
- Center for Computational Pharmacology, University of Colorado Denver School of Medicine, Aurora, CO, USA
| | | | - Martin Krallinger
- Structural and Computational Biology Group, Spanish National Cancer Research Centre, Madrid, Spain
| | - Alfonso Valencia
- Structural and Computational Biology Group, Spanish National Cancer Research Centre, Madrid, Spain
| | - Zhiyong Lu
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, USA
| | - John W Wilbur
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, USA
| | - Thomas C Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, USA
| |
Collapse
|
237
|
Goldberg T, Hecht M, Hamp T, Karl T, Yachdav G, Ahmed N, Altermann U, Angerer P, Ansorge S, Balasz K, Bernhofer M, Betz A, Cizmadija L, Do KT, Gerke J, Greil R, Joerdens V, Hastreiter M, Hembach K, Herzog M, Kalemanov M, Kluge M, Meier A, Nasir H, Neumaier U, Prade V, Reeb J, Sorokoumov A, Troshani I, Vorberg S, Waldraff S, Zierer J, Nielsen H, Rost B. LocTree3 prediction of localization. Nucleic Acids Res 2014; 42:W350-5. [PMID: 24848019 PMCID: PMC4086075 DOI: 10.1093/nar/gku396] [Citation(s) in RCA: 196] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
The prediction of protein sub-cellular localization is an important step toward elucidating protein function. For each query protein sequence, LocTree2 applies machine learning (profile kernel SVM) to predict the native sub-cellular localization in 18 classes for eukaryotes, in six for bacteria and in three for archaea. The method outputs a score that reflects the reliability of each prediction. LocTree2 has performed on par with or better than any other state-of-the-art method. Here, we report the availability of LocTree3 as a public web server. The server includes the machine learning-based LocTree2 and improves over it through the addition of homology-based inference. Assessed on sequence-unique data, LocTree3 reached an 18-state accuracy Q18 = 80 ± 3% for eukaryotes and a six-state accuracy Q6 = 89 ± 4% for bacteria. The server accepts submissions ranging from single protein sequences to entire proteomes. Response time of the unloaded server is about 90 s for a 300-residue eukaryotic protein and a few hours for an entire eukaryotic proteome not considering the generation of the alignments. For over 1000 entirely sequenced organisms, the predictions are directly available as downloads. The web server is available at http://www.rostlab.org/services/loctree3.
Collapse
Affiliation(s)
- Tatyana Goldberg
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), 85748 Garching, Germany
| | - Maximilian Hecht
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Tobias Hamp
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Timothy Karl
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Guy Yachdav
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany Biosof LLC, New York, NY 10001, USA
| | - Nadeem Ahmed
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Uwe Altermann
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Philipp Angerer
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Sonja Ansorge
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Kinga Balasz
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Michael Bernhofer
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Alexander Betz
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Laura Cizmadija
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Kieu Trinh Do
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Julia Gerke
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Robert Greil
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Vadim Joerdens
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | | | - Katharina Hembach
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Max Herzog
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Maria Kalemanov
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Michael Kluge
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Alice Meier
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Hassan Nasir
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Ulrich Neumaier
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Verena Prade
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Jonas Reeb
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | | | - Ilira Troshani
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Susann Vorberg
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Sonja Waldraff
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Jonas Zierer
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany
| | - Henrik Nielsen
- Center for Biological Sequence Analysis, Department of Systems Biology, DTU, 2800 Lyngby, Denmark
| | - Burkhard Rost
- Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany Biosof LLC, New York, NY 10001, USA Institute for Advanced Study (TUM-IAS), 85748 Garching, Germany New York Consortium on Membrane Protein Structure (NYCOMPS) & Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA Institute for Food and Plant Sciences WZW - Weihenstephan, 85350 Freising, Germany
| |
Collapse
|
238
|
Huntley RP, Harris MA, Alam-Faruque Y, Blake JA, Carbon S, Dietze H, Dimmer EC, Foulger RE, Hill DP, Khodiyar VK, Lock A, Lomax J, Lovering RC, Mutowo-Meullenet P, Sawford T, Van Auken K, Wood V, Mungall CJ. A method for increasing expressivity of Gene Ontology annotations using a compositional approach. BMC Bioinformatics 2014; 15:155. [PMID: 24885854 PMCID: PMC4039540 DOI: 10.1186/1471-2105-15-155] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2014] [Accepted: 05/15/2014] [Indexed: 11/22/2022] Open
Abstract
Background The Gene Ontology project integrates data about the function of gene products across a diverse range of organisms, allowing the transfer of knowledge from model organisms to humans, and enabling computational analyses for interpretation of high-throughput experimental and clinical data. The core data structure is the annotation, an association between a gene product and a term from one of the three ontologies comprising the GO. Historically, it has not been possible to provide additional information about the context of a GO term, such as the target gene or the location of a molecular function. This has limited the specificity of knowledge that can be expressed by GO annotations. Results The GO Consortium has introduced annotation extensions that enable manually curated GO annotations to capture additional contextual details. Extensions represent effector–target relationships such as localization dependencies, substrates of protein modifiers and regulation targets of signaling pathways and transcription factors as well as spatial and temporal aspects of processes such as cell or tissue type or developmental stage. We describe the content and structure of annotation extensions, provide examples, and summarize the current usage of annotation extensions. Conclusions The additional contextual information captured by annotation extensions improves the utility of functional annotation by representing dependencies between annotations to terms in the different ontologies of GO, external ontologies, or an organism’s gene products. These enhanced annotations can also support sophisticated queries and reasoning, and will provide curated, directional links between many gene products to support pathway and network reconstruction.
Collapse
|
239
|
Mashiyama ST, Malabanan MM, Akiva E, Bhosle R, Branch MC, Hillerich B, Jagessar K, Kim J, Patskovsky Y, Seidel RD, Stead M, Toro R, Vetting MW, Almo SC, Armstrong RN, Babbitt PC. Large-scale determination of sequence, structure, and function relationships in cytosolic glutathione transferases across the biosphere. PLoS Biol 2014; 12:e1001843. [PMID: 24756107 PMCID: PMC3995644 DOI: 10.1371/journal.pbio.1001843] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2013] [Accepted: 03/14/2014] [Indexed: 12/11/2022] Open
Abstract
Global networks of the cytosolic glutathione S-transferases illuminate sequence-structure-function relationships across more than 13,000 members of this superfamily, including experimental confirmation of enzymatic activity for 82 members and new crystal structures for 27. The cytosolic glutathione transferase (cytGST) superfamily comprises more than 13,000 nonredundant sequences found throughout the biosphere. Their key roles in metabolism and defense against oxidative damage have led to thousands of studies over several decades. Despite this attention, little is known about the physiological reactions they catalyze and most of the substrates used to assay cytGSTs are synthetic compounds. A deeper understanding of relationships across the superfamily could provide new clues about their functions. To establish a foundation for expanded classification of cytGSTs, we generated similarity-based subgroupings for the entire superfamily. Using the resulting sequence similarity networks, we chose targets that broadly covered unknown functions and report here experimental results confirming GST-like activity for 82 of them, along with 37 new 3D structures determined for 27 targets. These new data, along with experimentally known GST reactions and structures reported in the literature, were painted onto the networks to generate a global view of their sequence-structure-function relationships. The results show how proteins of both known and unknown function relate to each other across the entire superfamily and reveal that the great majority of cytGSTs have not been experimentally characterized or annotated by canonical class. A mapping of taxonomic classes across the superfamily indicates that many taxa are represented in each subgroup and highlights challenges for classification of superfamily sequences into functionally relevant classes. Experimental determination of disulfide bond reductase activity in many diverse subgroups illustrate a theme common for many reaction types. Finally, sequence comparison between an enzyme that catalyzes a reductive dechlorination reaction relevant to bioremediation efforts with some of its closest homologs reveals differences among them likely to be associated with evolution of this unusual reaction. Interactive versions of the networks, associated with functional and other types of information, can be downloaded from the Structure-Function Linkage Database (SFLD; http://sfld.rbvi.ucsf.edu). Cytosolic glutathione transferases (cytGSTs) are a large and diverse superfamily of enzymes that have important roles in metabolism and defense against oxidative damage. They have been studied for several decades but because of the synthetic nature of the chemicals used to test these proteins to determine if they have cytGST activity, little is known about the physiological reactions and roles of cytGSTs. In this large, collaborative study, we constructed networks where more than 13,000 cytGST sequences were grouped by sequence similarity and then used these networks to prioritize new targets for experimental characterization in relatively unexplored regions of the superfamily. We report here experimental results confirming GST-like activity for 82 of them, along with 37 new three-dimensional molecular structures determined for 27 targets. These new data, along with experimental data previously reported in the literature, were painted onto the networks to generate a global view of their sequence-structure-function relationships. The results show how proteins of both known and unknown function relate to each other across the entire superfamily and illuminate the complex ways in which their variations in sequence and structure affect our ability to predict unknown functional properties.
Collapse
Affiliation(s)
- Susan T. Mashiyama
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, California, United States of America
| | - M. Merced Malabanan
- Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Eyal Akiva
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, California, United States of America
| | - Rahul Bhosle
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Megan C. Branch
- Department of Biochemistry, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Brandan Hillerich
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Kevin Jagessar
- Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Jungwook Kim
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Yury Patskovsky
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Ronald D. Seidel
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Mark Stead
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Rafael Toro
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Matthew W. Vetting
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Steven C. Almo
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York, United States of America
- * E-mail: (SCA); (RNA); (PCB)
| | - Richard N. Armstrong
- Departments of Biochemistry and Chemistry, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
- * E-mail: (SCA); (RNA); (PCB)
| | - Patricia C. Babbitt
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, California, United States of America
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California, United States of America
- California Institute for Quantitative Biosciences, University of California, San Francisco, San Francisco, California, United States of America
- * E-mail: (SCA); (RNA); (PCB)
| |
Collapse
|
240
|
Murri M, Insenser M, Luque M, Tinahones FJ, Escobar-Morreale HF. Proteomic analysis of adipose tissue: informing diabetes research. Expert Rev Proteomics 2014; 11:491-502. [PMID: 24684164 DOI: 10.1586/14789450.2014.903158] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Diabetes, one of the most common endocrine diseases worldwide, results from complex pathophysiological mechanisms that are not fully understood. Adipose tissue is considered a major endocrine organ and plays a central role in the development of diabetes. The identification of the adipose tissue-derived factors that contribute to the onset and progression of diabetes will hopefully lead to the development of preventive and therapeutic interventions. Proteomic techniques may be useful tools for this purpose. In the present review, we have summarized the studies conducting adipose tissue proteomics in subjects with diabetes and insulin resistance, and discussed the proteins identified in these studies as candidates to exert important roles in these disorders.
Collapse
Affiliation(s)
- Mora Murri
- Department of Endocrinology and Nutrition, Diabetes, Obesity and Human Reproduction Research Group, Hospital Universitario Ramón y Cajal and Universidad de Alcalá and Instituto Ramón y Cajal de Investigación Sanitaria (IRYCIS) and Centro de Investigación Biomédica en Red Diabetes y Enfermedades Metabólicas Asociadas (CIBERDEM), E-28034 Madrid, Spain
| | | | | | | | | |
Collapse
|
241
|
Identification of microRNAs in the coral Stylophora pistillata. PLoS One 2014; 9:e91101. [PMID: 24658574 PMCID: PMC3962355 DOI: 10.1371/journal.pone.0091101] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2013] [Accepted: 02/06/2014] [Indexed: 12/22/2022] Open
Abstract
Coral reefs are major contributors to marine biodiversity. However, they are in rapid decline due to global environmental changes such as rising sea surface temperatures, ocean acidification, and pollution. Genomic and transcriptomic analyses have broadened our understanding of coral biology, but a study of the microRNA (miRNA) repertoire of corals is missing. miRNAs constitute a class of small non-coding RNAs of ∼22 nt in size that play crucial roles in development, metabolism, and stress response in plants and animals alike. In this study, we examined the coral Stylophora pistillata for the presence of miRNAs and the corresponding core protein machinery required for their processing and function. Based on small RNA sequencing, we present evidence for 31 bona fide microRNAs, 5 of which (miR-100, miR-2022, miR-2023, miR-2030, and miR-2036) are conserved in other metazoans. Homologues of Argonaute, Piwi, Dicer, Drosha, Pasha, and HEN1 were identified in the transcriptome of S. pistillata based on strong sequence conservation with known RNAi proteins, with additional support derived from phylogenetic trees. Examination of putative miRNA gene targets indicates potential roles in development, metabolism, immunity, and biomineralisation for several of the microRNAs. Here, we present first evidence of a functional RNAi machinery and five conserved miRNAs in S. pistillata, implying that miRNAs play a role in organismal biology of scleractinian corals. Analysis of predicted miRNA target genes in S. pistillata suggests potential roles of miRNAs in symbiosis and coral calcification. Given the importance of miRNAs in regulating gene expression in other metazoans, further expression analyses of small non-coding RNAs in transcriptional studies of corals should be informative about miRNA-affected processes and pathways.
Collapse
|
242
|
Huntley RP, Sawford T, Martin MJ, O'Donovan C. Understanding how and why the Gene Ontology and its annotations evolve: the GO within UniProt. Gigascience 2014; 3:4. [PMID: 24641996 PMCID: PMC3995153 DOI: 10.1186/2047-217x-3-4] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2013] [Accepted: 03/10/2014] [Indexed: 11/01/2022] Open
Abstract
The Gene Ontology Consortium (GOC) is a major bioinformatics project that provides structured controlled vocabularies to classify gene product function and location. GOC members create annotations to gene products using the Gene Ontology (GO) vocabularies, thus providing an extensive, publicly available resource. The GO and its annotations to gene products are now an integral part of functional analysis, and statistical tests using GO data are becoming routine for researchers to include when publishing functional information. While many helpful articles about the GOC are available, there are certain updates to the ontology and annotation sets that sometimes go unobserved. Here we describe some of the ways in which GO can change that should be carefully considered by all users of GO as they may have a significant impact on the resulting gene product annotations, and therefore the functional description of the gene product, or the interpretation of analyses performed on GO datasets. GO annotations for gene products change for many reasons, and while these changes generally improve the accuracy of the representation of the underlying biology, they do not necessarily imply that previous annotations were incorrect. We additionally describe the quality assurance mechanisms we employ to improve the accuracy of annotations, which necessarily changes the composition of the annotation sets we provide. We use the Universal Protein Resource (UniProt) for illustrative purposes of how the GO Consortium, as a whole, manages these changes.
Collapse
Affiliation(s)
- Rachael P Huntley
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | | | | | | |
Collapse
|
243
|
Croset S, Overington JP, Rebholz-Schuhmann D. The functional therapeutic chemical classification system. Bioinformatics 2014; 30:876-83. [PMID: 24177719 PMCID: PMC3957075 DOI: 10.1093/bioinformatics/btt628] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2013] [Revised: 10/15/2013] [Accepted: 10/27/2013] [Indexed: 01/27/2023] Open
Abstract
MOTIVATION Drug repositioning is the discovery of new indications for compounds that have already been approved and used in a clinical setting. Recently, some computational approaches have been suggested to unveil new opportunities in a systematic fashion, by taking into consideration gene expression signatures or chemical features for instance. We present here a novel method based on knowledge integration using semantic technologies, to capture the functional role of approved chemical compounds. RESULTS In order to computationally generate repositioning hypotheses, we used the Web Ontology Language to formally define the semantics of over 20 000 terms with axioms to correctly denote various modes of action (MoA). Based on an integration of public data, we have automatically assigned over a thousand of approved drugs into these MoA categories. The resulting new resource is called the Functional Therapeutic Chemical Classification System and was further evaluated against the content of the traditional Anatomical Therapeutic Chemical Classification System. We illustrate how the new classification can be used to generate drug repurposing hypotheses, using Alzheimers disease as a use-case. AVAILABILITY https://www.ebi.ac.uk/chembl/ftc; https://github.com/loopasam/ftc. CONTACT croset@ebi.ac.uk SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Samuel Croset
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | | | | |
Collapse
|
244
|
Häuser R, Ceol A, Rajagopala SV, Mosca R, Siszler G, Wermke N, Sikorski P, Schwarz F, Schick M, Wuchty S, Aloy P, Uetz P. A second-generation protein-protein interaction network of Helicobacter pylori. Mol Cell Proteomics 2014; 13:1318-29. [PMID: 24627523 DOI: 10.1074/mcp.o113.033571] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Helicobacter pylori infections cause gastric ulcers and play a major role in the development of gastric cancer. In 2001, the first protein interactome was published for this species, revealing over 1500 binary protein interactions resulting from 261 yeast two-hybrid screens. Here we roughly double the number of previously published interactions using an ORFeome-based, proteome-wide yeast two-hybrid screening strategy. We identified a total of 1515 protein-protein interactions, of which 1461 are new. The integration of all the interactions reported in H. pylori results in 3004 unique interactions that connect about 70% of its proteome. Excluding interactions of promiscuous proteins we derived from our new data a core network consisting of 908 interactions. We compared our data set to several other bacterial interactomes and experimentally benchmarked the conservation of interactions using 365 protein pairs (interologs) of E. coli of which one third turned out to be conserved in both species.
Collapse
Affiliation(s)
- Roman Häuser
- German Cancer Research Center (Deutsches Krebsforschungszentrum), Technologiepark 3, Im Neuenheimer Feld 580, 69120 Heidelberg, Germany
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
245
|
Poux S, Magrane M, Arighi CN, Bridge A, O'Donovan C, Laiho K. Expert curation in UniProtKB: a case study on dealing with conflicting and erroneous data. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bau016. [PMID: 24622611 PMCID: PMC3950660 DOI: 10.1093/database/bau016] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
UniProtKB/Swiss-Prot provides expert curation with information extracted from literature and curator-evaluated computational analysis. As knowledgebases continue to play an increasingly important role in scientific research, a number of studies have evaluated their accuracy and revealed various errors. While some are curation errors, others are the result of incorrect information published in the scientific literature. By taking the example of sirtuin-5, a complex annotation case, we will describe the curation procedure of UniProtKB/Swiss-Prot and detail how we report conflicting information in the database. We will demonstrate the importance of collaboration between resources to ensure curation consistency and the value of contributions from the user community in helping maintain error-free resources. Database URL:www.uniprot.org
Collapse
Affiliation(s)
- Sylvain Poux
- SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, 1 rue Michel Servet, 1211 Geneva 4, Switzerland, European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Protein Information Resource, University of Delaware, 15 Innovation Way, Suite 205, Newark, DE 19711, USA and Protein Information Resource, Georgetown University Medical Center, 3300 Whitehaven Street North West, Suite 1200, Washington, DC 20007, USA
| | | | | | | | | | | | | |
Collapse
|
246
|
Barbier M, Damron FH, Bielecki P, Suárez-Diez M, Puchałka J, Albertí S, dos Santos VM, Goldberg JB. From the environment to the host: re-wiring of the transcriptome of Pseudomonas aeruginosa from 22°C to 37°C. PLoS One 2014; 9:e89941. [PMID: 24587139 PMCID: PMC3933690 DOI: 10.1371/journal.pone.0089941] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2013] [Accepted: 01/25/2014] [Indexed: 11/18/2022] Open
Abstract
Pseudomonas aeruginosa is a highly versatile opportunistic pathogen capable of colonizing multiple ecological niches. This bacterium is responsible for a wide range of both acute and chronic infections in a variety of hosts. The success of this microorganism relies on its ability to adapt to environmental changes and re-program its regulatory and metabolic networks. The study of P. aeruginosa adaptation to temperature is crucial to understanding the pathogenesis upon infection of its mammalian host. We examined the effects of growth temperature on the transcriptome of the P. aeruginosa PAO1. Microarray analysis of PAO1 grown in Lysogeny broth at mid-exponential phase at 22°C and 37°C revealed that temperature changes are responsible for the differential transcriptional regulation of 6.4% of the genome. Major alterations were observed in bacterial metabolism, replication, and nutrient acquisition. Quorum-sensing and exoproteins secreted by type I, II, and III secretion systems, involved in the adaptation of P. aeruginosa to the mammalian host during infection, were up-regulated at 37°C compared to 22°C. Genes encoding arginine degradation enzymes were highly up-regulated at 22°C, together with the genes involved in the synthesis of pyoverdine. However, genes involved in pyochelin biosynthesis were up-regulated at 37°C. We observed that the changes in expression of P. aeruginosa siderophores correlated to an overall increase in Fe²⁺ extracellular concentration at 37°C and a peak in Fe³⁺ extracellular concentration at 22°C. This suggests a distinct change in iron acquisition strategies when the bacterium switches from the external environment to the host. Our work identifies global changes in bacterial metabolism and nutrient acquisition induced by growth at different temperatures. Overall, this study identifies factors that are regulated in genome-wide adaptation processes and discusses how this life-threatening pathogen responds to temperature.
Collapse
Affiliation(s)
- Mariette Barbier
- Department of Microbiology, Immunology, and Cancer Biology, University of Virginia, Charlottesville, Virginia, United States of America
| | - F. Heath Damron
- Department of Microbiology, Immunology, and Cancer Biology, University of Virginia, Charlottesville, Virginia, United States of America
| | - Piotr Bielecki
- Synthetic and Systems Biology Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - María Suárez-Diez
- Systems and Synthetic Biology, Wageningen University, Wageningen, Netherlands
| | - Jacek Puchałka
- Synthetic and Systems Biology Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Sebastian Albertí
- IUNICS, University of the Balearic Islands, Palma de Mallorca, Spain
| | - Vitor Martins dos Santos
- Synthetic and Systems Biology Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
- Systems and Synthetic Biology, Wageningen University, Wageningen, Netherlands
- LifeGlimmer GmbH, Berlin, Germany
| | - Joanna B. Goldberg
- Department of Microbiology, Immunology, and Cancer Biology, University of Virginia, Charlottesville, Virginia, United States of America
- Department of Pediatrics, and Center for Cystic Fibrosis Research, Emory University School of Medicine, Children’s Healthcare of Atlanta, Inc., Atlanta, Georgia, United States of America
- * E-mail: .
| |
Collapse
|
247
|
Mangiola S, Young ND, Sternberg PW, Strube C, Korhonen PK, Mitreva M, Scheerlinck JP, Hofmann A, Jex AR, Gasser RB. Analysis of the transcriptome of adult Dictyocaulus filaria and comparison with Dictyocaulus viviparus, with a focus on molecules involved in host-parasite interactions. Int J Parasitol 2014; 44:251-61. [PMID: 24487001 DOI: 10.1016/j.ijpara.2013.12.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2013] [Revised: 12/11/2013] [Accepted: 12/18/2013] [Indexed: 01/09/2023]
Abstract
Parasitic nematodes cause diseases of major economic importance in animals. Key representatives are species of Dictyocaulus (=lungworms), which cause bronchitis (=dictyocaulosis, commonly known as "husk") and have a major adverse impact on the health of livestock. In spite of their economic importance, very little is known about the immunomolecular biology of these parasites. Here, we conducted a comprehensive investigation of the adult transcriptome of Dictyocaulus filaria of small ruminants and compared it with that of Dictyocaulus viviparus of bovids. We then identified a subset of highly transcribed molecules inferred to be linked to host-parasite interactions, including cathepsin B peptidases, fatty-acid and/or retinol-binding proteins, β-galactoside-binding galectins, secreted protein 6 precursors, macrophage migration inhibitory factors, glutathione peroxidases, a transthyretin-like protein and a type 2-like cystatin. We then studied homologues of D. filaria type 2-like cystatin encoded in D. viviparus and 24 other nematodes representing seven distinct taxonomic orders, with a particular focus on their proposed role in immunomodulation and/or metabolism. Taken together, the present study provides new insights into nematode-host interactions. The findings lay the foundation for future experimental studies and could have implications for designing new interventions against lungworms and other parasitic nematodes. The future characterisation of the genomes of Dictyocaulus spp. should underpin these endeavours.
Collapse
Affiliation(s)
- Stefano Mangiola
- Faculty of Veterinary Science, The University of Melbourne, Victoria, Australia
| | - Neil D Young
- Faculty of Veterinary Science, The University of Melbourne, Victoria, Australia.
| | - Paul W Sternberg
- HHMI, Division of Biology, California Institute of Technology, Pasadena, CA, USA
| | - Christina Strube
- Institute for Parasitology, University of Veterinary Medicine Hannover, Hannover, Germany
| | - Pasi K Korhonen
- Faculty of Veterinary Science, The University of Melbourne, Victoria, Australia
| | - Makedonka Mitreva
- The Genome Institute, Washington University School of Medicine, St. Louis, MO, USA; Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | | | - Andreas Hofmann
- Faculty of Veterinary Science, The University of Melbourne, Victoria, Australia; Eskitis Institute for Cell & Molecular Therapies, Griffith University, Brisbane, Australia
| | - Aaron R Jex
- Faculty of Veterinary Science, The University of Melbourne, Victoria, Australia
| | - Robin B Gasser
- Faculty of Veterinary Science, The University of Melbourne, Victoria, Australia; Institute of Parasitology and Tropical Veterinary Medicine, Berlin, Germany.
| |
Collapse
|
248
|
Lee TY, Chang CW, Lu CT, Cheng TH, Chang TH. Identification and characterization of lysine-methylated sites on histones and non-histone proteins. Comput Biol Chem 2014; 50:11-8. [PMID: 24560580 DOI: 10.1016/j.compbiolchem.2014.01.009] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/23/2013] [Indexed: 01/17/2023]
Abstract
Protein methylation is a kind of post-translational modification (PTM), and typically takes place on lysine and arginine amino acid residues. Protein methylation is involved in many important biological processes, and most recent studies focused on lysine methylation of histones due to its critical roles in regulating transcriptional repression and activation. Histones possess highly conserved sequences and are homologous in most species. However, there is much less sequence conservation among non-histone proteins. Therefore, mechanisms for identifying lysine-methylated sites may greatly differ between histones and non-histone proteins. Nevertheless, this point of view was not considered in previous studies. Here we constructed two support vector machine (SVM) models by using lysine-methylated data from histones and non-histone proteins for predictions of lysine-methylated sites. Numerous features, such as the amino acid composition (AAC) and accessible surface area (ASA), were used in the SVM models, and the predictive performance was evaluated using five-fold cross-validations. For histones, the predictive sensitivity was 85.62% and specificity was 80.32%. For non-histone proteins, the predictive sensitivity was 69.1% and specificity was 88.72%. Results showed that our model significantly improved the predictive accuracy of histones compared to previous approaches. In addition, features of the flanking region of lysine-methylated sites on histones and non-histone proteins were also characterized and are discussed. A gene ontology functional analysis of lysine-methylated proteins and correlations of lysine-methylated sites with other PTMs in histones were also analyzed in detail. Finally, a web server, MethyK, was constructed to identify lysine-methylated sites. MethK now is available at http://csb.cse.yzu.edu.tw/MethK/.
Collapse
Affiliation(s)
- Tzong-Yi Lee
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, Taiwan
| | - Cheng-Wei Chang
- Department of Information Management, Hsing Wu University, New Taipei City, Taiwan
| | - Cheng-Tzung Lu
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, Taiwan
| | - Tzu-Hsiu Cheng
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, Taiwan
| | - Tzu-Hao Chang
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan.
| |
Collapse
|
249
|
Predicting human protein subcellular locations by the ensemble of multiple predictors via protein-protein interaction network with edge clustering coefficients. PLoS One 2014; 9:e86879. [PMID: 24466278 PMCID: PMC3900678 DOI: 10.1371/journal.pone.0086879] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2013] [Accepted: 12/18/2013] [Indexed: 12/14/2022] Open
Abstract
One of the fundamental tasks in biology is to identify the functions of all proteins to reveal the primary machinery of a cell. Knowledge of the subcellular locations of proteins will provide key hints to reveal their functions and to understand the intricate pathways that regulate biological processes at the cellular level. Protein subcellular location prediction has been extensively studied in the past two decades. A lot of methods have been developed based on protein primary sequences as well as protein-protein interaction network. In this paper, we propose to use the protein-protein interaction network as an infrastructure to integrate existing sequence based predictors. When predicting the subcellular locations of a given protein, not only the protein itself, but also all its interacting partners were considered. Unlike existing methods, our method requires neither the comprehensive knowledge of the protein-protein interaction network nor the experimentally annotated subcellular locations of most proteins in the protein-protein interaction network. Besides, our method can be used as a framework to integrate multiple predictors. Our method achieved 56% on human proteome in absolute-true rate, which is higher than the state-of-the-art methods.
Collapse
|
250
|
Moore CB, Wallace JR, Wolfe DJ, Frase AT, Pendergrass SA, Weiss KM, Ritchie MD. Low frequency variants, collapsed based on biological knowledge, uncover complexity of population stratification in 1000 genomes project data. PLoS Genet 2013; 9:e1003959. [PMID: 24385916 PMCID: PMC3873241 DOI: 10.1371/journal.pgen.1003959] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2013] [Accepted: 10/01/2013] [Indexed: 12/13/2022] Open
Abstract
Analyses investigating low frequency variants have the potential for explaining additional genetic heritability of many complex human traits. However, the natural frequencies of rare variation between human populations strongly confound genetic analyses. We have applied a novel collapsing method to identify biological features with low frequency variant burden differences in thirteen populations sequenced by the 1000 Genomes Project. Our flexible collapsing tool utilizes expert biological knowledge from multiple publicly available database sources to direct feature selection. Variants were collapsed according to genetically driven features, such as evolutionary conserved regions, regulatory regions genes, and pathways. We have conducted an extensive comparison of low frequency variant burden differences (MAF<0.03) between populations from 1000 Genomes Project Phase I data. We found that on average 26.87% of gene bins, 35.47% of intergenic bins, 42.85% of pathway bins, 14.86% of ORegAnno regulatory bins, and 5.97% of evolutionary conserved regions show statistically significant differences in low frequency variant burden across populations from the 1000 Genomes Project. The proportion of bins with significant differences in low frequency burden depends on the ancestral similarity of the two populations compared and types of features tested. Even closely related populations had notable differences in low frequency burden, but fewer differences than populations from different continents. Furthermore, conserved or functionally relevant regions had fewer significant differences in low frequency burden than regions under less evolutionary constraint. This degree of low frequency variant differentiation across diverse populations and feature elements highlights the critical importance of considering population stratification in the new era of DNA sequencing and low frequency variant genomic analyses. Low frequency variants are likely to play an important role in uncovering complex trait heritability; however, they are often continent or population specific. This specificity complicates genetic analyses investigating low frequency variants for two reasons: low frequency variant signals in an association test are often difficult to generalize beyond a single population or continental group, and there is an increase in false positive results in association analyses due to underlying population stratification. In order to reveal the magnitude of low frequency population stratification, we performed pairwise population comparisons using the 1000 Genomes Project Phase I data to investigate differences in low frequency variant burden across multiple biological features. We found that low frequency variant confounding is much more prevalent than one might expect, even within continental groups. The proportion of significant differences in low frequency variant burden was also dependent on the region of interest; for example, annotated regulatory regions showed fewer low frequency burden differences between populations than intergenic regions. Knowledge of population structure and the genomic landscape in a region of interest are important factors in determining the extent of confounding due to population stratification in a low frequency genomic analysis.
Collapse
Affiliation(s)
- Carrie B. Moore
- Center for Human Genetic Research, Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, Tennessee, United States of America
- Center for Systems Genomics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, Eberly College of Science, The Huck Institutes of the Life Sciences, University Park, Pennsylvania, United States of America
| | - John R. Wallace
- Center for Systems Genomics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, Eberly College of Science, The Huck Institutes of the Life Sciences, University Park, Pennsylvania, United States of America
| | - Daniel J. Wolfe
- Center for Systems Genomics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, Eberly College of Science, The Huck Institutes of the Life Sciences, University Park, Pennsylvania, United States of America
| | - Alex T. Frase
- Center for Systems Genomics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, Eberly College of Science, The Huck Institutes of the Life Sciences, University Park, Pennsylvania, United States of America
| | - Sarah A. Pendergrass
- Center for Systems Genomics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, Eberly College of Science, The Huck Institutes of the Life Sciences, University Park, Pennsylvania, United States of America
| | - Kenneth M. Weiss
- Department of Anthropology, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Marylyn D. Ritchie
- Center for Systems Genomics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, Eberly College of Science, The Huck Institutes of the Life Sciences, University Park, Pennsylvania, United States of America
- * E-mail:
| |
Collapse
|