1
|
Xing Y, Liu C, Zheng C, Li H, Yin H. Evolution and function analysis of auxin response factors reveal the molecular basis of the developed root system of Zygophyllum xanthoxylum. BMC PLANT BIOLOGY 2024; 24:81. [PMID: 38302884 PMCID: PMC10835889 DOI: 10.1186/s12870-023-04717-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 12/29/2023] [Indexed: 02/03/2024]
Abstract
BACKGROUND As a xerophytic shrub, forming developed root system dominated with lateral roots is one of the effective strategies for Zygophyllum xanthoxylum to adapt to desert habitat. However, the molecular mechanism of lateral root formation in Z. xanthoxylum is still unclear. Auxin response factors (ARFs) are a master family of transcription factors (TFs) in auxin-mediated biological processes including root growth and development. RESULTS Here, to determine the relationship between ARFs and root system formation in Z. xanthoxylum, a total of 30 potential ZxARF genes were first identified, and their classifications, evolutionary relationships, duplication events and conserved domains were characterized. 107 ARF protein sequences from alga to higher plant species including Z. xanthoxylum are split into A, B, and C 3 Clades, consisting with previous studies. The comparative analysis of ARFs between xerophytes and mesophytes showed that A-ARFs of xerophytes expanded considerably more than that of mesophytes. Furthermore, in this Clade, ZxARF5b and ZxARF8b have lost the important B3 DNA-binding domain partly and completely, suggesting both two proteins may be more functional in activating transcription by dimerization with AUX/IAA repressors. qRT-PCR results showed that all A-ZxARFs are high expressed in the roots of Z. xanthoxylum, and they were significantly induced by drought stress. Among these A-ZxARFs, the over-expression assay showed that ZxARF7c and ZxARF7d play positive roles in lateral root formation. CONCLUSION This study provided the first comprehensive overview of ZxARFs and highlighted the importance of A-ZxARFs in the lateral root development.
Collapse
Affiliation(s)
- Ying Xing
- State Key Laboratory of Herbage Improvement and Grassland Agro-Ecosystems; Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730000, People's Republic of China
| | - Chunli Liu
- State Key Laboratory of Herbage Improvement and Grassland Agro-Ecosystems; Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730000, People's Republic of China
| | - Chuan Zheng
- State Key Laboratory of Herbage Improvement and Grassland Agro-Ecosystems; Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730000, People's Republic of China
| | - Hong Li
- State Key Laboratory of Herbage Improvement and Grassland Agro-Ecosystems; Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730000, People's Republic of China
| | - Hongju Yin
- State Key Laboratory of Herbage Improvement and Grassland Agro-Ecosystems; Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730000, People's Republic of China.
| |
Collapse
|
2
|
Canavati C, Sherill-Rofe D, Kamal L, Bloch I, Zahdeh F, Sharon E, Terespolsky B, Allan IA, Rabie G, Kawas M, Kassem H, Avraham KB, Renbaum P, Levy-Lahad E, Kanaan M, Tabach Y. Using multi-scale genomics to associate poorly annotated genes with rare diseases. Genome Med 2024; 16:4. [PMID: 38178268 PMCID: PMC10765705 DOI: 10.1186/s13073-023-01276-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Accepted: 12/15/2023] [Indexed: 01/06/2024] Open
Abstract
BACKGROUND Next-generation sequencing (NGS) has significantly transformed the landscape of identifying disease-causing genes associated with genetic disorders. However, a substantial portion of sequenced patients remains undiagnosed. This may be attributed not only to the challenges posed by harder-to-detect variants, such as non-coding and structural variations but also to the existence of variants in genes not previously associated with the patient's clinical phenotype. This study introduces EvORanker, an algorithm that integrates unbiased data from 1,028 eukaryotic genomes to link mutated genes to clinical phenotypes. METHODS EvORanker utilizes clinical data, multi-scale phylogenetic profiling, and other omics data to prioritize disease-associated genes. It was evaluated on solved exomes and simulated genomes, compared with existing methods, and applied to 6260 knockout genes with mouse phenotypes lacking human associations. Additionally, EvORanker was made accessible as a user-friendly web tool. RESULTS In the analyzed exomic cohort, EvORanker accurately identified the "true" disease gene as the top candidate in 69% of cases and within the top 5 candidates in 95% of cases, consistent with results from the simulated dataset. Notably, EvORanker outperformed existing methods, particularly for poorly annotated genes. In the case of the 6260 knockout genes with mouse phenotypes, EvORanker linked 41% of these genes to observed human disease phenotypes. Furthermore, in two unsolved cases, EvORanker successfully identified DLGAP2 and LPCAT3 as disease candidates for previously uncharacterized genetic syndromes. CONCLUSIONS We highlight clade-based phylogenetic profiling as a powerful systematic approach for prioritizing potential disease genes. Our study showcases the efficacy of EvORanker in associating poorly annotated genes to disease phenotypes observed in patients. The EvORanker server is freely available at https://ccanavati.shinyapps.io/EvORanker/ .
Collapse
Affiliation(s)
- Christina Canavati
- Department of Developmental Biology and Cancer Research, Institute of Medical Research - Israel-Canada, The Hebrew University of Jerusalem, Jerusalem, 9112102, Israel
- Molecular Genetics Lab, Istishari Arab Hospital, Ramallah, Palestine
| | - Dana Sherill-Rofe
- Department of Developmental Biology and Cancer Research, Institute of Medical Research - Israel-Canada, The Hebrew University of Jerusalem, Jerusalem, 9112102, Israel
| | - Lara Kamal
- Molecular Genetics Lab, Istishari Arab Hospital, Ramallah, Palestine
- Department of Human Molecular Genetics and Biochemistry, Faculty of Medicine and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, 6997801, Israel
| | - Idit Bloch
- Department of Developmental Biology and Cancer Research, Institute of Medical Research - Israel-Canada, The Hebrew University of Jerusalem, Jerusalem, 9112102, Israel
| | - Fouad Zahdeh
- Medical Genetics Institute, Shaare Zedek Medical Center, Jerusalem, 91031, Israel
| | - Elad Sharon
- Department of Developmental Biology and Cancer Research, Institute of Medical Research - Israel-Canada, The Hebrew University of Jerusalem, Jerusalem, 9112102, Israel
| | - Batel Terespolsky
- Department of Developmental Biology and Cancer Research, Institute of Medical Research - Israel-Canada, The Hebrew University of Jerusalem, Jerusalem, 9112102, Israel
- Medical Genetics Institute, Shaare Zedek Medical Center, Jerusalem, 91031, Israel
| | - Islam Abu Allan
- Molecular Genetics Lab, Istishari Arab Hospital, Ramallah, Palestine
| | - Grace Rabie
- Hereditary Research Laboratory and Department of Life Sciences, Bethlehem University, Bethlehem, 72372, Palestine
| | - Mariana Kawas
- Hereditary Research Laboratory and Department of Life Sciences, Bethlehem University, Bethlehem, 72372, Palestine
| | - Hanin Kassem
- Molecular Genetics Lab, Istishari Arab Hospital, Ramallah, Palestine
| | - Karen B Avraham
- Department of Human Molecular Genetics and Biochemistry, Faculty of Medicine and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, 6997801, Israel
| | - Paul Renbaum
- Medical Genetics Institute, Shaare Zedek Medical Center, Jerusalem, 91031, Israel
| | - Ephrat Levy-Lahad
- Medical Genetics Institute, Shaare Zedek Medical Center, Jerusalem, 91031, Israel
- Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, 9112102, Israel
| | - Moien Kanaan
- Molecular Genetics Lab, Istishari Arab Hospital, Ramallah, Palestine
- Hereditary Research Laboratory and Department of Life Sciences, Bethlehem University, Bethlehem, 72372, Palestine
| | - Yuval Tabach
- Department of Developmental Biology and Cancer Research, Institute of Medical Research - Israel-Canada, The Hebrew University of Jerusalem, Jerusalem, 9112102, Israel.
| |
Collapse
|
3
|
Xu L, Li J, Gonzalez Ramos VM, Lyra C, Wiebenga A, Grigoriev IV, de Vries RP, Mäkelä MR, Peng M. Genome-wide prediction and transcriptome analysis of sugar transporters in four ascomycete fungi. BIORESOURCE TECHNOLOGY 2024; 391:130006. [PMID: 37952592 DOI: 10.1016/j.biortech.2023.130006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2023] [Revised: 11/09/2023] [Accepted: 11/09/2023] [Indexed: 11/14/2023]
Abstract
The import of plant-derived small sugars by sugar transporters (STs) has received increasing interest due to its important biological role and great industrial potential. STs are important targets of genetic engineering to improve fungal plant biomass conversion. Comparatively analysis of the genome-wide prevalence and transcriptomics of STs was performed in four filamentous fungi: Aspergillus niger, Aspergillus nidulans, Penicillium subrubescens and Trichoderma reesei. Using phylogenetic analysis and literature mining, their predicted STs were divided into ten subfamilies with putative sugar specificities assigned. In addition, transcriptome analysis revealed complex expression profiles among different STs subfamilies and fungal species, indicating a sophisticated transcriptome regulation and functional diversity of fungal STs. Several STs showed strong co-expression with other genes involved in sugar utilization, encoding CAZymes and sugar catabolic enzymes. This study provides new insights into the diversity of STs at the genomic/transcriptomic level, facilitating their biochemical characterization and metabolic engineering.
Collapse
Affiliation(s)
- Li Xu
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute, & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands.
| | - Jiajia Li
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute, & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands.
| | | | - Christina Lyra
- Department of Microbiology, University of Helsinki, Viikinkaari 9, 00014 Helsinki, Finland.
| | - Ad Wiebenga
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute, & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
| | - Igor V Grigoriev
- USA Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Rd., Berkeley, CA 94720, USA; Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA 94720, USA.
| | - Ronald P de Vries
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute, & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands.
| | - Miia R Mäkelä
- Department of Microbiology, University of Helsinki, Viikinkaari 9, 00014 Helsinki, Finland.
| | - Mao Peng
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute, & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands.
| |
Collapse
|
4
|
Thoben C, Pucker B. Automatic annotation of the bHLH gene family in plants. BMC Genomics 2023; 24:780. [PMID: 38102570 PMCID: PMC10722790 DOI: 10.1186/s12864-023-09877-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 12/06/2023] [Indexed: 12/17/2023] Open
Abstract
BACKGROUND The bHLH transcription factor family is named after the basic helix-loop-helix (bHLH) domain that is a characteristic element of their members. Understanding the function and characteristics of this family is important for the examination of a wide range of functions. As the availability of genome sequences and transcriptome assemblies has increased significantly, the need for automated solutions that provide reliable functional annotations is emphasised. RESULTS A phylogenetic approach was adapted for the automatic identification and functional annotation of the bHLH transcription factor family. The bHLH_annotator, designed for the automated functional annotation of bHLHs, was implemented in Python3. Sequences of bHLHs described in literature were collected to represent the full diversity of bHLH sequences. Previously described orthologs form the basis for the functional annotation assignment to candidates which are also screened for bHLH-specific motifs. The pipeline was successfully deployed on the two Arabidopsis thaliana accessions Col-0 and Nd-1, the monocot species Dioscorea dumetorum, and a transcriptome assembly of Croton tiglium. Depending on the applied search parameters for the initial candidates in the pipeline, species-specific candidates or members of the bHLH family which experienced domain loss can be identified. CONCLUSIONS The bHLH_annotator allows a detailed and systematic investigation of the bHLH family in land plant species and classifies candidates based on bHLH-specific characteristics, which distinguishes the pipeline from other established functional annotation tools. This provides the basis for the functional annotation of the bHLH family in land plants and the systematic examination of a wide range of functions regulated by this transcription factor family.
Collapse
Affiliation(s)
- Corinna Thoben
- Plant Biotechnology and Bioinformatics, Institute of Plant Biology & Braunschweig Integrated, Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
| | - Boas Pucker
- Plant Biotechnology and Bioinformatics, Institute of Plant Biology & Braunschweig Integrated, Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany.
| |
Collapse
|
5
|
Spiers AJ, Dorfmueller HC, Jerdan R, McGregor J, Nicoll A, Steel K, Cameron S. Bioinformatics characterization of BcsA-like orphan proteins suggest they form a novel family of pseudomonad cyclic-β-glucan synthases. PLoS One 2023; 18:e0286540. [PMID: 37267309 PMCID: PMC10237404 DOI: 10.1371/journal.pone.0286540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 05/18/2023] [Indexed: 06/04/2023] Open
Abstract
Bacteria produce a variety of polysaccharides with functional roles in cell surface coating, surface and host interactions, and biofilms. We have identified an 'Orphan' bacterial cellulose synthase catalytic subunit (BcsA)-like protein found in four model pseudomonads, P. aeruginosa PA01, P. fluorescens SBW25, P. putida KT2440 and P. syringae pv. tomato DC3000. Pairwise alignments indicated that the Orphan and BcsA proteins shared less than 41% sequence identity suggesting they may not have the same structural folds or function. We identified 112 Orphans among soil and plant-associated pseudomonads as well as in phytopathogenic and human opportunistic pathogenic strains. The wide distribution of these highly conserved proteins suggest they form a novel family of synthases producing a different polysaccharide. In silico analysis, including sequence comparisons, secondary structure and topology predictions, and protein structural modelling, revealed a two-domain transmembrane ovoid-like structure for the Orphan protein with a periplasmic glycosyl hydrolase family GH17 domain linked via a transmembrane region to a cytoplasmic glycosyltransferase family GT2 domain. We suggest the GT2 domain synthesises β-(1,3)-glucan that is transferred to the GH17 domain where it is cleaved and cyclised to produce cyclic-β-(1,3)-glucan (CβG). Our structural models are consistent with enzymatic characterisation and recent molecular simulations of the PaPA01 and PpKT2440 GH17 domains. It also provides a functional explanation linking PaPAK and PaPA14 Orphan (also known as NdvB) transposon mutants with CβG production and biofilm-associated antibiotic resistance. Importantly, cyclic glucans are also involved in osmoregulation, plant infection and induced systemic suppression, and our findings suggest this novel family of CβG synthases may provide similar range of adaptive responses for pseudomonads.
Collapse
Affiliation(s)
- Andrew J. Spiers
- School of Applied Sciences, Abertay University, Dundee, United Kingdom
| | - Helge C. Dorfmueller
- Division of Molecular Microbiology, School of Life Sciences, University of Dundee, Dundee, United Kingdom
| | - Robyn Jerdan
- School of Applied Sciences, Abertay University, Dundee, United Kingdom
| | - Jessica McGregor
- Nuffield Research Placement Students, School of Applied Sciences, Abertay University, Dundee, United Kingdom
| | - Abbie Nicoll
- Nuffield Research Placement Students, School of Applied Sciences, Abertay University, Dundee, United Kingdom
| | - Kenzie Steel
- Nuffield Research Placement Students, School of Applied Sciences, Abertay University, Dundee, United Kingdom
| | - Scott Cameron
- School of Applied Sciences, Abertay University, Dundee, United Kingdom
| |
Collapse
|
6
|
The Structure of Evolutionary Model Space for Proteins across the Tree of Life. BIOLOGY 2023; 12:biology12020282. [PMID: 36829559 PMCID: PMC9952988 DOI: 10.3390/biology12020282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 02/04/2023] [Accepted: 02/08/2023] [Indexed: 02/12/2023]
Abstract
The factors that determine the relative rates of amino acid substitution during protein evolution are complex and known to vary among taxa. We estimated relative exchangeabilities for pairs of amino acids from clades spread across the tree of life and assessed the historical signal in the distances among these clade-specific models. We separately trained these models on collections of arbitrarily selected protein alignments and on ribosomal protein alignments. In both cases, we found a clear separation between the models trained using multiple sequence alignments from bacterial clades and the models trained on archaeal and eukaryotic data. We assessed the predictive power of our novel clade-specific models of sequence evolution by asking whether fit to the models could be used to identify the source of multiple sequence alignments. Model fit was generally able to correctly classify protein alignments at the level of domain (bacterial versus archaeal), but the accuracy of classification at finer scales was much lower. The only exceptions to this were the relatively high classification accuracy for two archaeal lineages: Halobacteriaceae and Thermoprotei. Genomic GC content had a modest impact on relative exchangeabilities despite having a large impact on amino acid frequencies. Relative exchangeabilities involving aromatic residues exhibited the largest differences among models. There were a small number of exchangeabilities that exhibited large differences in comparisons among major clades and between generalized models and ribosomal protein models. Taken as a whole, these results reveal that a small number of relative exchangeabilities are responsible for much of the structure of the "model space" for protein sequence evolution. The clade-specific models we generated may be useful tools for protein phylogenetics, and the structure of evolutionary model space that they revealed has implications for phylogenomic inference across the tree of life.
Collapse
|
7
|
Fang Y, Yang Y, Liu C. New feature extraction from phylogenetic profiles improved the performance of pathogen-host interactions. Front Cell Infect Microbiol 2022; 12:931072. [PMID: 35982784 PMCID: PMC9378789 DOI: 10.3389/fcimb.2022.931072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 07/11/2022] [Indexed: 11/13/2022] Open
Abstract
MotivationThe understanding of pathogen-host interactions (PHIs) is essential and challenging research because this potentially provides the mechanism of molecular interactions between different organisms. The experimental exploration of PHI is time-consuming and labor-intensive, and computational approaches are playing a crucial role in discovering new unknown PHIs between different organisms. Although it has been proposed that most machine learning (ML)–based methods predict PHI, these methods are all based on the structure-based information extracted from the sequence for prediction. The selection of feature values is critical to improving the performance of predicting PHI using ML.ResultsThis work proposed a new method to extract features from phylogenetic profiles as evolutionary information for predicting PHI. The performance of our approach is better than that of structure-based and ML-based PHI prediction methods. The five different extract models proposed by our approach combined with structure-based information significantly improved the performance of PHI, suggesting that combining phylogenetic profile features and structure-based methods could be applied to the exploration of PHI and discover new unknown biological relativity.Availability and implementationThe KPP method is implemented in the Java language and is available at https://github.com/yangfangs/KPP.
Collapse
Affiliation(s)
- Yang Fang
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
- Department of Laboratory Medicine, Third Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Yi Yang
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
- *Correspondence: Chengcheng Liu, ; Yi Yang,
| | - Chengcheng Liu
- State Key Laboratory of Oral Diseases, Department of Periodontics, National Clinical Research Center for Oral Diseases, West China School & Hospital of Stomatology, Sichuan University, Chengdu, China
- *Correspondence: Chengcheng Liu, ; Yi Yang,
| |
Collapse
|
8
|
Fang Y, Li M, Li X, Yang Y. GFICLEE: ultrafast tree-based phylogenetic profile method inferring gene function at the genomic-wide level. BMC Genomics 2021; 22:774. [PMID: 34715785 PMCID: PMC8557005 DOI: 10.1186/s12864-021-08070-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Accepted: 10/10/2021] [Indexed: 11/25/2022] Open
Abstract
Background Phylogenetic profiling is widely used to predict novel members of large protein complexes and biological pathways. Although methods combined with phylogenetic trees have significantly improved prediction accuracy, computational efficiency is still an issue that limits its genome-wise application. Results Here we introduce a new tree-based phylogenetic profiling algorithm named GFICLEE, which infers common single and continuous loss (SCL) events in the evolutionary patterns. We validated our algorithm with human pathways from three databases and compared the computational efficiency with current tree-based with 10 different scales genome dataset. Our algorithm has a better predictive performance with high computational efficiency. Conclusions The GFICLEE is a new method to infers genome-wide gene function. The accuracy and computational efficiency of GFICLEE make it possible to explore gene functions at the genome-wide level on a personal computer. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-08070-7.
Collapse
Affiliation(s)
- Yang Fang
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, People's Republic of China
| | - Menglong Li
- College of Chemistry, Sichuan University, Chengdu, People's Republic of China
| | - Xufeng Li
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, People's Republic of China
| | - Yi Yang
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, People's Republic of China.
| |
Collapse
|
9
|
Unterman I, Bloch I, Cazacu S, Kazimirsky G, Ben-Zeev B, Berman BP, Brodie C, Tabach Y. Expanding the MECP2 network using comparative genomics reveals potential therapeutic targets for Rett syndrome. eLife 2021; 10:e67085. [PMID: 34355696 PMCID: PMC8346285 DOI: 10.7554/elife.67085] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 07/23/2021] [Indexed: 12/12/2022] Open
Abstract
Inactivating mutations in the Methyl-CpG Binding Protein 2 (MECP2) gene are the main cause of Rett syndrome (RTT). Despite extensive research into MECP2 function, no treatments for RTT are currently available. Here, we used an evolutionary genomics approach to construct an unbiased MECP2 gene network, using 1028 eukaryotic genomes to prioritize proteins with strong co-evolutionary signatures with MECP2. Focusing on proteins targeted by FDA-approved drugs led to three promising targets, two of which were previously linked to MECP2 function (IRAK, KEAP1) and one that was not (EPOR). The drugs targeting these three proteins (Pacritinib, DMF, and EPO) were able to rescue different phenotypes of MECP2 inactivation in cultured human neural cell types, and appeared to converge on Nuclear Factor Kappa B (NF-κB) signaling in inflammation. This study highlights the potential of comparative genomics to accelerate drug discovery, and yields potential new avenues for the treatment of RTT.
Collapse
Affiliation(s)
- Irene Unterman
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-CanadaJerusalemIsrael
| | - Idit Bloch
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-CanadaJerusalemIsrael
| | - Simona Cazacu
- Hermelin Brain Tumor Center, Henry Ford HospitalDetroitUnited States
| | - Gila Kazimirsky
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan UniversityRamat-GanIsrael
| | - Bruria Ben-Zeev
- Edmond and Lily Safra Children's Hospital, Chaim Sheba Medical CenterRamat GanIsrael
| | - Benjamin P Berman
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-CanadaJerusalemIsrael
| | - Chaya Brodie
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan UniversityRamat-GanIsrael
| | - Yuval Tabach
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-CanadaJerusalemIsrael
| |
Collapse
|
10
|
Linard B, Ebersberger I, McGlynn SE, Glover N, Mochizuki T, Patricio M, Lecompte O, Nevers Y, Thomas PD, Gabaldón T, Sonnhammer E, Dessimoz C, Uchiyama I. Ten Years of Collaborative Progress in the Quest for Orthologs. Mol Biol Evol 2021; 38:3033-3045. [PMID: 33822172 PMCID: PMC8321534 DOI: 10.1093/molbev/msab098] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 02/07/2021] [Accepted: 04/01/2021] [Indexed: 12/19/2022] Open
Abstract
Accurate determination of the evolutionary relationships between genes is a foundational challenge in biology. Homology-evolutionary relatedness-is in many cases readily determined based on sequence similarity analysis. By contrast, whether or not two genes directly descended from a common ancestor by a speciation event (orthologs) or duplication event (paralogs) is more challenging, yet provides critical information on the history of a gene. Since 2009, this task has been the focus of the Quest for Orthologs (QFO) Consortium. The sixth QFO meeting took place in Okazaki, Japan in conjunction with the 67th National Institute for Basic Biology conference. Here, we report recent advances, applications, and oncoming challenges that were discussed during the conference. Steady progress has been made toward standardization and scalability of new and existing tools. A feature of the conference was the presentation of a panel of accessible tools for phylogenetic profiling and several developments to bring orthology beyond the gene unit-from domains to networks. This meeting brought into light several challenges to come: leveraging orthology computations to get the most of the incoming avalanche of genomic data, integrating orthology from domain to biological network levels, building better gene models, and adapting orthology approaches to the broad evolutionary and genomic diversity recognized in different forms of life and viruses.
Collapse
Affiliation(s)
- Benjamin Linard
- LIRMM, University of Montpellier, CNRS, Montpellier, France.,SPYGEN, Le Bourget-du-Lac, France
| | - Ingo Ebersberger
- Institute of Cell Biology and Neuroscience, Goethe University Frankfurt, Frankfurt, Germany.,Senckenberg Biodiversity and Climate Research Centre (S-BIKF), Frankfurt, Germany.,LOEWE Center for Translational Biodiversity Genomics (TBG), Frankfurt, Germany
| | - Shawn E McGlynn
- Earth-Life Science Institute, Tokyo Institute of Technology, Meguro, Tokyo, Japan.,Blue Marble Space Institute of Science, Seattle, WA, USA
| | - Natasha Glover
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Tomohiro Mochizuki
- Earth-Life Science Institute, Tokyo Institute of Technology, Meguro, Tokyo, Japan
| | - Mateus Patricio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Odile Lecompte
- Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Fédération de Médecine Translationnelle de Strasbourg, Strasbourg, France
| | - Yannis Nevers
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Paul D Thomas
- Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, CA, USA
| | - Toni Gabaldón
- Barcelona Supercomputing Centre (BCS-CNS), Jordi Girona, Barcelona, Spain.,Institute for Research in Biomedicine (IRB), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Erik Sonnhammer
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden
| | - Christophe Dessimoz
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.,Department of Computer Science, University College London, London, United Kingdom.,Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Ikuo Uchiyama
- Department of Theoretical Biology, National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Aichi, Japan
| | | |
Collapse
|
11
|
Bloch I, Sherill-Rofe D, Stupp D, Unterman I, Beer H, Sharon E, Tabach Y. Optimization of co-evolution analysis through phylogenetic profiling reveals pathway-specific signals. Bioinformatics 2021; 36:4116-4125. [PMID: 32353123 DOI: 10.1093/bioinformatics/btaa281] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Revised: 04/17/2020] [Accepted: 04/23/2020] [Indexed: 12/11/2022] Open
Abstract
SUMMARY The exponential growth in available genomic data is expected to reach full sequencing of a million genomes in the coming decade. Improving and developing methods to analyze these genomes and to reveal their utility is of major interest in a wide variety of fields, such as comparative and functional genomics, evolution and bioinformatics. Phylogenetic profiling is an established method for predicting functional interactions between proteins based on similarities in their evolutionary patterns across species. Proteins that function together (i.e. generate complexes, interact in the same pathways or improve adaptation to environmental niches) tend to show coordinated evolution across the tree of life. The normalized phylogenetic profiling (NPP) method takes into account minute changes in proteins across species to identify protein co-evolution. Despite the success of this method, it is still not clear what set of parameters is required for optimal use of co-evolution in predicting functional interactions. Moreover, it is not clear if pathway evolution or function should direct parameter choice. Here, we create a reliable and usable NPP construction pipeline. We explore the effect of parameter selection on functional interaction prediction using NPP from 1028 genomes, both separately and in various value combinations. We identify several parameter sets that optimize performance for pathways with certain biological annotation. This work reveals the importance of choosing the right parameters for optimized function prediction based on a biological context. AVAILABILITY AND IMPLEMENTATION Source code and documentation are available on GitHub: https://github.com/iditam/CompareNPPs. CONTACT yuvaltab@ekmd.huji.ac.il. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Idit Bloch
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-Canada, Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Dana Sherill-Rofe
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-Canada, Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Doron Stupp
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-Canada, Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Irene Unterman
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-Canada, Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Hodaya Beer
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-Canada, Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Elad Sharon
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-Canada, Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Yuval Tabach
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-Canada, Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| |
Collapse
|
12
|
Chen Y, Klinkhamer PGL, Memelink J, Vrieling K. Diversity and evolution of cytochrome P450s of Jacobaea vulgaris and Jacobaea aquatica. BMC PLANT BIOLOGY 2020; 20:342. [PMID: 32689941 PMCID: PMC7372880 DOI: 10.1186/s12870-020-02532-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Accepted: 06/28/2020] [Indexed: 06/11/2023]
Abstract
BACKGROUND Collectively, plants produce a huge variety of secondary metabolites (SMs) which are involved in the adaptation of plants to biotic and abiotic stresses. The most characteristic feature of SMs is their striking inter- and intraspecific chemical diversity. Cytochrome P450 monooxygenases (CYPs) often play an important role in the biosynthesis of SMs and thus in the evolution of chemical diversity. Here we studied the diversity and evolution of CYPs of two Jacobaea species which contain a characteristic group of SMs namely the pyrrolizidine alkaloids (PAs). RESULTS We retrieved CYPs from RNA-seq data of J. vulgaris and J. aquatica, resulting in 221 and 157 full-length CYP genes, respectively. The analyses of conserved motifs confirmed that Jacobaea CYP proteins share conserved motifs including the heme-binding signature, the PERF motif, the K-helix and the I-helix. KEGG annotation revealed that the CYPs assigned as being SM metabolic pathway genes were all from the CYP71 clan but no CYPs were assigned as being involved in alkaloid pathways. Phylogenetic analyses of full-length CYPs were conducted for the six largest CYP families of Jacobaea (CYP71, CYP76, CYP706, CYP82, CYP93 and CYP72) and were compared with CYPs of two other members of the Asteraceae, Helianthus annuus and Lactuca sativa, and with Arabidopsis thaliana. The phylogenetic trees showed strong lineage specific diversification of CYPs, implying that the evolution of CYPs has been very fast even within the Asteraceae family. Only in the closely related species J. vulgaris and J. aquatica, CYPs were found often in pairs, confirming a close relationship in the evolutionary history. CONCLUSIONS This study discovered 378 full-length CYPs in Jacobaea species, which can be used for future exploration of their functions, including possible involvement in PA biosynthesis and PA diversity.
Collapse
Affiliation(s)
- Yangan Chen
- Plant Ecology and Phytochemistry, Institute of Biology, Leiden University, Sylviusweg 72, P. O. Box 9505, 2300 RA, Leiden, The Netherlands
- Plant Cell Physiology, Institute of Biology, Leiden University, Sylviusweg 72, P. O. Box 9505, 2300 RA, Leiden, The Netherlands
| | - Peter G L Klinkhamer
- Plant Ecology and Phytochemistry, Institute of Biology, Leiden University, Sylviusweg 72, P. O. Box 9505, 2300 RA, Leiden, The Netherlands
| | - Johan Memelink
- Plant Cell Physiology, Institute of Biology, Leiden University, Sylviusweg 72, P. O. Box 9505, 2300 RA, Leiden, The Netherlands.
| | - Klaas Vrieling
- Plant Ecology and Phytochemistry, Institute of Biology, Leiden University, Sylviusweg 72, P. O. Box 9505, 2300 RA, Leiden, The Netherlands.
| |
Collapse
|
13
|
Aminoglycoside antibiotic resistance conferred by Hpa2 of MDR Acinetobacter baumannii: an unusual adaptation of a common histone acetyltransferase. Biochem J 2019; 476:795-808. [PMID: 30573651 DOI: 10.1042/bcj20180791] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Revised: 12/18/2018] [Accepted: 12/20/2018] [Indexed: 12/20/2022]
Abstract
Antibiotic-resistant bacteria pose the greatest threat to human health. Among the list of such bacteria released by WHO, carbapenem-resistant Acinetobacter baumannii, for which almost no treatment exists, tops the list. A. baumannii is one of the most troublesome ESKAPE pathogens and mechanisms that have facilitated its rise as a successful pathogen are not well studied. Efforts in this direction have resulted in the identification of Hpa2-Ab, an uncharacterized histone acetyltransferase enzyme of GNAT superfamily. Here, we show that Hpa2-Ab confers resistance against aminoglycoside antibiotics using Escherichia coli DH5α strains in which Hpa2 gene is expressed. Resistivity for aminoglycoside antibiotics is demonstrated with the help of CLSI-2010 and KB tests. Isothermal titration calorimetry, MALDI and acetylation assays indicate that conferred resistance is an outcome of evolved antibiotic acetylation capacity in this. Hpa2 is known to acetylate nuclear molecules; however, here it is found to cross its boundary and participate in other functions. An array of biochemical and biophysical techniques were also used to study this protein, which demonstrates that Hpa2-Ab is intrinsically oligomeric in nature, exists primarily as a dimer and its interface is mainly stabilized by hydrophobic interactions. Our work demonstrates an evolved survival strategy by A. baumannii and provides insights into the mechanism that facilitates it to rise as a successful pathogen.
Collapse
|
14
|
Sherill-Rofe D, Rahat D, Findlay S, Mellul A, Guberman I, Braun M, Bloch I, Lalezari A, Samiei A, Sadreyev R, Goldberg M, Orthwein A, Zick A, Tabach Y. Mapping global and local coevolution across 600 species to identify novel homologous recombination repair genes. Genome Res 2019; 29:439-448. [PMID: 30718334 PMCID: PMC6396423 DOI: 10.1101/gr.241414.118] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Accepted: 01/22/2019] [Indexed: 12/02/2022]
Abstract
The homologous recombination repair (HRR) pathway repairs DNA double-strand breaks in an error-free manner. Mutations in HRR genes can result in increased mutation rate and genomic rearrangements, and are associated with numerous genetic disorders and cancer. Despite intensive research, the HRR pathway is not yet fully mapped. Phylogenetic profiling analysis, which detects functional linkage between genes using coevolution, is a powerful approach to identify factors in many pathways. Nevertheless, phylogenetic profiling has limited predictive power when analyzing pathways with complex evolutionary dynamics such as the HRR. To map novel HRR genes systematically, we developed clade phylogenetic profiling (CladePP). CladePP detects local coevolution across hundreds of genomes and points to the evolutionary scale (e.g., mammals, vertebrates, animals, plants) at which coevolution occurred. We found that multiscale coevolution analysis is significantly more biologically relevant and sensitive to detect gene function. By using CladePP, we identified dozens of unrecognized genes that coevolved with the HRR pathway, either globally across all eukaryotes or locally in different clades. We validated eight genes in functional biological assays to have a role in DNA repair at both the cellular and organismal levels. These genes are expected to play a role in the HRR pathway and might lead to a better understanding of missing heredity in HRR-associated cancers (e.g., heredity breast and ovarian cancer). Our platform presents an innovative approach to predict gene function, identify novel factors related to different diseases and pathways, and characterize gene evolution.
Collapse
Affiliation(s)
- Dana Sherill-Rofe
- Department of Developmental Biology and Cancer Research, Institute for Medical Research-Israel-Canada, Hebrew University of Jerusalem, Jerusalem 91120, Israel
| | - Dolev Rahat
- Department of Developmental Biology and Cancer Research, Institute for Medical Research-Israel-Canada, Hebrew University of Jerusalem, Jerusalem 91120, Israel.,Sharett Institute of Oncology, Hadassah Medical Center, Ein-Kerem, Jerusalem 91120, Israel
| | - Steven Findlay
- Lady Davis Institute for Medical Research, Segal Cancer Centre, Jewish General Hospital, Montreal, Quebec H3T 1E2, Canada.,Division of Experimental Medicine, McGill University, Montreal, Quebec H4A 3J1, Canada
| | - Anna Mellul
- Department of Developmental Biology and Cancer Research, Institute for Medical Research-Israel-Canada, Hebrew University of Jerusalem, Jerusalem 91120, Israel
| | - Irene Guberman
- Department of Developmental Biology and Cancer Research, Institute for Medical Research-Israel-Canada, Hebrew University of Jerusalem, Jerusalem 91120, Israel
| | - Maya Braun
- Department of Developmental Biology and Cancer Research, Institute for Medical Research-Israel-Canada, Hebrew University of Jerusalem, Jerusalem 91120, Israel
| | - Idit Bloch
- Department of Developmental Biology and Cancer Research, Institute for Medical Research-Israel-Canada, Hebrew University of Jerusalem, Jerusalem 91120, Israel
| | - Alon Lalezari
- Department of Developmental Biology and Cancer Research, Institute for Medical Research-Israel-Canada, Hebrew University of Jerusalem, Jerusalem 91120, Israel
| | - Arash Samiei
- Lady Davis Institute for Medical Research, Segal Cancer Centre, Jewish General Hospital, Montreal, Quebec H3T 1E2, Canada.,Division of Experimental Medicine, McGill University, Montreal, Quebec H4A 3J1, Canada
| | - Ruslan Sadreyev
- Department of Molecular Biology, Massachusetts General Hospital, Boston, Massachusetts 02114, USA.,Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA.,Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Michal Goldberg
- Department of Genetics, Alexander Silberman Institute of Life Sciences, Hebrew University of Jerusalem, Jerusalem 91904, Israel
| | - Alexandre Orthwein
- Lady Davis Institute for Medical Research, Segal Cancer Centre, Jewish General Hospital, Montreal, Quebec H3T 1E2, Canada.,Division of Experimental Medicine, McGill University, Montreal, Quebec H4A 3J1, Canada.,Department of Microbiology and Immunology, McGill University, Montreal, Quebec H3A 2B4, Canada.,Gerald Bronfman Department of Oncology, McGill University, Montreal, Quebec H4A 3T2, Canada
| | - Aviad Zick
- Sharett Institute of Oncology, Hadassah Medical Center, Ein-Kerem, Jerusalem 91120, Israel
| | - Yuval Tabach
- Department of Developmental Biology and Cancer Research, Institute for Medical Research-Israel-Canada, Hebrew University of Jerusalem, Jerusalem 91120, Israel
| |
Collapse
|
15
|
Ziemert N, Alanjary M, Weber T. The evolution of genome mining in microbes - a review. Nat Prod Rep 2016; 33:988-1005. [PMID: 27272205 DOI: 10.1039/c6np00025h] [Citation(s) in RCA: 404] [Impact Index Per Article: 50.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Covering: 2006 to 2016The computational mining of genomes has become an important part in the discovery of novel natural products as drug leads. Thousands of bacterial genome sequences are publically available these days containing an even larger number and diversity of secondary metabolite gene clusters that await linkage to their encoded natural products. With the development of high-throughput sequencing methods and the wealth of DNA data available, a variety of genome mining methods and tools have been developed to guide discovery and characterisation of these compounds. This article reviews the development of these computational approaches during the last decade and shows how the revolution of next generation sequencing methods has led to an evolution of various genome mining approaches, techniques and tools. After a short introduction and brief overview of important milestones, this article will focus on the different approaches of mining genomes for secondary metabolites, from detecting biosynthetic genes to resistance based methods and "evo-mining" strategies including a short evaluation of the impact of the development of genome mining methods and tools on the field of natural products and microbial ecology.
Collapse
Affiliation(s)
- Nadine Ziemert
- Interfaculty Institute for Microbiology and Infection Medicine Tübingen (IMIT), Microbiology and Biotechnology, University of Tuebingen, Germany.
| | | | | |
Collapse
|
16
|
Palma-Silva C, Ferro M, Bacci M, Turchetto-Zolet AC. De novo assembly and characterization of leaf and floral transcriptomes of the hybridizing bromeliad species (Pitcairnia spp.) adapted to Neotropical Inselbergs. Mol Ecol Resour 2016; 16:1012-22. [PMID: 26849180 DOI: 10.1111/1755-0998.12504] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2015] [Revised: 12/17/2015] [Accepted: 12/22/2015] [Indexed: 02/06/2023]
Abstract
We present the leaf and floral transcriptomes of two hybridizing bromeliad species that differ in their major pollinator systems. Here we identified candidate genes responsible for pollinator attraction and reproductive isolation in these two species. We searched for candidate genes involved in floral traits, such as colour. Approximately 34 Gbp of cDNA sequence data were produced from both tissues and species, resulting in a total of 424 506 914 raw reads. The de novo-assembled transcriptomes consisted of a total of 263 955 contigs, further clustered into 110 977 unigenes. Over 58% of the unigenes were functionally annotated and assigned to one or more Gene Ontology terms. The transcriptomes revealed 144 unique transcripts that encode key enzymes in the flavonoid and anthocyanin biosynthesis pathways. The domain/family annotation and phylogenetic analysis allowed us to infer, by homology, potential functions of the genes encoding MYB, HD-ZIP and bZIP-HY5 transcription factors, as well as WD40 protein, which may be involved in anthocyanin and flavonoid regulation in these species. These candidate genes are associated with natural regulation in flower colour in other plant species and will facilitate future studies aimed at elucidating the molecular basis of adaptive differentiation and the evolution of mechanisms of pollinator-mediated reproductive isolation in these two bromeliads. In addition, we identified a total of 49 439 microsatellite loci. These resources will assist future research into adaptation and speciation events in bromeliad species, thus providing a starting point for investigation of the molecular mechanisms of the traits responsible for their reproductive isolation.
Collapse
Affiliation(s)
- C Palma-Silva
- Departamento de Ecologia, Programa de Pós-graduação em Ecologia e Biodiversidade, Instituto de Biociências, Universidade Estadual Paulista Julio Mesquita Filho, 13506-900, Rio Claro, SP, Brazil
| | - M Ferro
- Centro de Estudos de Insetos Sociais, Instituto de Biociências, Universidade Estadual Paulista Julio Mesquita Filho, 13506-900, Rio Claro, SP, Brazil
| | - M Bacci
- Centro de Estudos de Insetos Sociais, Instituto de Biociências, Universidade Estadual Paulista Julio Mesquita Filho, 13506-900, Rio Claro, SP, Brazil
| | - A C Turchetto-Zolet
- Departamento de Genética, Programa de Pós-graduação em Genética e Biologia Molecular, Instituto de Biociências, Universidade Federal do Rio Grande do Sul, 91501-970, Porto Alegre, RS, Brazil
| |
Collapse
|
17
|
Valdivia HO, Scholte LLS, Oliveira G, Gabaldón T, Bartholomeu DC. The Leishmania metaphylome: a comprehensive survey of Leishmania protein phylogenetic relationships. BMC Genomics 2015; 16:887. [PMID: 26518129 PMCID: PMC4628237 DOI: 10.1186/s12864-015-2091-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2015] [Accepted: 10/15/2015] [Indexed: 11/22/2022] Open
Abstract
Background Leishmaniasis is a neglected parasitic disease with diverse clinical manifestations and a complex epidemiology. It has been shown that its parasite-related traits vary between species and that they modulate infectivity, pathogenicity, and virulence. However, understanding of the species-specific adaptations responsible for these features and their evolutionary background is limited. To improve our knowledge regarding the parasite biology and adaptation mechanisms of different Leishmania species, we conducted a proteome-wide phylogenomic analysis to gain insights into Leishmania evolution. Results The analysis of the reconstructed phylomes (totaling 45,918 phylogenies) allowed us to detect genes that are shared in pathogenic Leishmania species, such as calpain-like cysteine peptidases and 3'a2rel-related proteins, or genes that could be associated with visceral or cutaneous development. This analysis also established the phylogenetic relationship of several hypothetical proteins whose roles remain to be characterized. Our findings demonstrated that gene duplication constitutes an important evolutionary force in Leishmania, acting on protein families that mediate host-parasite interactions, such as amastins, GP63 metallopeptidases, cathepsin L-like proteases, and our methods permitted a deeper analysis of their phylogenetic relationships. Conclusions Our results highlight the importance of proteome wide phylogenetic analyses to detect adaptation and evolutionary processes in different organisms and underscore the need to characterize the role of expanded and species-specific proteins in the context of Leishmania evolution by providing a framework for the phylogenetic relationships of Leishmania proteins. Phylogenomic data are publicly available for use through PhylomeDB (http://www.phylomedb.org). Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-2091-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hugo O Valdivia
- Laboratório de Imunologia e Genômica de Parasitos, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Av. Presidente Antonio Carlos, 6627 - Pampulha, Belo Horizonte, MG, 31270-901, Brazil. .,Department of Parasitology, U.S. Naval Medical Research Unit No. 6, Lima, Peru. .,Centro de Investigaciones Tecnológicas, Biomédicas y Medioambientales, Lima, Peru.
| | - Larissa L S Scholte
- Genomics and Computational Biology Group, Centro de Pesquisas René Rachou, Belo Horizonte, Brazil.
| | - Guilherme Oliveira
- Genomics and Computational Biology Group, Centro de Pesquisas René Rachou, Belo Horizonte, Brazil. .,Instituto Tecnológico Vale - ITV, Belém, Brazil.
| | - Toni Gabaldón
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain. .,Universitat Pompeu Fabra (UPF), Barcelona, Spain. .,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
| | - Daniella C Bartholomeu
- Laboratório de Imunologia e Genômica de Parasitos, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Av. Presidente Antonio Carlos, 6627 - Pampulha, Belo Horizonte, MG, 31270-901, Brazil. .,Centro de Investigaciones Tecnológicas, Biomédicas y Medioambientales, Lima, Peru.
| |
Collapse
|
18
|
Shin JH, Han JH, Kim KS. Genome-wide analyses of DNA-binding proteins harboring AT-hook motifs and their functional roles in the rice blast pathogen, Magnaporthe oryzae. Genes Genomics 2014. [DOI: 10.1007/s13258-014-0233-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
19
|
Kleist TJ, Spencley AL, Luan S. Comparative phylogenomics of the CBL-CIPK calcium-decoding network in the moss Physcomitrella, Arabidopsis, and other green lineages. FRONTIERS IN PLANT SCIENCE 2014; 5:187. [PMID: 24860579 DOI: 10.3389/fpls.2014.0018] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 01/31/2014] [Accepted: 04/21/2014] [Indexed: 05/24/2023]
Abstract
Land plants have evolved a host of anatomical and molecular adaptations for terrestrial growth. Many of these adaptations are believed to be elaborations of features that were present in their algal-like progenitors. In the model plant Arabidopsis, 10 Calcineurin B-Like proteins (CBLs) function as calcium sensors and modulate the activity of 26 CBL-Interacting Protein Kinases (CIPKs). The CBL-CIPK network coordinates environmental responses and helps maintain proper ion balances, especially during abiotic stress. We identified and analyzed CBL and CIPK homologs in green lineages, including CBLs and CIPKs from charophyte green algae, the closest living relatives of land plants. Phylogenomic evidence suggests that the network expanded from a small module, likely a single CBL-CIPK pair, present in the ancestor of modern plants and algae. Extreme conservation of the NAF motif, which mediates CBL-CIPK physical interactions, among all identified CIPKs supports the interpretation of CBL and CIPK homologs in green algae and early diverging land plants as functionally linked network components. We identified the full complement of CBL and CIPK loci in the genome of Physcomitrella, a model moss. These analyses demonstrate the strong effects of a recent moss whole genome duplication: CBL and CIPK loci appear in cognate pairs, some of which appear to be pseudogenes, with high sequence similarity. We cloned all full-length transcripts from these loci and performed yeast two-hybrid analyses to demonstrate CBL-CIPK interactions and identify specific connections within the network. Using phylogenomics, we have identified three ancient types of CBLs that are discernible by N-terminal localization motifs and a "green algal-type" clade of CIPKs with members from Physcomitrella and Arabidopsis.
Collapse
Affiliation(s)
- Thomas J Kleist
- Department of Plant and Microbial Biology, University of California Berkeley Berkeley, CA, USA
| | - Andrew L Spencley
- Department of Plant and Microbial Biology, University of California Berkeley Berkeley, CA, USA ; Department of Dermatology, Stanford University Stanford, CA, USA
| | - Sheng Luan
- Department of Plant and Microbial Biology, University of California Berkeley Berkeley, CA, USA
| |
Collapse
|
20
|
Kleist TJ, Spencley AL, Luan S. Comparative phylogenomics of the CBL-CIPK calcium-decoding network in the moss Physcomitrella, Arabidopsis, and other green lineages. FRONTIERS IN PLANT SCIENCE 2014; 5:187. [PMID: 24860579 PMCID: PMC4030171 DOI: 10.3389/fpls.2014.00187] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2014] [Accepted: 04/21/2014] [Indexed: 05/22/2023]
Abstract
Land plants have evolved a host of anatomical and molecular adaptations for terrestrial growth. Many of these adaptations are believed to be elaborations of features that were present in their algal-like progenitors. In the model plant Arabidopsis, 10 Calcineurin B-Like proteins (CBLs) function as calcium sensors and modulate the activity of 26 CBL-Interacting Protein Kinases (CIPKs). The CBL-CIPK network coordinates environmental responses and helps maintain proper ion balances, especially during abiotic stress. We identified and analyzed CBL and CIPK homologs in green lineages, including CBLs and CIPKs from charophyte green algae, the closest living relatives of land plants. Phylogenomic evidence suggests that the network expanded from a small module, likely a single CBL-CIPK pair, present in the ancestor of modern plants and algae. Extreme conservation of the NAF motif, which mediates CBL-CIPK physical interactions, among all identified CIPKs supports the interpretation of CBL and CIPK homologs in green algae and early diverging land plants as functionally linked network components. We identified the full complement of CBL and CIPK loci in the genome of Physcomitrella, a model moss. These analyses demonstrate the strong effects of a recent moss whole genome duplication: CBL and CIPK loci appear in cognate pairs, some of which appear to be pseudogenes, with high sequence similarity. We cloned all full-length transcripts from these loci and performed yeast two-hybrid analyses to demonstrate CBL-CIPK interactions and identify specific connections within the network. Using phylogenomics, we have identified three ancient types of CBLs that are discernible by N-terminal localization motifs and a "green algal-type" clade of CIPKs with members from Physcomitrella and Arabidopsis.
Collapse
Affiliation(s)
- Thomas J. Kleist
- Department of Plant and Microbial Biology, University of California BerkeleyBerkeley, CA, USA
| | - Andrew L. Spencley
- Department of Plant and Microbial Biology, University of California BerkeleyBerkeley, CA, USA
- Department of Dermatology, Stanford UniversityStanford, CA, USA
| | - Sheng Luan
- Department of Plant and Microbial Biology, University of California BerkeleyBerkeley, CA, USA
| |
Collapse
|
21
|
Lohse M, Nagel A, Herter T, May P, Schroda M, Zrenner R, Tohge T, Fernie AR, Stitt M, Usadel B. Mercator: a fast and simple web server for genome scale functional annotation of plant sequence data. PLANT, CELL & ENVIRONMENT 2014; 37:1250-8. [PMID: 24237261 DOI: 10.1111/pce.12231] [Citation(s) in RCA: 379] [Impact Index Per Article: 37.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2013] [Revised: 10/23/2013] [Accepted: 10/28/2013] [Indexed: 05/18/2023]
Abstract
Next-generation technologies generate an overwhelming amount of gene sequence data. Efficient annotation tools are required to make these data amenable to functional genomics analyses. The Mercator pipeline automatically assigns functional terms to protein or nucleotide sequences. It uses the MapMan 'BIN' ontology, which is tailored for functional annotation of plant 'omics' data. The classification procedure performs parallel sequence searches against reference databases, compiles the results and computes the most likely MapMan BINs for each query. In the current version, the pipeline relies on manually curated reference classifications originating from the three reference organisms (Arabidopsis, Chlamydomonas, rice), various other plant species that have a reviewed SwissProt annotation, and more than 2000 protein domain and family profiles at InterPro, CDD and KOG. Functional annotations predicted by Mercator achieve accuracies above 90% when benchmarked against manual annotation. In addition to mapping files for direct use in the visualization software MapMan, Mercator provides graphical overview charts, detailed annotation information in a convenient web browser interface and a MapMan-to-GO translation table to export results as GO terms. Mercator is available free of charge via http://mapman.gabipd.org/web/guest/app/Mercator.
Collapse
Affiliation(s)
- Marc Lohse
- Max-Planck-Institute of Molecular Plant Physiology, 14476, Potsdam, Germany
| | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Human disease locus discovery and mapping to molecular pathways through phylogenetic profiling. Mol Syst Biol 2013; 9:692. [PMID: 24084807 PMCID: PMC3817400 DOI: 10.1038/msb.2013.50] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2013] [Accepted: 08/29/2013] [Indexed: 12/16/2022] Open
Abstract
By analyzing the conservation of human proteins across 87 species, we sorted proteins into clusters of coevolution. Some clusters are enriched for genes assigned to particular human diseases or molecular pathways; the other genes in the same cluster may function in related pathways and diseases. ![]()
Many genes that were thought to map to different diseases are actually coevolved together and mapped into the same phylogenetic clusters. Many molecular pathways map to the same phylogenetic clusters as genes associated with specific human diseases. Focusing on proteins coevolved with the microphthalmia-associated transcription factor (MITF), we identified the Notch pathway suppressor of hairless (RBP-Jk/SuH) transcription factor, and showed that RBP-Jk functions as an MITF cofactor. Our analysis thus establishes a connectivity between different diseases and pathways, linking diseases phenotypes and functional gene groups.
Genes with common profiles of the presence and absence in disparate genomes tend to function in the same pathway. By mapping all human genes into about 1000 clusters of genes with similar patterns of conservation across eukaryotic phylogeny, we determined that sets of genes associated with particular diseases have similar phylogenetic profiles. By focusing on those human phylogenetic gene clusters that significantly overlap some of the thousands of human gene sets defined by their coexpression or annotation to pathways or other molecular attributes, we reveal the evolutionary map that connects molecular pathways and human diseases. The other genes in the phylogenetic clusters enriched for particular known disease genes or molecular pathways identify candidate genes for roles in those same disorders and pathways. Focusing on proteins coevolved with the microphthalmia-associated transcription factor (MITF), we identified the Notch pathway suppressor of hairless (RBP-Jk/SuH) transcription factor, and showed that RBP-Jk functions as an MITF cofactor.
Collapse
|
23
|
Bouzat JL, Hoostal MJ. Evolutionary Analysis and Lateral Gene Transfer of Two-Component Regulatory Systems Associated with Heavy-Metal Tolerance in Bacteria. J Mol Evol 2013; 76:267-79. [DOI: 10.1007/s00239-013-9558-z] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2012] [Accepted: 03/23/2013] [Indexed: 11/28/2022]
|
24
|
Silva LL, Marcet-Houben M, Nahum LA, Zerlotini A, Gabaldón T, Oliveira G. The Schistosoma mansoni phylome: using evolutionary genomics to gain insight into a parasite's biology. BMC Genomics 2012; 13:617. [PMID: 23148687 PMCID: PMC3534613 DOI: 10.1186/1471-2164-13-617] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2012] [Accepted: 10/22/2012] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND Schistosoma mansoni is one of the causative agents of schistosomiasis, a neglected tropical disease that affects about 237 million people worldwide. Despite recent efforts, we still lack a general understanding of the relevant host-parasite interactions, and the possible treatments are limited by the emergence of resistant strains and the absence of a vaccine. The S. mansoni genome was completely sequenced and still under continuous annotation. Nevertheless, more than 45% of the encoded proteins remain without experimental characterization or even functional prediction. To improve our knowledge regarding the biology of this parasite, we conducted a proteome-wide evolutionary analysis to provide a broad view of the S. mansoni's proteome evolution and to improve its functional annotation. RESULTS Using a phylogenomic approach, we reconstructed the S. mansoni phylome, which comprises the evolutionary histories of all parasite proteins and their homologs across 12 other organisms. The analysis of a total of 7,964 phylogenies allowed a deeper understanding of genomic complexity and evolutionary adaptations to a parasitic lifestyle. In particular, the identification of lineage-specific gene duplications pointed to the diversification of several protein families that are relevant for host-parasite interaction, including proteases, tetraspanins, fucosyltransferases, venom allergen-like proteins, and tegumental-allergen-like proteins. In addition to the evolutionary knowledge, the phylome data enabled us to automatically re-annotate 3,451 proteins through a phylogenetic-based approach rather than solely sequence similarity searches. To allow further exploitation of this valuable data, all information has been made available at PhylomeDB (http://www.phylomedb.org). CONCLUSIONS In this study, we used an evolutionary approach to assess S. mansoni parasite biology, improve genome/proteome functional annotation, and provide insights into host-parasite interactions. Taking advantage of a proteome-wide perspective rather than focusing on individual proteins, we identified that this parasite has experienced specific gene duplication events, particularly affecting genes that are potentially related to the parasitic lifestyle. These innovations may be related to the mechanisms that protect S. mansoni against host immune responses being important adaptations for the parasite survival in a potentially hostile environment. Continuing this work, a comparative analysis involving genomic, transcriptomic, and proteomic data from other helminth parasites, other parasites, and vectors will supply more information regarding parasite's biology as well as host-parasite interactions.
Collapse
Affiliation(s)
- Larissa Lopes Silva
- Grupo de Genômica e Biologia Computacional, Centro de Pesquisas René Rachou. Instituto Nacional de Ciência e Tecnologia em Doenças Tropicais. Fundação Oswaldo Cruz - FIOCRUZ, Belo Horizonte, MG, 30190-002, Brazil
- Centro de Excelência em Bioinformática, Fundação Oswaldo Cruz – FIOCRUZ, Belo Horizonte, MG, Brazil
- Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais – UFMG, Belo Horizonte, MG, Brazil
| | - Marina Marcet-Houben
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Dr. Aiguader, 88, 08003, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain
| | - Laila Alves Nahum
- Grupo de Genômica e Biologia Computacional, Centro de Pesquisas René Rachou. Instituto Nacional de Ciência e Tecnologia em Doenças Tropicais. Fundação Oswaldo Cruz - FIOCRUZ, Belo Horizonte, MG, 30190-002, Brazil
- Centro de Excelência em Bioinformática, Fundação Oswaldo Cruz – FIOCRUZ, Belo Horizonte, MG, Brazil
- Faculdade Infórium de Tecnologia, Belo Horizonte, MG, 30130-180, Brazil
| | - Adhemar Zerlotini
- Centro de Excelência em Bioinformática, Fundação Oswaldo Cruz – FIOCRUZ, Belo Horizonte, MG, Brazil
- Laboratório Multiusuário de Bioinformática, Embrapa Informática Agropecuária, Campinas, São Paulo, Brazil
| | - Toni Gabaldón
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Dr. Aiguader, 88, 08003, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain
| | - Guilherme Oliveira
- Grupo de Genômica e Biologia Computacional, Centro de Pesquisas René Rachou. Instituto Nacional de Ciência e Tecnologia em Doenças Tropicais. Fundação Oswaldo Cruz - FIOCRUZ, Belo Horizonte, MG, 30190-002, Brazil
- Centro de Excelência em Bioinformática, Fundação Oswaldo Cruz – FIOCRUZ, Belo Horizonte, MG, Brazil
| |
Collapse
|
25
|
Orthopoxvirus genome evolution: the role of gene loss. Viruses 2010; 2:1933-1967. [PMID: 21994715 PMCID: PMC3185746 DOI: 10.3390/v2091933] [Citation(s) in RCA: 125] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2010] [Revised: 08/25/2010] [Accepted: 09/01/2010] [Indexed: 12/26/2022] Open
Abstract
Poxviruses are highly successful pathogens, known to infect a variety of hosts. The family Poxviridae includes Variola virus, the causative agent of smallpox, which has been eradicated as a public health threat but could potentially reemerge as a bioterrorist threat. The risk scenario includes other animal poxviruses and genetically engineered manipulations of poxviruses. Studies of orthologous gene sets have established the evolutionary relationships of members within the Poxviridae family. It is not clear, however, how variations between family members arose in the past, an important issue in understanding how these viruses may vary and possibly produce future threats. Using a newly developed poxvirus-specific tool, we predicted accurate gene sets for viruses with completely sequenced genomes in the genus Orthopoxvirus. Employing sensitive sequence comparison techniques together with comparison of syntenic gene maps, we established the relationships between all viral gene sets. These techniques allowed us to unambiguously identify the gene loss/gain events that have occurred over the course of orthopoxvirus evolution. It is clear that for all existing Orthopoxvirus species, no individual species has acquired protein-coding genes unique to that species. All existing species contain genes that are all present in members of the species Cowpox virus and that cowpox virus strains contain every gene present in any other orthopoxvirus strain. These results support a theory of reductive evolution in which the reduction in size of the core gene set of a putative ancestral virus played a critical role in speciation and confining any newly emerging virus species to a particular environmental (host or tissue) niche.
Collapse
|
26
|
Cibrián-Jaramillo A, De la Torre-Bárcena JE, Lee EK, Katari MS, Little DP, Stevenson DW, Martienssen R, Coruzzi GM, DeSalle R. Using phylogenomic patterns and gene ontology to identify proteins of importance in plant evolution. Genome Biol Evol 2010; 2:225-39. [PMID: 20624728 PMCID: PMC2997538 DOI: 10.1093/gbe/evq012] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/14/2010] [Indexed: 01/01/2023] Open
Abstract
We use measures of congruence on a combined expressed sequenced tag genome phylogeny to identify proteins that have potential significance in the evolution of seed plants. Relevant proteins are identified based on the direction of partitioned branch and hidden support on the hypothesis obtained on a 16-species tree, constructed from 2,557 concatenated orthologous genes. We provide a general method for detecting genes or groups of genes that may be under selection in directions that are in agreement with the phylogenetic pattern. Gene partitioning methods and estimates of the degree and direction of support of individual gene partitions to the overall data set are used. Using this approach, we correlate positive branch support of specific genes for key branches in the seed plant phylogeny. In addition to basic metabolic functions, such as photosynthesis or hormones, genes involved in posttranscriptional regulation by small RNAs were significantly overrepresented in key nodes of the phylogeny of seed plants. Two genes in our matrix are of critical importance as they are involved in RNA-dependent regulation, essential during embryo and leaf development. These are Argonaute and the RNA-dependent RNA polymerase 6 found to be overrepresented in the angiosperm clade. We use these genes as examples of our phylogenomics approach and show that identifying partitions or genes in this way provides a platform to explain some of the more interesting organismal differences among species, and in particular, in the evolution of plants.
Collapse
Affiliation(s)
- Angélica Cibrián-Jaramillo
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, New York, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Towfic F, VanderPIas S, OIiver CA, Couture OI, TuggIe CK, West GreenIee MH, Honavar V. Detection of gene orthology from gene co-expression and protein interaction networks. BMC Bioinformatics 2010; 11 Suppl 3:S7. [PMID: 20438654 PMCID: PMC2863066 DOI: 10.1186/1471-2105-11-s3-s7] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Ortholog detection methods present a powerful approach for finding genes that participate in similar biological processes across different organisms, extending our understanding of interactions between genes across different pathways, and understanding the evolution of gene families. RESULTS We exploit features derived from the alignment of protein-protein interaction networks and gene-coexpression networks to reconstruct KEGG orthologs for Drosophila melanogaster, Saccharomyces cerevisiae, Mus musculus and Homo sapiens protein-protein interaction networks extracted from the DIP repository and Mus musculus and Homo sapiens and Sus scrofa gene coexpression networks extracted from NCBI's Gene Expression Omnibus using the decision tree, Naive-Bayes and Support Vector Machine classification algorithms. CONCLUSIONS The performance of our classifiers in reconstructing KEGG orthologs is compared against a basic reciprocal BLAST hit approach. We provide implementations of the resulting algorithms as part of BiNA, an open source biomolecular network alignment toolkit.
Collapse
Affiliation(s)
- Fadi Towfic
- Bioinformatics and Computational Biology Graduate Program Iowa State University, Ames, IA, USA
- Department of Computer Science, Iowa State University, Ames, IA, USA
| | - Susan VanderPIas
- Bioinformatics and Computational Biology Graduate Program Iowa State University, Ames, IA, USA
| | | | - OIiver Couture
- Department of Animal Science, Iowa State University, Ames, IA, USA
| | - Christopher K TuggIe
- Bioinformatics and Computational Biology Graduate Program Iowa State University, Ames, IA, USA
- Department of Animal Science, Iowa State University, Ames, IA, USA
| | - M Heather West GreenIee
- Bioinformatics and Computational Biology Graduate Program Iowa State University, Ames, IA, USA
- Department of Biomedical Sciences, Iowa State University, Ames, IA, USA
| | - Vasant Honavar
- Bioinformatics and Computational Biology Graduate Program Iowa State University, Ames, IA, USA
- Department of Computer Science, Iowa State University, Ames, IA, USA
| |
Collapse
|
28
|
Timmins J, Gordon E, Caria S, Leonard G, Acajjaoui S, Kuo MS, Monchois V, McSweeney S. Structural and mutational analyses of Deinococcus radiodurans UvrA2 provide insight into DNA binding and damage recognition by UvrAs. Structure 2009; 17:547-58. [PMID: 19368888 DOI: 10.1016/j.str.2009.02.008] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2008] [Revised: 02/03/2009] [Accepted: 02/04/2009] [Indexed: 10/20/2022]
Abstract
UvrA proteins are key actors in DNA damage repair and play an essential role in prokaryotic nucleotide excision repair (NER), a pathway that is unique in its ability to remove a broad spectrum of DNA lesions. Understanding the DNA binding and damage recognition activities of the UvrA family is a critical component for establishing the molecular basis of this process. Here we report the structure of the class II UvrA2 from Deinococcus radiodurans in two crystal forms. These structures, coupled with mutational analyses and comparison with the crystal structure of class I UvrA from Bacillus stearothermophilus, suggest a previously unsuspected role for the identified insertion domains of UvrAs in both DNA binding and damage recognition. Taken together, the available information suggests a model for how UvrA interacts with DNA and thus sheds new light on the molecular mechanisms underlying the role of UvrA in the early steps of NER.
Collapse
Affiliation(s)
- Joanna Timmins
- European Synchrotron Radiation Facility, 38043 Grenoble, France
| | | | | | | | | | | | | | | |
Collapse
|
29
|
Zhou JM, Seo YW, Ibrahim RK. Biochemical characterization of a putative wheat caffeic acid O-methyltransferase. PLANT PHYSIOLOGY AND BIOCHEMISTRY : PPB 2009; 47:322-326. [PMID: 19211254 DOI: 10.1016/j.plaphy.2008.11.011] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2008] [Accepted: 11/26/2008] [Indexed: 05/27/2023]
Abstract
A wheat (Triticum aestivum L., near isogenic line of Hamlet) O-methyltransferase (OMT) was previously reported as a putative caffeic acid OMT (TaCOMT1), involved in lignin biosynthesis, based on its high sequence similarity with a number of graminaceous COMTs. The fact that the putative TaCOMT1 exhibits a significantly high sequence homology to another recently characterized wheat flavone-specific OMT (TaOMT2), and that molecular modeling studies indicated several conserved amino acid residues involved in substrate binding and catalysis of both proteins, prompted an investigation of its appropriate substrate specificity. We report here that TaCOMT1 exhibits highest preference for the flavone tricetin, and lowest activity with the lignin precursors, caffeic acid/5-hydroxyferulic acid as the methyl acceptor molecules, indicating that it is not involved in lignin biosynthesis. We recommend its reannotation to a flavone-specific TaOMT1 that is distinct from TaOMT2.
Collapse
Affiliation(s)
- Jian-Min Zhou
- Plant Biochemistry Laboratory, Concordia University, Montréal, Québec, Canada H4B 1R6
| | | | | |
Collapse
|
30
|
Jiang Z. Protein Function Predictions Based on the Phylogenetic Profile Method. Crit Rev Biotechnol 2008; 28:233-8. [PMID: 19051102 DOI: 10.1080/07388550802512633] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
31
|
Singh S, Stavrinides J, Christendat D, Guttman DS. A phylogenomic analysis of the shikimate dehydrogenases reveals broadscale functional diversification and identifies one functionally distinct subclass. Mol Biol Evol 2008; 25:2221-32. [PMID: 18669580 DOI: 10.1093/molbev/msn170] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The shikimate dehydrogenases (SDH) represent a widely distributed enzyme family with an essential role in secondary metabolism. This superfamily had been previously subdivided into 4 enzyme groups (AroE, YdiB, SdhL, and RifI), which show clear biochemical and functional differences ranging from amino acid biosynthesis to antibiotic production. Despite the importance of this group, little is known about how such essential enzymatic functions can evolve and diversify. We dissected the enzyme superfamily with a phylogenomic analysis of approximately 250 fully sequenced genomes, making use of previously characterized representatives from each enzyme class, and the key substrate-binding residues known to distinguish substrate specificity. We identified 5 major evolutionary and functional SDH subgroups and several other potentially unique functional classes within this complex enzyme family and then validated the functional distinctiveness of each group by characterizing the 5 SDH homologs found in Pseudomonas putida KT2440 biochemically. We identified an entirely novel functionally distinct subgroup, which we designated Ael1 (AroE-like1) and also delineated a new group of shikimate/quinate dehydrogenases (YdiB2), which is phylogenetically distinct from the previously described Escherichia coli YdiB. The combination of biochemical, phylogenetic, and genomic approaches has revealed the broad extent to which the SDH enzyme superfamily has diversified. Five functional groups were validated with the potential for at least 5 additional subgroups. Our analysis also identified a new SDH functional group, which appears to have evolved recently from an ancestral AroE, illustrating a very prominent role of horizontal transmission and neofunctionalizaton in the evolutionary and functional diversification of this enzyme family.
Collapse
Affiliation(s)
- Sasha Singh
- Department of Pathology, Children's Hospital Boston, Harvard Medical School, Boston, MA, USA
| | | | | | | |
Collapse
|
32
|
Levasseur A, Pontarotti P, Poch O, Thompson JD. Strategies for reliable exploitation of evolutionary concepts in high throughput biology. Evol Bioinform Online 2008; 4:121-37. [PMID: 19204813 PMCID: PMC2614184 DOI: 10.4137/ebo.s597] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
The recent availability of the complete genome sequences of a large number of model organisms, together with the immense amount of data being produced by the new high-throughput technologies, means that we can now begin comparative analyses to understand the mechanisms involved in the evolution of the genome and their consequences in the study of biological systems. Phylogenetic approaches provide a unique conceptual framework for performing comparative analyses of all this data, for propagating information between different systems and for predicting or inferring new knowledge. As a result, phylogeny-based inference systems are now playing an increasingly important role in most areas of high throughput genomics, including studies of promoters (phylogenetic footprinting), interactomes (based on the presence and degree of conservation of interacting proteins), and in comparisons of transcriptomes or proteomes (phylogenetic proximity and co-regulation/co-expression). Here we review the recent developments aimed at making automatic, reliable phylogeny-based inference feasible in large-scale projects. We also discuss how evolutionary concepts and phylogeny-based inference strategies are now being exploited in order to understand the evolution and function of biological systems. Such advances will be fundamental for the success of the emerging disciplines of systems biology and synthetic biology, and will have wide-reaching effects in applied fields such as biotechnology, medicine and pharmacology.
Collapse
Affiliation(s)
- Anthony Levasseur
- Phylogenomics Laboratory, EA 3781 Evolution Biologique, Université de Provence, 13331 Marseille, France
| | | | | | | |
Collapse
|
33
|
Woody OZ, Doxey AC, McConkey BJ. Assessing the evolution of gene expression using microarray data. Evol Bioinform Online 2008; 4:139-52. [PMID: 19204814 PMCID: PMC2614203 DOI: 10.4137/ebo.s628] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Classical studies of the evolution of gene function have predominantly focused on mutations within protein coding regions. With the advent of microarrays, however, it has become possible to evaluate the transcriptional activity of a gene as an additional characteristic of function. Recent studies have revealed an equally important role for gene regulation in the retention and evolution of duplicate genes. Here we review approaches to assessing the evolution of gene expression using microarray data, and discuss potential influences on expression divergence. Currently, there are no established standards on how best to identify and quantify instances of expression divergence. There have also been few efforts to date that incorporate suspected influences into mathematical models of expression divergence. Such developments will be crucial to a comprehensive understanding of the role gene duplications and expression evolution play in the emergence of complex traits and functional diversity. An integrative approach to gene family evolution, including both orthologous and paralogous genes, has the potential to bring strong predictive power both to the functional annotation of extant proteins and to the inference of functional characteristics of ancestral gene family members.
Collapse
Affiliation(s)
- Owen Z Woody
- Department of Biology, University of Waterloo, Waterloo, Ontario Canada
| | | | | |
Collapse
|
34
|
Fuellen G. Homology and phylogeny and their automated inference. Naturwissenschaften 2008; 95:469-81. [PMID: 18288471 DOI: 10.1007/s00114-008-0348-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2007] [Revised: 12/20/2007] [Accepted: 01/12/2008] [Indexed: 11/25/2022]
Abstract
The analysis of the ever-increasing amount of biological and biomedical data can be pushed forward by comparing the data within and among species. For example, an integrative analysis of data from the genome sequencing projects for various species traces the evolution of the genomes and identifies conserved and innovative parts. Here, I review the foundations and advantages of this "historical" approach and evaluate recent attempts at automating such analyses. Biological data is comparable if a common origin exists (homology), as is the case for members of a gene family originating via duplication of an ancestral gene. If the family has relatives in other species, we can assume that the ancestral gene was present in the ancestral species from which all the other species evolved. In particular, describing the relationships among the duplicated biological sequences found in the various species is often possible by a phylogeny, which is more informative than homology statements. Detecting and elaborating on common origins may answer how certain biological sequences developed, and predict what sequences are in a particular species and what their function is. Such knowledge transfer from sequences in one species to the homologous sequences of the other is based on the principle of 'my closest relative looks and behaves like I do', often referred to as 'guilt by association'. To enable knowledge transfer on a large scale, several automated 'phylogenomics pipelines' have been developed in recent years, and seven of these will be described and compared. Overall, the examples in this review demonstrate that homology and phylogeny analyses, done on a large (and automated) scale, can give insights into function in biology and biomedicine.
Collapse
Affiliation(s)
- Georg Fuellen
- Bioinformatics Research Group, Institute for Mathematics and Computer Science, Ernst-Moritz-Arndt-University Greifswald, Greifswald, Germany.
| |
Collapse
|
35
|
Phylogenomics, Protein Family Evolution, and the Tree of Life: An Integrated Approach between Molecular Evolution and Computational Intelligence. APPLICATIONS OF COMPUTATIONAL INTELLIGENCE IN BIOLOGY 2008. [DOI: 10.1007/978-3-540-78534-7_11] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
36
|
Martins-Pinheiro M, Marques RCP, Menck CFM. Genome analysis of DNA repair genes in the alpha proteobacterium Caulobacter crescentus. BMC Microbiol 2007; 7:17. [PMID: 17352799 PMCID: PMC1839093 DOI: 10.1186/1471-2180-7-17] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2006] [Accepted: 03/12/2007] [Indexed: 11/10/2022] Open
Abstract
Background The integrity of DNA molecules is fundamental for maintaining life. The DNA repair proteins protect organisms against genetic damage, by removal of DNA lesions or helping to tolerate them. DNA repair genes are best known from the gamma-proteobacterium Escherichia coli, which is the most understood bacterial model. However, genome sequencing raises questions regarding uniformity and ubiquity of these DNA repair genes and pathways, reinforcing the need for identifying genes and proteins, which may respond to DNA damage in other bacteria. Results In this study, we employed a bioinformatic approach, to analyse and describe the open reading frames potentially related to DNA repair from the genome of the alpha-proteobacterium Caulobacter crescentus. This was performed by comparison with known DNA repair related genes found in public databases. As expected, although C. crescentus and E. coli bacteria belong to separate phylogenetic groups, many of their DNA repair genes are very similar. However, some important DNA repair genes are absent in the C. crescentus genome and other interesting functionally related gene duplications are present, which do not occur in E. coli. These include DNA ligases, exonuclease III (xthA), endonuclease III (nth), O6-methylguanine-DNA methyltransferase (ada gene), photolyase-like genes, and uracil-DNA-glycosylases. On the other hand, the genes imuA and imuB, which are involved in DNA damage induced mutagenesis, have recently been described in C. crescentus, but are absent in E. coli. Particularly interesting are the potential atypical phylogeny of one of the photolyase genes in alpha-proteobacteria, indicating an origin by horizontal transfer, and the duplication of the Ada orthologs, which have diverse structural configurations, including one that is still unique for C. crescentus. Conclusion The absence and the presence of certain genes are discussed and predictions are made considering the particular aspects of the C. crescentus among other known DNA repair pathways. The observed differences enlarge what is known for DNA repair in the Bacterial world, and provide a useful framework for further experimental studies in this organism.
Collapse
Affiliation(s)
- Marinalva Martins-Pinheiro
- Department of Microbiology, Institute of Biomedical Sciences, Av. Prof. Lineu Prestes 1374, São Paulo, 05508-900, SP, Brazil
| | - Regina CP Marques
- Department of Microbiology, Institute of Biomedical Sciences, Av. Prof. Lineu Prestes 1374, São Paulo, 05508-900, SP, Brazil
| | - Carlos FM Menck
- Department of Microbiology, Institute of Biomedical Sciences, Av. Prof. Lineu Prestes 1374, São Paulo, 05508-900, SP, Brazil
| |
Collapse
|
37
|
Lee I, Narayanaswamy R, Marcotte EM. 24 Bioinformatic Prediction of Yeast Gene Function. J Microbiol Methods 2007. [DOI: 10.1016/s0580-9517(06)36024-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
38
|
Bandyopadhyay S, Sharan R, Ideker T. Systematic identification of functional orthologs based on protein network comparison. Genome Res 2006; 16:428-35. [PMID: 16510899 PMCID: PMC1415213 DOI: 10.1101/gr.4526006] [Citation(s) in RCA: 148] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Annotating protein function across species is an important task that is often complicated by the presence of large paralogous gene families. Here, we report a novel strategy for identifying functionally related proteins that supplements sequence-based comparisons with information on conserved protein-protein interactions. First, the protein interaction networks of two species are aligned by assigning proteins to sequence homology clusters using the Inparanoid algorithm. Next, probabilistic inference is performed on the aligned networks to identify pairs of proteins, one from each species, that are likely to retain the same function based on conservation of their interacting partners. Applying this method to Drosophila melanogaster and Saccharomyces cerevisiae, we analyze 121 cases for which functional orthology assignment is ambiguous when sequence similarity is used alone. In 61 of these cases, the network supports a different protein pair than that favored by sequence comparisons. These results suggest that network analysis can be used to provide a key source of information for refining sequence-based homology searches.
Collapse
Affiliation(s)
- Sourav Bandyopadhyay
- Program in Bioinformatics, University of California at San Diego, La Jolla, California 92093, USA
| | | | | |
Collapse
|
39
|
Zhang W, Culley DE, Gritsenko MA, Moore RJ, Nie L, Scholten JCM, Petritis K, Strittmatter EF, Camp DG, Smith RD, Brockman FJ. LC-MS/MS based proteomic analysis and functional inference of hypothetical proteins in Desulfovibrio vulgaris. Biochem Biophys Res Commun 2006; 349:1412-9. [PMID: 16982031 DOI: 10.1016/j.bbrc.2006.09.019] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2006] [Accepted: 09/07/2006] [Indexed: 11/26/2022]
Abstract
High efficiency capillary liquid chromatography-tandem mass spectrometry (LC-MS/MS) was used to examine the proteins extracted from Desulfovibrio vulgaris cells across six treatment conditions. While our previous study provided a proteomic overview of the cellular metabolism based on proteins with known functions [W. Zhang, M.A. Gritsenko, R.J. Moore, D.E. Culley, L. Nie, K. Petritis, E.F. Strittmatter, D.G. Camp II, R.D. Smith, F.J. Brockman, A proteomic view of the metabolism in Desulfovibrio vulgaris determined by liquid chromatography coupled with tandem mass spectrometry, Proteomics 6 (2006) 4286-4299], this study describes the global detection and functional inference for hypothetical D. vulgaris proteins. Using criteria that a given peptide of a protein is identified from at least two out of three independent LC-MS/MS measurements and that for any protein at least two different peptides are identified among the three measurements, 129 open reading frames (ORFs) originally annotated as hypothetical proteins were found to encode expressed proteins. Functional inference for the conserved hypothetical proteins was performed by a combination of several non-homology based methods: genomic context analysis, phylogenomic profiling, and analysis of a combination of experimental information, including peptide detection in cells grown under specific culture conditions and cellular location of the proteins. Using this approach we were able to assign possible functions to 20 conserved hypothetical proteins. This study demonstrated that a combination of proteomics and bioinformatics methodologies can provide verification of the expression of hypothetical proteins and improve genome annotation.
Collapse
Affiliation(s)
- Weiwen Zhang
- Microbiology Group, Pacific Northwest National Laboratory, 902 Battelle Boulevard, P.O. Box 999, Richland, WA 99352, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Alako BTF, Rainey D, Nijveen H, Leunissen JAM. TreeDomViewer: a tool for the visualization of phylogeny and protein domain structure. Nucleic Acids Res 2006; 34:W104-9. [PMID: 16844970 PMCID: PMC1538806 DOI: 10.1093/nar/gkl171] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
Phylogenetic analysis and examination of protein domains allow accurate genome annotation and are invaluable to study proteins and protein complex evolution. However, two sequences can be homologous without sharing statistically significant amino acid or nucleotide identity, presenting a challenging bioinformatics problem. We present TreeDomViewer, a visualization tool available as a web-based interface that combines phylogenetic tree description, multiple sequence alignment and InterProScan data of sequences and generates a phylogenetic tree projecting the corresponding protein domain information onto the multiple sequence alignment. Thereby it makes use of existing domain prediction tools such as InterProScan. TreeDomViewer adopts an evolutionary perspective on how domain structure of two or more sequences can be aligned and compared, to subsequently infer the function of an unknown homolog. This provides insight into the function assignment of, in terms of amino acid substitution, very divergent but yet closely related family members. Our tool produces an interactive scalar vector graphics image that provides orthological relationship and domain content of proteins of interest at one glance. In addition, PDF, JPEG or PNG formatted output is also provided. These features make TreeDomViewer a valuable addition to the annotation pipeline of unknown genes or gene products. TreeDomViewer is available at .
Collapse
Affiliation(s)
- Blaise T. F. Alako
- Laboratory of Bioinformatics, Wageningen University and Research CentrePO Box 8128, 6700 ET Wageningen, The Netherlands
- Centre for BioSystems GenomicsPO Box 98, 6700 AB Wageningen, The Netherlands
| | - Daphne Rainey
- KEYGENE NVPO Box 216 6700 AE Wageningen, The Netherlands
| | - Harm Nijveen
- Laboratory of Bioinformatics, Wageningen University and Research CentrePO Box 8128, 6700 ET Wageningen, The Netherlands
| | - Jack A. M. Leunissen
- Laboratory of Bioinformatics, Wageningen University and Research CentrePO Box 8128, 6700 ET Wageningen, The Netherlands
- To whom correspondence should be addressed. Tel: +31 317 482 036; Fax: +31 317 483 584;
| |
Collapse
|
41
|
Jothi R, Zotenko E, Tasneem A, Przytycka TM. COCO-CL: hierarchical clustering of homology relations based on evolutionary correlations. Bioinformatics 2006; 22:779-88. [PMID: 16434444 PMCID: PMC1620014 DOI: 10.1093/bioinformatics/btl009] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Determining orthology relations among genes across multiple genomes is an important problem in the post-genomic era. Identifying orthologous genes can not only help predict functional annotations for newly sequenced or poorly characterized genomes, but can also help predict new protein-protein interactions. Unfortunately, determining orthology relation through computational methods is not straightforward due to the presence of paralogs. Traditional approaches have relied on pairwise sequence comparisons to construct graphs, which were then partitioned into putative clusters of orthologous groups. These methods do not attempt to preserve the non-transitivity and hierarchic nature of the orthology relation. RESULTS We propose a new method, COCO-CL, for hierarchical clustering of homology relations and identification of orthologous groups of genes. Unlike previous approaches, which are based on pairwise sequence comparisons, our method explores the correlation of evolutionary histories of individual genes in a more global context. COCO-CL can be used as a semi-independent method to delineate the orthology/paralogy relation for a refined set of homologous proteins obtained using a less-conservative clustering approach, or as a refiner that removes putative out-paralogs from clusters computed using a more inclusive approach. We analyze our clustering results manually, with support from literature and functional annotations. Since our orthology determination procedure does not employ a species tree to infer duplication events, it can be used in situations when the species tree is unknown or uncertain. CONTACT jothi@mail.nih.gov, przytyck@mail.nih.gov SUPPLEMENTARY INFORMATION Supplementary materials are available at Bioinformatics online.
Collapse
Affiliation(s)
- Raja Jothi
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| | | | | | | |
Collapse
|
42
|
Fan J, Lefebvre J, Manjunath P. Bovine seminal plasma proteins and their relatives: A new expanding superfamily in mammals. Gene 2006; 375:63-74. [PMID: 16678981 DOI: 10.1016/j.gene.2006.02.025] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2005] [Revised: 02/10/2006] [Accepted: 02/11/2006] [Indexed: 11/17/2022]
Abstract
BSP proteins represent three major proteins of bovine seminal plasma: BSP-A1/-A2, -A3 and -30 kDa. The BSP protein signature is characterized by two tandemly repeated fibronectin type 2 (Fn2) domains. Although classical affinity chromatography and protein sequencing have proven that the BSP protein homologs may be ubiquitous in mammals and functionally related to sperm capacitation, only the three bovine genes have been reported thus far. In this study, we report three new BSP protein-related genes from bovine, as well as other BSP protein-related DNA sequences from human, chimpanzee, mouse, rat, dog, horse and rabbit. Analysis of the relationships between all Fn2 domain-containing proteins revealed that the Fn2 domains found in BSP-related proteins have special features that distinguish them from non-BSP-related proteins. These features can be used to identify new BSP protein-related sequences. Further molecular evolutionary analysis of the BSP protein lineage revealed that all BSP proteins and their related sequences can be grouped into three subfamilies: BSPH4, BSPH5 and BSPH6, which indicates that the BSP protein family is much bigger than previously envisioned. More interestingly, the three BSP proteins in bovine within the BSPH4-subfamily were shown to evolve rapidly. The ratio of nonsynonymous to synonymous substitutions was higher than 1. The analysis also indicated that the rate of evolution was heterogeneous between the first and second Fn2 domains of the genes. These data may reflect that some amino acids in BSP proteins are under a strong positive selection after gene duplication and that each BSP protein evolves rapidly, possibly to acquire new functions.
Collapse
Affiliation(s)
- Jinjiang Fan
- Guy-Bernier Research Center, Maisonneuve-Rosemont Hospital, Montreal, Québec, Canada H1T 2M4
| | | | | |
Collapse
|
43
|
Wu M, Ren Q, Durkin AS, Daugherty SC, Brinkac LM, Dodson RJ, Madupu R, Sullivan SA, Kolonay JF, Nelson WC, Tallon LJ, Jones KM, Ulrich LE, Gonzalez JM, Zhulin IB, Robb FT, Eisen JA. Life in hot carbon monoxide: the complete genome sequence of Carboxydothermus hydrogenoformans Z-2901. PLoS Genet 2005; 1:e65. [PMID: 16311624 PMCID: PMC1287953 DOI: 10.1371/journal.pgen.0010065] [Citation(s) in RCA: 177] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2005] [Accepted: 10/19/2005] [Indexed: 11/20/2022] Open
Abstract
We report here the sequencing and analysis of the genome of the thermophilic bacterium Carboxydothermus hydrogenoformans Z-2901. This species is a model for studies of hydrogenogens, which are diverse bacteria and archaea that grow anaerobically utilizing carbon monoxide (CO) as their sole carbon source and water as an electron acceptor, producing carbon dioxide and hydrogen as waste products. Organisms that make use of CO do so through carbon monoxide dehydrogenase complexes. Remarkably, analysis of the genome of C. hydrogenoformans reveals the presence of at least five highly differentiated anaerobic carbon monoxide dehydrogenase complexes, which may in part explain how this species is able to grow so much more rapidly on CO than many other species. Analysis of the genome also has provided many general insights into the metabolism of this organism which should make it easier to use it as a source of biologically produced hydrogen gas. One surprising finding is the presence of many genes previously found only in sporulating species in the Firmicutes Phylum. Although this species is also a Firmicutes, it was not known to sporulate previously. Here we show that it does sporulate and because it is missing many of the genes involved in sporulation in other species, this organism may serve as a “minimal” model for sporulation studies. In addition, using phylogenetic profile analysis, we have identified many uncharacterized gene families found in all known sporulating Firmicutes, but not in any non-sporulating bacteria, including a sigma factor not known to be involved in sporulation previously. Carboxydothermus hydrogenoformans, a bacterium isolated from a Russian hotspring, is studied for three major reasons: it grows at very high temperature, it lives almost entirely on a diet of carbon monoxide (CO), and it converts water to hydrogen gas as part of its metabolism. Understanding this organism's unique biology gets a boost from the decoding of its genome, reported in this issue of PLoS Genetics. For example, genome analysis reveals that it encodes five different forms of the protein machine carbon monoxide dehydrogenase (CODH). Most species have no CODH and even species that utilize CO usually have only one or two. The five CODH in C. hydrogenoformans likely allow it to both use CO for diverse cellular processes and out-compete for it when it is limiting. The genome sequence also led the researchers to experimentally document new aspects of this species' biology including the ability to form spores. The researchers then used comparative genomic analysis to identify conserved genes found in all spore-forming species, including Bacillus anthracis, and not in any other species. Finally, the genome sequence and analysis reported here will aid in those trying to develop this and other species into systems to biologically produce hydrogen gas from water.
Collapse
Affiliation(s)
- Martin Wu
- The Institute for Genomic Research, Rockville, Maryland, United States of America
| | - Qinghu Ren
- The Institute for Genomic Research, Rockville, Maryland, United States of America
| | - A. Scott Durkin
- The Institute for Genomic Research, Rockville, Maryland, United States of America
| | - Sean C Daugherty
- The Institute for Genomic Research, Rockville, Maryland, United States of America
| | - Lauren M Brinkac
- The Institute for Genomic Research, Rockville, Maryland, United States of America
| | - Robert J Dodson
- The Institute for Genomic Research, Rockville, Maryland, United States of America
| | - Ramana Madupu
- The Institute for Genomic Research, Rockville, Maryland, United States of America
| | - Steven A Sullivan
- The Institute for Genomic Research, Rockville, Maryland, United States of America
| | - James F Kolonay
- The Institute for Genomic Research, Rockville, Maryland, United States of America
| | - William C Nelson
- The Institute for Genomic Research, Rockville, Maryland, United States of America
| | - Luke J Tallon
- The Institute for Genomic Research, Rockville, Maryland, United States of America
| | - Kristine M Jones
- The Institute for Genomic Research, Rockville, Maryland, United States of America
| | - Luke E Ulrich
- Center for Bioinformatics and Computational Biology, School of Biology, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Juan M Gonzalez
- Center of Marine Biotechnology, University of Maryland Biotechnology Institute, Baltimore, Maryland, United States of America
| | - Igor B Zhulin
- Center for Bioinformatics and Computational Biology, School of Biology, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Frank T Robb
- Center of Marine Biotechnology, University of Maryland Biotechnology Institute, Baltimore, Maryland, United States of America
| | - Jonathan A Eisen
- The Institute for Genomic Research, Rockville, Maryland, United States of America
- Johns Hopkins University, Baltimore, Maryland, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
44
|
Fuellen G, Spitzer M, Cullen P, Lorkowski S. Correspondence of function and phylogeny of ABC proteins based on an automated analysis of 20 model protein data sets. Proteins 2005; 61:888-99. [PMID: 16254912 DOI: 10.1002/prot.20616] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Using our BLAST-based procedure RiPE (Retrieval-induced Phylogeny Environment), which automates the evolutionary analysis of a protein family, we assembled a set of 1138 ABC protein components [adenosine triphosphate (ATP)-binding cassette and transmembrane domain] from the protein data sets of 20 model organisms and subjected them to phylogenetic and functional analysis. For maximum speed, we based the alignment directly on a homology search with a profile of all known human ABC proteins and used neighbor-joining tree estimation. All but 11 sequences from Homo sapiens, Arabidopsis thaliana, Drosophila melanogaster, and Saccharomyces cerevisiae were placed into the correct subtree/subfamily, reproducing published classifications of the individual organisms. By following a simple "function transfer rule", our comparative phylogenetic analysis successfully predicted the known function of human ABC proteins in 19 of 22 cases. Three functional predictions did not correspond, and 10 were novel. Predictions based on BLAST alone were inferior in five cases and superior in two. Bacterial sequences were placed close to the root of most subtrees. This placement coincides with domain architecture, suggesting an early diversification of the ABC family before the kingdoms split apart. Our approach can, in principle, be used to annotate any protein family of any organism included in the study.
Collapse
Affiliation(s)
- Georg Fuellen
- Department of Medicine, AG Bioinformatics, University of Münster, Münster, Germany.
| | | | | | | |
Collapse
|
45
|
Krishnamurthy N, Sjölander K. Phylogenomic Inference of Protein Molecular Function. ACTA ACUST UNITED AC 2005; Chapter 6:Unit 6.9. [DOI: 10.1002/0471250953.bi0609s11] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
46
|
Francke C, Siezen RJ, Teusink B. Reconstructing the metabolic network of a bacterium from its genome. Trends Microbiol 2005; 13:550-8. [PMID: 16169729 DOI: 10.1016/j.tim.2005.09.001] [Citation(s) in RCA: 112] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2005] [Revised: 08/25/2005] [Accepted: 09/08/2005] [Indexed: 10/25/2022]
Abstract
The prospect of understanding the relationship between the genome and the physiology of an organism is an important incentive to reconstruct metabolic networks. The first steps in the process can be automated and it does not take much effort to obtain an initial metabolic reconstruction from a genome sequence. However, such a reconstruction is certainly not flawless and correction of the many imperfections is laborious. It requires the combined analysis of the available information on protein sequence, phylogeny, gene-context and co-occurrence but is also aided by high-throughput experimental data. Simultaneously, the reconstructed network provides the opportunity to visualize the "omics" data within a relevant biological functional context and thus aids the interpretation of those data.
Collapse
Affiliation(s)
- Christof Francke
- Wageningen Centre for Food Sciences, PO Box 557, 6700 AN Wageningen, the Netherlands.
| | | | | |
Collapse
|
47
|
Ibrahim RK. A forty-year journey in plant research: original contributions to flavonoid biochemistry. ACTA ACUST UNITED AC 2005. [DOI: 10.1139/b05-030] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
This review highlights original contributions by the author to the field of flavonoid biochemistry during his research career of more than four decades. These include elucidation of novel aspects of some of the common enzymatic reactions involved in the later steps of flavonoid biosynthesis, with emphasis on methyltransferases, glucosyltransferases, sulfotransferases, and an oxoglutarate-dependent dioxygenase, as well as cloning, and inferences about phylogenetic relationships, of the genes encoding some of these enzymes. The three-dimensional structure of a flavonol O-methyltransferase was studied through homology-based modeling, using a caffeic acid O-methyltransferase as a template, to explain their strict substrate preferences. In addition, the biological significance of enzymatic prenylation of isoflavones, as well as their role as phytoanticipins and inducers of nodulation genes, are emphasized. Finally, the potential application of knowledge about the genes encoding these enzyme reactions is discussed in terms of improving plant productivity and survival, modification of flavonoid profiles, and the search for new compounds with pharmaceutical and (or) nutraceutical value.Key words: flavonoid enzymology, metabolite localization, gene cloning, 3-D structure, phylogeny.
Collapse
|
48
|
Ward N, Larsen Ø, Sakwa J, Bruseth L, Khouri H, Durkin AS, Dimitrov G, Jiang L, Scanlan D, Kang KH, Lewis M, Nelson KE, Methé B, Wu M, Heidelberg JF, Paulsen IT, Fouts D, Ravel J, Tettelin H, Ren Q, Read T, DeBoy RT, Seshadri R, Salzberg SL, Jensen HB, Birkeland NK, Nelson WC, Dodson RJ, Grindhaug SH, Holt I, Eidhammer I, Jonasen I, Vanaken S, Utterback T, Feldblyum TV, Fraser CM, Lillehaug JR, Eisen JA. Genomic insights into methanotrophy: the complete genome sequence of Methylococcus capsulatus (Bath). PLoS Biol 2004; 2:e303. [PMID: 15383840 PMCID: PMC517821 DOI: 10.1371/journal.pbio.0020303] [Citation(s) in RCA: 204] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2004] [Accepted: 07/14/2004] [Indexed: 11/23/2022] Open
Abstract
Methanotrophs are ubiquitous bacteria that can use the greenhouse gas methane as a sole carbon and energy source for growth, thus playing major roles in global carbon cycles, and in particular, substantially reducing emissions of biologically generated methane to the atmosphere. Despite their importance, and in contrast to organisms that play roles in other major parts of the carbon cycle such as photosynthesis, no genome-level studies have been published on the biology of methanotrophs. We report the first complete genome sequence to our knowledge from an obligate methanotroph, Methylococcus capsulatus (Bath), obtained by the shotgun sequencing approach. Analysis revealed a 3.3-Mb genome highly specialized for a methanotrophic lifestyle, including redundant pathways predicted to be involved in methanotrophy and duplicated genes for essential enzymes such as the methane monooxygenases. We used phylogenomic analysis, gene order information, and comparative analysis with the partially sequenced methylotroph Methylobacterium extorquens to detect genes of unknown function likely to be involved in methanotrophy and methylotrophy. Genome analysis suggests the ability of M. capsulatus to scavenge copper (including a previously unreported nonribosomal peptide synthetase) and to use copper in regulation of methanotrophy, but the exact regulatory mechanisms remain unclear. One of the most surprising outcomes of the project is evidence suggesting the existence of previously unsuspected metabolic flexibility in M. capsulatus, including an ability to grow on sugars, oxidize chemolithotrophic hydrogen and sulfur, and live under reduced oxygen tension, all of which have implications for methanotroph ecology. The availability of the complete genome of M. capsulatus (Bath) deepens our understanding of methanotroph biology and its relationship to global carbon cycles. We have gained evidence for greater metabolic flexibility than was previously known, and for genetic components that may have biotechnological potential.
Collapse
Affiliation(s)
- Naomi Ward
- The Institute for Genomic Research, Rockville, Maryland, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
49
|
Premzl M, Gready JE, Jermiin LS, Simonic T, Marshall Graves JA. Evolution of vertebrate genes related to prion and Shadoo proteins--clues from comparative genomic analysis. Mol Biol Evol 2004; 21:2210-31. [PMID: 15342797 DOI: 10.1093/molbev/msh245] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Recent findings of new genes in fish related to the prion protein (PrP) gene PRNP, including our recent report of SPRN coding for Shadoo (Sho) protein found also in mammals, raise issues of their function and evolution. Here we report additional novel fish genes found in public databases, including a duplicated SPRN gene, SPRNB, in Fugu, Tetraodon, carp, and zebrafish encoding the Sho2 protein, and we use comparative genomic analysis to analyze the evolutionary relationships and to infer evolutionary trajectories of the complete data set. Phylogenetic footprinting performed on aligned human, mouse, and Fugu SPRN genes to define candidate regulatory promoter regions, detected 16 conserved motifs, three of which are known transcription factor-binding sites for a receptor and transcription factors specific to or associated with expression in brain. This result and other homology-based (VISTA global genomic alignment; protein sequence alignment and phylogenetics) and context-dependent (genomic context; relative gene order and orientation) criteria indicate fish and mammalian SPRN genes are orthologous and suggest a strongly conserved basic function in brain. Whereas tetrapod PRNPs share context with the analogous stPrP-2-coding gene in fish, their sequences are diverged, suggesting that the tetrapod and fish genes are likely to have significantly different functions. Phylogenetic analysis predicts the SPRN/SPRNB duplication occurred before divergence of fish from tetrapods, whereas that of stPrP-1 and stPrP-2 occurred in fish. Whereas Sho appears to have a conserved function in vertebrate brain, PrP seems to have an adaptive role fine-tuned in a lineage-specific fashion. An evolutionary model consistent with our findings and literature knowledge is proposed that has an ancestral prevertebrate SPRN-like gene leading to all vertebrate PrP-related and Sho-related genes. This provides a new framework for exploring the evolution of this unusual family of proteins and for searching for members in other fish branches and intermediate vertebrate groups.
Collapse
Affiliation(s)
- Marko Premzl
- Computational Proteomics Group, John Curtin School of Medical Research, Australian National University, Canberra, Australia
| | | | | | | | | |
Collapse
|
50
|
Wu M, Sun LV, Vamathevan J, Riegler M, Deboy R, Brownlie JC, McGraw EA, Martin W, Esser C, Ahmadinejad N, Wiegand C, Madupu R, Beanan MJ, Brinkac LM, Daugherty SC, Durkin AS, Kolonay JF, Nelson WC, Mohamoud Y, Lee P, Berry K, Young MB, Utterback T, Weidman J, Nierman WC, Paulsen IT, Nelson KE, Tettelin H, O'Neill SL, Eisen JA. Phylogenomics of the reproductive parasite Wolbachia pipientis wMel: a streamlined genome overrun by mobile genetic elements. PLoS Biol 2004; 2:E69. [PMID: 15024419 PMCID: PMC368164 DOI: 10.1371/journal.pbio.0020069] [Citation(s) in RCA: 587] [Impact Index Per Article: 29.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2003] [Accepted: 01/06/2004] [Indexed: 12/17/2022] Open
Abstract
The complete sequence of the 1,267,782 bp genome of Wolbachia pipientis wMel, an obligate intracellular bacteria of Drosophila melanogaster, has been determined. Wolbachia, which are found in a variety of invertebrate species, are of great interest due to their diverse interactions with different hosts, which range from many forms of reproductive parasitism to mutualistic symbioses. Analysis of the wMel genome, in particular phylogenomic comparisons with other intracellular bacteria, has revealed many insights into the biology and evolution of wMel and Wolbachia in general. For example, the wMel genome is unique among sequenced obligate intracellular species in both being highly streamlined and containing very high levels of repetitive DNA and mobile DNA elements. This observation, coupled with multiple evolutionary reconstructions, suggests that natural selection is somewhat inefficient in wMel, most likely owing to the occurrence of repeated population bottlenecks. Genome analysis predicts many metabolic differences with the closely related Rickettsia species, including the presence of intact glycolysis and purine synthesis, which may compensate for an inability to obtain ATP directly from its host, as Rickettsia can. Other discoveries include the apparent inability of wMel to synthesize lipopolysaccharide and the presence of the most genes encoding proteins with ankyrin repeat domains of any prokaryotic genome yet sequenced. Despite the ability of wMel to infect the germline of its host, we find no evidence for either recent lateral gene transfer between wMel and D. melanogaster or older transfers between Wolbachia and any host. Evolutionary analysis further supports the hypothesis that mitochondria share a common ancestor with the α-Proteobacteria, but shows little support for the grouping of mitochondria with species in the order Rickettsiales. With the availability of the complete genomes of both species and excellent genetic tools for the host, the wMel–D. melanogaster symbiosis is now an ideal system for studying the biology and evolution of Wolbachia infections. The genome sequence of Wolbachia provides insights into the origins of mitochondria, as well as the ecology and evolution of endosymbiosis
Collapse
Affiliation(s)
- Martin Wu
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - Ling V Sun
- 2Department of Epidemiology and Public Health, Yale University School of MedicineNew Haven, ConnecticutUnited States of America
| | - Jessica Vamathevan
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - Markus Riegler
- 3Department of Zoology and Entomology, School of Life SciencesThe University of Queensland, St Lucia, QueenslandAustralia
| | - Robert Deboy
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - Jeremy C Brownlie
- 3Department of Zoology and Entomology, School of Life SciencesThe University of Queensland, St Lucia, QueenslandAustralia
| | - Elizabeth A McGraw
- 3Department of Zoology and Entomology, School of Life SciencesThe University of Queensland, St Lucia, QueenslandAustralia
| | - William Martin
- 4Institut für Botanik III, Heinrich-Heine UniversitätDüsseldorfGermany
| | - Christian Esser
- 4Institut für Botanik III, Heinrich-Heine UniversitätDüsseldorfGermany
| | - Nahal Ahmadinejad
- 4Institut für Botanik III, Heinrich-Heine UniversitätDüsseldorfGermany
| | - Christian Wiegand
- 4Institut für Botanik III, Heinrich-Heine UniversitätDüsseldorfGermany
| | - Ramana Madupu
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - Maureen J Beanan
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - Lauren M Brinkac
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - Sean C Daugherty
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - A. Scott Durkin
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - James F Kolonay
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - William C Nelson
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - Yasmin Mohamoud
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - Perris Lee
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - Kristi Berry
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - M. Brook Young
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - Teresa Utterback
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - Janice Weidman
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - William C Nierman
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - Ian T Paulsen
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - Karen E Nelson
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - Hervé Tettelin
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| | - Scott L O'Neill
- 2Department of Epidemiology and Public Health, Yale University School of MedicineNew Haven, ConnecticutUnited States of America
- 3Department of Zoology and Entomology, School of Life SciencesThe University of Queensland, St Lucia, QueenslandAustralia
| | - Jonathan A Eisen
- 1The Institute for Genomic Research, RockvilleMarylandUnited States of America
| |
Collapse
|