1
|
Nishihara A, Tsukatani Y, Azai C, Nobu MK. Illuminating the coevolution of photosynthesis and Bacteria. Proc Natl Acad Sci U S A 2024; 121:e2322120121. [PMID: 38875151 PMCID: PMC11194577 DOI: 10.1073/pnas.2322120121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 05/06/2024] [Indexed: 06/16/2024] Open
Abstract
Life harnessing light energy transformed the relationship between biology and Earth-bringing a massive flux of organic carbon and oxidants to Earth's surface that gave way to today's organotrophy- and respiration-dominated biosphere. However, our understanding of how life drove this transition has largely relied on the geological record; much remains unresolved due to the complexity and paucity of the genetic record tied to photosynthesis. Here, through holistic phylogenetic comparison of the bacterial domain and all photosynthetic machinery (totally spanning >10,000 genomes), we identify evolutionary congruence between three independent biological systems-bacteria, (bacterio)chlorophyll-mediated light metabolism (chlorophototrophy), and carbon fixation-and uncover their intertwined history. Our analyses uniformly mapped progenitors of extant light-metabolizing machinery (reaction centers, [bacterio]chlorophyll synthases, and magnesium-chelatases) and enzymes facilitating the Calvin-Benson-Bassham cycle (form I RuBisCO and phosphoribulokinase) to the same ancient Terrabacteria organism near the base of the bacterial domain. These phylogenies consistently showed that extant phototrophs ultimately derived light metabolism from this bacterium, the last phototroph common ancestor (LPCA). LPCA was a non-oxygen-generating (anoxygenic) phototroph that already possessed carbon fixation and two reaction centers, a type I analogous to extant forms and a primitive type II. Analyses also indicate chlorophototrophy originated before LPCA. We further reconstructed evolution of chlorophototrophs/chlorophototrophy post-LPCA, including vertical inheritance in Terrabacteria, the rise of oxygen-generating chlorophototrophy in one descendant branch near the Great Oxidation Event, and subsequent emergence of Cyanobacteria. These collectively unveil a detailed view of the coevolution of light metabolism and Bacteria having clear congruence with the geological record.
Collapse
Affiliation(s)
- Arisa Nishihara
- Department of Life Science and Biotechnology, The National Institute of Advanced Industrial Science and Technology, Ibaraki305-0817, Japan
| | - Yusuke Tsukatani
- Biogeochemistry Research Center, Japan Agency for Marine-Earth Science and Technology (JAMSTEC), Kanagawa237-0061, Japan
- Institute for Extra-Cutting-Edge Science and Technology Avant-Garde Research (X-star), Japan Agency for Marine-Earth Science and Technology (JAMSTEC), Kanagawa237-0061, Japan
| | - Chihiro Azai
- College of Life Sciences, Ritsumeikan University, Shiga525-8577, Japan
- Department of Biological Sciences, Faculty of Science and Engineering, Chuo University, Tokyo112-8551, Japan
| | - Masaru K. Nobu
- Department of Life Science and Biotechnology, The National Institute of Advanced Industrial Science and Technology, Ibaraki305-0817, Japan
- Institute for Extra-Cutting-Edge Science and Technology Avant-Garde Research (X-star), Japan Agency for Marine-Earth Science and Technology (JAMSTEC), Kanagawa237-0061, Japan
| |
Collapse
|
2
|
Cannon EK, Portwood JL, Hayford RK, Haley OC, Gardiner JM, Andorf CM, Woodhouse MR. Enhanced pan-genomic resources at the maize genetics and genomics database. Genetics 2024; 227:iyae036. [PMID: 38577974 DOI: 10.1093/genetics/iyae036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 01/13/2024] [Indexed: 04/06/2024] Open
Abstract
Pan-genomes, encompassing the entirety of genetic sequences found in a collection of genomes within a clade, are more useful than single reference genomes for studying species diversity. This is especially true for a species like Zea mays, which has a particularly diverse and complex genome. Presenting pan-genome data, analyses, and visualization is challenging, especially for a diverse species, but more so when pan-genomic data is linked to extensive gene model and gene data, including classical gene information, markers, insertions, expression and proteomic data, and protein structures as is the case at MaizeGDB. Here, we describe MaizeGDB's expansion to include the genic subset of the Zea pan-genome in a pan-gene data center featuring the maize genomes hosted at MaizeGDB, and the outgroup teosinte Zea genomes from the Pan-Andropoganeae project. The new data center offers a variety of browsing and visualization tools, including sequence alignment visualization, gene trees and other tools, to explore pan-genes in Zea that were calculated by the pipeline Pandagma. Combined, these data will help maize researchers study the complexity and diversity of Zea, and to use the comparative functions to validate pan-gene relationships for a selected gene model.
Collapse
Affiliation(s)
- Ethalinda K Cannon
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - John L Portwood
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Rita K Hayford
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Olivia C Haley
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Jack M Gardiner
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, USA
| | - Carson M Andorf
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
- Department of Computer Science, Iowa State University, Ames, IA 50011, USA
| | | |
Collapse
|
3
|
Lam WYS, Kwong E, Chan HWT, Zheng YP. Using Sequence Analyses to Quantitatively Measure Oropharyngeal Swallowing Temporality in Point-of-Care Ultrasound Examinations: A Pilot Study. J Clin Med 2024; 13:2288. [PMID: 38673561 PMCID: PMC11051012 DOI: 10.3390/jcm13082288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 04/02/2024] [Accepted: 04/12/2024] [Indexed: 04/28/2024] Open
Abstract
(1) Background: Swallowing is a complex process that comprises well-timed control of oropharyngeal and laryngeal structures to achieve airway protection and swallowing efficiency. To understand its temporality, previous research adopted adherence measures and revealed obligatory pairs in healthy swallows and the effect of aging and bolus type on the variability of event timing and order. This study aimed to (i) propose a systemic conceptualization of swallowing physiology, (ii) apply sequence analyses, a set of information-theoretic and bioinformatic methods, to quantify and characterize swallowing temporality, and (iii) investigate the effect of aging and dysphagia on the quantified variables using sequence analyses measures. (2) Method: Forty-three participants (17 young adults, 15 older adults, and 11 dysphagic adults) underwent B-mode ultrasound swallowing examinations at the mid-sagittal plane of the submental region. The onset, maximum, and offset states of hyoid bone displacement, geniohyoid muscle contraction, and tongue base retraction were identified and sorted to form sequences which were analyzed using an inventory of sequence analytic techniques; namely, overlap coefficients, Shannon entropy, and longest common subsequence algorithms. (3) Results: The concurrency of movement sequence was found to be significantly impacted by aging and dysphagia. Swallowing sequence variability was also found to be reduced with age and the presence of dysphagia (H(2) = 52.253, p < 0.001, η2 = 0.260). Four obligatory sequences were identified, and high adherence was also indicated in two previously reported pairs. These results provided preliminary support for the validity of sequence analyses for quantifying swallowing sequence temporality. (4) Conclusions: A systemic conceptualization of human deglutition permits a multi-level quantitative analysis of swallowing physiology. Sequence analyses are a set of promising quantitative measurement techniques for point-of-care ultrasound (POCUS) swallowing examinations and outcome measures for swallowing rehabilitation and evaluation of associated physiological conditions, such as sarcopenia. Findings in the current study revealed physiological differences among healthy young, healthy older, and dysphagic adults. They also helped lay the groundwork for future AI-assisted dysphagia assessment and outcome measures using POCUSs. Arguably, the proposed conceptualization and analyses are also modality-independent measures that can potentially be generalized for other instrumental swallowing assessment modalities.
Collapse
Affiliation(s)
- Wilson Yiu Shun Lam
- Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong SAR, China (H.W.T.C.)
- Research Institute for Smart Ageing, The Hong Kong Polytechnic University, Hong Kong SAR, China;
| | - Elaine Kwong
- Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong SAR, China (H.W.T.C.)
- Research Institute for Smart Ageing, The Hong Kong Polytechnic University, Hong Kong SAR, China;
| | - Huberta Wai Tung Chan
- Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong SAR, China (H.W.T.C.)
| | - Yong-Ping Zheng
- Research Institute for Smart Ageing, The Hong Kong Polytechnic University, Hong Kong SAR, China;
- Department of Biomedical Engineering, The Hong Kong Polytechnic University, Hong Kong SAR, China
| |
Collapse
|
4
|
Kahne SC, Yoo JH, Chen J, Nakedi K, Iyer LM, Putzel G, Samhadaneh NM, Pironti A, Aravind L, Ekiert DC, Bhabha G, Rhee KY, Darwin KH. Identification of a proteolysis regulator for an essential enzyme in Mycobacterium tuberculosis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.29.587195. [PMID: 38585835 PMCID: PMC10996600 DOI: 10.1101/2024.03.29.587195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
In Mycobacterium tuberculosis proteins that are post-translationally modified with Pup, a prokaryotic ubiquitin-like protein, can be degraded by proteasomes. While pupylation is reversible, mechanisms regulating substrate specificity have not been identified. Here, we identify the first depupylation regulators: CoaX, a pseudokinase, and pantothenate, an essential, central metabolite. In a Δ coaX mutant, pantothenate synthesis enzymes were more abundant, including PanB, a substrate of the Pup-proteasome system. Media supplementation with pantothenate decreased PanB levels in a coaX and Pup-proteasome-dependent manner. In vitro , CoaX accelerated depupylation of Pup∼PanB, while addition of pantothenate inhibited this reaction. Collectively, we propose CoaX contributes to proteasomal degradation of PanB by modulating depupylation of Pup∼PanB in response to pantothenate levels. One Sentence Summary A pseudo-pantothenate kinase regulates proteasomal degradation of a pantothenate synthesis enzyme in M. tuberculosis .
Collapse
|
5
|
Updegrove TB, Delerue T, Anantharaman V, Cho H, Chan C, Nipper T, Choo-Wosoba H, Jenkins LM, Zhang L, Su Y, Shroff H, Chen J, Bewley CA, Aravind L, Ramamurthi KS. Altruistic feeding and cell-cell signaling during bacterial differentiation actively enhance phenotypic heterogeneity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.27.587046. [PMID: 38903092 PMCID: PMC11188070 DOI: 10.1101/2024.03.27.587046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/22/2024]
Abstract
Starvation triggers bacterial spore formation, a committed differentiation program that transforms a vegetative cell into a dormant spore. Cells in a population enter sporulation non-uniformly to secure against the possibility that favorable growth conditions, which puts sporulation-committed cells at a disadvantage, may resume. This heterogeneous behavior is initiated by a passive mechanism: stochastic activation of a master transcriptional regulator. Here, we identify a cell-cell communication pathway that actively promotes phenotypic heterogeneity, wherein Bacillus subtilis cells that start sporulating early utilize a calcineurin-like phosphoesterase to release glycerol, which simultaneously acts as a signaling molecule and a nutrient to delay non-sporulating cells from entering sporulation. This produced a more diverse population that was better poised to exploit a sudden influx of nutrients compared to those generating heterogeneity via stochastic gene expression alone. Although conflict systems are prevalent among microbes, genetically encoded cooperative behavior in unicellular organisms can evidently also boost inclusive fitness.
Collapse
Affiliation(s)
- Taylor B. Updegrove
- Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Thomas Delerue
- Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Vivek Anantharaman
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Hyomoon Cho
- Laboratory of Bioorganic Chemistry, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Carissa Chan
- Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Thomas Nipper
- Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Hyoyoung Choo-Wosoba
- Biostatistics and Data Management Support Section, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Lisa M. Jenkins
- Laboratory of Cell Biology, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Lixia Zhang
- Advanced Imaging and Microscopy Resource, National Institutes of Health, Bethesda, MD, USA
| | - Yijun Su
- Laboratory of High Resolution Optical Imaging, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, MD, USA
- Janelia Farm Research Campus, Howard Hughes Medical Institute (HHMI), Ashburn, VA, USA
| | - Hari Shroff
- Laboratory of High Resolution Optical Imaging, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, MD, USA
- Janelia Farm Research Campus, Howard Hughes Medical Institute (HHMI), Ashburn, VA, USA
| | - Jiji Chen
- Advanced Imaging and Microscopy Resource, National Institutes of Health, Bethesda, MD, USA
| | - Carole A. Bewley
- Laboratory of Bioorganic Chemistry, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, USA
| | - L. Aravind
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Kumaran S. Ramamurthi
- Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
6
|
Chiteri KO, Rairdin A, Sandhu K, Redsun S, Farmer A, O'Rourke JA, Cannon SB, Singh A. Combining GWAS and comparative genomics to fine map candidate genes for days to flowering in mung bean. BMC Genomics 2024; 25:270. [PMID: 38475739 DOI: 10.1186/s12864-024-10156-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 02/22/2024] [Indexed: 03/14/2024] Open
Abstract
BACKGROUND Mung bean (Vigna radiata (L.) Wilczek), is an important pulse crop in the global south. Early flowering and maturation are advantageous traits for adaptation to northern and southern latitudes. This study investigates the genetic basis of the Days-to-Flowering trait (DTF) in mung bean, combining genome-wide association studies (GWAS) in mung bean and comparisons with orthologous genes involved with control of DTF responses in soybean (Glycine max (L) Merr) and Arabidopsis (Arabidopsis thaliana). RESULTS The most significant associations for DTF were on mung bean chromosomes 1, 2, and 4. Only the SNPs on chromosomes 1 and 4 were heavily investigated using downstream analysis. The chromosome 1 DTF association is tightly linked with a cluster of locally duplicated FERONIA (FER) receptor-like protein kinase genes, and the SNP occurs within one of the FERONIA genes. In Arabidopsis, an orthologous FERONIA gene (AT3G51550), has been reported to regulate the expression of the FLOWERING LOCUS C (FLC). For the chromosome 4 DTF locus, the strongest candidates are Vradi04g00002773 and Vradi04g00002778, orthologous to the Arabidopsis PhyA and PIF3 genes, encoding phytochrome A (a photoreceptor protein sensitive to red to far-red light) and phytochrome-interacting factor 3, respectively. The soybean PhyA orthologs include the classical loci E3 and E4 (genes GmPhyA3, Glyma.19G224200, and GmPhyA2, Glyma.20G090000). The mung bean PhyA ortholog has been previously reported as a candidate for DTF in studies conducted in South Korea. CONCLUSION The top two identified SNPs accounted for a significant proportion (~ 65%) of the phenotypic variability in mung bean DTF by the six significant SNPs (39.61%), with a broad-sense heritability of 0.93. The strong associations of DTF with genes that have orthologs with analogous functions in soybean and Arabidopsis provide strong circumstantial evidence that these genes are causal for this trait. The three reported loci and candidate genes provide useful targets for marker-assisted breeding in mung beans.
Collapse
Affiliation(s)
- Kevin O Chiteri
- Department of Agronomy, Iowa State University, Ames, IA, United States
| | - Ashlyn Rairdin
- Department of Agronomy, Iowa State University, Ames, IA, United States
| | | | - Sven Redsun
- National Center for Genome Resources, Santa Fe, NM, 87505, United States
| | - Andrew Farmer
- National Center for Genome Resources, Santa Fe, NM, 87505, United States
| | - Jamie A O'Rourke
- Department of Agronomy, Iowa State University, Ames, IA, United States
- USDA - Agricultural Research Service, Corn Insects, and Crop Genetics Research Unit, Ames, IA, United States
| | - Steven B Cannon
- Department of Agronomy, Iowa State University, Ames, IA, United States.
- USDA - Agricultural Research Service, Corn Insects, and Crop Genetics Research Unit, Ames, IA, United States.
| | - Arti Singh
- Department of Agronomy, Iowa State University, Ames, IA, United States.
| |
Collapse
|
7
|
Zheng Z, Zhu M, Zhang J, Liu X, Hou L, Liu W, Yuan S, Luo C, Yao X, Liu J, Yang Y. A sequence-aware merger of genomic structural variations at population scale. Nat Commun 2024; 15:960. [PMID: 38307885 PMCID: PMC10837428 DOI: 10.1038/s41467-024-45244-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 01/18/2024] [Indexed: 02/04/2024] Open
Abstract
Merging structural variations (SVs) at the population level presents a significant challenge, yet it is essential for conducting comprehensive genotypic analyses, especially in the era of pangenomics. Here, we introduce PanPop, a tool that utilizes an advanced sequence-aware SV merging algorithm to efficiently merge SVs of various types. We demonstrate that PanPop can merge and optimize the majority of multiallelic SVs into informative biallelic variants. We show its superior precision and lower rates of missing data compared to alternative software solutions. Our approach not only enables the filtering of SVs by leveraging multiple SV callers for enhanced accuracy but also facilitates the accurate merging of large-scale population SVs. These capabilities of PanPop will help to accelerate future SV-related studies.
Collapse
Affiliation(s)
- Zeyu Zheng
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Ecology, Lanzhou University, Lanzhou, China
| | - Mingjia Zhu
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Ecology, Lanzhou University, Lanzhou, China
| | - Jin Zhang
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Ecology, Lanzhou University, Lanzhou, China
| | - Xinfeng Liu
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Ecology, Lanzhou University, Lanzhou, China
| | - Liqiang Hou
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Ecology, Lanzhou University, Lanzhou, China
| | - Wenyu Liu
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Ecology, Lanzhou University, Lanzhou, China
| | - Shuai Yuan
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Ecology, Lanzhou University, Lanzhou, China
| | - Changhong Luo
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Ecology, Lanzhou University, Lanzhou, China
| | - Xinhao Yao
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Ecology, Lanzhou University, Lanzhou, China
| | - Jianquan Liu
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Ecology, Lanzhou University, Lanzhou, China.
| | - Yongzhi Yang
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Ecology, Lanzhou University, Lanzhou, China.
| |
Collapse
|
8
|
Ngou BPM, Wyler M, Schmid MW, Kadota Y, Shirasu K. Evolutionary trajectory of pattern recognition receptors in plants. Nat Commun 2024; 15:308. [PMID: 38302456 PMCID: PMC10834447 DOI: 10.1038/s41467-023-44408-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 12/12/2023] [Indexed: 02/03/2024] Open
Abstract
Cell-surface receptors play pivotal roles in many biological processes, including immunity, development, and reproduction, across diverse organisms. How cell-surface receptors evolve to become specialised in different biological processes remains elusive. To shed light on the immune-specificity of cell-surface receptors, we analyzed more than 200,000 genes encoding cell-surface receptors from 350 genomes and traced the evolutionary origin of immune-specific leucine-rich repeat receptor-like proteins (LRR-RLPs) in plants. Surprisingly, we discovered that the motifs crucial for co-receptor interaction in LRR-RLPs are closely related to those of the LRR-receptor-like kinase (RLK) subgroup Xb, which perceives phytohormones and primarily governs growth and development. Functional characterisation further reveals that LRR-RLPs initiate immune responses through their juxtamembrane and transmembrane regions, while LRR-RLK-Xb members regulate development through their cytosolic kinase domains. Our data suggest that the cell-surface receptors involved in immunity and development share a common origin. After diversification, their ectodomains, juxtamembrane, transmembrane, and cytosolic regions have either diversified or stabilised to recognise diverse ligands and activate differential downstream responses. Our work reveals a mechanism by which plants evolve to perceive diverse signals to activate the appropriate responses in a rapidly changing environment.
Collapse
Affiliation(s)
| | | | | | - Yasuhiro Kadota
- RIKEN Center for Sustainable Resource Science, Yokohama, Japan.
| | - Ken Shirasu
- RIKEN Center for Sustainable Resource Science, Yokohama, Japan.
| |
Collapse
|
9
|
Andorf CM, Haley OC, Hayford RK, Portwood JL, Harding S, Sen S, Cannon EK, Gardiner JM, Kim HS, Woodhouse MR. PanEffect: a pan-genome visualization tool for variant effects in maize. Bioinformatics 2024; 40:btae073. [PMID: 38337024 PMCID: PMC10881103 DOI: 10.1093/bioinformatics/btae073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 01/30/2024] [Accepted: 02/06/2024] [Indexed: 02/12/2024] Open
Abstract
SUMMARY Understanding the effects of genetic variants is crucial for accurately predicting traits and functional outcomes. Recent approaches have utilized artificial intelligence and protein language models to score all possible missense variant effects at the proteome level for a single genome, but a reliable tool is needed to explore these effects at the pan-genome level. To address this gap, we introduce a new tool called PanEffect. We implemented PanEffect at MaizeGDB to enable a comprehensive examination of the potential effects of coding variants across 50 maize genomes. The tool allows users to visualize over 550 million possible amino acid substitutions in the B73 maize reference genome and to observe the effects of the 2.3 million natural variations in the maize pan-genome. Each variant effect score, calculated from the Evolutionary Scale Modeling (ESM) protein language model, shows the log-likelihood ratio difference between B73 and all variants in the pan-genome. These scores are shown using heatmaps spanning benign outcomes to potential functional consequences. In addition, PanEffect displays secondary structures and functional domains along with the variant effects, offering additional functional and structural context. Using PanEffect, researchers now have a platform to explore protein variants and identify genetic targets for crop enhancement. AVAILABILITY AND IMPLEMENTATION The PanEffect code is freely available on GitHub (https://github.com/Maize-Genetics-and-Genomics-Database/PanEffect). A maize implementation of PanEffect and underlying datasets are available at MaizeGDB (https://www.maizegdb.org/effect/maize/).
Collapse
Affiliation(s)
- Carson M Andorf
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
- Department of Computer Science, Iowa State University, Ames, IA 50011, United States
| | - Olivia C Haley
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
| | - Rita K Hayford
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
| | - John L Portwood
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
| | - Stephen Harding
- USDA-ARS, Mycotoxin Prevention and Applied Microbiology Research Unit, National Center for Agricultural Utilization Research, Peoria, IL 61604, United States
| | - Shatabdi Sen
- Department of Plant Pathology & Microbiology, Iowa State University, Ames, IA 50011, United States
| | - Ethalinda K Cannon
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
| | - Jack M Gardiner
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, United States
| | - Hye-Seon Kim
- USDA-ARS, Mycotoxin Prevention and Applied Microbiology Research Unit, National Center for Agricultural Utilization Research, Peoria, IL 61604, United States
| | - Margaret R Woodhouse
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
| |
Collapse
|
10
|
Barcia-Cruz R, Goudenège D, Moura de Sousa JA, Piel D, Marbouty M, Rocha EPC, Le Roux F. Phage-inducible chromosomal minimalist islands (PICMIs), a novel family of small marine satellites of virulent phages. Nat Commun 2024; 15:664. [PMID: 38253718 PMCID: PMC10803314 DOI: 10.1038/s41467-024-44965-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 01/10/2024] [Indexed: 01/24/2024] Open
Abstract
Phage satellites are bacterial genetic elements that co-opt phage machinery for their own dissemination. Here we identify a family of satellites, named Phage-Inducible Chromosomal Minimalist Islands (PICMIs), that are broadly distributed in marine bacteria of the family Vibrionaceae. A typical PICMI is characterized by reduced gene content, does not encode genes for capsid remodelling, and packages its DNA as a concatemer. PICMIs integrate in the bacterial host genome next to the fis regulator, and encode three core proteins necessary for excision and replication. PICMIs are dependent on virulent phage particles to spread to other bacteria, and protect their hosts from other competitive phages without interfering with their helper phage. Thus, our work broadens our understanding of phage satellites and narrows down the minimal number of functions necessary to hijack a tailed phage.
Collapse
Affiliation(s)
- Rubén Barcia-Cruz
- Sorbonne Université, CNRS, UMR 8227, Integrative Biology of Marine Models, Station Biologique de Roscoff, CS 90074, F-29688, Roscoff cedex, France
- Department of Microbiology and Parasitology, CIBUS-Faculty of Biology, Universidade de Santiago de Compostela, Santiago de Compostela, Spain
| | - David Goudenège
- Sorbonne Université, CNRS, UMR 8227, Integrative Biology of Marine Models, Station Biologique de Roscoff, CS 90074, F-29688, Roscoff cedex, France
- Ifremer, Unité Physiologie Fonctionnelle des Organismes Marins, ZI de la Pointe du Diable, CS 10070, F-29280, Plouzané, France
| | - Jorge A Moura de Sousa
- Institut Pasteur, Université Paris Cité, CNRS UMR3525, Microbial Evolutionary Genomics, Paris, France
| | - Damien Piel
- Sorbonne Université, CNRS, UMR 8227, Integrative Biology of Marine Models, Station Biologique de Roscoff, CS 90074, F-29688, Roscoff cedex, France
- Ifremer, Unité Physiologie Fonctionnelle des Organismes Marins, ZI de la Pointe du Diable, CS 10070, F-29280, Plouzané, France
| | - Martial Marbouty
- Institut Pasteur, Université Paris Cité, Organization and Dynamics of Viral Genomes Group, CNRS UMR 3525, Paris, F-75015, France
| | - Eduardo P C Rocha
- Institut Pasteur, Université Paris Cité, CNRS UMR3525, Microbial Evolutionary Genomics, Paris, France
| | - Frédérique Le Roux
- Sorbonne Université, CNRS, UMR 8227, Integrative Biology of Marine Models, Station Biologique de Roscoff, CS 90074, F-29688, Roscoff cedex, France.
- Ifremer, Unité Physiologie Fonctionnelle des Organismes Marins, ZI de la Pointe du Diable, CS 10070, F-29280, Plouzané, France.
- Département de microbiologie, infectiologie et immunologie, Université de Montréal, Montréal, Canada.
| |
Collapse
|
11
|
Monecke S, Braun SD, Collatz M, Diezel C, Müller E, Reinicke M, Cabal Rosel A, Feßler AT, Hanke D, Loncaric I, Schwarz S, Cortez de Jäckel S, Ruppitsch W, Gavier-Widén D, Hotzel H, Ehricht R. Molecular Characterization of Chimeric Staphylococcus aureus Strains from Waterfowl. Microorganisms 2024; 12:96. [PMID: 38257923 PMCID: PMC10821479 DOI: 10.3390/microorganisms12010096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 12/15/2023] [Accepted: 12/29/2023] [Indexed: 01/24/2024] Open
Abstract
Staphylococcus aureus is a versatile pathogen that does not only occur in humans but also in various wild and domestic animals, including several avian species. When characterizing S. aureus isolates from waterfowl, isolates were identified as atypical CC133 by DNA microarray analysis. They differed from previously sequenced CC133 strains in the presence of the collagen adhesin gene cna; some also showed a different capsule type and a deviant spa type. Thus, they were subjected to whole-genome sequencing. This revealed multiple insertions of large regions of DNA from other S. aureus lineages into a CC133-derived backbone genome. Three distinct strains were identified based on the size and extent of these inserts. One strain comprised two small inserts of foreign DNA up- and downstream of oriC; one of about 7000 nt or 0.25% originated from CC692 and the other, at ca. 38,000 nt or 1.3% slightly larger one was of CC522 provenance. The second strain carried a larger CC692 insert (nearly 257,000 nt or 10% of the strain's genome), and its CC522-derived insert was also larger, at about 53,500 nt or 2% of the genome). The third strain carried an identical CC692-derived region (in which the same mutations were observed as in the second strain), but it had a considerably larger CC522-like insertion of about 167,000 nt or 5.9% of the genome. Both isolates of the first, and two out of four isolates of the second strain also harbored a hemolysin-beta-integrating prophage carrying "bird-specific" virulence factors, ornithine cyclodeaminase D0K6J8 and a putative protease D0K6J9. Furthermore, isolates had two different variants of SCC elements that lacked mecA/mecC genes. These findings highlight the role of horizontal gene transfer in the evolution of S. aureus facilitated by SCC elements, by phages, and by a yet undescribed mechanism for large-scale exchange of core genomic DNA.
Collapse
Affiliation(s)
- Stefan Monecke
- Leibniz Institute of Photonic Technology (IPHT), Leibniz Center for Photonics in Infection Research (LPI), 07745 Jena, Germany
- InfectoGnostics Research Campus, 07743 Jena, Germany
- Institute for Medical Microbiology and Virology, Dresden University Hospital, 01307 Dresden, Germany
| | - Sascha D. Braun
- Leibniz Institute of Photonic Technology (IPHT), Leibniz Center for Photonics in Infection Research (LPI), 07745 Jena, Germany
- InfectoGnostics Research Campus, 07743 Jena, Germany
| | - Maximillian Collatz
- Leibniz Institute of Photonic Technology (IPHT), Leibniz Center for Photonics in Infection Research (LPI), 07745 Jena, Germany
- InfectoGnostics Research Campus, 07743 Jena, Germany
| | - Celia Diezel
- Leibniz Institute of Photonic Technology (IPHT), Leibniz Center for Photonics in Infection Research (LPI), 07745 Jena, Germany
- InfectoGnostics Research Campus, 07743 Jena, Germany
| | - Elke Müller
- Leibniz Institute of Photonic Technology (IPHT), Leibniz Center for Photonics in Infection Research (LPI), 07745 Jena, Germany
- InfectoGnostics Research Campus, 07743 Jena, Germany
| | - Martin Reinicke
- Leibniz Institute of Photonic Technology (IPHT), Leibniz Center for Photonics in Infection Research (LPI), 07745 Jena, Germany
- InfectoGnostics Research Campus, 07743 Jena, Germany
| | - Adriana Cabal Rosel
- Austrian Agency for Health and Food Safety, Institute for Medical Microbiology and Hygiene, 1220 Vienna, Austria
| | - Andrea T. Feßler
- Institute of Microbiology and Epizootics, Centre for Infection, Medicine School of Veterinary Medicine, Freie Universität Berlin, 14163 Berlin, Germany
- Veterinary Centre for Resistance Research (TZR), School of Veterinary Medicine, Freie Universität Berlin, 14163 Berlin, Germany
| | - Dennis Hanke
- Institute of Microbiology and Epizootics, Centre for Infection, Medicine School of Veterinary Medicine, Freie Universität Berlin, 14163 Berlin, Germany
- Veterinary Centre for Resistance Research (TZR), School of Veterinary Medicine, Freie Universität Berlin, 14163 Berlin, Germany
| | - Igor Loncaric
- Institute of Microbiology, University of Veterinary Medicine, 1210 Vienna, Austria;
| | - Stefan Schwarz
- Institute of Microbiology and Epizootics, Centre for Infection, Medicine School of Veterinary Medicine, Freie Universität Berlin, 14163 Berlin, Germany
- Veterinary Centre for Resistance Research (TZR), School of Veterinary Medicine, Freie Universität Berlin, 14163 Berlin, Germany
| | | | - Werner Ruppitsch
- Austrian Agency for Health and Food Safety, Institute for Medical Microbiology and Hygiene, 1220 Vienna, Austria
| | - Dolores Gavier-Widén
- Department of Pathology and Wildlife Disease, National Veterinary Institute (SVA), 75189 Uppsala, Sweden
- Department of Biomedical Sciences and Veterinary Public Health, Swedish University of Agricultural Sciences (SLU), 75007 Uppsala, Sweden
| | - Helmut Hotzel
- Institute of Bacterial Infections and Zoonoses, Friedrich-Loeffler-Institut (Federal Research Institute for Animal Health), 07743 Jena, Germany
| | - Ralf Ehricht
- Leibniz Institute of Photonic Technology (IPHT), Leibniz Center for Photonics in Infection Research (LPI), 07745 Jena, Germany
- InfectoGnostics Research Campus, 07743 Jena, Germany
- Institute of Physical Chemistry, Friedrich-Schiller University, 07743 Jena, Germany
| |
Collapse
|
12
|
Wright Z, Seymour M, Paszczak K, Truttmann T, Senn K, Stilp S, Jansen N, Gosz M, Goeden L, Anantharaman V, Aravind L, Waters LS. The small protein MntS evolved from a signal peptide and acquired a novel function regulating manganese homeostasis in Escherichia coli. Mol Microbiol 2024; 121:152-166. [PMID: 38104967 PMCID: PMC10842292 DOI: 10.1111/mmi.15206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 11/17/2023] [Accepted: 11/24/2023] [Indexed: 12/19/2023]
Abstract
Small proteins (<50 amino acids) are emerging as ubiquitous and important regulators in organisms ranging from bacteria to humans, where they commonly bind to and regulate larger proteins during stress responses. However, fundamental aspects of small proteins, such as their molecular mechanism of action, downregulation after they are no longer needed, and their evolutionary provenance, are poorly understood. Here, we show that the MntS small protein involved in manganese (Mn) homeostasis binds and inhibits the MntP Mn transporter. Mn is crucial for bacterial survival in stressful environments but is toxic in excess. Thus, Mn transport is tightly controlled at multiple levels to maintain optimal Mn levels. The small protein MntS adds a new level of regulation for Mn transporters, beyond the known transcriptional and post-transcriptional control. We also found that MntS binds to itself in the presence of Mn, providing a possible mechanism of downregulating MntS activity to terminate its inhibition of MntP Mn export. MntS is homologous to the signal peptide of SitA, the periplasmic metal-binding subunit of a Mn importer. Remarkably, the homologous signal peptide regions can substitute for MntS, demonstrating a functional relationship between MntS and these signal peptides. Conserved gene neighborhoods support that MntS evolved from the signal peptide of an ancestral SitA protein, acquiring a life of its own with a distinct function in Mn homeostasis.
Collapse
Affiliation(s)
- Zachary Wright
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Mackenzie Seymour
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Kalista Paszczak
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Taylor Truttmann
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Katherine Senn
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Samuel Stilp
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Nickolas Jansen
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Magdalyn Gosz
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Lindsay Goeden
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Vivek Anantharaman
- National Center for Biotechnology Information, National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - L. Aravind
- National Center for Biotechnology Information, National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Lauren S. Waters
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| |
Collapse
|
13
|
Nicastro GG, Burroughs AM, Iyer L, Aravind L. Functionally comparable but evolutionarily distinct nucleotide-targeting effectors help identify conserved paradigms across diverse immune systems. Nucleic Acids Res 2023; 51:11479-11503. [PMID: 37889040 PMCID: PMC10681802 DOI: 10.1093/nar/gkad879] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 09/21/2023] [Accepted: 09/28/2023] [Indexed: 10/28/2023] Open
Abstract
While nucleic acid-targeting effectors are known to be central to biological conflicts and anti-selfish element immunity, recent findings have revealed immune effectors that target their building blocks and the cellular energy currency-free nucleotides. Through comparative genomics and sequence-structure analysis, we identified several distinct effector domains, which we named Calcineurin-CE, HD-CE, and PRTase-CE. These domains, along with specific versions of the ParB and MazG domains, are widely present in diverse prokaryotic immune systems and are predicted to degrade nucleotides by targeting phosphate or glycosidic linkages. Our findings unveil multiple potential immune systems associated with at least 17 different functional themes featuring these effectors. Some of these systems sense modified DNA/nucleotides from phages or operate downstream of novel enzymes generating signaling nucleotides. We also uncovered a class of systems utilizing HSP90- and HSP70-related modules as analogs of STAND and GTPase domains that are coupled to these nucleotide-targeting- or proteolysis-induced complex-forming effectors. While widespread in bacteria, only a limited subset of nucleotide-targeting effectors was integrated into eukaryotic immune systems, suggesting barriers to interoperability across subcellular contexts. This work establishes nucleotide-degrading effectors as an emerging immune paradigm and traces their origins back to homologous domains in housekeeping systems.
Collapse
Affiliation(s)
- Gianlucca G Nicastro
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, USA
| | - A Maxwell Burroughs
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, USA
| | - Lakshminarayan M Iyer
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, USA
| | - L Aravind
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, USA
| |
Collapse
|
14
|
Sobala ŁF. Evolution and phylogenetic distribution of endo-α-mannosidase. Glycobiology 2023; 33:687-699. [PMID: 37202179 PMCID: PMC11025385 DOI: 10.1093/glycob/cwad041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 05/12/2023] [Accepted: 05/16/2023] [Indexed: 05/20/2023] Open
Abstract
While glycans underlie many biological processes, such as protein folding, cell adhesion, and cell-cell recognition, deep evolution of glycosylation machinery remains an understudied topic. N-linked glycosylation is a conserved process in which mannosidases are key trimming enzymes. One of them is the glycoprotein endo-α-1,2-mannosidase which participates in the initial trimming of mannose moieties from an N-linked glycan inside the cis-Golgi. It is unique as the only endo-acting mannosidase found in this organelle. Relatively little is known about its origins and evolutionary history; so far it was reported to occur only in vertebrates. In this work, a taxon-rich bioinformatic survey to unravel the evolutionary history of this enzyme, including all major eukaryotic clades and a wide representation of animals, is presented. The endomannosidase was found to be more widely distributed in animals and other eukaryotes. The protein motif changes in context of the canonical animal enzyme were tracked. Additionally, the data show the two canonical vertebrate endomannosidase genes, MANEA and MANEAL, arose at the second round of the two vertebrate genome duplications and one more vertebrate paralog, CMANEAL, is uncovered. Finally, a framework where N-glycosylation co-evolved with complex multicellularity is described. A better understanding of the evolution of core glycosylation pathways is pivotal to understanding biology of eukaryotes in general, and the Golgi apparatus in particular. This systematic analysis of the endomannosidase evolution is one step toward this goal.
Collapse
Affiliation(s)
- Łukasz F Sobala
- Laboratory of Glycobiology, Hirszfeld Institute of Immunology and Experimental Therapy, Weigla 12, 53-114 Wroclaw, Poland
| |
Collapse
|
15
|
Delerue T, Chareyre S, Anantharaman V, Gilmore MC, Popham DL, Cava F, Aravind L, Ramamurthi KS. Bacterial cell surface nanoenvironment requires a specialized chaperone to activate a peptidoglycan biosynthetic enzyme. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.06.561273. [PMID: 37986874 PMCID: PMC10659427 DOI: 10.1101/2023.10.06.561273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Bacillus subtilis spores are produced inside the cytosol of a mother cell. Spore surface assembly requires the SpoVK protein in the mother cell, but its function is unknown. Here, we report that SpoVK is a dedicated chaperone from a distinct higher-order clade of AAA+ ATPases that activates the peptidoglycan glycosyltransferase MurG during sporulation, even though MurG does not normally require activation by a chaperone during vegetative growth. MurG redeploys to the spore surface during sporulation, where we show that the local pH is reduced and propose that this change in cytosolic nanoenvironment necessitates a specific chaperone for proper MurG function. Further, we show that SpoVK participates in a developmental checkpoint in which improper spore surface assembly inactivates SpoVK, which leads to sporulation arrest. The AAA+ ATPase clade containing SpoVK includes other dedicated chaperones involved in secretion, cell-envelope biosynthesis, and carbohydrate metabolism, suggesting that such fine-tuning might be a widespread feature of different subcellular nanoenvironments.
Collapse
Affiliation(s)
- Thomas Delerue
- Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, USA
| | - Sylvia Chareyre
- Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, USA
| | - Vivek Anantharaman
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | - Michael C. Gilmore
- The Laboratory for Molecular Infection Medicine Sweden (MIMS), Umeå Center for Microbial Research (UCMR), Science for Life Laboratory (SciLifeLab), Department of Molecular Biology, Umeå University, Umeå, Sweden
| | - David L. Popham
- Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia, USA
| | - Felipe Cava
- The Laboratory for Molecular Infection Medicine Sweden (MIMS), Umeå Center for Microbial Research (UCMR), Science for Life Laboratory (SciLifeLab), Department of Molecular Biology, Umeå University, Umeå, Sweden
| | - L. Aravind
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | - Kumaran S. Ramamurthi
- Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, USA
| |
Collapse
|
16
|
Patil PR, Burroughs AM, Misra M, Cerullo F, Costas-Insua C, Hung HC, Dikic I, Aravind L, Joazeiro CAP. Mechanism and evolutionary origins of alanine-tail C-degron recognition by E3 ligases Pirh2 and CRL2-KLHDC10. Cell Rep 2023; 42:113100. [PMID: 37676773 PMCID: PMC10591846 DOI: 10.1016/j.celrep.2023.113100] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 07/11/2023] [Accepted: 08/22/2023] [Indexed: 09/09/2023] Open
Abstract
In ribosome-associated quality control (RQC), nascent polypeptides produced by interrupted translation are modified with C-terminal polyalanine tails ("Ala-tails") that function outside ribosomes to induce ubiquitylation by E3 ligases Pirh2 (p53-induced RING-H2 domain-containing) or CRL2 (Cullin-2 RING ligase2)-KLHDC10. Here, we investigate the molecular basis of Ala-tail function using biochemical and in silico approaches. We show that Pirh2 and KLHDC10 directly bind to Ala-tails and that structural predictions identify candidate Ala-tail-binding sites, which we experimentally validate. The degron-binding pockets and specific pocket residues implicated in Ala-tail recognition are conserved among Pirh2 and KLHDC10 homologs, suggesting that an important function of these ligases across eukaryotes is in targeting Ala-tailed substrates. Moreover, we establish that the two Ala-tail-binding pockets have convergently evolved, either from an ancient module of bacterial provenance (Pirh2) or via tinkering of a widespread C-degron-recognition element (KLHDC10). These results shed light on the recognition of a simple degron sequence and the evolution of Ala-tail proteolytic signaling.
Collapse
Affiliation(s)
- Pratik Rajendra Patil
- Center for Molecular Biology of Heidelberg University (ZMBH), DKFZ-ZMBH-Alliance, 69120 Heidelberg, Germany
| | - A Maxwell Burroughs
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Mohit Misra
- Institute of Biochemistry II, Goethe University Faculty of Medicine, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany; Buchmann Institute for Molecular Life Sciences, Goethe University Frankfurt, Max-von-Laue-Strasse 15, 60438 Frankfurt am Main, Germany
| | - Federico Cerullo
- Center for Molecular Biology of Heidelberg University (ZMBH), DKFZ-ZMBH-Alliance, 69120 Heidelberg, Germany
| | - Carlos Costas-Insua
- Center for Molecular Biology of Heidelberg University (ZMBH), DKFZ-ZMBH-Alliance, 69120 Heidelberg, Germany
| | - Hao-Chih Hung
- Center for Molecular Biology of Heidelberg University (ZMBH), DKFZ-ZMBH-Alliance, 69120 Heidelberg, Germany
| | - Ivan Dikic
- Institute of Biochemistry II, Goethe University Faculty of Medicine, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany; Buchmann Institute for Molecular Life Sciences, Goethe University Frankfurt, Max-von-Laue-Strasse 15, 60438 Frankfurt am Main, Germany
| | - L Aravind
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Claudio A P Joazeiro
- Center for Molecular Biology of Heidelberg University (ZMBH), DKFZ-ZMBH-Alliance, 69120 Heidelberg, Germany; Department of Molecular Medicine, UF Scripps Biomedical Research, Jupiter, FL 33458, USA.
| |
Collapse
|
17
|
Baker JL. Illuminating the oral microbiome and its host interactions: recent advancements in omics and bioinformatics technologies in the context of oral microbiome research. FEMS Microbiol Rev 2023; 47:fuad051. [PMID: 37667515 PMCID: PMC10503653 DOI: 10.1093/femsre/fuad051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 08/02/2023] [Accepted: 09/01/2023] [Indexed: 09/06/2023] Open
Abstract
The oral microbiota has an enormous impact on human health, with oral dysbiosis now linked to many oral and systemic diseases. Recent advancements in sequencing, mass spectrometry, bioinformatics, computational biology, and machine learning are revolutionizing oral microbiome research, enabling analysis at an unprecedented scale and level of resolution using omics approaches. This review contains a comprehensive perspective of the current state-of-the-art tools available to perform genomics, metagenomics, phylogenomics, pangenomics, transcriptomics, proteomics, metabolomics, lipidomics, and multi-omics analysis on (all) microbiomes, and then provides examples of how the techniques have been applied to research of the oral microbiome, specifically. Key findings of these studies and remaining challenges for the field are highlighted. Although the methods discussed here are placed in the context of their contributions to oral microbiome research specifically, they are pertinent to the study of any microbiome, and the intended audience of this includes researchers would simply like to get an introduction to microbial omics and/or an update on the latest omics methods. Continued research of the oral microbiota using omics approaches is crucial and will lead to dramatic improvements in human health, longevity, and quality of life.
Collapse
Affiliation(s)
- Jonathon L Baker
- Department of Oral Rehabilitation & Biosciences, School of Dentistry, Oregon Health & Science University, 3181 Sam Jackson Park Road, Portland, OR 97202, United States
- Genomic Medicine Group, J. Craig Venter Institute, La Jolla, CA 92037, United States
- Department of Pediatrics, UC San Diego School of Medicine, La Jolla, CA 92093, United States
| |
Collapse
|
18
|
Ramos-León F, Anjuwon-Foster BR, Anantharaman V, Ferreira CN, Ibrahim AM, Tai CH, Missiakas DM, Camberg JL, Aravind L, Ramamurthi KS. Protein coopted from a phage restriction system dictates orthogonal cell division plane selection in Staphylococcus aureus. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.03.556088. [PMID: 37886572 PMCID: PMC10602043 DOI: 10.1101/2023.09.03.556088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]
Abstract
The spherical bacterium Staphylococcus aureus, a leading cause of nosocomial infections, undergoes binary fission by dividing in two alternating orthogonal planes, but the mechanism by which S. aureus correctly selects the next cell division plane is not known. To identify cell division placement factors, we performed a chemical genetic screen that revealed a gene which we termed pcdA. We show that PcdA is a member of the McrB family of AAA+ NTPases that has undergone structural changes and a concomitant functional shift from a restriction enzyme subunit to an early cell division protein. PcdA directly interacts with the tubulin-like central divisome component FtsZ and localizes to future cell division sites before membrane invagination initiates. This parallels the action of another McrB family protein, CTTNBP2, which stabilizes microtubules in animals. We show that PcdA also interacts with the structural protein DivIVA and propose that the DivIVA/PcdA complex recruits unpolymerized FtsZ to assemble along the proper cell division plane. Deletion of pcdA conferred abnormal, non-orthogonal division plane selection, increased sensitivity to cell wall-targeting antibiotics, and reduced virulence in a murine infection model. Targeting PcdA could therefore highlight a treatment strategy for combatting antibiotic-resistant strains of S. aureus.
Collapse
Affiliation(s)
- Félix Ramos-León
- Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, USA
| | - Brandon R. Anjuwon-Foster
- Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, USA
| | - Vivek Anantharaman
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, USA
| | - Colby N. Ferreira
- Department of Cell and Molecular Biology, University of Rhode Island, Kingston, USA
| | - Amany M. Ibrahim
- Department of Microbiology, Howard Taylor Ricketts Laboratory, University of Chicago, Lemont, USA
| | - Chin-Hsien Tai
- Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, USA
| | - Dominique M. Missiakas
- Department of Microbiology, Howard Taylor Ricketts Laboratory, University of Chicago, Lemont, USA
| | - Jodi L. Camberg
- Department of Cell and Molecular Biology, University of Rhode Island, Kingston, USA
| | - L. Aravind
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, USA
| | - Kumaran S. Ramamurthi
- Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, USA
| |
Collapse
|
19
|
McWhite CD, Armour-Garb I, Singh M. Leveraging protein language models for accurate multiple sequence alignments. Genome Res 2023; 33:1145-1153. [PMID: 37414576 PMCID: PMC10538487 DOI: 10.1101/gr.277675.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 06/29/2023] [Indexed: 07/08/2023]
Abstract
Multiple sequence alignment (MSA) is a critical step in the study of protein sequence and function. Typically, MSA algorithms progressively align pairs of sequences and combine these alignments with the aid of a guide tree. These alignment algorithms use scoring systems based on substitution matrices to measure amino acid similarities. Although successful, standard methods struggle on sets of proteins with low sequence identity: the so-called twilight zone of protein alignment. For these difficult cases, another source of information is needed. Protein language models are a powerful new approach that leverages massive sequence data sets to produce high-dimensional contextual embeddings for each amino acid in a sequence. These embeddings have been shown to reflect physicochemical and higher-order structural and functional attributes of amino acids within proteins. Here, we present a novel approach to MSA, based on clustering and ordering amino acid contextual embeddings. Our method for aligning semantically consistent groups of proteins circumvents the need for many standard components of MSA algorithms, avoiding initial guide tree construction, intermediate pairwise alignments, gap penalties, and substitution matrices. The added information from contextual embeddings leads to higher accuracy alignments for structurally similar proteins with low amino-acid similarity. We anticipate that protein language models will become a fundamental component of the next generation of algorithms for generating MSAs.
Collapse
Affiliation(s)
- Claire D McWhite
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA;
| | - Isabel Armour-Garb
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA
- Department of Computer Science, Princeton University, Princeton, New Jersey 08544, USA
| | - Mona Singh
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA;
- Department of Computer Science, Princeton University, Princeton, New Jersey 08544, USA
| |
Collapse
|
20
|
Wright Z, Seymour M, Paszczak K, Truttmann T, Senn K, Stilp S, Jansen N, Gosz M, Goeden L, Anantharaman V, Aravind L, Waters LS. The small protein MntS evolved from a signal peptide and acquired a novel function regulating manganese homeostasis in Escherichia coli. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.02.543501. [PMID: 37398132 PMCID: PMC10312517 DOI: 10.1101/2023.06.02.543501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Small proteins (< 50 amino acids) are emerging as ubiquitous and important regulators in organisms ranging from bacteria to humans, where they commonly bind to and regulate larger proteins during stress responses. However, fundamental aspects of small proteins, such as their molecular mechanism of action, downregulation after they are no longer needed, and their evolutionary provenance are poorly understood. Here we show that the MntS small protein involved in manganese (Mn) homeostasis binds and inhibits the MntP Mn transporter. Mn is crucial for bacterial survival in stressful environments, but is toxic in excess. Thus, Mn transport is tightly controlled at multiple levels to maintain optimal Mn levels. The small protein MntS adds a new level of regulation for Mn transporters, beyond the known transcriptional and post-transcriptional control. We also found that MntS binds to itself in the presence of Mn, providing a possible mechanism of downregulating MntS activity to terminate its inhibition of MntP Mn export. MntS is homologous to the signal peptide of SitA, the periplasmic metal-binding subunit of a Mn importer. Remarkably, the homologous signal peptide regions can substitute for MntS, demonstrating a functional relationship between MntS and these signal peptides. Conserved gene-neighborhoods support that MntS evolved from an ancestral SitA, acquiring a life of its own with a distinct function in Mn homeostasis. Significance This study demonstrates that the MntS small protein binds and inhibits the MntP Mn exporter, adding another layer to the complex regulation of Mn homeostasis. MntS also interacts with itself in cells with Mn, which could prevent it from regulating MntP. We propose that MntS and other small proteins might sense environmental signals and shut off their own regulation via binding to ligands (e.g., metals) or other proteins. We also provide evidence that MntS evolved from the signal peptide region of the Mn importer, SitA. Homologous SitA signal peptides can recapitulate MntS activities, showing that they have a second function beyond protein secretion. Overall, we establish that small proteins can emerge and develop novel functionalities from gene remnants.
Collapse
|
21
|
Santus L, Garriga E, Deorowicz S, Gudyś A, Notredame C. Towards the accurate alignment of over a million protein sequences: Current state of the art. Curr Opin Struct Biol 2023; 80:102577. [PMID: 37012200 DOI: 10.1016/j.sbi.2023.102577] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 02/21/2023] [Accepted: 02/27/2023] [Indexed: 04/04/2023]
Abstract
Large-scale genomics requires highly scalable and accurate multiple sequence alignment methods. Results collected over this last decade suggest accuracy loss when scaling up over a few thousand sequences. This issue has been actively addressed with a number of innovative algorithmic solutions that combine low-level hardware optimization with novel higher-level heuristics. This review provides an extensive critical overview of these recent methods. Using established reference datasets we conclude that albeit significant progress has been achieved, a unified framework able to consistently and efficiently produce high-accuracy large-scale multiple alignments is still lacking.
Collapse
|
22
|
Tierney BT, Foox J, Ryon KA, Butler D, Damle N, Young BG, Mozsary C, Babler KM, Yin X, Carattini Y, Andrews D, Solle NS, Kumar N, Shukla B, Vidovic D, Currall B, Williams SL, Schürer SC, Stevenson M, Amirali A, Beaver CC, Kobetz E, Boone MM, Reding B, Laine J, Comerford S, Lamar WE, Tallon JJ, Hirschberg JW, Proszynski J, Sharkey ME, Church GM, Grills GS, Solo-Gabriele HM, Mason CE. Geospatially-resolved public-health surveillance via wastewater sequencing. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.05.31.23290781. [PMID: 37398062 PMCID: PMC10312847 DOI: 10.1101/2023.05.31.23290781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Wastewater, which contains everything from pathogens to pollutants, is a geospatially-and temporally-linked microbial fingerprint of a given population. As a result, it can be leveraged for monitoring multiple dimensions of public health across locales and time. Here, we integrate targeted and bulk RNA sequencing (n=1,419 samples) to track the viral, bacterial, and functional content over geospatially distinct areas within Miami Dade County from 2020-2022. First, we used targeted amplicon sequencing (n=966) to track diverse SARS-CoV-2 variants across space and time, and we found a tight correspondence with clinical caseloads from University students (N = 1,503) and Miami-Dade County hospital patients (N = 3,939 patients), as well as an 8-day earlier detection of the Delta variant in wastewater vs. in patients. Additionally, in 453 metatranscriptomic samples, we demonstrate that different wastewater sampling locations have clinically and public-health-relevant microbiota that vary as a function of the size of the human population they represent. Through assembly, alignment-based, and phylogenetic approaches, we also detect multiple clinically important viruses (e.g., norovirus ) and describe geospatial and temporal variation in microbial functional genes that indicate the presence of pollutants. Moreover, we found distinct profiles of antimicrobial resistance (AMR) genes and virulence factors across campus buildings, dorms, and hospitals, with hospital wastewater containing a significant increase in AMR abundance. Overall, this effort lays the groundwork for systematic characterization of wastewater to improve public health decision making and a broad platform to detect emerging pathogens.
Collapse
|
23
|
Du S, Tong X, Lai ACK, Chan CK, Mason CE, Lee PKH. Highly host-linked viromes in the built environment possess habitat-dependent diversity and functions for potential virus-host coevolution. Nat Commun 2023; 14:2676. [PMID: 37160974 PMCID: PMC10169181 DOI: 10.1038/s41467-023-38400-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 04/27/2023] [Indexed: 05/11/2023] Open
Abstract
Viruses in built environments (BEs) raise public health concerns, yet they are generally less studied than bacteria. To better understand viral dynamics in BEs, this study assesses viromes from 11 habitats across four types of BEs with low to high occupancy. The diversity, composition, metabolic functions, and lifestyles of the viromes are found to be habitat dependent. Caudoviricetes species are ubiquitous on surface habitats in the BEs, and some of them are distinct from those present in other environments. Antimicrobial resistance genes are identified in viruses inhabiting surfaces frequently touched by occupants and in viruses inhabiting occupants' skin. Diverse CRISPR/Cas immunity systems and anti-CRISPR proteins are found in bacterial hosts and viruses, respectively, consistent with the strongly coupled virus-host links. Evidence of viruses potentially aiding host adaptation in a specific-habitat manner is identified through a unique gene insertion. This work illustrates that virus-host interactions occur frequently in BEs and that viruses are integral members of BE microbiomes.
Collapse
Affiliation(s)
- Shicong Du
- School of Energy and Environment, City University of Hong Kong, Hong Kong SAR, China
| | - Xinzhao Tong
- School of Energy and Environment, City University of Hong Kong, Hong Kong SAR, China
- Department of Biological Sciences, School of Science, Xi'an Jiaotong-Liverpool University, Suzhou, P. R. China
| | - Alvin C K Lai
- School of Energy and Environment, City University of Hong Kong, Hong Kong SAR, China
| | - Chak K Chan
- School of Energy and Environment, City University of Hong Kong, Hong Kong SAR, China
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- The WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY, USA
- The Feil Family Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY, USA
| | - Patrick K H Lee
- School of Energy and Environment, City University of Hong Kong, Hong Kong SAR, China.
- State Key Laboratory of Marine Pollution, City University of Hong Kong, Hong Kong SAR, China.
| |
Collapse
|
24
|
Patil PR, Burroughs AM, Misra M, Cerullo F, Dikic I, Aravind L, Joazeiro CAP. Mechanism and evolutionary origins of Alanine-tail C-degron recognition by E3 ligases Pirh2 and CRL2-KLHDC10. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.03.539038. [PMID: 37205381 PMCID: PMC10187211 DOI: 10.1101/2023.05.03.539038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
In Ribosome-associated Quality Control (RQC), nascent-polypeptides produced by interrupted translation are modified with C-terminal polyalanine tails ('Ala-tails') that function outside ribosomes to induce ubiquitylation by Pirh2 or CRL2-KLHDC10 E3 ligases. Here we investigate the molecular basis of Ala-tail function using biochemical and in silico approaches. We show that Pirh2 and KLHDC10 directly bind to Ala-tails, and structural predictions identify candidate Ala-tail binding sites, which we experimentally validate. The degron-binding pockets and specific pocket residues implicated in Ala-tail recognition are conserved among Pirh2 and KLHDC10 homologs, suggesting that an important function of these ligases across eukaryotes is in targeting Ala-tailed substrates. Moreover, we establish that the two Ala-tail binding pockets have convergently evolved, either from an ancient module of bacterial provenance (Pirh2) or via tinkering of a widespread C-degron recognition element (KLHDC10). These results shed light on the recognition of a simple degron sequence and the evolution of Ala-tail proteolytic signaling.
Collapse
|
25
|
Saheb Kashaf S, Harkins CP, Deming C, Joglekar P, Conlan S, Holmes CJ, Almeida A, Finn RD, Segre JA, Kong HH. Staphylococcal diversity in atopic dermatitis from an individual to a global scale. Cell Host Microbe 2023; 31:578-592.e6. [PMID: 37054678 PMCID: PMC10151067 DOI: 10.1016/j.chom.2023.03.010] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 12/08/2022] [Accepted: 03/10/2023] [Indexed: 04/15/2023]
Abstract
Atopic dermatitis (AD) is a multifactorial, chronic relapsing disease associated with genetic and environmental factors. Among skin microbes, Staphylococcus aureus and Staphylococcus epidermidis are associated with AD, but how genetic variability and staphylococcal strains shape the disease remains unclear. We investigated the skin microbiome of an AD cohort (n = 54) as part of a prospective natural history study using shotgun metagenomic and whole genome sequencing, which we analyzed alongside publicly available data (n = 473). AD status and global geographical regions exhibited associations with strains and genomic loci of S. aureus and S. epidermidis. In addition, antibiotic prescribing patterns and within-household transmission between siblings shaped colonizing strains. Comparative genomics determined that S. aureus AD strains were enriched in virulence factors, whereas S. epidermidis AD strains varied in genes involved in interspecies interactions and metabolism. In both species, staphylococcal interspecies genetic transfer shaped gene content. These findings reflect the staphylococcal genomic diversity and dynamics associated with AD.
Collapse
Affiliation(s)
- Sara Saheb Kashaf
- Microbial Genomics Section, Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA; European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Catriona P Harkins
- Microbial Genomics Section, Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA; Dermatology Branch, National Institute of Arthritis and Musculoskeletal and Skin Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| | - Clay Deming
- Microbial Genomics Section, Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Payal Joglekar
- Microbial Genomics Section, Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Sean Conlan
- Microbial Genomics Section, Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Cassandra J Holmes
- Dermatology Branch, National Institute of Arthritis and Musculoskeletal and Skin Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| | - Alexandre Almeida
- Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Julia A Segre
- Microbial Genomics Section, Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA.
| | - Heidi H Kong
- Dermatology Branch, National Institute of Arthritis and Musculoskeletal and Skin Diseases, National Institutes of Health, Bethesda, MD 20892, USA.
| |
Collapse
|
26
|
Saldivar EV, Ding Y, Poretsky E, Bird S, Block AK, Huffaker A, Schmelz EA. Maize Terpene Synthase 8 (ZmTPS8) Contributes to a Complex Blend of Fungal-Elicited Antibiotics. PLANTS (BASEL, SWITZERLAND) 2023; 12:1111. [PMID: 36903970 PMCID: PMC10005556 DOI: 10.3390/plants12051111] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 02/23/2023] [Accepted: 02/23/2023] [Indexed: 06/18/2023]
Abstract
In maize (Zea mays), fungal-elicited immune responses include the accumulation of terpene synthase (TPS) and cytochrome P450 monooxygenases (CYP) enzymes resulting in complex antibiotic arrays of sesquiterpenoids and diterpenoids, including α/β-selinene derivatives, zealexins, kauralexins and dolabralexins. To uncover additional antibiotic families, we conducted metabolic profiling of elicited stem tissues in mapping populations, which included B73 × M162W recombinant inbred lines and the Goodman diversity panel. Five candidate sesquiterpenoids associated with a chromosome 1 locus spanning the location of ZmTPS27 and ZmTPS8. Heterologous enzyme co-expression studies of ZmTPS27 in Nicotiana benthamiana resulted in geraniol production while ZmTPS8 yielded α-copaene, δ-cadinene and sesquiterpene alcohols consistent with epi-cubebol, cubebol, copan-3-ol and copaborneol matching the association mapping efforts. ZmTPS8 is an established multiproduct α-copaene synthase; however, ZmTPS8-derived sesquiterpene alcohols are rarely encountered in maize tissues. A genome wide association study further linked an unknown sesquiterpene acid to ZmTPS8 and combined ZmTPS8-ZmCYP71Z19 heterologous enzyme co-expression studies yielded the same product. To consider defensive roles for ZmTPS8, in vitro bioassays with cubebol demonstrated significant antifungal activity against both Fusarium graminearum and Aspergillus parasiticus. As a genetically variable biochemical trait, ZmTPS8 contributes to the cocktail of terpenoid antibiotics present following complex interactions between wounding and fungal elicitation.
Collapse
Affiliation(s)
- Evan V. Saldivar
- Department of Cell and Developmental Biology, University of California at San Diego, San Diego, CA 92093, USA
- Department of Plant Biology, Carnegie Institution for Science, Stanford University, Palo Alto, CA 94305, USA
| | - Yezhang Ding
- Department of Cell and Developmental Biology, University of California at San Diego, San Diego, CA 92093, USA
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Elly Poretsky
- Department of Cell and Developmental Biology, University of California at San Diego, San Diego, CA 92093, USA
| | - Skylar Bird
- Department of Cell and Developmental Biology, University of California at San Diego, San Diego, CA 92093, USA
| | - Anna K. Block
- Chemistry Research Unit, U.S. Department of Agriculture-Agricultural Research Service, Center for Medical, Agricultural and Veterinary Entomology, Gainesville, FL 32608, USA
| | - Alisa Huffaker
- Department of Cell and Developmental Biology, University of California at San Diego, San Diego, CA 92093, USA
| | - Eric A. Schmelz
- Department of Cell and Developmental Biology, University of California at San Diego, San Diego, CA 92093, USA
| |
Collapse
|
27
|
Vineis JH, Bulseco AN, Bowen JL. Microbial chemolithoautotrophs are abundant in salt marsh sediment following long-term experimental nitrate enrichment. FEMS Microbiol Lett 2023; 370:fnad082. [PMID: 37541957 DOI: 10.1093/femsle/fnad082] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 07/13/2023] [Accepted: 07/25/2023] [Indexed: 08/06/2023] Open
Abstract
Long-term anthropogenic nitrate (NO3-) enrichment is a serious threat to many coastal systems. Nitrate reduction coupled with the oxidation of reduced forms of sulfur is conducted by chemolithoautotrophic microbial populations in a process that decreases nitrogen (N) pollution. However, little is known about the diversity and distribution of microbes capable of carbon fixation within salt marsh sediment and how they respond to long-term NO3- loading. We used genome-resolved metagenomics to characterize the distribution, phylogenetic relationships, and adaptations important to microbial communities within NO3--enriched sediment. We found NO3- reducing sulfur oxidizers became dominant members of the microbial community throughout the top 25 cm of the sediment following long-term NO3- enrichment. We also found that most of the chemolithoautotrophic genomes recovered contained striking metabolic versatility, including the potential for complete denitrification and evidence of mixotrophy. Phylogenetic reconstruction indicated that similar carbon fixation strategies and metabolic versatility can be found in several phylogenetic groups, but the genomes recovered here represent novel organisms. Our results suggest that the role of chemolithoautotrophy within NO3--enriched salt marsh sediments may be quantitatively more important for retaining carbon and filtering NO3- than previously indicated and further inquiry is needed to explicitly measure their contribution to carbon turnover and removal of N pollution.
Collapse
Affiliation(s)
- Joseph H Vineis
- Department of Marine and Environmental Sciences, Marine Science Center, Northeastern University, 30 Nahant Road, Nahant, MA 01908, United States
| | - Ashley N Bulseco
- Department of Marine and Environmental Sciences, Marine Science Center, Northeastern University, 30 Nahant Road, Nahant, MA 01908, United States
| | - Jennifer L Bowen
- Department of Marine and Environmental Sciences, Marine Science Center, Northeastern University, 30 Nahant Road, Nahant, MA 01908, United States
| |
Collapse
|
28
|
Fullam A, Letunic I, Schmidt TSB, Ducarmon QR, Karcher N, Khedkar S, Kuhn M, Larralde M, Maistrenko OM, Malfertheiner L, Milanese A, Rodrigues JFM, Sanchis-López C, Schudoma C, Szklarczyk D, Sunagawa S, Zeller G, Huerta-Cepas J, von Mering C, Bork P, Mende DR. proGenomes3: approaching one million accurately and consistently annotated high-quality prokaryotic genomes. Nucleic Acids Res 2023; 51:D760-D766. [PMID: 36408900 PMCID: PMC9825469 DOI: 10.1093/nar/gkac1078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 10/15/2022] [Accepted: 11/07/2022] [Indexed: 11/22/2022] Open
Abstract
The interpretation of genomic, transcriptomic and other microbial 'omics data is highly dependent on the availability of well-annotated genomes. As the number of publicly available microbial genomes continues to increase exponentially, the need for quality control and consistent annotation is becoming critical. We present proGenomes3, a database of 907 388 high-quality genomes containing 4 billion genes that passed stringent criteria and have been consistently annotated using multiple functional and taxonomic databases including mobile genetic elements and biosynthetic gene clusters. proGenomes3 encompasses 41 171 species-level clusters, defined based on universal single copy marker genes, for which pan-genomes and contextual habitat annotations are provided. The database is available at http://progenomes.embl.de/.
Collapse
Affiliation(s)
- Anthony Fullam
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Ivica Letunic
- Biobyte solutions GmbH, Bothestr. 142, 69117 Heidelberg, Germany
| | - Thomas S B Schmidt
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Quinten R Ducarmon
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Nicolai Karcher
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Supriya Khedkar
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Michael Kuhn
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Martin Larralde
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Oleksandr M Maistrenko
- Royal Netherlands Institute for Sea Research (NIOZ), Department of Marine Microbiology & Biogeochemistry, 1797 SZ, 't Horntje (Texel), Netherlands
| | - Lukas Malfertheiner
- Department of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Alessio Milanese
- Institute of Microbiology, Department of Biology and Swiss Institute of Bioinformatics, ETH Zurich, Vladimir-Prelog-Weg 4, 8093 Zurich, Switzerland
| | | | - Claudia Sanchis-López
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Campus de Montegancedo-UPM, 28223, Pozuelo de Alarcón, Madrid, Spain
| | - Christian Schudoma
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Damian Szklarczyk
- Department of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Shinichi Sunagawa
- Institute of Microbiology, Department of Biology and Swiss Institute of Bioinformatics, ETH Zurich, Vladimir-Prelog-Weg 4, 8093 Zurich, Switzerland
| | - Georg Zeller
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Jaime Huerta-Cepas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Campus de Montegancedo-UPM, 28223, Pozuelo de Alarcón, Madrid, Spain
| | - Christian von Mering
- Department of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany.,Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany.,Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany.,Yonsei Frontier Lab (YFL), Yonsei University, 03722 Seoul, South Korea
| | - Daniel R Mende
- Department of Medical Microbiology, Amsterdam University Medical Centers, Amsterdam, The Netherlands
| |
Collapse
|
29
|
Abstract
Future applications of synthetic biology will rely on deploying engineered cells outside of lab environments for long periods of time. Currently, a significant roadblock to this application is the potential for deactivating mutations in engineered genes. A recently developed method to protect engineered coding sequences from mutation is called Constraining Adaptive Mutations using Engineered Overlapping Sequences (CAMEOS). In this chapter we provide a workflow for utilizing CAMEOS to create synthetic overlaps between two genes, one essential (infA) and one non-essential (aroB), to protect the non-essential gene from mutation and loss of protein function. In this workflow we detail the methods to collect large numbers of related protein sequences, produce multiple sequence alignments (MSAs), use the MSAs to generate hidden Markov models and Markov random field models, and finally generate a library of overlapping coding sequences through CAMEOS scripts. To assist practitioners with basic coding skills to try out the CAMEOS method, we have created a virtual machine containing all the required packages already installed that can be downloaded and run locally.
Collapse
Affiliation(s)
- Dominic Y Logel
- School of Natural Sciences, Macquarie University, Sydney, NSW, Australia
- ARC Centre of Excellence in Synthetic Biology, Macquarie University, Sydney, NSW, Australia
| | - Paul R Jaschke
- School of Natural Sciences, Macquarie University, Sydney, NSW, Australia.
- ARC Centre of Excellence in Synthetic Biology, Macquarie University, Sydney, NSW, Australia.
| |
Collapse
|
30
|
Wei Y, Zou Q, Tang F, Yu L. WMSA: a novel method for multiple sequence alignment of DNA sequences. Bioinformatics 2022; 38:5019-5025. [PMID: 36179076 DOI: 10.1093/bioinformatics/btac658] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 08/30/2022] [Accepted: 09/29/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Multiple sequence alignment (MSA) is a fundamental problem in bioinformatics. The quality of alignment will affect downstream analysis. MAFFT has adopted the Fast Fourier Transform method for searching the homologous segments and using them as anchors to divide the sequences, then making alignment only on segments, which can save time and memory without overly reducing the sequence alignment quality. MAFFT becomes slow when the dataset is large. RESULTS We made a software, WMSA, which uses the divide-and-conquer method to split the sequences into clusters, aligns those clusters into profiles with the center star strategy and then makes a progressive profile-profile alignment. The alignment is conducted by the compiled algorithms of MAFFT, K-Band with multithread parallelism. Our method can balance time, space and quality and performs better than MAFFT in test experiments on highly conserved datasets. AVAILABILITY AND IMPLEMENTATION Source code is freely available at https://github.com/malabz/WMSA/, which is implemented in C/C++ and supported on Linux, and datasets are available at https://github.com/malabz/WMSA-dataset. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yanming Wei
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324003, China.,Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan 610054, China
| | - Furong Tang
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324003, China
| | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China
| |
Collapse
|
31
|
Pipes L, Chen Z, Afanaseva S, Nielsen R. Estimating the relative proportions of SARS-CoV-2 haplotypes from wastewater samples. CELL REPORTS METHODS 2022; 2:100313. [PMID: 36159190 PMCID: PMC9485417 DOI: 10.1016/j.crmeth.2022.100313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 06/27/2022] [Accepted: 09/14/2022] [Indexed: 12/02/2022]
Abstract
Wastewater surveillance has become essential for monitoring the spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The quantification of SARS-CoV-2 RNA in wastewater correlates with the coronavirus disease 2019 (COVID-19) caseload in a community. However, estimating the proportions of different SARS-CoV-2 haplotypes has remained technically difficult. We present a phylogenetic imputation method for improving the SARS-CoV-2 reference database and a method for estimating the relative proportions of SARS-CoV-2 haplotypes from wastewater samples. The phylogenetic imputation method uses the global SARS-CoV-2 phylogeny and imputes based on the maximum of the posterior probability of each nucleotide. We show that the imputation method has error rates comparable to, or lower than, typical sequencing error rates, which substantially improves the reference database and allows for accurate inferences of haplotype composition. Our method for estimating relative proportions of haplotypes uses an initial step to remove unlikely haplotypes and an expectation maximization (EM) algorithm for obtaining maximum likelihood estimates of the proportions of different haplotypes in a sample. Using simulations with a reference database of >3 million SARS-CoV-2 genomes, we show that the estimated proportions reflect the true proportions given sufficiently high sequencing depth.
Collapse
Affiliation(s)
- Lenore Pipes
- Department of Integrative Biology, University of California-Berkeley, 4098 Valley Life Sciences Building, Berkeley, CA 94720, USA
| | - Zihao Chen
- School of Mathematical Sciences, Peking University, Beijing 100871, China
| | - Svetlana Afanaseva
- Department of Integrative Biology, University of California-Berkeley, 4098 Valley Life Sciences Building, Berkeley, CA 94720, USA
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California-Berkeley, 4098 Valley Life Sciences Building, Berkeley, CA 94720, USA
- GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
32
|
Ngou BPM, Heal R, Wyler M, Schmid MW, Jones JDG. Concerted expansion and contraction of immune receptor gene repertoires in plant genomes. NATURE PLANTS 2022; 8:1146-1152. [PMID: 36241733 PMCID: PMC9579050 DOI: 10.1038/s41477-022-01260-5] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 09/09/2022] [Indexed: 05/10/2023]
Abstract
Recent reports suggest that cell-surface and intracellular immune receptors function synergistically to activate robust defence against pathogens, but whether they co-evolve is unclear. Here we determined the numbers of cell-surface and intracellular immune receptors in 350 species. Surprisingly, the number of receptor genes that are predicted to encode cell-surface and intracellular immune receptors is strongly correlated. We suggest this is consistent with mutual potentiation of immunity initiated by cell-surface and intracellular receptors being reflected in the concerted co-evolution of the size of their repertoires across plant species.
Collapse
Affiliation(s)
- Bruno Pok Man Ngou
- The Sainsbury Laboratory, University of East Anglia, Norwich Research Park, Norwich, UK
- RIKEN Center for Sustainable Resource Science, Yokohama, Japan
| | - Robert Heal
- The Sainsbury Laboratory, University of East Anglia, Norwich Research Park, Norwich, UK
| | | | | | - Jonathan D G Jones
- The Sainsbury Laboratory, University of East Anglia, Norwich Research Park, Norwich, UK.
| |
Collapse
|
33
|
Baltzis A, Mansouri L, Jin S, Langer BE, Erb I, Notredame C. Highly significant improvement of protein sequence alignments with AlphaFold2. Bioinformatics 2022; 38:5007-5011. [PMID: 36130276 PMCID: PMC9665868 DOI: 10.1093/bioinformatics/btac625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 08/29/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Protein sequence alignments are essential to structural, evolutionary and functional analysis, but their accuracy is often limited by sequence similarity unless molecular structures are available. Protein structures predicted at experimental grade accuracy, as achieved by AlphaFold2, could therefore have a major impact on sequence analysis. RESULTS Here, we find that multiple sequence alignments estimated on AlphaFold2 predictions are almost as accurate as alignments estimated on experimental structures and significantly closer to the structural reference than sequence-based alignments. We also show that AlphaFold2 structural models of relatively low quality can be used to obtain highly accurate alignments. These results suggest that, besides structure modeling, AlphaFold2 encodes higher-order dependencies that can be exploited for sequence analysis. AVAILABILITY AND IMPLEMENTATION All data, analyses and results are available on Zenodo (https://doi.org/10.5281/zenodo.7031286). The code and scripts have been deposited in GitHub (https://github.com/cbcrg/msa-af2-nf) and the various containers in (https://cloud.sylabs.io/library/athbaltzis/af2/alphafold, https://hub.docker.com/r/athbaltzis/pred). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Suzanne Jin
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona 08003, Spain
| | - Björn E Langer
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona 08003, Spain
| | - Ionas Erb
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona 08003, Spain
| | | |
Collapse
|
34
|
Singh A, Schnürer A. AcetoBase Version 2: a database update and re-analysis of formyltetrahydrofolate synthetase amplicon sequencing data from anaerobic digesters. Database (Oxford) 2022; 2022:6609150. [PMID: 35708586 PMCID: PMC9216588 DOI: 10.1093/database/baac041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 05/03/2022] [Accepted: 05/04/2022] [Indexed: 11/14/2022]
Abstract
AcetoBase is a public repository and database of formyltetrahydrofolate synthetase (FTHFS) sequences. It is the first systematic collection of bacterial FTHFS nucleotide and protein sequences from genomes and metagenome-assembled genomes and of sequences generated by clone library sequencing. At its publication in 2019, AcetoBase (Version 1) was also the first database to establish connections between the FTHFS gene, the Wood–Ljungdahl pathway and 16S ribosomal RNA genes. Since the publication of AcetoBase, there have been significant improvements in the taxonomy of many bacterial lineages and accessibility/availability of public genomics and metagenomics data. The update to the AcetoBase reference database described here (Version 2) provides new sequence data and taxonomy, along with improvements in web functionality and user interface. The evaluation of this latest update by re-analysis of publicly accessible FTHFS amplicon sequencing data previously analysed with AcetoBase Version 1 revealed significant improvements in the taxonomic assignment of FTHFS sequences. Database URL: https://acetobase.molbio.slu.se
Collapse
Affiliation(s)
- Abhijeet Singh
- Department of Molecular Sciences, BioCenter, Anaerobic Microbiology and Biotechnology Group, Swedish University of Agricultural Sciences , Almas Allé 5, Uppsala SE-750 07, Sweden
| | - Anna Schnürer
- Department of Molecular Sciences, BioCenter, Anaerobic Microbiology and Biotechnology Group, Swedish University of Agricultural Sciences , Almas Allé 5, Uppsala SE-750 07, Sweden
| |
Collapse
|
35
|
Wei Q, Zou H, Zhong C, Xu J. RPfam: A refiner towards curated-like multiple sequence alignments of the Pfam protein families. J Bioinform Comput Biol 2022; 20:2240002. [DOI: 10.1142/s0219720022400029] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
High-quality multiple sequence alignments can provide insights into the architecture and function of protein families. The existing MSA tools often generate results inconsistent with biological distribution of conserved regions because of positioning amino acid residues and gaps only by symbols. We propose RPfam, a refiner towards curated-like MSAs for modeling the protein families in the Pfam database. RPfam refines the automatic alignments via scoring alignments based on the PFASUM matrix, restricting realignments within badly aligned blocks, optimizing the block scores by dynamic programming, and running refinements iteratively using the Simulated Annealing algorithm. Experiments show RPfam effectively refined the alignments produced by the MSA tools ClustalO and Muscle with reference to the curated seed alignments of the Pfam protein families. Especially RPfam improved the quality of the ClustalO alignments by 4.4% and the Muscle alignments by 2.8% on the gp32 DNA binding protein-like family. Supplementary Table is available at http://www.worldscinet.com/jbcb/ .
Collapse
Affiliation(s)
- Qingting Wei
- School of Software, Nanchang University, Nanchang 330047, Jiangxi Province, P. R. China
| | - Hong Zou
- Jiangxi Provincial Armed Force Unit Hospital, Nanchang 330043, Jiangxi Province, P. R. China
| | - Cuncong Zhong
- Department of Electrical Engineering and Computer Science, The University of Kansas, Lawrence, KS 66045, USA
| | - Jianfeng Xu
- School of Software, Nanchang University, Nanchang 330047, Jiangxi Province, P. R. China
| |
Collapse
|
36
|
Chao J, Tang F, Xu L. Developments in Algorithms for Sequence Alignment: A Review. Biomolecules 2022; 12:biom12040546. [PMID: 35454135 PMCID: PMC9024764 DOI: 10.3390/biom12040546] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 03/29/2022] [Accepted: 03/31/2022] [Indexed: 01/27/2023] Open
Abstract
The continuous development of sequencing technologies has enabled researchers to obtain large amounts of biological sequence data, and this has resulted in increasing demands for software that can perform sequence alignment fast and accurately. A number of algorithms and tools for sequence alignment have been designed to meet the various needs of biologists. Here, the ideas that prevail in the research of sequence alignment and some quality estimation methods for multiple sequence alignment tools are summarized.
Collapse
Affiliation(s)
- Jiannan Chao
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China;
| | - Furong Tang
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324003, China;
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen 518055, China
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen 518055, China
- Correspondence:
| |
Collapse
|
37
|
Vanni C, Schechter MS, Acinas SG, Barberán A, Buttigieg PL, Casamayor EO, Delmont TO, Duarte CM, Eren AM, Finn RD, Kottmann R, Mitchell A, Sánchez P, Siren K, Steinegger M, Gloeckner FO, Fernàndez-Guerra A. Unifying the known and unknown microbial coding sequence space. eLife 2022; 11:67667. [PMID: 35356891 PMCID: PMC9132574 DOI: 10.7554/elife.67667] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 03/30/2022] [Indexed: 12/02/2022] Open
Abstract
Genes of unknown function are among the biggest challenges in molecular biology, especially in microbial systems, where 40–60% of the predicted genes are unknown. Despite previous attempts, systematic approaches to include the unknown fraction into analytical workflows are still lacking. Here, we present a conceptual framework, its translation into the computational workflow AGNOSTOS and a demonstration on how we can bridge the known-unknown gap in genomes and metagenomes. By analyzing 415,971,742 genes predicted from 1749 metagenomes and 28,941 bacterial and archaeal genomes, we quantify the extent of the unknown fraction, its diversity, and its relevance across multiple organisms and environments. The unknown sequence space is exceptionally diverse, phylogenetically more conserved than the known fraction and predominantly taxonomically restricted at the species level. From the 71 M genes identified to be of unknown function, we compiled a collection of 283,874 lineage-specific genes of unknown function for Cand. Patescibacteria (also known as Candidate Phyla Radiation, CPR), which provides a significant resource to expand our understanding of their unusual biology. Finally, by identifying a target gene of unknown function for antibiotic resistance, we demonstrate how we can enable the generation of hypotheses that can be used to augment experimental data. It is estimated that scientists do not know what half of microbial genes actually do. When these genes are discovered in microorganisms grown in the lab or found in environmental samples, it is not possible to identify what their roles are. Many of these genes are excluded from further analyses for these reasons, meaning that the study of microbial genes tends to be limited to genes that have already been described. These limitations hinder research into microbiology, because information from newly discovered genes cannot be integrated to better understand how these organisms work. Experiments to understand what role these genes have in the microorganisms are labor-intensive, so new analytical strategies are needed. To do this, Vanni et al. developed a new framework to categorize genes with unknown roles, and a computational workflow to integrate them into traditional analyses. When this approach was applied to over 400 million microbial genes (both with known and unknown roles), it showed that the share of genes with unknown functions is only about 30 per cent, smaller than previously thought. The analysis also showed that these genes are very diverse, revealing a huge space for future research and potential applications. Combining their approach with experimental data, Vanni et al. were able to identify a gene with a previously unknown purpose that could be involved in antibiotic resistance. This system could be useful for other scientists studying microorganisms to get a more complete view of microbial systems. In future, it may also be used to analyze the genetics of other organisms, such as plants and animals.
Collapse
Affiliation(s)
- Chiara Vanni
- Microbial Genomics and Bioinformatics Research G, Max Planck Institute for Marine Microbiology, Bremen, Germany
| | | | - Silvia G Acinas
- Department of Marine Biology and Oceanography, Institut de Ciències del Mar-CMIMA (CSIC), Barcelona, Spain
| | - Albert Barberán
- Department of Environmental Science, University of Arizona, Tucson, United States
| | - Pier Luigi Buttigieg
- Helmholtz Centre for Polar and Marine Research, Alfred Wegener Institute, Bremerhaven, Germany
| | - Emilio O Casamayor
- Center for Advanced Studies of Blanes CEAB-CSIC, Spanish Council for Research, Blanes, Spain
| | - Tom O Delmont
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Paris, France
| | - Carlos M Duarte
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - A Murat Eren
- Department of Medicine, University of Chicago, Chicago, United States
| | - Robert D Finn
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Hinxton, United Kingdom
| | - Renzo Kottmann
- Microbial Genomics and Bioinformatics Research G, Max Planck Institute for Marine Microbiology, Bremen, Germany
| | - Alex Mitchell
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Hinxton, United Kingdom
| | - Pablo Sánchez
- Department of Marine Biology and Oceanography, Institut de Ciències del Mar-CMIMA (CSIC), Barcelona, Spain
| | - Kimmo Siren
- Section for Evolutionary Genomics, The GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
| | - Martin Steinegger
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
| | - Frank Oliver Gloeckner
- MARUM, Helmholtz Center for Polar and Marine Research, University of Bremen, Bremen, Germany
| | - Antonio Fernàndez-Guerra
- Lundbeck Foundation GeoGenetics Centre, GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
38
|
Functional Classification and Characterization of the Fungal Glycoside Hydrolase 28 Protein Family. J Fungi (Basel) 2022; 8:jof8030217. [PMID: 35330219 PMCID: PMC8952511 DOI: 10.3390/jof8030217] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 02/13/2022] [Accepted: 02/15/2022] [Indexed: 02/01/2023] Open
Abstract
Pectin is a major constituent of the plant cell wall, comprising compounds with important industrial applications such as homogalacturonan, rhamnogalacturonan and xylogalacturonan. A large array of enzymes is involved in the degradation of this amorphous substrate. The Glycoside Hydrolase 28 (GH28) family includes polygalacturonases (PG), rhamnogalacturonases (RG) and xylogalacturonases (XG) that share a structure of three to four pleated β-sheets that form a rod with the catalytic site amidst a long, narrow groove. Although these enzymes have been studied for many years, there has been no systematic analysis. We have collected a comprehensive set of GH28 encoding sequences to study their evolution in fungi, directed at obtaining a functional classification, as well as at the identification of substrate specificity as functional constraint. Computational tools such as Alphafold, Consurf and MEME were used to identify the subfamilies’ characteristics. A hierarchic classification defines the major classes of endoPG, endoRG and endoXG as well as three exoPG classes. Ascomycete endoPGs are further classified in two subclasses whereas we identify four exoRG subclasses. Diversification towards exomode is explained by loops that appear inserted in a number of turns. Substrate-driven diversification can be identified by various specificity determining positions that appear to surround the binding groove.
Collapse
|
39
|
Merényi Z, Virágh M, Gluck-Thaler E, Slot JC, Kiss B, Varga T, Geösel A, Hegedüs B, Bálint B, Nagy LG. Gene age shapes the transcriptional landscape of sexual morphogenesis in mushroom forming fungi (Agaricomycetes). eLife 2022; 11:71348. [PMID: 35156613 PMCID: PMC8893723 DOI: 10.7554/elife.71348] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 02/11/2022] [Indexed: 11/13/2022] Open
Abstract
Multicellularity has been one of the most important innovations in the history of life. The role of gene regulatory changes in driving transitions to multicellularity is being increasingly recognized; however, factors influencing gene expression patterns are poorly known in many clades. Here, we compared the developmental transcriptomes of complex multicellular fruiting bodies of eight Agaricomycetes and Cryptococcus neoformans, a closely related human pathogen with a simple morphology. In-depth analysis in Pleurotus ostreatus revealed that allele-specific expression, natural antisense transcripts, and developmental gene expression, but not RNA editing or a ‘developmental hourglass,’ act in concert to shape its transcriptome during fruiting body development. We found that transcriptional patterns of genes strongly depend on their evolutionary ages. Young genes showed more developmental and allele-specific expression variation, possibly because of weaker evolutionary constraint, suggestive of nonadaptive expression variance in fruiting bodies. These results prompted us to define a set of conserved genes specifically regulated only during complex morphogenesis by excluding young genes and accounting for deeply conserved ones shared with species showing simple sexual development. Analysis of the resulting gene set revealed evolutionary and functional associations with complex multicellularity, which allowed us to speculate they are involved in complex multicellular morphogenesis of mushroom fruiting bodies.
Collapse
Affiliation(s)
- Zsolt Merényi
- Synthetic and Systems Biology Unit, Biological Research Center, Szeged, Hungary
| | - Máté Virágh
- Synthetic and Systems Biology Unit, Biological Research Center, Szeged, Hungary
| | - Emile Gluck-Thaler
- Department of Biology, University of Pennsylvania, Philadelphia, United States
| | - Jason C Slot
- Department of Plant Pathology, Ohio State University, Columbus, United States
| | - Brigitta Kiss
- Synthetic and Systems Biology Unit, Biological Research Center, Szeged, Hungary
| | - Torda Varga
- Synthetic and Systems Biology Unit, Biological Research Center, Szeged, Hungary
| | - András Geösel
- Department of Vegetable and Mushroom Growing, Hungarian University of Agriculture and Life Sciences, Budapest, Hungary
| | - Botond Hegedüs
- Synthetic and Systems Biology Unit, Biological Research Center, Szeged, Hungary
| | - Balázs Bálint
- Synthetic and Systems Biology Unit, Biological Research Center, Szeged, Hungary
| | - László G Nagy
- Synthetic and Systems Biology Unit, Biological Research Center, Szeged, Hungary
| |
Collapse
|
40
|
Raghavan V, Kraft L, Mesny F, Rigerte L. A simple guide to de novo transcriptome assembly and annotation. Brief Bioinform 2022; 23:6514404. [PMID: 35076693 PMCID: PMC8921630 DOI: 10.1093/bib/bbab563] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 12/03/2021] [Accepted: 12/09/2021] [Indexed: 12/13/2022] Open
Abstract
A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be assembled de novo using only the information available in the RNA-seq reads. Subsequently, the sequences must be annotated in order to identify sequence-intrinsic and evolutionary features in them (for example, protein-coding regions). Although straightforward at first glance, de novo transcriptome assembly and annotation can quickly prove to be challenging undertakings. In addition to familiarizing themselves with the conceptual and technical intricacies of the tasks at hand and the numerous pre- and post-processing steps involved, those interested must also grapple with an overwhelmingly large choice of tools. The lack of standardized workflows, fast pace of development of new tools and techniques and paucity of authoritative literature have served to exacerbate the difficulty of the task even further. Here, we present a comprehensive overview of de novo transcriptome assembly and annotation. We discuss the procedures involved, including pre- and post-processing steps, and present a compendium of corresponding tools.
Collapse
Affiliation(s)
- Venket Raghavan
- Corresponding authors: Venket Raghavan, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail: ; Louis Kraft, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail:
| | - Louis Kraft
- Corresponding authors: Venket Raghavan, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail: ; Louis Kraft, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail:
| | | | | |
Collapse
|
41
|
Poosapati S, Poretsky E, Dressano K, Ruiz M, Vazquez A, Sandoval E, Estrada-Cardenas A, Duggal S, Lim JH, Morris G, Szczepaniec A, Walse SS, Ni X, Schmelz EA, Huffaker A. A sorghum genome-wide association study (GWAS) identifies a WRKY transcription factor as a candidate gene underlying sugarcane aphid (Melanaphis sacchari) resistance. PLANTA 2022; 255:37. [PMID: 35020066 DOI: 10.1007/s00425-021-03814-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Accepted: 12/19/2021] [Indexed: 06/14/2023]
Abstract
A WRKY transcription factor identified through forward genetics is associated with sorghum resistance to the sugarcane aphid and through heterologous expression reduces aphid populations in multiple plant species. Crop plant resistance to insect pests is based on genetically encoded traits which often display variability across diverse germplasm. In a comparatively recent event, a predominant sugarcane aphid (SCA: Melanaphis sacchari) biotype has become a significant agronomic pest of grain sorghum (Sorghum bicolor). To uncover candidate genes underlying SCA resistance, we used a forward genetics approach combining the genetic diversity present in the Sorghum Association Panel (SAP) and the Bioenergy Association Panel (BAP) for a genome-wide association study, employing an established SCA damage rating. One major association was found on Chromosome 9 within the WRKY transcription factor 86 (SbWRKY86). Transcripts encoding SbWRKY86 were previously identified as upregulated in SCA-resistant germplasm and the syntenic ortholog in maize accumulates following Rhopalosiphum maidis infestation. Analyses of SbWRKY86 transcripts displayed patterns of increased SCA-elicited accumulation in additional SCA-resistant sorghum lines. Heterologous expression of SbWRKY86 in both tobacco (Nicotiana benthamiana) and Arabidopsis resulted in reduced population growth of green peach aphid (Myzus persicae). Comparative RNA-Seq analyses of Arabidopsis lines expressing 35S:SbWRKY86-YFP identified changes in expression for a small network of genes associated with carbon-nitrogen metabolism and callose deposition, both contributing factors to defense against aphids. As a test of altered plant responses, 35S:SbWRKY86-YFP Arabidopsis lines were activated using the flagellin epitope elicitor, flg22, and displayed significant increases in callose deposition. Our findings indicate that both heterologous and increased native expression of the transcription factor SbWRKY86 contributes to reduced aphid levels in diverse plant models.
Collapse
Affiliation(s)
- Sowmya Poosapati
- Section of Cell and Developmental Biology, University of California at San Diego, 9500 Gilman Dr., La Jolla, CA, 92093-0116, USA
| | - Elly Poretsky
- Section of Cell and Developmental Biology, University of California at San Diego, 9500 Gilman Dr., La Jolla, CA, 92093-0116, USA
| | - Keini Dressano
- Section of Cell and Developmental Biology, University of California at San Diego, 9500 Gilman Dr., La Jolla, CA, 92093-0116, USA
| | - Miguel Ruiz
- Section of Cell and Developmental Biology, University of California at San Diego, 9500 Gilman Dr., La Jolla, CA, 92093-0116, USA
| | - Armando Vazquez
- Section of Cell and Developmental Biology, University of California at San Diego, 9500 Gilman Dr., La Jolla, CA, 92093-0116, USA
| | - Evan Sandoval
- Section of Cell and Developmental Biology, University of California at San Diego, 9500 Gilman Dr., La Jolla, CA, 92093-0116, USA
| | - Adelaida Estrada-Cardenas
- Section of Cell and Developmental Biology, University of California at San Diego, 9500 Gilman Dr., La Jolla, CA, 92093-0116, USA
| | - Sarthak Duggal
- Section of Cell and Developmental Biology, University of California at San Diego, 9500 Gilman Dr., La Jolla, CA, 92093-0116, USA
| | - Jia-Hui Lim
- Section of Cell and Developmental Biology, University of California at San Diego, 9500 Gilman Dr., La Jolla, CA, 92093-0116, USA
| | - Geoffrey Morris
- Soil and Crop Sciences, Colorado State University, 307 University Ave., Fort Collins, CO, 80523-1177, USA
| | - Adrianna Szczepaniec
- Agricultural Biology, Colorado State University, 307 University Ave., Fort Collins, CO, 80523-1177, USA
| | - Spencer S Walse
- USDA-Agricultural Research Service, San Joaquin Valley Agricultural Sciences Center, 9611 South Riverbend Avenue, Parlier, CA, 93648-9757, USA
| | - Xinzhi Ni
- Crop Genetics and Breeding Research Unit, USDA-ARS, 115 Coastal Way, Tifton, GA, 31793, USA
| | - Eric A Schmelz
- Section of Cell and Developmental Biology, University of California at San Diego, 9500 Gilman Dr., La Jolla, CA, 92093-0116, USA
| | - Alisa Huffaker
- Section of Cell and Developmental Biology, University of California at San Diego, 9500 Gilman Dr., La Jolla, CA, 92093-0116, USA.
| |
Collapse
|
42
|
Showers WM, Leach SM, Kechris K, Strong M. Longitudinal analysis of SARS-CoV-2 spike and RNA-dependent RNA polymerase protein sequences reveals the emergence and geographic distribution of diverse mutations. INFECTION, GENETICS AND EVOLUTION : JOURNAL OF MOLECULAR EPIDEMIOLOGY AND EVOLUTIONARY GENETICS IN INFECTIOUS DISEASES 2022; 97:105153. [PMID: 34801754 PMCID: PMC8600767 DOI: 10.1016/j.meegid.2021.105153] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 11/05/2021] [Accepted: 11/16/2021] [Indexed: 01/18/2023]
Abstract
Amid the ongoing COVID-19 pandemic, it has become increasingly important to monitor the mutations that arise in the SARS-CoV-2 virus, to prepare public health strategies and guide the further development of vaccines and therapeutics. The spike (S) protein and the proteins comprising the RNA-Dependent RNA Polymerase (RdRP) are key vaccine and drug targets, respectively, making mutation surveillance of these proteins of great importance. Full protein sequences were downloaded from the GISAID database, aligned, and the variants identified. 437,006 unique viral genomes were analyzed. Polymorphisms in the protein sequence were investigated and examined longitudinally to identify sequence and strain variants appearing between January 5th, 2020 and January 16th, 2021. A structural analysis was also performed to investigate mutations in the receptor binding domain and the N-terminal domain of the spike protein. Within the spike protein, there were 766 unique mutations observed in the N-terminal domain and 360 in the receptor binding domain. Four residues that directly contact ACE2 were mutated in more than 100 sequences, including positions K417, Y453, S494, and N501. Within the furin cleavage site of the spike protein, a high degree of conservation was observed, but the P681H mutation was observed in 10.47% of sequences analyzed. Within the RNA dependent RNA polymerase complex proteins, 327 unique mutations were observed in Nsp8, 166 unique mutations were observed in Nsp7, and 1157 unique mutations were observed in Nsp12. Only 4 sequences analyzed contained mutations in the 9 residues that directly interact with the therapeutic Remdesivir, suggesting limited mutations in drug interacting residues. The identification of new variants emphasizes the need for further study on the effects of the mutations and the implications of increased prevalence, particularly for vaccine or therapeutic efficacy.
Collapse
Affiliation(s)
- William M Showers
- University of Colorado Anschutz Medical Campus, 13001 East 17th Place, Aurora, CO, USA; Center for Genes, Environment, and Health, National Jewish Health, Smith Building, Room A651, 1400 Jackson Street, Denver, CO, USA.
| | - Sonia M Leach
- University of Colorado Anschutz Medical Campus, 13001 East 17th Place, Aurora, CO, USA; Center for Genes, Environment, and Health, National Jewish Health, Smith Building, Room A651, 1400 Jackson Street, Denver, CO, USA
| | - Katerina Kechris
- University of Colorado Anschutz Medical Campus, 13001 East 17th Place, Aurora, CO, USA
| | - Michael Strong
- University of Colorado Anschutz Medical Campus, 13001 East 17th Place, Aurora, CO, USA; Center for Genes, Environment, and Health, National Jewish Health, Smith Building, Room A651, 1400 Jackson Street, Denver, CO, USA
| |
Collapse
|
43
|
Mesny F, Miyauchi S, Thiergart T, Pickel B, Atanasova L, Karlsson M, Hüttel B, Barry KW, Haridas S, Chen C, Bauer D, Andreopoulos W, Pangilinan J, LaButti K, Riley R, Lipzen A, Clum A, Drula E, Henrissat B, Kohler A, Grigoriev IV, Martin FM, Hacquard S. Genetic determinants of endophytism in the Arabidopsis root mycobiome. Nat Commun 2021; 12:7227. [PMID: 34893598 PMCID: PMC8664821 DOI: 10.1038/s41467-021-27479-y] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 11/11/2021] [Indexed: 02/03/2023] Open
Abstract
The roots of Arabidopsis thaliana host diverse fungal communities that affect plant health and disease states. Here, we sequence the genomes of 41 fungal isolates representative of the A. thaliana root mycobiota for comparative analysis with other 79 plant-associated fungi. Our analyses indicate that root mycobiota members evolved from ancestors with diverse lifestyles and retain large repertoires of plant cell wall-degrading enzymes (PCWDEs) and effector-like small secreted proteins. We identify a set of 84 gene families associated with endophytism, including genes encoding PCWDEs acting on xylan (family GH10) and cellulose (family AA9). Transcripts encoding these enzymes are also part of a conserved transcriptional program activated by phylogenetically-distant mycobiota members upon host contact. Recolonization experiments with individual fungi indicate that strains with detrimental effects in mono-association with the host colonize roots more aggressively than those with beneficial activities, and dominate in natural root samples. Furthermore, we show that the pectin-degrading enzyme family PL1_7 links aggressiveness of endophytic colonization to plant health.
Collapse
Affiliation(s)
- Fantin Mesny
- Max Planck Institute for Plant Breeding Research, 50829, Cologne, Germany
| | - Shingo Miyauchi
- Max Planck Institute for Plant Breeding Research, 50829, Cologne, Germany
- Université de Lorraine, Institut national de recherche pour l'agriculture, l'alimentation et l'environnement, UMR Interactions Arbres/Microorganismes, Centre INRAE Grand Est-Nancy, 54280, Champenoux, France
| | - Thorsten Thiergart
- Max Planck Institute for Plant Breeding Research, 50829, Cologne, Germany
| | - Brigitte Pickel
- Max Planck Institute for Plant Breeding Research, 50829, Cologne, Germany
| | - Lea Atanasova
- Research division of Biochemical Technology, Institute of Chemical, Environmental and Biological Engineering, Vienna University of Technology, Vienna, Austria
- Institute of Food Technology, University of Natural Resources and Life Sciences, Vienna, Austria
| | - Magnus Karlsson
- Forest Mycology and Plant Pathology, Swedish University of Agricultural Sciences, SE-75007, Uppsala, Sweden
| | - Bruno Hüttel
- Max Planck Genome Centre Cologne, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Kerrie W Barry
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Sajeet Haridas
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Cindy Chen
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Diane Bauer
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - William Andreopoulos
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Jasmyn Pangilinan
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Kurt LaButti
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Robert Riley
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Anna Lipzen
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Alicia Clum
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Elodie Drula
- INRAE, USC1408 Architecture et Fonction des Macromolécules Biologiques, 13009, Marseille, France
- Architecture et Fonction des Macromolécules Biologiques (AFMB), CNRS, Aix-Marseille Univ., 13009, Marseille, France
| | - Bernard Henrissat
- Department of Biological Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Annegret Kohler
- Université de Lorraine, Institut national de recherche pour l'agriculture, l'alimentation et l'environnement, UMR Interactions Arbres/Microorganismes, Centre INRAE Grand Est-Nancy, 54280, Champenoux, France
| | - Igor V Grigoriev
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA
| | - Francis M Martin
- Université de Lorraine, Institut national de recherche pour l'agriculture, l'alimentation et l'environnement, UMR Interactions Arbres/Microorganismes, Centre INRAE Grand Est-Nancy, 54280, Champenoux, France.
- Beijing Advanced Innovation Centre for Tree Breeding by Molecular Design (BAIC-TBMD), Institute of Microbiology, Beijing Forestry University, Tsinghua East Road Haidian District, Beijing, China.
| | - Stéphane Hacquard
- Max Planck Institute for Plant Breeding Research, 50829, Cologne, Germany.
- Cluster of Excellence on Plant Sciences (CEPLAS), Max Planck Institute for Plant Breeding Research, 50829, Cologne, Germany.
| |
Collapse
|
44
|
Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Žídek A, Potapenko A, Bridgland A, Meyer C, Kohl SAA, Ballard AJ, Cowie A, Romera-Paredes B, Nikolov S, Jain R, Adler J, Back T, Petersen S, Reiman D, Clancy E, Zielinski M, Steinegger M, Pacholska M, Berghammer T, Bodenstein S, Silver D, Vinyals O, Senior AW, Kavukcuoglu K, Kohli P, Hassabis D. Highly accurate protein structure prediction with AlphaFold. Nature 2021; 596:583-589. [PMID: 34265844 PMCID: PMC8371605 DOI: 10.1038/s41586-021-03819-2] [Citation(s) in RCA: 14655] [Impact Index Per Article: 4885.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 07/12/2021] [Indexed: 02/07/2023]
Abstract
Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort1-4, the structures of around 100,000 unique proteins have been determined5, but this represents a small fraction of the billions of known protein sequences6,7. Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence-the structure prediction component of the 'protein folding problem'8-has been an important open research problem for more than 50 years9. Despite recent progress10-14, existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14)15, demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Martin Steinegger
- School of Biological Sciences, Seoul National University, Seoul, South Korea
- Artificial Intelligence Institute, Seoul National University, Seoul, South Korea
| | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Singh A, Müller B, Schnürer A. Profiling temporal dynamics of acetogenic communities in anaerobic digesters using next-generation sequencing and T-RFLP. Sci Rep 2021; 11:13298. [PMID: 34168213 PMCID: PMC8225771 DOI: 10.1038/s41598-021-92658-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Accepted: 06/14/2021] [Indexed: 02/06/2023] Open
Abstract
Acetogens play a key role in anaerobic degradation of organic material and in maintaining biogas process efficiency. Profiling this community and its temporal changes can help evaluate process stability and function, especially under disturbance/stress conditions, and avoid complete process failure. The formyltetrahydrofolate synthetase (FTHFS) gene can be used as a marker for acetogenic community profiling in diverse environments. In this study, we developed a new high-throughput FTHFS gene sequencing method for acetogenic community profiling and compared it with conventional terminal restriction fragment length polymorphism of the FTHFS gene, 16S rRNA gene-based profiling of the whole bacterial community, and indirect analysis via 16S rRNA profiling of the FTHFS gene-harbouring community. Analyses and method comparisons were made using samples from two laboratory-scale biogas processes, one operated under stable control and one exposed to controlled overloading disturbance. Comparative analysis revealed satisfactory detection of the bacterial community and its changes for all methods, but with some differences in resolution and taxonomic identification. FTHFS gene sequencing was found to be the most suitable and reliable method to study acetogenic communities. These results pave the way for community profiling in various biogas processes and in other environments where the dynamics of acetogenic bacteria have not been well studied.
Collapse
Affiliation(s)
- Abhijeet Singh
- grid.6341.00000 0000 8578 2742Anaerobic Microbiology and Biotechnology Group, Department of Molecular Sciences, Swedish University of Agricultural Sciences, Almas Allé 5, Box 7025, 750 07 Uppsala, Sweden
| | - Bettina Müller
- grid.6341.00000 0000 8578 2742Anaerobic Microbiology and Biotechnology Group, Department of Molecular Sciences, Swedish University of Agricultural Sciences, Almas Allé 5, Box 7025, 750 07 Uppsala, Sweden
| | - Anna Schnürer
- grid.6341.00000 0000 8578 2742Anaerobic Microbiology and Biotechnology Group, Department of Molecular Sciences, Swedish University of Agricultural Sciences, Almas Allé 5, Box 7025, 750 07 Uppsala, Sweden
| |
Collapse
|
46
|
Abstract
Mavericks are virus-like mobile genetic elements found in the genomes of eukaryotes. Although Mavericks encode capsid morphogenesis homologs, their viral particles have not been observed. Here, we provide new evidence supporting the viral nature of Mavericks and the potential existence of virions. To this end, we conducted a phylogenomic analysis of Mavericks in hundreds of vertebrate genomes, discovering 134 elements with an intact coding capacity in 17 host species. We reveal an extensive genomic fossil record in 143 species and date three groups of elements to the Late Cretaceous. Bayesian phylogenetic analysis using genomic fossil orthologs suggests that Mavericks have infected osteichthyans for ∼419 My. They have undergone frequent cross-species transmissions in cyprinid fish and all core genes are subject to strong purifying selection. We conclude that vertebrate Mavericks form an ancient lineage of aquatic dsDNA viruses which are probably still functional in some vertebrate lineages.
Collapse
Affiliation(s)
| | - Aris Katzourakis
- Department of Zoology, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
47
|
Anton B, Besalú M, Fornes O, Bonet J, Molina A, Molina-Fernandez R, De Las Cuevas G, Fernandez-Fuentes N, Oliva B. On the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction. NAR Genom Bioinform 2021; 3:lqab027. [PMID: 33937764 PMCID: PMC8061457 DOI: 10.1093/nargab/lqab027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 02/27/2021] [Accepted: 03/26/2021] [Indexed: 11/12/2022] Open
Abstract
Direct-coupling analysis (DCA) for studying the coevolution of residues in proteins has been widely used to predict the three-dimensional structure of a protein from its sequence. We present RADI/raDIMod, a variation of the original DCA algorithm that groups chemically equivalent residues combined with super-secondary structure motifs to model protein structures. Interestingly, the simplification produced by grouping amino acids into only two groups (polar and non-polar) is still representative of the physicochemical nature that characterizes the protein structure and it is in line with the role of hydrophobic forces in protein-folding funneling. As a result of a compressed alphabet, the number of sequences required for the multiple sequence alignment is reduced. The number of long-range contacts predicted is limited; therefore, our approach requires the use of neighboring sequence-positions. We use the prediction of secondary structure and motifs of super-secondary structures to predict local contacts. We use RADI and raDIMod, a fragment-based protein structure modelling, achieving near native conformations when the number of super-secondary motifs covers >30-50% of the sequence. Interestingly, although different contacts are predicted with different alphabets, they produce similar structures.
Collapse
Affiliation(s)
- Bernat Anton
- Structural Bioinformatics Lab (GRIB-IMIM), Department of Experimental and Health Science, University Pompeu Fabra, Barcelona 08005, Catalonia, Spain
| | - Mireia Besalú
- Departament de Genètica, Microbiologia i Estadística, Universitat de Barcelona, Barcelona 08028, Catalonia, Spain
| | - Oriol Fornes
- Structural Bioinformatics Lab (GRIB-IMIM), Department of Experimental and Health Science, University Pompeu Fabra, Barcelona 08005, Catalonia, Spain
| | - Jaume Bonet
- Structural Bioinformatics Lab (GRIB-IMIM), Department of Experimental and Health Science, University Pompeu Fabra, Barcelona 08005, Catalonia, Spain
| | - Alexis Molina
- Electronic and Atomic Protein Modeling, Life Sciences, Barcelona Supercomputing Center, Barcelona 08034, Catalonia, Spain
| | - Ruben Molina-Fernandez
- Structural Bioinformatics Lab (GRIB-IMIM), Department of Experimental and Health Science, University Pompeu Fabra, Barcelona 08005, Catalonia, Spain
| | - Gemma De Las Cuevas
- Institut für Theoritische Physik, School of Mathematics, Computer Science and Physics, Universität Innsbruck. A-6020 Innsbruck, Austria
| | - Narcis Fernandez-Fuentes
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, SY233EB Aberystwyth, United Kingdom
| | - Baldo Oliva
- Structural Bioinformatics Lab (GRIB-IMIM), Department of Experimental and Health Science, University Pompeu Fabra, Barcelona 08005, Catalonia, Spain
| |
Collapse
|
48
|
Feldbauer R, Gosch L, Lüftinger L, Hyden P, Flexer A, Rattei T. DeepNOG: fast and accurate protein orthologous group assignment. Bioinformatics 2021; 36:5304-5312. [PMID: 33367584 PMCID: PMC8016488 DOI: 10.1093/bioinformatics/btaa1051] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 12/02/2020] [Accepted: 12/10/2020] [Indexed: 11/30/2022] Open
Abstract
MOTIVATION Protein orthologous group databases are powerful tools for evolutionary analysis, functional annotation or metabolic pathway modeling across lineages. Sequences are typically assigned to orthologous groups with alignment-based methods, such as profile hidden Markov models, which have become a computational bottleneck. RESULTS We present DeepNOG, an extremely fast and accurate, alignment-free orthology assignment method based on deep convolutional networks. We compare DeepNOG against state-of-the-art alignment-based (HMMER, DIAMOND) and alignment-free methods (DeepFam) on two orthology databases (COG, eggNOG 5). DeepNOG can be scaled to large orthology databases like eggNOG, for which it outperforms DeepFam in terms of precision and recall by large margins. While alignment-based methods still provide the most accurate assignments among the investigated methods, computing time of DeepNOG is an order of magnitude lower on CPUs. Optional GPU usage further increases throughput massively. A command-line tool enables rapid adoption by users. AVAILABILITYAND IMPLEMENTATION Source code and packages are freely available at https://github.com/univieCUBE/deepnog. Install the platform-independent Python program with $pip install deepnog. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Roman Feldbauer
- Department of Microbiology and Ecosystem Science, University of Vienna, Vienna 1090, Austria
| | - Lukas Gosch
- Department of Microbiology and Ecosystem Science, University of Vienna, Vienna 1090, Austria
| | - Lukas Lüftinger
- Department of Microbiology and Ecosystem Science, University of Vienna, Vienna 1090, Austria
- Ares Genetics GmbH, Vienna 1030, Austria
| | - Patrick Hyden
- Department of Microbiology and Ecosystem Science, University of Vienna, Vienna 1090, Austria
| | - Arthur Flexer
- Institute of Computational Perception, Johannes Kepler University Linz, Linz 4040, Austria
| | - Thomas Rattei
- Department of Microbiology and Ecosystem Science, University of Vienna, Vienna 1090, Austria
| |
Collapse
|
49
|
Nayfach S, Roux S, Seshadri R, Udwary D, Varghese N, Schulz F, Wu D, Paez-Espino D, Chen IM, Huntemann M, Palaniappan K, Ladau J, Mukherjee S, Reddy TBK, Nielsen T, Kirton E, Faria JP, Edirisinghe JN, Henry CS, Jungbluth SP, Chivian D, Dehal P, Wood-Charlson EM, Arkin AP, Tringe SG, Visel A, Woyke T, Mouncey NJ, Ivanova NN, Kyrpides NC, Eloe-Fadrosh EA. A genomic catalog of Earth's microbiomes. Nat Biotechnol 2021; 39:499-509. [PMID: 33169036 PMCID: PMC8041624 DOI: 10.1038/s41587-020-0718-6] [Citation(s) in RCA: 336] [Impact Index Per Article: 112.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2019] [Accepted: 09/28/2020] [Indexed: 01/02/2023]
Abstract
The reconstruction of bacterial and archaeal genomes from shotgun metagenomes has enabled insights into the ecology and evolution of environmental and host-associated microbiomes. Here we applied this approach to >10,000 metagenomes collected from diverse habitats covering all of Earth's continents and oceans, including metagenomes from human and animal hosts, engineered environments, and natural and agricultural soils, to capture extant microbial, metabolic and functional potential. This comprehensive catalog includes 52,515 metagenome-assembled genomes representing 12,556 novel candidate species-level operational taxonomic units spanning 135 phyla. The catalog expands the known phylogenetic diversity of bacteria and archaea by 44% and is broadly available for streamlined comparative analyses, interactive exploration, metabolic modeling and bulk download. We demonstrate the utility of this collection for understanding secondary-metabolite biosynthetic potential and for resolving thousands of new host linkages to uncultivated viruses. This resource underscores the value of genome-centric approaches for revealing genomic properties of uncultivated microorganisms that affect ecosystem processes.
Collapse
Affiliation(s)
| | - Simon Roux
- DOE Joint Genome Institute, Berkeley, CA, USA
| | | | | | | | | | - Dongying Wu
- DOE Joint Genome Institute, Berkeley, CA, USA
| | | | - I-Min Chen
- DOE Joint Genome Institute, Berkeley, CA, USA
| | | | | | | | | | - T B K Reddy
- DOE Joint Genome Institute, Berkeley, CA, USA
| | | | | | | | | | | | - Sean P Jungbluth
- DOE Joint Genome Institute, Berkeley, CA, USA
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Dylan Chivian
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Paramvir Dehal
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | | | - Adam P Arkin
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | | | - Axel Visel
- DOE Joint Genome Institute, Berkeley, CA, USA
| | - Tanja Woyke
- DOE Joint Genome Institute, Berkeley, CA, USA
| | | | | | | | | |
Collapse
|
50
|
Mistry J, Chuguransky S, Williams L, Qureshi M, Salazar GA, Sonnhammer ELL, Tosatto SCE, Paladin L, Raj S, Richardson LJ, Finn RD, Bateman A. Pfam: The protein families database in 2021. Nucleic Acids Res 2021; 49:D412-D419. [PMID: 33125078 PMCID: PMC7779014 DOI: 10.1093/nar/gkaa913] [Citation(s) in RCA: 2514] [Impact Index Per Article: 838.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Revised: 10/01/2020] [Accepted: 10/06/2020] [Indexed: 12/19/2022] Open
Abstract
The Pfam database is a widely used resource for classifying protein sequences into families and domains. Since Pfam was last described in this journal, over 350 new families have been added in Pfam 33.1 and numerous improvements have been made to existing entries. To facilitate research on COVID-19, we have revised the Pfam entries that cover the SARS-CoV-2 proteome, and built new entries for regions that were not covered by Pfam. We have reintroduced Pfam-B which provides an automatically generated supplement to Pfam and contains 136 730 novel clusters of sequences that are not yet matched by a Pfam family. The new Pfam-B is based on a clustering by the MMseqs2 software. We have compared all of the regions in the RepeatsDB to those in Pfam and have started to use the results to build and refine Pfam repeat families. Pfam is freely available for browsing and download at http://pfam.xfam.org/.
Collapse
Affiliation(s)
- Jaina Mistry
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Sara Chuguransky
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Lowri Williams
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Matloob Qureshi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Gustavo A Salazar
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Erik L L Sonnhammer
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Box 1031, 17121 Solna, Sweden
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padua, 35131 Padova, Italy
| | - Lisanna Paladin
- Department of Biomedical Sciences, University of Padua, 35131 Padova, Italy
| | - Shriya Raj
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Lorna J Richardson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| |
Collapse
|