1
|
Shene C, Leyton A, Flores L, Chavez D, Asenjo JA, Chisti Y. Genome-scale metabolic modeling of Thraustochytrium sp. RT2316-16: Effects of nutrients on metabolism. Biotechnol Bioeng 2024; 121:1986-2001. [PMID: 38500406 DOI: 10.1002/bit.28689] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 01/17/2024] [Accepted: 02/20/2024] [Indexed: 03/20/2024]
Abstract
Marine thraustochytrids produce metabolically important lipids such as the long-chain omega-3 polyunsaturated fatty acids, carotenoids, and sterols. The growth and lipid production in thraustochytrids depends on the composition of the culture medium that often contains yeast extract as a source of amino acids. This work discusses the effects of individual amino acids provided in the culture medium as the only source of nitrogen, on the production of biomass and lipids by the thraustochytrid Thraustochytrium sp. RT2316-16. A reconstructed metabolic network based on the annotated genome of RT2316-16 in combination with flux balance analysis was used to explain the observed growth and consumption of the nutrients. The culture kinetic parameters estimated from the experimental data were used to constrain the flux via the nutrient consumption rates and the specific growth rate of the triacylglycerol-free biomass in the genome-scale metabolic model (GEM) to predict the specific rate of ATP production for cell maintenance. A relationship was identified between the specific rate of ATP production for maintenance and the specific rate of glucose consumption. The GEM and the derived relationship for the production of ATP for maintenance were used in linear optimization problems, to successfully predict the specific growth rate of RT2316-16 in different experimental conditions.
Collapse
Affiliation(s)
- Carolina Shene
- Department of Chemical Engineering, Center of Food Biotechnology and Bioseparations, BIOREN, and Centre of Biotechnology and Bioengineering (CeBiB), Universidad de La Frontera, Temuco, Chile
| | - Allison Leyton
- Department of Chemical Engineering, Center of Food Biotechnology and Bioseparations, BIOREN, and Centre of Biotechnology and Bioengineering (CeBiB), Universidad de La Frontera, Temuco, Chile
| | - Liset Flores
- Department of Chemical Engineering, Center of Food Biotechnology and Bioseparations, BIOREN, and Centre of Biotechnology and Bioengineering (CeBiB), Universidad de La Frontera, Temuco, Chile
| | - Daniela Chavez
- Department of Chemical Engineering, Center of Food Biotechnology and Bioseparations, BIOREN, and Centre of Biotechnology and Bioengineering (CeBiB), Universidad de La Frontera, Temuco, Chile
| | - Juan A Asenjo
- Department of Chemical Engineering, Biotechnology and Materials, Centre for Biotechnology and Bioengineering (CeBiB), Universidad de Chile, Santiago, Chile
| | - Yusuf Chisti
- Institute of Tropical Aquaculture and Fisheries, Universiti Malaysia Terengganu, Kuala Nerus, Terengganu, Malaysia
| |
Collapse
|
2
|
Pena MM, Martins TZ, Teper D, Zamuner C, Alves HA, Ferreira H, Wang N, Ferro MIT, Ferro JA. EnvC Homolog Encoded by Xanthomonas citri subsp. citri Is Necessary for Cell Division and Virulence. Microorganisms 2024; 12:691. [PMID: 38674634 PMCID: PMC11051873 DOI: 10.3390/microorganisms12040691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 03/16/2024] [Accepted: 03/21/2024] [Indexed: 04/28/2024] Open
Abstract
Peptidoglycan hydrolases are enzymes responsible for breaking the peptidoglycan present in the bacterial cell wall, facilitating cell growth, cell division and peptidoglycan turnover. Xanthomonas citri subsp. citri (X. citri), the causal agent of citrus canker, encodes an Escherichia coli M23 peptidase EnvC homolog. EnvC is a LytM factor essential for cleaving the septal peptidoglycan, thereby facilitating the separation of daughter cells. In this study, the investigation focused on EnvC contribution to the virulence and cell separation of X. citri. It was observed that disruption of the X. citri envC gene (ΔenvC) led to a reduction in virulence. Upon inoculation into leaves of Rangpur lime (Citrus limonia Osbeck), the X. citri ΔenvC exhibited a delayed onset of citrus canker symptoms compared with the wild-type X. citri. Mutant complementation restored the wild-type phenotype. Sub-cellular localization confirmed that X. citri EnvC is a periplasmic protein. Moreover, the X. citri ΔenvC mutant exhibited elongated cells, indicating a defect in cell division. These findings support the role of EnvC in the regulation of cell wall organization, cell division, and they clarify the role of this peptidase in X. citri virulence.
Collapse
Affiliation(s)
- Michelle M. Pena
- Agricultural and Livestock Microbiology Graduation Program, School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Jaboticabal 14884-900, SP, Brazil; (M.M.P.); (T.Z.M.)
| | - Thaisa Z. Martins
- Agricultural and Livestock Microbiology Graduation Program, School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Jaboticabal 14884-900, SP, Brazil; (M.M.P.); (T.Z.M.)
| | - Doron Teper
- Department of Plant Pathology and Weed Research, Institute of Plant Protection Agricultural Research Organization (ARO), Volcani Institute, Rishon LeZion 7505101, Israel;
| | - Caio Zamuner
- Biochemistry Building, Institute of Biosciences, São Paulo State University (UNESP), Rio Claro 13506-900, SP, Brazil; (C.Z.); (H.F.)
| | - Helen A. Alves
- Department of Agricultural, Livestock and Environmental Biotechnology, School of Agricultural and Veterinary Sciences, São Paulo State University (UNESP), Jaboticabal 14884-900, SP, Brazil; (H.A.A.); (M.I.T.F.)
| | - Henrique Ferreira
- Biochemistry Building, Institute of Biosciences, São Paulo State University (UNESP), Rio Claro 13506-900, SP, Brazil; (C.Z.); (H.F.)
| | - Nian Wang
- Citrus Research and Education Center, Department of Microbiology and Cell Science, Institute of Food and Agricultural Sciences, University of Florida, Lake Alfred, FL 33850, USA;
| | - Maria Inês T. Ferro
- Department of Agricultural, Livestock and Environmental Biotechnology, School of Agricultural and Veterinary Sciences, São Paulo State University (UNESP), Jaboticabal 14884-900, SP, Brazil; (H.A.A.); (M.I.T.F.)
| | - Jesus A. Ferro
- Department of Agricultural, Livestock and Environmental Biotechnology, School of Agricultural and Veterinary Sciences, São Paulo State University (UNESP), Jaboticabal 14884-900, SP, Brazil; (H.A.A.); (M.I.T.F.)
| |
Collapse
|
3
|
Fang T, Szklarczyk D, Hachilif R, von Mering C. Enhancing coevolutionary signals in protein-protein interaction prediction through clade-wise alignment integration. Sci Rep 2024; 14:6009. [PMID: 38472223 PMCID: PMC10933411 DOI: 10.1038/s41598-024-55655-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 02/26/2024] [Indexed: 03/14/2024] Open
Abstract
Protein-protein interactions (PPIs) play essential roles in most biological processes. The binding interfaces between interacting proteins impose evolutionary constraints that have successfully been employed to predict PPIs from multiple sequence alignments (MSAs). To construct MSAs, critical choices have to be made: how to ensure the reliable identification of orthologs, and how to optimally balance the need for large alignments versus sufficient alignment quality. Here, we propose a divide-and-conquer strategy for MSA generation: instead of building a single, large alignment for each protein, multiple distinct alignments are constructed under distinct clades in the tree of life. Coevolutionary signals are searched separately within these clades, and are only subsequently integrated using machine learning techniques. We find that this strategy markedly improves overall prediction performance, concomitant with better alignment quality. Using the popular DCA algorithm to systematically search pairs of such alignments, a genome-wide all-against-all interaction scan in a bacterial genome is demonstrated. Given the recent successes of AlphaFold in predicting direct PPIs at atomic detail, a discover-and-refine approach is proposed: our method could provide a fast and accurate strategy for pre-screening the entire genome, submitting to AlphaFold only promising interaction candidates-thus reducing false positives as well as computation time.
Collapse
Affiliation(s)
- Tao Fang
- Department of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Damian Szklarczyk
- Department of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Radja Hachilif
- Department of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Christian von Mering
- Department of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland.
- SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland.
| |
Collapse
|
4
|
Raghuraman P, Ramireddy S, Raman G, Park S, Sudandiradoss C. Understanding a point mutation signature D54K in the caspase activation recruitment domain of NOD1 capitulating concerted immunity via atomistic simulation. J Biomol Struct Dyn 2024:1-17. [PMID: 38415678 DOI: 10.1080/07391102.2024.2322618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 12/11/2023] [Indexed: 02/29/2024]
Abstract
Point mutation D54K in the human N-terminal caspase recruitment domain (CARD) of nucleotide-binding oligomerization domain -1 (NOD1) abrogates an imperative downstream interaction with receptor-interacting protein kinase (RIPK2) that entails combating bacterial infections and inflammatory dysfunction. Here, we addressed the molecular details concerning conformational changes and interaction patterns (monomeric-dimeric states) of D54K by signature-based molecular dynamics simulation. Initially, the sequence analysis prioritized D54K as a pathogenic mutation, among other variants, based on a sequence signature. Since the mutation is highly conserved, we derived the distant ortholog to predict the sequence and structural similarity between native and mutant. This analysis showed the utility of 33 communal core residues associated with structural-functional preservation and variations, concurrently served to infer the cryptic hotspots Cys39, Glu53, Asp54, Glu56, Ile57, Leu74, and Lys78 determining the inter helical fold forming homodimers for putative receptor interaction. Subsequently, the atomistic simulations with free energy (MM/PB(GB)SA) calculations predicted structural alteration that takes place in the N-terminal mutant CARD where coils changed to helices (45 α3- L4-α4-L6- α683) in contrast to native (45T2-L4-α4-L6-T483). Likewise, the C-terminal helices 93T1-α7105 connected to the loops distorted compared to native 93α6-L7105 may result in conformational misfolding that promotes functional regulation and activation. These structural perturbations of D54K possibly destabilize the flexible adaptation of critical homotypic NOD1CARD-CARDRIPK2 interactions (α4Asp42-Arg488α5 and α6Phe86-Lys471α4) is consistent with earlier experimental reports. Altogether, our findings unveil the conformational plasticity of mutation-dependent immunomodulatory response and may aid in functional validation exploring clinical investigation on CARD-regulated immunotherapies to prevent systemic infection and inflammation.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- P Raghuraman
- Department of Biotechnology, School of Bioscience and Technology, Vellore Institute of Technology, Vellore, India
- Department of Life Sciences, Yeungnam University, Gyeongsan, Gyeongsangbuk-do, Republic of Korea
| | - Sriroopreddy Ramireddy
- Department of Biotechnology, School of Bioscience and Technology, Vellore Institute of Technology, Vellore, India
- Department of Genetics and Molecular Biology, School of Health Sciences, The Apollo University, Chittoor, India
| | - Gurusamy Raman
- Department of Life Sciences, Yeungnam University, Gyeongsan, Gyeongsangbuk-do, Republic of Korea
| | - SeonJoo Park
- Department of Life Sciences, Yeungnam University, Gyeongsan, Gyeongsangbuk-do, Republic of Korea
| | - C Sudandiradoss
- Department of Biotechnology, School of Bioscience and Technology, Vellore Institute of Technology, Vellore, India
| |
Collapse
|
5
|
Bonello J, Orengo C. FunPredCATH: An ensemble method for predicting protein function using CATH. BIOCHIMICA ET BIOPHYSICA ACTA. PROTEINS AND PROTEOMICS 2024; 1872:140985. [PMID: 38122964 DOI: 10.1016/j.bbapap.2023.140985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 12/05/2023] [Accepted: 12/06/2023] [Indexed: 12/23/2023]
Abstract
MOTIVATION The growth of unannotated proteins in UniProt increases at a very high rate every year due to more efficient sequencing methods. However, the experimental annotation of proteins is a lengthy and expensive process. Using computational techniques to narrow the search can speed up the process by providing highly specific Gene Ontology (GO) terms. METHODOLOGY We propose an ensemble approach that combines three generic base predictors that predict Gene Ontology (BP, CC and MF) terms from sequences across different species. We train our models on UniProtGOA annotation data and use the CATH domain resources to identify the protein families. We then calculate a score based on the prevalence of individual GO terms in the functional families that is then used as an indicator of confidence when assigning the GO term to an uncharacterised protein. METHODS In the ensemble, we use a statistics-based method that scores the occurrence of GO terms in a CATH FunFam against a background set of proteins annotated by the same GO term. We also developed a set-based method that uses Set Intersection and Set Union to score the occurrence of GO terms within the same CATH FunFam. Finally, we also use FunFams-Plus, a predictor method developed by the Orengo Group at UCL to predict GO terms for uncharacterised proteins in the CAFA3 challenge. EVALUATION We evaluated the methods against the CAFA3 benchmark and DomFun. We used the Precision, Recall and Fmax metrics and the benchmark datasets that are used in CAFA3 to evaluate our models and compare them to the CAFA3 results. Our results show that FunPredCATH compares well with top CAFA methods in the different ontologies and benchmarks. CONTRIBUTIONS FunPredCATH compares well with other prediction methods on CAFA3, and the ensemble approach outperforms the base methods. We show that non-IEA models obtain higher Fmax scores than the IEA counterparts, while the models including IEA annotations have higher coverage at the expense of a lower Fmax score.
Collapse
Affiliation(s)
- Joseph Bonello
- Department of Structural and Molecular Biology, University College London, Gower Street, London WC1E 6BT, United Kingdom; Department of Computer Information Systems, University of Malta, Faculty of ICT, Msida, MSD 2080, Malta.
| | - Christine Orengo
- Department of Structural and Molecular Biology, University College London, Gower Street, London WC1E 6BT, United Kingdom
| |
Collapse
|
6
|
Hellmuth M, Stadler PF. The Theory of Gene Family Histories. Methods Mol Biol 2024; 2802:1-32. [PMID: 38819554 DOI: 10.1007/978-1-0716-3838-5_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Most genes are part of larger families of evolutionary-related genes. The history of gene families typically involves duplications and losses of genes as well as horizontal transfers into other organisms. The reconstruction of detailed gene family histories, i.e., the precise dating of evolutionary events relative to phylogenetic tree of the underlying species has remained a challenging topic despite their importance as a basis for detailed investigations into adaptation and functional evolution of individual members of the gene family. The identification of orthologs, moreover, is a particularly important subproblem of the more general setting considered here. In the last few years, an extensive body of mathematical results has appeared that tightly links orthology, a formal notion of best matches among genes, and horizontal gene transfer. The purpose of this chapter is to broadly outline some of the key mathematical insights and to discuss their implication for practical applications. In particular, we focus on tree-free methods, i.e., methods to infer orthology or horizontal gene transfer as well as gene trees, species trees, and reconciliations between them without using a priori knowledge of the underlying trees or statistical models for the inference of phylogenetic trees. Instead, the initial step aims to extract binary relations among genes.
Collapse
Affiliation(s)
- Marc Hellmuth
- Department of Mathematics, Faculty of Science, Stockholm University, Stockholm, Sweden
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, Leipzig University, Leipzig, Germany.
- Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany.
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany.
- Universidad Nacional de Colombia, Bogotá, Colombia.
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria.
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark.
- Santa Fe Institute, Santa Fe, NM, USA.
| |
Collapse
|
7
|
Mallikarjuna MG, Tomar R, Lohithaswa HC, Sahu S, Mishra DC, Rao AR, Chinnusamy V. Genome-wide identification of potassium channels in maize showed evolutionary patterns and variable functional responses to abiotic stresses. PLANT PHYSIOLOGY AND BIOCHEMISTRY : PPB 2024; 206:108235. [PMID: 38039585 DOI: 10.1016/j.plaphy.2023.108235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 11/08/2023] [Accepted: 11/22/2023] [Indexed: 12/03/2023]
Abstract
Potassium (K) channels are essential components of plant biology, mediating not only K ion (K+) homeostasis but also regulating several physiological processes and stress tolerance. In the current investigation, we identified 27 K+ channels in maize and deciphered the evolution and divergence pattern with four monocots and five dicot species. Chromosomal localization and expansion of K+ channel genes showed uneven distribution and were independent of genome size. The dispersed duplication is the major force in expanding K+ channels in the target genomes. The mean Ka/Ks ratio of <0.5 in paralogs and orthologs indicates horizontal and vertical expansions of K+ channel genes under strong purifying selection. The one-to-one K+ channel orthologs were prominent among the closely related species, with higher synteny between maize and the rest of the monocots. Comprehensive K+ channels promoter analysis revealed various cis-regulatory elements mediating stress tolerance with the predominance of MYB and STRE binding sites. The regulatory network showed AP2-EREBP TFs, miR164 and miR399 are prominent regulatory elements of K+ channels. The qRT-PCR analysis of K+ channels and regulatory miRNAs showed significant expressions in response to drought and waterlogging stresses. The present study expanded the knowledge on K+ channels in maize and will serve as a basis for an in-depth functional analysis.
Collapse
Affiliation(s)
| | - Rakhi Tomar
- Division of Genetics, ICAR- Indian Agricultural Research Institute, New Delhi, 110012, India
| | | | - Sarika Sahu
- Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, 110012, India
| | - Dwijesh Chandra Mishra
- Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, 110012, India
| | - Atmakuri Ramakrishna Rao
- Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, 110012, India
| | - Viswanathan Chinnusamy
- Division of Plant Physiology, ICAR- Indian Agricultural Research Institute, New Delhi, 110012, India
| |
Collapse
|
8
|
Carhuaricra-Huaman D, Setubal JC. Protein-Coding Gene Families in Prokaryote Genome Comparisons. Methods Mol Biol 2024; 2802:33-55. [PMID: 38819555 DOI: 10.1007/978-1-0716-3838-5_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
The identification of orthologous genes is relevant for comparative genomics, phylogenetic analysis, and functional annotation. There are many computational tools for the prediction of orthologous groups as well as web-based resources that offer orthology datasets for download and online analysis. This chapter presents a simple and practical guide to the process of orthologous group prediction, using a dataset of 10 prokaryotic proteomes as example. The orthology methods covered are OrthoMCL, COGtriangles, OrthoFinder2, and OMA. The authors compare the number of orthologous groups predicted by these various methods, and present a brief workflow for the functional annotation and reconstruction of phylogenies from inferred single-copy orthologous genes. The chapter also demonstrates how to explore two orthology databases: eggNOG6 and OrthoDB.
Collapse
Affiliation(s)
- Dennis Carhuaricra-Huaman
- Programa de Pós-Graduação Interunidades em Bioinformática, Instituto de Matemática e Estatística, Universidade de São Paulo, São Paulo, SP, Brazil
- Research Group in Biotechnology Applied to Animal Health, Production and Conservation (SANIGEN), Laboratory of Biology and Molecular Genetics, Faculty of Veterinary Medicine, Universidad Nacional Mayor de San Marcos, Lima, Peru
| | - João Carlos Setubal
- Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, São Paulo, SP, Brazil.
| |
Collapse
|
9
|
Singleton M, Eisen M. Leveraging genomic redundancy to improve inference and alignment of orthologous proteins. G3 (BETHESDA, MD.) 2023; 13:jkad222. [PMID: 37770067 PMCID: PMC10700111 DOI: 10.1093/g3journal/jkad222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 09/11/2023] [Accepted: 09/19/2023] [Indexed: 10/03/2023]
Abstract
Identifying protein sequences with common ancestry is a core task in bioinformatics and evolutionary biology. However, methods for inferring and aligning such sequences in annotated genomes have not kept pace with the increasing scale and complexity of the available data. Thus, in this work, we implemented several improvements to the traditional methodology that more fully leverage the redundancy of closely related genomes and the organization of their annotations. Two highlights include the application of the more flexible k-clique percolation algorithm for identifying clusters of orthologous proteins and the development of a novel technique for removing poorly supported regions of alignments with a phylogenetic hidden Markov model (phylo-HMM). In making the latter, we wrote a fully documented Python package Homomorph that implements standard HMM algorithms and created a set of tutorials to promote its use by a wide audience. We applied the resulting pipeline to a set of 33 annotated Drosophila genomes, generating 22,813 orthologous groups and 8,566 high-quality alignments.
Collapse
Affiliation(s)
- Marc Singleton
- Howard Hughes Medical Institute, University of California Berkeley, Berkeley, CA 94720, USA
| | - Michael Eisen
- Howard Hughes Medical Institute, University of California Berkeley, Berkeley, CA 94720, USA
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA 94720, USA
| |
Collapse
|
10
|
Kirilenko BM, Munegowda C, Osipova E, Jebb D, Sharma V, Blumer M, Morales AE, Ahmed AW, Kontopoulos DG, Hilgers L, Lindblad-Toh K, Karlsson EK, Hiller M, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. Integrating gene annotation with orthology inference at scale. Science 2023; 380:eabn3107. [PMID: 37104600 DOI: 10.1126/science.abn3107] [Citation(s) in RCA: 31] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Annotating coding genes and inferring orthologs are two classical challenges in genomics and evolutionary biology that have traditionally been approached separately, limiting scalability. We present TOGA (Tool to infer Orthologs from Genome Alignments), a method that integrates structural gene annotation and orthology inference. TOGA implements a different paradigm to infer orthologous loci, improves ortholog detection and annotation of conserved genes compared with state-of-the-art methods, and handles even highly fragmented assemblies. TOGA scales to hundreds of genomes, which we demonstrate by applying it to 488 placental mammal and 501 bird assemblies, creating the largest comparative gene resources so far. Additionally, TOGA detects gene losses, enables selection screens, and automatically provides a superior measure of mammalian genome quality. TOGA is a powerful and scalable method to annotate and compare genes in the genomic era.
Collapse
Affiliation(s)
- Bogdan M Kirilenko
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Chetan Munegowda
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Ekaterina Osipova
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - David Jebb
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
| | - Virag Sharma
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
| | - Moritz Blumer
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
| | - Ariadna E Morales
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Alexis-Walid Ahmed
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Dimitrios-Georgios Kontopoulos
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Leon Hilgers
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Kerstin Lindblad-Toh
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 751 32 Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Elinor K Karlsson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Michael Hiller
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Persson E, Sonnhammer ELL. InParanoiDB 9: Ortholog Groups for Protein Domains and Full-Length Proteins. J Mol Biol 2023:168001. [PMID: 36764355 DOI: 10.1016/j.jmb.2023.168001] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 01/20/2023] [Accepted: 02/01/2023] [Indexed: 02/11/2023]
Abstract
Prediction of orthologs is an important bioinformatics pursuit that is frequently used for inferring protein function and evolutionary analyses. The InParanoid database is a well known resource of ortholog predictions between a wide variety of organisms. Although orthologs have historically been inferred at the level of full-length protein sequences, many proteins consist of several independent protein domains that may be orthologous to domains in other proteins in a way that differs from the full-length protein case. To be able to capture all types of orthologous relations, conventional full-length protein orthologs can be complemented with orthologs inferred at the domain level. We here present InParanoiDB 9, covering 640 species and providing orthologs for both protein domains and full-length proteins. InParanoiDB 9 was built using the faster InParanoid-DIAMOND algorithm for orthology analysis, as well as Domainoid and Pfam to infer orthologous domains. InParanoiDB 9 is based on proteomes from 447 eukaryotes, 158 bacteria and 35 archaea, and includes over one billion predicted ortholog groups. A new website has been built for the database, providing multiple search options as well as visualization of groups of orthologs and orthologous domains. This release constitutes a major upgrade of the InParanoid database in terms of the number of species as well as the new capability to operate on the domain level. InParanoiDB 9 is available at https://inparanoidb.sbc.su.se/.
Collapse
Affiliation(s)
- Emma Persson
- Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 17121 Solna, Sweden. https://twitter.com/eriksonnhammer
| | - Erik L L Sonnhammer
- Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 17121 Solna, Sweden.
| |
Collapse
|
12
|
Moris VC, Podsiadlowski L, Martin S, Oeyen JP, Donath A, Petersen M, Wilbrandt J, Misof B, Liedtke D, Thamm M, Scheiner R, Schmitt T, Niehuis O. Intrasexual cuticular hydrocarbon dimorphism in a wasp sheds light on hydrocarbon biosynthesis genes in Hymenoptera. Commun Biol 2023; 6:147. [PMID: 36737661 PMCID: PMC9898505 DOI: 10.1038/s42003-022-04370-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 12/13/2022] [Indexed: 02/05/2023] Open
Abstract
Cuticular hydrocarbons (CHCs) cover the cuticle of insects and serve as desiccation barrier and as semiochemicals. While the main enzymatic steps of CHC biosynthesis are well understood, few of the underlying genes have been identified. Here we show how exploitation of intrasexual CHC dimorphism in a mason wasp, Odynerus spinipes, in combination with whole-genome sequencing and comparative transcriptomics facilitated identification of such genes. RNAi-mediated knockdown of twelve candidate gene orthologs in the honey bee, Apis mellifera, confirmed nine genes impacting CHC profile composition. Most of them have predicted functions consistent with current knowledge of CHC metabolism. However, we found first-time evidence for a fatty acid amide hydrolase also influencing CHC profile composition. In situ hybridization experiments furthermore suggest trophocytes participating in CHC biosynthesis. Our results set the base for experimental CHC profile manipulation in Hymenoptera and imply that the evolutionary origin of CHC biosynthesis predates the arthropods' colonization of land.
Collapse
Affiliation(s)
- Victoria C. Moris
- grid.5963.9Department of Evolutionary Biology and Ecology, Institute of Biology I (Zoology), University of Freiburg, 79104 Freiburg, Germany ,grid.4989.c0000 0001 2348 0746Laboratory of Molecular Biology & Evolution (MBE), Department of Biology, Université Libre de Bruxelles, 1000 Brussels, Belgium
| | - Lars Podsiadlowski
- grid.517093.90000 0005 0294 9006Centre for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change / ZFMK, Museum Koenig, Adenauerallee 160, 53113 Bonn, Germany ,grid.10388.320000 0001 2240 3300Institute of Evolutionary Biology and Ecology, University of Bonn, An der Immenburg 1, 53121 Bonn, Germany
| | - Sebastian Martin
- grid.517093.90000 0005 0294 9006Centre for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change / ZFMK, Museum Koenig, Adenauerallee 160, 53113 Bonn, Germany ,grid.10388.320000 0001 2240 3300Institute of Evolutionary Biology and Ecology, University of Bonn, An der Immenburg 1, 53121 Bonn, Germany
| | - Jan Philip Oeyen
- grid.517093.90000 0005 0294 9006Centre for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change / ZFMK, Museum Koenig, Adenauerallee 160, 53113 Bonn, Germany ,grid.5510.10000 0004 1936 8921Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, NO-0316 Oslo, Norway
| | - Alexander Donath
- grid.517093.90000 0005 0294 9006Centre for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change / ZFMK, Museum Koenig, Adenauerallee 160, 53113 Bonn, Germany
| | - Malte Petersen
- grid.517093.90000 0005 0294 9006Centre for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change / ZFMK, Museum Koenig, Adenauerallee 160, 53113 Bonn, Germany ,grid.10388.320000 0001 2240 3300High Performance Computing & Analytics Lab, University of Bonn, Friedrich-Hirzebruch-Allee 8, 53115 Bonn, Germany
| | - Jeanne Wilbrandt
- grid.517093.90000 0005 0294 9006Centre for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change / ZFMK, Museum Koenig, Adenauerallee 160, 53113 Bonn, Germany ,grid.418245.e0000 0000 9999 5706Leibniz Institute on Aging — Fritz Lipmann Institute, Beutenbergstraße 11, 07745 Jena, Germany
| | - Bernhard Misof
- grid.517093.90000 0005 0294 9006Centre for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change / ZFMK, Museum Koenig, Adenauerallee 160, 53113 Bonn, Germany
| | - Daniel Liedtke
- grid.8379.50000 0001 1958 8658Institute of Human Genetics, University of Würzburg, Am Hubland, 97074 Würzburg, Germany
| | - Markus Thamm
- grid.8379.50000 0001 1958 8658Department of Behavioral Physiology and Sociobiology, University of Würzburg, Am Hubland, 97074 Würzburg, Germany
| | - Ricarda Scheiner
- grid.8379.50000 0001 1958 8658Department of Behavioral Physiology and Sociobiology, University of Würzburg, Am Hubland, 97074 Würzburg, Germany
| | - Thomas Schmitt
- grid.8379.50000 0001 1958 8658Department of Animal Ecology and Tropical Biology Biocenter, University of Würzburg, Am Hubland, 97074 Würzburg, Germany
| | - Oliver Niehuis
- Department of Evolutionary Biology and Ecology, Institute of Biology I (Zoology), University of Freiburg, 79104, Freiburg, Germany.
| |
Collapse
|
13
|
Parey E, Roest Crollius H, Berthelot C. SCORPiOs, a Novel Method to Reconstruct Gene Phylogenies in the Context of a Known WGD Event. Methods Mol Biol 2023; 2545:155-173. [PMID: 36720812 DOI: 10.1007/978-1-0716-2561-3_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Phylogenetic gene trees recapitulate the evolutionary history of genes across species, forming an essential framework for comparative genomic studies. In particular, within the context of whole-genome duplications (WGDs), they serve as a basis to investigate patterns of duplicate gene retention and loss, timing of genome rediploidization, and, more generally, to explore the functional consequences of the duplication in descending species. Yet, despite ever more sophisticated models to describe the evolution of gene sequences, building accurate gene trees remains a challenge in ancient polyploid taxons. WGDs generate complex gene families with many duplicated copies and recurrent gene losses, which complicate this task even more. Here, we describe how to use SCORPiOs, a novel method that leverages synteny conservation to provide more accurate phylogenies in the presence of a known WGD event.
Collapse
Affiliation(s)
- Elise Parey
- Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, Paris, France
- INRAE, LPGP, Rennes, France
| | - Hugues Roest Crollius
- Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, Paris, France
| | - Camille Berthelot
- Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, Paris, France.
| |
Collapse
|
14
|
Kasianov AS, Klepikova AV, Mayorov AV, Buzanov GS, Logacheva MD, Penin AA. Interspecific comparison of gene expression profiles using machine learning. PLoS Comput Biol 2023; 19:e1010743. [PMID: 36626392 PMCID: PMC9879537 DOI: 10.1371/journal.pcbi.1010743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2021] [Revised: 01/26/2023] [Accepted: 11/16/2022] [Indexed: 01/11/2023] Open
Abstract
Interspecific gene comparisons are the keystones for many areas of biological research and are especially important for the translation of knowledge from model organisms to economically important species. Currently they are hampered by the low resolution of methods based on sequence analysis and by the complex evolutionary history of eukaryotic genes. This is especially critical for plants, whose genomes are shaped by multiple whole genome duplications and subsequent gene loss. This requires the development of new methods for comparing the functions of genes in different species. Here, we report ISEEML (Interspecific Similarity of Expression Evaluated using Machine Learning)-a novel machine learning-based algorithm for interspecific gene classification. In contrast to previous studies focused on sequence similarity, our algorithm focuses on functional similarity inferred from the comparison of gene expression profiles. We propose novel metrics for expression pattern similarity-expression score (ES)-that is suitable for species with differing morphologies. As a proof of concept, we compare detailed transcriptome maps of Arabidopsis thaliana, the model species, Zea mays (maize) and Fagopyrum esculentum (common buckwheat), which are species that represent distant clades within flowering plants. The classifier resulted in an AUC of 0.91; under the ES threshold of 0.5, the specificity was 94%, and sensitivity was 72%.
Collapse
Affiliation(s)
- Artem S. Kasianov
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
| | - Anna V. Klepikova
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
| | - Alexey V. Mayorov
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
| | | | - Maria D. Logacheva
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
- Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Aleksey A. Penin
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
- * E-mail:
| |
Collapse
|
15
|
Sosa CC, Clavijo-Buriticá DC, García-Merchán VH, López-Rozo N, Riccio-Rengifo C, Diaz MV, Londoño DA, Quimbaya MA. GOCompare: An R package to compare functional enrichment analysis between two species. Genomics 2023; 115:110528. [PMID: 36462728 DOI: 10.1016/j.ygeno.2022.110528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 11/11/2022] [Accepted: 11/29/2022] [Indexed: 12/03/2022]
Abstract
Functional enrichment analysis is a cornerstone in bioinformatics as it makes possible to identify functional information by using a gene list as source. Different tools are available to compare gene ontology (GO) terms, based on a directed acyclic graph structure or content-based algorithms which are time-consuming and require a priori information of GO terms. Nevertheless, quantitative procedures to compare GO terms among gene lists and species are not available. Here we present a computational procedure, implemented in R, to infer functional information derived from comparative strategies. GOCompare provides a framework for functional comparative genomics starting from comparable lists from GO terms. The program uses functional enrichment analysis (FEA) results and implement graph theory to identify statistically relevant GO terms for both, GO categories and analyzed species. Thus, GOCompare allows finding new functional information complementing current FEA approaches and extending their use to a comparative perspective. To test our approach GO terms were obtained for a list of aluminum tolerance-associated genes in Oryza sativa subsp. japonica and their orthologues in Arabidopsis thaliana. GOCompare was able to detect functional similarities for reactive oxygen species and ion binding capabilities which are common in plants as molecular mechanisms to tolerate aluminum toxicity. Consequently, the R package exhibited a good performance when implemented in complex datasets, allowing to establish hypothesis that might explain a biological process from a functional perspective, and narrowing down the possible landscapes to design wet lab experiments.
Collapse
Affiliation(s)
- Chrystian C Sosa
- Department of Natural Sciences and Mathematics, Pontificia Universidad Javeriana, Cali, Cali, Colombia; Evolution, Ecology and Conservation Research Group EECO, Biology Program, Faculty of Basic Sciences and Technologies, Universidad del Quindío, Armenia, Colombia
| | | | - Victor Hugo García-Merchán
- Evolution, Ecology and Conservation Research Group EECO, Biology Program, Faculty of Basic Sciences and Technologies, Universidad del Quindío, Armenia, Colombia
| | - Nicolas López-Rozo
- Department of Natural Sciences and Mathematics, Pontificia Universidad Javeriana, Cali, Cali, Colombia
| | - Camila Riccio-Rengifo
- Department of Natural Sciences and Mathematics, Pontificia Universidad Javeriana, Cali, Cali, Colombia
| | | | - David Arango Londoño
- Department of Natural Sciences and Mathematics, Pontificia Universidad Javeriana, Cali, Cali, Colombia
| | - Mauricio Alberto Quimbaya
- Department of Natural Sciences and Mathematics, Pontificia Universidad Javeriana, Cali, Cali, Colombia.
| |
Collapse
|
16
|
Suetsugu K, Fukushima K, Makino T, Ikematsu S, Sakamoto T, Kimura S. Transcriptomic heterochrony and completely cleistogamous flower development in the mycoheterotrophic orchid Gastrodia. THE NEW PHYTOLOGIST 2023; 237:323-338. [PMID: 36110047 DOI: 10.1111/nph.18495] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 09/09/2022] [Indexed: 06/15/2023]
Abstract
Cleistogamy, in which plants can reproduce via self-fertilization within permanently closed flowers, has evolved in > 30 angiosperm lineages; however, consistent with Darwin's doubts about its existence, complete cleistogamy - the production of only cleistogamous flowers - has rarely been recognized. Thus far, the achlorophyllous orchid genus, Gastrodia, is the only known genus with several plausible completely cleistogamous species. Here, we analyzed the floral developmental transcriptomes of two recently evolved, completely cleistogamous Gastrodia species and their chasmogamous sister species to elucidate the possible changes involved in producing common cleistogamous traits. The ABBA-BABA test did not support introgression and protein sequence convergence as evolutionary mechanisms leading to cleistogamy, leaving convergence in gene expression as a plausible mechanism. Regarding transcriptomic differentiation, the two cleistogamous species had common modifications in the expression of developmental regulators, exhibiting a gene family-wide signature of convergent expression changes in MADS-box genes. Our transcriptomic pseudotime analysis revealed a prolonged juvenile state and eventual maturation, a heterochronic pattern consistent with partial neoteny, in cleistogamous flower development. These findings indicate that transcriptomic partial neoteny, arising from changes in the expression of conserved developmental regulators, might have contributed to the rapid and repeated evolution of cleistogamous flowers in Gastrodia.
Collapse
Affiliation(s)
- Kenji Suetsugu
- Department of Biology, Graduate School of Science, Kobe University, 1-1 Rokkodai, Nada-ku, Kobe, 657-8501, Japan
| | - Kenji Fukushima
- Institute for Molecular Plant Physiology and Biophysics, University of Würzburg, Julius-von-Sachs Platz 2, 97082, Würzburg, Germany
| | - Takashi Makino
- Graduate School of Life Sciences, Tohoku University, 6-3, Aramaki Aza Aoba, Aoba-ku, Sendai, 980-8578, Japan
| | - Shuka Ikematsu
- Faculty of Life Sciences, Kyoto Sangyo University, Kamigamo-motoyama, Kita-ku, Kyoto, 603-8555, Japan
- Center for Plant Sciences, Kyoto Sangyo University, Kamigamo-motoyama, Kita-ku, Kyoto, 603-8555, Japan
| | - Tomoaki Sakamoto
- Faculty of Life Sciences, Kyoto Sangyo University, Kamigamo-motoyama, Kita-ku, Kyoto, 603-8555, Japan
- Center for Plant Sciences, Kyoto Sangyo University, Kamigamo-motoyama, Kita-ku, Kyoto, 603-8555, Japan
| | - Seisuke Kimura
- Faculty of Life Sciences, Kyoto Sangyo University, Kamigamo-motoyama, Kita-ku, Kyoto, 603-8555, Japan
- Center for Plant Sciences, Kyoto Sangyo University, Kamigamo-motoyama, Kita-ku, Kyoto, 603-8555, Japan
| |
Collapse
|
17
|
Cenci A, Concepción-Hernández M, Guignon V, Angenon G, Rouard M. Genome-Wide Classification and Phylogenetic Analyses of the GDSL-Type Esterase/Lipase (GELP) Family in Flowering Plants. Int J Mol Sci 2022; 23:ijms232012114. [PMID: 36292971 PMCID: PMC9602515 DOI: 10.3390/ijms232012114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 10/05/2022] [Accepted: 10/07/2022] [Indexed: 11/16/2022] Open
Abstract
GDSL-type esterase/lipase (GELP) enzymes have key functions in plants, such as developmental processes, anther and pollen development, and responses to biotic and abiotic stresses. Genes that encode GELP belong to a complex and large gene family, ranging from tens to more than hundreds of members per plant species. To facilitate functional transfer between them, we conducted a genome-wide classification of GELP in 46 plant species. First, we applied an iterative phylogenetic method using a selected set of representative angiosperm genomes (three monocots and five dicots) and identified 10 main clusters, subdivided into 44 orthogroups (OGs). An expert curation for gene structures, orthogroup composition, and functional annotation was made based on a literature review. Then, using the HMM profiles as seeds, we expanded the classification to 46 plant species. Our results revealed the variable evolutionary dynamics between OGs in which some expanded, mostly through tandem duplications, while others were maintained as single copies. Among these, dicot-specific clusters and specific amplifications in monocots and wheat were characterized. This approach, by combining manual curation and automatic identification, was effective in characterizing a large gene family, allowing the establishment of a classification framework for gene function transfer and a better understanding of the evolutionary history of GELP.
Collapse
Affiliation(s)
- Alberto Cenci
- Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier, France
- Correspondence: (A.C.); (M.R.)
| | - Mairenys Concepción-Hernández
- Instituto de Biotecnología de las Plantas, Universidad Central “Marta Abreu” de Las Villas (UCLV), Carretera a Camajuaní km 5.5, Santa Clara C.P. 54830, Villa Clara, Cuba
- Research Group Plant Genetics, Vrije Universiteit Brussel (VUB), Pleinlaan 2, 1050 Brussels, Belgium
| | - Valentin Guignon
- Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier, France
| | - Geert Angenon
- Research Group Plant Genetics, Vrije Universiteit Brussel (VUB), Pleinlaan 2, 1050 Brussels, Belgium
| | - Mathieu Rouard
- Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier, France
- Correspondence: (A.C.); (M.R.)
| |
Collapse
|
18
|
Escorcia-Rodríguez JM, Esposito M, Freyre-González JA, Moreno-Hagelsieb G. Non-synonymous to synonymous substitutions suggest that orthologs tend to keep their functions, while paralogs are a source of functional novelty. PeerJ 2022; 10:e13843. [PMID: 36065404 PMCID: PMC9440661 DOI: 10.7717/peerj.13843] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 07/14/2022] [Indexed: 01/18/2023] Open
Abstract
Orthologs separate after lineages split from each other and paralogs after gene duplications. Thus, orthologs are expected to remain more functionally coherent across lineages, while paralogs have been proposed as a source of new functions. Because protein functional divergence follows from non-synonymous substitutions, we performed an analysis based on the ratio of non-synonymous to synonymous substitutions (dN/dS), as proxy for functional divergence. We used five working definitions of orthology, including reciprocal best hits (RBH), among other definitions based on network analyses and clustering. The results showed that orthologs, by all definitions tested, had values of dN/dS noticeably lower than those of paralogs, suggesting that orthologs generally tend to be more functionally stable than paralogs. The differences in dN/dS ratios remained suggesting the functional stability of orthologs after eliminating gene comparisons with potential problems, such as genes with high codon usage biases, low coverage of either of the aligned sequences, or sequences with very high similarities. Separation by percent identity of the encoded proteins showed that the differences between the dN/dS ratios of orthologs and paralogs were more evident at high sequence identity, less so as identity dropped. The last results suggest that the differences between dN/dS ratios were partially related to differences in protein identity. However, they also suggested that paralogs undergo functional divergence relatively early after duplication. Our analyses indicate that choosing orthologs as probably functionally coherent remains the right approach in comparative genomics.
Collapse
Affiliation(s)
- Juan M. Escorcia-Rodríguez
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autonóma de México, Cuernavaca, Morelos, México
| | - Mario Esposito
- Department of Biology, Wilfrid Laurier University, Waterloo, Canada
| | - Julio A. Freyre-González
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autonóma de México, Cuernavaca, Morelos, México
| | | |
Collapse
|
19
|
Kille B, Balaji A, Sedlazeck FJ, Nute M, Treangen TJ. Multiple genome alignment in the telomere-to-telomere assembly era. Genome Biol 2022; 23:182. [PMID: 36038949 PMCID: PMC9421119 DOI: 10.1186/s13059-022-02735-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Accepted: 07/21/2022] [Indexed: 01/22/2023] Open
Abstract
With the arrival of telomere-to-telomere (T2T) assemblies of the human genome comes the computational challenge of efficiently and accurately constructing multiple genome alignments at an unprecedented scale. By identifying nucleotides across genomes which share a common ancestor, multiple genome alignments commonly serve as the bedrock for comparative genomics studies. In this review, we provide an overview of the algorithmic template that most multiple genome alignment methods follow. We also discuss prospective areas of improvement of multiple genome alignment for keeping up with continuously arriving high-quality T2T assembled genomes and for unlocking clinically-relevant insights.
Collapse
Affiliation(s)
- Bryce Kille
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Advait Balaji
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Michael Nute
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Todd J Treangen
- Department of Computer Science, Rice University, Houston, TX, USA.
| |
Collapse
|
20
|
Persson E, Sonnhammer ELL. InParanoid-DIAMOND: faster orthology analysis with the InParanoid algorithm. Bioinformatics 2022; 38:2918-2919. [PMID: 35561192 PMCID: PMC9113356 DOI: 10.1093/bioinformatics/btac194] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 03/14/2022] [Accepted: 03/29/2022] [Indexed: 02/03/2023] Open
Abstract
SUMMARY Predicting orthologs, genes in different species having shared ancestry, is an important task in bioinformatics. Orthology prediction tools are required to make accurate and fast predictions, in order to analyze large amounts of data within a feasible time frame. InParanoid is a well-known algorithm for orthology analysis, shown to perform well in benchmarks, but having the major limitation of long runtimes on large datasets. Here, we present an update to the InParanoid algorithm that can use the faster tool DIAMOND instead of BLAST for the homolog search step. We show that it reduces the runtime by 94%, while still obtaining similar performance in the Quest for Orthologs benchmark. AVAILABILITY AND IMPLEMENTATION The source code is available at (https://bitbucket.org/sonnhammergroup/inparanoid). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Emma Persson
- Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, 17121 Solna, Sweden
| | | |
Collapse
|
21
|
Zmasek CM, Lefkowitz EJ, Niewiadomska A, Scheuermann RH. Genomic evolution of the Coronaviridae family. Virology 2022; 570:123-133. [PMID: 35398776 PMCID: PMC8965632 DOI: 10.1016/j.virol.2022.03.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 03/11/2022] [Accepted: 03/18/2022] [Indexed: 01/03/2023]
Abstract
The current outbreak of coronavirus disease-2019 (COVID-19) caused by SARS-CoV-2 poses unparalleled challenges to global public health. SARS-CoV-2 is a Betacoronavirus, one of four genera belonging to the Coronaviridae subfamily Orthocoronavirinae. Coronaviridae, in turn, are members of the order Nidovirales, a group of enveloped, positive-stranded RNA viruses. Here we present a systematic phylogenetic and evolutionary study based on protein domain architecture, encompassing the entire proteomes of all Orthocoronavirinae, as well as other Nidovirales. This analysis has revealed that the genomic evolution of Nidovirales is associated with extensive gains and losses of protein domains. In Orthocoronavirinae, the sections of the genomes that show the largest divergence in protein domains are found in the proteins encoded in the amino-terminal end of the polyprotein (PP1ab), the spike protein (S), and many of the accessory proteins. The diversity among the accessory proteins is particularly striking, as each subgenus possesses a set of accessory proteins that is almost entirely specific to that subgenus. The only notable exception to this is ORF3b, which is present and orthologous over all Alphacoronaviruses. In contrast, the membrane protein (M), envelope small membrane protein (E), nucleoprotein (N), as well as proteins encoded in the central and carboxy-terminal end of PP1ab (such as the 3C-like protease, RNA-dependent RNA polymerase, and Helicase) show stable domain architectures across all Orthocoronavirinae. This comprehensive analysis of the Coronaviridae domain architecture has important implication for efforts to develop broadly cross-protective coronavirus vaccines.
Collapse
Affiliation(s)
- Christian M Zmasek
- Department of Informatics, J. Craig Venter Institute, La Jolla, CA, 92037, USA
| | - Elliot J Lefkowitz
- Department of Microbiology, UAB School of Medicine, Birmingham, AL, 35294, USA
| | - Anna Niewiadomska
- Department of Informatics, J. Craig Venter Institute, La Jolla, CA, 92037, USA
| | - Richard H Scheuermann
- Department of Informatics, J. Craig Venter Institute, La Jolla, CA, 92037, USA; Department of Pathology, University of California, San Diego, CA, 92093, USA; Division of Vaccine Discovery, La Jolla Institute for Immunology, La Jolla, CA, 92037, USA; Global Virus Network, Baltimore MD, 21201, USA.
| |
Collapse
|
22
|
Nicheperovich A, Altenhoff AM, Dessimoz C, Majidian S. OMAMO: orthology-based alternative model organism selection. Bioinformatics 2022; 38:2965-2966. [PMID: 35561194 PMCID: PMC9113245 DOI: 10.1093/bioinformatics/btac163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 02/18/2022] [Accepted: 03/17/2022] [Indexed: 11/20/2022] Open
Abstract
Summary The conservation of pathways and genes across species has allowed scientists to use non-human model organisms to gain a deeper understanding of human biology. However, the use of traditional model systems such as mice, rats and zebrafish is costly, time-consuming and increasingly raises ethical concerns, which highlights the need to search for less complex model organisms. Existing tools only focus on the few well-studied model systems, most of which are complex animals. To address these issues, we have developed Orthologous Matrix and Alternative Model Organism (OMAMO), a software and a web service that provides the user with the best non-complex organism for research into a biological process of interest based on orthologous relationships between human and the species. The outputs provided by OMAMO were supported by a systematic literature review. Availability and implementation https://omabrowser.org/omamo/, https://github.com/DessimozLab/omamo. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alina Nicheperovich
- Department of Structural and Molecular Biology, University College London, London WC1E, UK
| | - Adrian M Altenhoff
- Department of Computer Science, ETH, 8092 Zurich, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Christophe Dessimoz
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.,Department of Computer Science, University College London, London WC1E 6BT, UK.,Department of Genetics, Evolution and Environment, University College London, London WC1E, UK
| | - Sina Majidian
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|
23
|
Raghavan V, Kraft L, Mesny F, Rigerte L. A simple guide to de novo transcriptome assembly and annotation. Brief Bioinform 2022; 23:6514404. [PMID: 35076693 PMCID: PMC8921630 DOI: 10.1093/bib/bbab563] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 12/03/2021] [Accepted: 12/09/2021] [Indexed: 12/13/2022] Open
Abstract
A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be assembled de novo using only the information available in the RNA-seq reads. Subsequently, the sequences must be annotated in order to identify sequence-intrinsic and evolutionary features in them (for example, protein-coding regions). Although straightforward at first glance, de novo transcriptome assembly and annotation can quickly prove to be challenging undertakings. In addition to familiarizing themselves with the conceptual and technical intricacies of the tasks at hand and the numerous pre- and post-processing steps involved, those interested must also grapple with an overwhelmingly large choice of tools. The lack of standardized workflows, fast pace of development of new tools and techniques and paucity of authoritative literature have served to exacerbate the difficulty of the task even further. Here, we present a comprehensive overview of de novo transcriptome assembly and annotation. We discuss the procedures involved, including pre- and post-processing steps, and present a compendium of corresponding tools.
Collapse
Affiliation(s)
- Venket Raghavan
- Corresponding authors: Venket Raghavan, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail: ; Louis Kraft, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail:
| | - Louis Kraft
- Corresponding authors: Venket Raghavan, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail: ; Louis Kraft, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail:
| | | | | |
Collapse
|
24
|
The Landscape of Autophagy-Related (ATG) Genes and Functional Characterization of TaVAMP727 to Autophagy in Wheat. Int J Mol Sci 2022; 23:ijms23020891. [PMID: 35055085 PMCID: PMC8776105 DOI: 10.3390/ijms23020891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 01/07/2022] [Accepted: 01/11/2022] [Indexed: 11/17/2022] Open
Abstract
Autophagy is an indispensable biological process and plays crucial roles in plant growth and plant responses to both biotic and abiotic stresses. This study systematically identified autophagy-related proteins (ATGs) in wheat and its diploid and tetraploid progenitors and investigated their genomic organization, structure characteristics, expression patterns, genetic variation, and regulation network. We identified a total of 77, 51, 29, and 30 ATGs in wheat, wild emmer, T. urartu and A. tauschii, respectively, and grouped them into 19 subfamilies. We found that these autophagy-related genes (ATGs) suffered various degrees of selection during the wheat’s domestication and breeding processes. The genetic variations in the promoter region of Ta2A_ATG8a were associated with differences in seed size, which might be artificially selected for during the domestication process of tetraploid wheat. Overexpression of TaVAMP727 improved the cold, drought, and salt stresses resistance of the transgenic Arabidopsis and wheat. It also promoted wheat heading by regulating the expression of most ATGs. Our findings demonstrate how ATGs regulate wheat plant development and improve abiotic stress resistance. The results presented here provide the basis for wheat breeding programs for selecting varieties of higher yield which are capable of growing in colder, drier, and saltier areas.
Collapse
|
25
|
Vazquez JM, Pena MT, Muhammad B, Kraft M, Adams LB, Lynch VJ. Parallel evolution of reduced cancer risk and tumor suppressor duplications in Xenarthra. eLife 2022; 11:82558. [PMID: 36480266 PMCID: PMC9810328 DOI: 10.7554/elife.82558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 12/07/2022] [Indexed: 12/14/2022] Open
Abstract
The risk of developing cancer is correlated with body size and lifespan within species, but there is no correlation between cancer and either body size or lifespan between species indicating that large, long-lived species have evolved enhanced cancer protection mechanisms. Previously we showed that several large bodied Afrotherian lineages evolved reduced intrinsic cancer risk, particularly elephants and their extinct relatives (Proboscideans), coincident with pervasive duplication of tumor suppressor genes (Vazquez and Lynch, 2021). Unexpectedly, we also found that Xenarthrans (sloths, armadillos, and anteaters) evolved very low intrinsic cancer risk. Here, we show that: (1) several Xenarthran lineages independently evolved large bodies, long lifespans, and reduced intrinsic cancer risk; (2) the reduced cancer risk in the stem lineages of Xenarthra and Pilosa coincided with bursts of tumor suppressor gene duplications; (3) cells from sloths proliferate extremely slowly while Xenarthran cells induce apoptosis at very low doses of DNA damaging agents; and (4) the prevalence of cancer is extremely low Xenarthrans, and cancer is nearly absent from armadillos. These data implicate the duplication of tumor suppressor genes in the evolution of remarkably large body sizes and decreased cancer risk in Xenarthrans and suggest they are a remarkably cancer-resistant group of mammals.
Collapse
Affiliation(s)
- Juan Manuel Vazquez
- Department of Integrative Biology, Valley Life Sciences, University of California, BerkeleyBerkeleyUnited States
| | - Maria T Pena
- United States Department of Health and Human Services, Health Resources and Services Administration, Health Systems Bureau, National Hansen's Disease ProgramBaton RougeUnited States
| | - Baaqeyah Muhammad
- Department of Biological Sciences, University at Buffalo, SUNYBuffaloUnited States
| | - Morgan Kraft
- Department of Biological Sciences, University at Buffalo, SUNYBuffaloUnited States
| | - Linda B Adams
- United States Department of Health and Human Services, Health Resources and Services Administration, Health Systems Bureau, National Hansen's Disease ProgramBaton RougeUnited States
| | - Vincent J Lynch
- Department of Biological Sciences, University at Buffalo, SUNYBuffaloUnited States
| |
Collapse
|
26
|
Analyses of Lysin-motif Receptor-like Kinase ( LysM-RLK) Gene Family in Allotetraploid Brassica napus L. and Its Progenitor Species: An In Silico Study. Cells 2021; 11:cells11010037. [PMID: 35011598 PMCID: PMC8750388 DOI: 10.3390/cells11010037] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 12/10/2021] [Accepted: 12/20/2021] [Indexed: 12/11/2022] Open
Abstract
The LysM receptor-like kinases (LysM-RLKs) play a crucial role in plant symbiosis and response to environmental stresses. Brassica napus, B. rapa, and B. oleracea are utilized as valuable vegetables. Different biotic and abiotic stressors affect these crops, resulting in yield losses. Therefore, genome-wide analysis of the LysM-RLK gene family was conducted. From the genome of the examined species, 33 LysM-RLK have been found. The conserved domains of Brassica LysM-RLKs were divided into three groups: LYK, LYP, and LysMn. In the BrassicaLysM-RLK gene family, only segmental duplication has occurred. The Ka/Ks ratio for the duplicated pair of genes was less than one indicating that the genes’ function had not changed over time. The BrassicaLysM-RLKs contain 70 cis-elements, indicating that they are involved in stress response. 39 miRNA molecules were responsible for the post-transcriptional regulation of 12 Brassica LysM-RLKs. A total of 22 SSR loci were discovered in 16 Brassica LysM-RLKs. According to RNA-seq data, the highest expression in response to biotic stresses was related to BnLYP6. According to the docking simulations, several residues in the active sites of BnLYP6 are in direct contact with the docked chitin and could be useful in future studies to develop pathogen-resistant B. napus. This research reveals comprehensive information that could lead to the identification of potential genes for Brassica species genetic manipulation.
Collapse
|
27
|
Opazo JC, Hoffmann FG, Zavala K, Edwards SV. Evolution of the DAN gene family in vertebrates. Dev Biol 2021; 482:34-43. [PMID: 34902310 DOI: 10.1016/j.ydbio.2021.12.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Revised: 12/07/2021] [Accepted: 12/08/2021] [Indexed: 11/26/2022]
Abstract
The DAN gene family (DAN, Differential screening-selected gene Aberrant in Neuroblastoma) is a group of genes that is expressed during development and plays fundamental roles in limb bud formation and digitation, kidney formation and morphogenesis and left-right axis specification. During adulthood the expression of these genes are associated with diseases, including cancer. Although most of the attention to this group of genes has been dedicated to understanding its role in physiology and development, its evolutionary history remains poorly understood. Thus, the goal of this study is to investigate the evolutionary history of the DAN gene family in vertebrates, with the objective of complementing the already abundant physiological information with an evolutionary context. Our results recovered the monophyly of all DAN gene family members and divide them into five main groups. In addition to the well-known DAN genes, our phylogenetic results revealed the presence of two new DAN gene lineages; one is only retained in cephalochordates, whereas the other one (GREM3) was only identified in cartilaginous fish, holostean fish, and coelacanth. According to the phyletic distribution of the genes, the ancestor of gnathostomes possessed a repertoire of eight DAN genes, and during the radiation of the group GREM1, GREM2, SOST, SOSTDC1, and NBL1 were retained in all major groups, whereas, GREM3, CER1, and DAND5 were differentially lost.
Collapse
Affiliation(s)
- Juan C Opazo
- Integrative Biology Group, Universidad Austral de Chile, Valdivia, Chile; Instituto de Ciencias Ambientales y Evolutivas, Facultad de Ciencias, Universidad Austral de Chile, Valdivia, Chile; David Rockefeller Center for Latin American Studies, Harvard University, Cambridge, MA, 02138, USA; Millennium Nucleus of Ion Channels-Associated Diseases (MiNICAD), Chile.
| | - Federico G Hoffmann
- Department of Biochemistry, Molecular Biology, Entomology, and Plant Pathology, Mississippi State University, Mississippi State, 39762, USA; Institute for Genomics, Biocomputing, and Biotechnology, Mississippi State University, Mississippi State, 39762, USA
| | - Kattina Zavala
- Instituto de Ciencias Ambientales y Evolutivas, Facultad de Ciencias, Universidad Austral de Chile, Valdivia, Chile
| | - Scott V Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, 02138, USA
| |
Collapse
|
28
|
Sánchez AL, Lafond M. Colorful orthology clustering in bounded-degree similarity graphs. J Bioinform Comput Biol 2021; 19:2140010. [PMID: 34775924 DOI: 10.1142/s0219720021400102] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Clustering genes in similarity graphs is a popular approach for orthology prediction. Most algorithms group genes without considering their species, which results in clusters that contain several paralogous genes. Moreover, clustering is known to be problematic when in-paralogs arise from ancient duplications. Recently, we proposed a two-step process that avoids these problems. First, we infer clusters of only orthologs (i.e. with only genes from distinct species), and second, we infer the missing inter-cluster orthologs. In this paper, we focus on the first step, which leads to a problem we call Colorful Clustering. In general, this is as hard as classical clustering. However, in similarity graphs, the number of species is usually small, as well as the neighborhood size of genes in other species. We therefore study the problem of clustering in which the number of colors is bounded by [Formula: see text], and each gene has at most [Formula: see text] neighbors in another species. We show that the well-known cluster editing formulation remains NP-hard even when [Formula: see text] and [Formula: see text]. We then propose a fixed-parameter algorithm in [Formula: see text] to find the single best cluster in the graph. We implemented this algorithm and included it in the aforementioned two-step approach. Experiments on simulated data show that this approach performs favorably to applying only an unconstrained clustering step.
Collapse
Affiliation(s)
- Alitzel López Sánchez
- Computer Science Department, Université de Sherbrooke, 2500 Boulevard de l'Université, Sherbrooke, Québec J1K 2R1, Canada
| | - Manuel Lafond
- Computer Science Department, Université de Sherbrooke, 2500 Boulevard de l'Université, Sherbrooke, Québec J1K 2R1, Canada
| |
Collapse
|
29
|
Birikmen M, Bohnsack KE, Tran V, Somayaji S, Bohnsack MT, Ebersberger I. Tracing Eukaryotic Ribosome Biogenesis Factors Into the Archaeal Domain Sheds Light on the Evolution of Functional Complexity. Front Microbiol 2021; 12:739000. [PMID: 34603269 PMCID: PMC8481954 DOI: 10.3389/fmicb.2021.739000] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 08/17/2021] [Indexed: 01/23/2023] Open
Abstract
Ribosome assembly is an essential and carefully choreographed cellular process. In eukaryotes, several 100 proteins, distributed across the nucleolus, nucleus, and cytoplasm, co-ordinate the step-wise assembly of four ribosomal RNAs (rRNAs) and approximately 80 ribosomal proteins (RPs) into the mature ribosomal subunits. Due to the inherent complexity of the assembly process, functional studies identifying ribosome biogenesis factors and, more importantly, their precise functions and interplay are confined to a few and very well-established model organisms. Although best characterized in yeast (Saccharomyces cerevisiae), emerging links to disease and the discovery of additional layers of regulation have recently encouraged deeper analysis of the pathway in human cells. In archaea, ribosome biogenesis is less well-understood. However, their simpler sub-cellular structure should allow a less elaborated assembly procedure, potentially providing insights into the functional essentials of ribosome biogenesis that evolved long before the diversification of archaea and eukaryotes. Here, we use a comprehensive phylogenetic profiling setup, integrating targeted ortholog searches with automated scoring of protein domain architecture similarities and an assessment of when search sensitivity becomes limiting, to trace 301 curated eukaryotic ribosome biogenesis factors across 982 taxa spanning the tree of life and including 727 archaea. We show that both factor loss and lineage-specific modifications of factor function modulate ribosome biogenesis, and we highlight that limited sensitivity of the ortholog search can confound evolutionary conclusions. Projecting into the archaeal domain, we find that only few factors are consistently present across the analyzed taxa, and lineage-specific loss is common. While members of the Asgard group are not special with respect to their inventory of ribosome biogenesis factors (RBFs), they unite the highest number of orthologs to eukaryotic RBFs in one taxon. Using large ribosomal subunit maturation as an example, we demonstrate that archaea pursue a simplified version of the corresponding steps in eukaryotes. Much of the complexity of this process evolved on the eukaryotic lineage by the duplication of ribosomal proteins and their subsequent functional diversification into ribosome biogenesis factors. This highlights that studying ribosome biogenesis in archaea provides fundamental information also for understanding the process in eukaryotes.
Collapse
Affiliation(s)
- Mehmet Birikmen
- Applied Bioinformatics Group, Institute of Cell Biology and Neuroscience, Goethe University Frankfurt, Frankfurt, Germany
| | - Katherine E Bohnsack
- Department of Molecular Biology, University Medical Center Göttingen, Göttingen, Germany
| | - Vinh Tran
- Applied Bioinformatics Group, Institute of Cell Biology and Neuroscience, Goethe University Frankfurt, Frankfurt, Germany
| | - Sharvari Somayaji
- Applied Bioinformatics Group, Institute of Cell Biology and Neuroscience, Goethe University Frankfurt, Frankfurt, Germany
| | - Markus T Bohnsack
- Department of Molecular Biology, University Medical Center Göttingen, Göttingen, Germany.,Göttingen Center for Molecular Biosciences, Georg-August University, Göttingen, Germany
| | - Ingo Ebersberger
- Applied Bioinformatics Group, Institute of Cell Biology and Neuroscience, Goethe University Frankfurt, Frankfurt, Germany.,Senckenberg Biodiversity and Climate Research Center (S-BIK-F), Frankfurt, Germany.,LOEWE Center for Translational Biodiversity Genomics (LOEWE-TBG), Frankfurt, Germany
| |
Collapse
|
30
|
Matsubara S, Osugi T, Shiraishi A, Wada A, Satake H. Comparative analysis of transcriptomic profiles among ascidians, zebrafish, and mice: Insights from tissue-specific gene expression. PLoS One 2021; 16:e0254308. [PMID: 34559810 PMCID: PMC8462739 DOI: 10.1371/journal.pone.0254308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 09/12/2021] [Indexed: 11/18/2022] Open
Abstract
Tissue/organ-specific genes (TSGs) are important not only for understanding organ development and function, but also for investigating the evolutionary lineages of organs in animals. Here, we investigate the TSGs of 9 adult tissues of an ascidian, Ciona intestinalis Type A (Ciona robusta), which lies in the important position of being the sister group of vertebrates. RNA-seq and qRT-PCR identified the Ciona TSGs in each tissue, and BLAST searches identified their homologs in zebrafish and mice. Tissue distributions of the vertebrate homologs were analyzed and clustered using public RNA-seq data for 12 zebrafish and 30 mouse tissues. Among the vertebrate homologs of the Ciona TSGs in the neural complex, 48% and 63% showed high expression in the zebrafish and mouse brain, respectively, suggesting that the central nervous system is evolutionarily conserved in chordates. In contrast, vertebrate homologs of Ciona TSGs in the ovary, pharynx, and intestine were not consistently highly expressed in the corresponding tissues of vertebrates, suggesting that these organs have evolved in Ciona-specific lineages. Intriguingly, more TSG homologs of the Ciona stomach were highly expressed in the vertebrate liver (17-29%) and intestine (22-33%) than in the mouse stomach (5%). Expression profiles for these genes suggest that the biological roles of the Ciona stomach are distinct from those of their vertebrate counterparts. Collectively, Ciona tissues were categorized into 3 groups: i) high similarity to the corresponding vertebrate tissues (neural complex and heart), ii) low similarity to the corresponding vertebrate tissues (ovary, pharynx, and intestine), and iii) low similarity to the corresponding vertebrate tissues, but high similarity to other vertebrate tissues (stomach, endostyle, and siphons). The present study provides transcriptomic catalogs of adult ascidian tissues and significant insights into the evolutionary lineages of the brain, heart, and digestive tract of chordates.
Collapse
Affiliation(s)
- Shin Matsubara
- Bioorganic Research Institute, Suntory Foundation for Life Sciences, Kyoto, Japan
- * E-mail:
| | - Tomohiro Osugi
- Bioorganic Research Institute, Suntory Foundation for Life Sciences, Kyoto, Japan
| | - Akira Shiraishi
- Bioorganic Research Institute, Suntory Foundation for Life Sciences, Kyoto, Japan
| | - Azumi Wada
- Bioorganic Research Institute, Suntory Foundation for Life Sciences, Kyoto, Japan
| | - Honoo Satake
- Bioorganic Research Institute, Suntory Foundation for Life Sciences, Kyoto, Japan
| |
Collapse
|
31
|
Abstract
Recent human activity has profoundly transformed Earth biomes on a scale and at rates that are unprecedented. Given the central role of symbioses in ecosystem processes, functions, and services throughout the Earth biosphere, the impacts of human-driven change on symbioses are critical to understand. Symbioses are not merely collections of organisms, but co-evolved partners that arise from the synergistic combination and action of different genetic programs. They function with varying degrees of permanence and selection as emergent units with substantial potential for combinatorial and evolutionary innovation in both structure and function. Following an articulation of operational definitions of symbiosis and related concepts and characteristics of the Anthropocene, we outline a basic typology of anthropogenic change (AC) and a conceptual framework for how AC might mechanistically impact symbioses with select case examples to highlight our perspective. We discuss surprising connections between symbiosis and the Anthropocene, suggesting ways in which new symbioses could arise due to AC, how symbioses could be agents of ecosystem change, and how symbioses, broadly defined, of humans and “farmed” organisms may have launched the Anthropocene. We conclude with reflections on the robustness of symbioses to AC and our perspective on the importance of symbioses as ecosystem keystones and the need to tackle anthropogenic challenges as wise and humble stewards embedded within the system.
Collapse
Affiliation(s)
- Erik F Y Hom
- Department of Biology and Center for Biodiversity and Conservation Research, University of Mississippi, University, MS 38677 USA
| | - Alexandra S Penn
- Department of Sociology and Centre for Evaluation of Complexity Across the Nexus, University of Surrey, Guildford, Surrey, GU2 7XH UK
| |
Collapse
|
32
|
Moiseenko KV, Glazunova OA, Savinova OS, Vasina DV, Zherebker AY, Kulikova NA, Nikolaev EN, Fedorova TV. Relation between lignin molecular profile and fungal exo-proteome during kraft lignin modification by Trametes hirsuta LE-BIN 072. BIORESOURCE TECHNOLOGY 2021; 335:125229. [PMID: 34010738 DOI: 10.1016/j.biortech.2021.125229] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 04/23/2021] [Accepted: 04/24/2021] [Indexed: 05/11/2023]
Abstract
The process of kraft lignin modification by the white-rot fungus Trametes hirsuta was investigated using electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry (ESI FT-ICR MS), and groups of systematically changing compounds were delineated. In the course of cultivation, fungus tended to degrade progressively more reduced compounds and produced more oxidized ones. However, this process was not gradual - the substantial discontinuity was observed between 6th and 10th days of cultivation. Simultaneously, the secretion of ligninolytic peroxidases by the fungus was changing in a cascade manner - new isoenzymes were added to the mixture of the already secreted ones, and once new isoenzyme appeared both its relative quantity and number of isoforms increased as cultivation proceeded. It was proposed, that the later secreted peroxidases (MnP7 and MnP1) possess higher substrate affinity for some phenolic compounds and act in more specialized manner than the early secreted ones (MnP5 and VP2).
Collapse
Affiliation(s)
- Konstantin V Moiseenko
- A. N. Bach Institute of Biochemistry, Research Center of Biotechnology, Russian Academy of Sciences, Leninsky Ave. 33/2, Moscow 119071, Russia.
| | - Olga A Glazunova
- A. N. Bach Institute of Biochemistry, Research Center of Biotechnology, Russian Academy of Sciences, Leninsky Ave. 33/2, Moscow 119071, Russia
| | - Olga S Savinova
- A. N. Bach Institute of Biochemistry, Research Center of Biotechnology, Russian Academy of Sciences, Leninsky Ave. 33/2, Moscow 119071, Russia
| | - Daria V Vasina
- A. N. Bach Institute of Biochemistry, Research Center of Biotechnology, Russian Academy of Sciences, Leninsky Ave. 33/2, Moscow 119071, Russia
| | | | - Natalia A Kulikova
- A. N. Bach Institute of Biochemistry, Research Center of Biotechnology, Russian Academy of Sciences, Leninsky Ave. 33/2, Moscow 119071, Russia; Department of Soil Science, Lomonosov Moscow State University, Moscow 119991, Russia
| | - Evgeny N Nikolaev
- Skolkovo Institute of Science and Technology, Skolkovo, Moscow Region 143025, Russia
| | - Tatiana V Fedorova
- A. N. Bach Institute of Biochemistry, Research Center of Biotechnology, Russian Academy of Sciences, Leninsky Ave. 33/2, Moscow 119071, Russia
| |
Collapse
|
33
|
Liu Y, Han N, Wang S, Chen C, Lu J, Riaz MW, Si H, Sun G, Ma C. Genome-Wide Identification of Triticum aestivum Xylanase Inhibitor Gene Family and Inhibitory Effects of XI-2 Subfamily Proteins on Fusarium graminearum GH11 Xylanase. FRONTIERS IN PLANT SCIENCE 2021; 12:665501. [PMID: 34381472 PMCID: PMC8350787 DOI: 10.3389/fpls.2021.665501] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Accepted: 04/21/2021] [Indexed: 06/13/2023]
Abstract
Triticum aestivum xylanase inhibitor (TaXI) gene plays an important role in plant defense. Recently, TaXI-III inhibitor has been shown to play a dual role in wheat resistance to Fusarium graminearum infection. Thus, identifying the members of the TaXI gene family and clarifying its role in wheat resistance to stresses are essential for wheat resistance breeding. However, to date, no comprehensive research on TaXIs in wheat (Triticum aestivum L.) has been conducted. In this study, a total of 277 TaXI genes, including six genes that we cloned, were identified from the recently released wheat genome database (IWGSC RefSeq v1.1), which were unevenly located on 21 chromosomes of wheat. Phylogenetic analysis divided these genes into six subfamilies, all the six genes we cloned belonged to XI-2 subfamily. The exon/intron structure of most TaXI genes and the conserved motifs of proteins in the same subfamily are similar. The TaXI gene family contains 92 homologous gene pairs or clusters, 63 and 193 genes were identified as tandem replication and segmentally duplicated genes, respectively. Analysis of the cis-acting elements in the promoter of TaXI genes showed that they are involved in wheat growth, hormone-mediated signal transduction, and response to biotic and abiotic stresses. RNA-seq data analysis revealed that TaXI genes exhibited expression preference or specificity in different organs and developmental stages, as well as in diverse stress responses, which can be regulated or induced by a variety of plant hormones and stresses. In addition, the qRT-PCR data and heterologous expression analysis of six TaXI genes revealed that the genes of XI-2 subfamily have double inhibitory effect on GH11 xylanase of F. graminearum, suggesting their potential important roles in wheat resistance to F. graminearum infection. The outcomes of this study not only enhance our understanding of the TaXI gene family in wheat, but also help us to screen more candidate genes for further exploring resistance mechanism in wheat.
Collapse
Affiliation(s)
- Yang Liu
- College of Agronomy, Anhui Agricultural University, Hefei, China
- Key Laboratory of Wheat Biology and Genetic Improvement on Southern Yellow and Huai River Valley, Ministry of Agriculture and Rural Affairs, Hefei, China
| | - Nannan Han
- College of Agronomy, Anhui Agricultural University, Hefei, China
- Key Laboratory of Wheat Biology and Genetic Improvement on Southern Yellow and Huai River Valley, Ministry of Agriculture and Rural Affairs, Hefei, China
| | - Sheng Wang
- College of Agronomy, Anhui Agricultural University, Hefei, China
- Key Laboratory of Wheat Biology and Genetic Improvement on Southern Yellow and Huai River Valley, Ministry of Agriculture and Rural Affairs, Hefei, China
| | - Can Chen
- College of Agronomy, Anhui Agricultural University, Hefei, China
- Key Laboratory of Wheat Biology and Genetic Improvement on Southern Yellow and Huai River Valley, Ministry of Agriculture and Rural Affairs, Hefei, China
| | - Jie Lu
- College of Agronomy, Anhui Agricultural University, Hefei, China
- Key Laboratory of Wheat Biology and Genetic Improvement on Southern Yellow and Huai River Valley, Ministry of Agriculture and Rural Affairs, Hefei, China
| | - Muhammad Waheed Riaz
- College of Agronomy, Anhui Agricultural University, Hefei, China
- Key Laboratory of Wheat Biology and Genetic Improvement on Southern Yellow and Huai River Valley, Ministry of Agriculture and Rural Affairs, Hefei, China
| | - Hongqi Si
- College of Agronomy, Anhui Agricultural University, Hefei, China
- Key Laboratory of Wheat Biology and Genetic Improvement on Southern Yellow and Huai River Valley, Ministry of Agriculture and Rural Affairs, Hefei, China
| | - Genlou Sun
- College of Agronomy, Anhui Agricultural University, Hefei, China
- Biology Department, Saint Mary’s University, Halifax, NS, Canada
| | - Chuanxi Ma
- College of Agronomy, Anhui Agricultural University, Hefei, China
- Key Laboratory of Wheat Biology and Genetic Improvement on Southern Yellow and Huai River Valley, Ministry of Agriculture and Rural Affairs, Hefei, China
- National United Engineering Laboratory for Crop Stress Resistance Breeding, Hefei, China
- Anhui Key Laboratory of Crop Biology, Hefei, China
| |
Collapse
|
34
|
Begum T, Robinson-Rechavi M. Special Care Is Needed in Applying Phylogenetic Comparative Methods to Gene Trees with Speciation and Duplication Nodes. Mol Biol Evol 2021; 38:1614-1626. [PMID: 33169790 PMCID: PMC8042747 DOI: 10.1093/molbev/msaa288] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
How gene function evolves is a central question of evolutionary biology. It can be investigated by comparing functional genomics results between species and between genes. Most comparative studies of functional genomics have used pairwise comparisons. Yet it has been shown that this can provide biased results, as genes, like species, are phylogenetically related. Phylogenetic comparative methods should be used to correct for this, but they depend on strong assumptions, including unbiased tree estimates relative to the hypothesis being tested. Such methods have recently been used to test the “ortholog conjecture,” the hypothesis that functional evolution is faster in paralogs than in orthologs. Although pairwise comparisons of tissue specificity (τ) provided support for the ortholog conjecture, phylogenetic independent contrasts did not. Our reanalysis on the same gene trees identified problems with the time calibration of duplication nodes. We find that the gene trees used suffer from important biases, due to the inclusion of trees with no duplication nodes, to the relative age of speciations and duplications, to systematic differences in branch lengths, and to non-Brownian motion of tissue specificity on many trees. We find that incorrect implementation of phylogenetic method in empirical gene trees with duplications can be problematic. Controlling for biases allows successful use of phylogenetic methods to study the evolution of gene function and provides some support for the ortholog conjecture using three different phylogenetic approaches.
Collapse
Affiliation(s)
- Tina Begum
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Marc Robinson-Rechavi
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
35
|
Corynebacterium glutamicum Regulation beyond Transcription: Organizing Principles and Reconstruction of an Extended Regulatory Network Incorporating Regulations Mediated by Small RNA and Protein-Protein Interactions. Microorganisms 2021; 9:microorganisms9071395. [PMID: 34203422 PMCID: PMC8303971 DOI: 10.3390/microorganisms9071395] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2020] [Revised: 01/08/2021] [Accepted: 01/12/2021] [Indexed: 11/16/2022] Open
Abstract
Corynebacterium glutamicum is a Gram-positive bacterium found in soil where the condition changes demand plasticity of the regulatory machinery. The study of such machinery at the global scale has been challenged by the lack of data integration. Here, we report three regulatory network models for C. glutamicum: strong (3040 interactions) constructed solely with regulations previously supported by directed experiments; all evidence (4665 interactions) containing the strong network, regulations previously supported by nondirected experiments, and protein-protein interactions with a direct effect on gene transcription; sRNA (5222 interactions) containing the all evidence network and sRNA-mediated regulations. Compared to the previous version (2018), the strong and all evidence networks increased by 75 and 1225 interactions, respectively. We analyzed the system-level components of the three networks to identify how they differ and compared their structures against those for the networks of more than 40 species. The inclusion of the sRNA-mediated regulations changed the proportions of the system-level components and increased the number of modules but decreased their size. The C. glutamicum regulatory structure contrasted with other bacterial regulatory networks. Finally, we used the strong networks of three model organisms to provide insights and future directions of the C.glutamicum regulatory network characterization.
Collapse
|
36
|
Lai DL, Yan J, Fan Y, Li Y, Ruan JJ, Wang JZ, Fan Y, Cheng XB, Cheng JP. Genome-wide identification and phylogenetic relationships of the Hsp70 gene family of Aegilops tauschii, wild emmer wheat ( Triticum dicoccoides) and bread wheat ( Triticum aestivum). 3 Biotech 2021; 11:301. [PMID: 34194894 DOI: 10.1007/s13205-021-02639-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 01/03/2021] [Indexed: 12/22/2022] Open
Abstract
Heat shock protein 70 (Hsp70) plays an important role in plant development. It is closely related to the physiological process of cell development and the response to abiotic and biological stress. However, the classification and evolution of Hsp70 genes in bread wheat, wild emmer wheat and Aegilops tauschii are still unclear. Therefore, this study conducted a comprehensive bioinformatics analysis of Hsp70 gene in three species. Among these three species, 113, 79 and 36 Hsp70 genes were identified. They are divided into six subfamilies. Group vi-1 is different from Arabidopsis thaliana. It may be the result of early evolutionary segregation. The number of exons in different subfamilies (from 1 to 13) was different, but the distribution patterns of exons / introns in the same subfamily were similar. The results of Hsp70 promoter region analysis showed that the cis-regulatory elements of A. tauschii and wild emmer wheat were different from those of wheat. In addition, CpG island proportion of wild emmer Hsp70 was higher than that of wheat, which may be the molecular basis of heat resistance of wild wheat relative to cultivated wheat. Further comprehensive analysis of chromosome location and repeat events of Hsp70 gene showed that whole-genome duplication and tandem duplication events contributed to the evolution and expansion of Hsp70 gene in wheat. The results of non-synonymous substitution and synonymous substitution analysis showed that Hsp70 genes of three species had undergone purification selection. The expression profile analysis showed that Hsp70 gene was highly expressed in the roots during the vegetative growth period. In addition, TaHsp70 gene was highly expressed under various stress. The identification, classification and evolution of Hsp70 in wheat and its relatives provided a basis for further research on its evolution and its molecular mechanism in response to stress. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s13205-021-02639-5.
Collapse
Affiliation(s)
- Di-Li Lai
- College of Agriculture, Guizhou University, Guiyang, 550025 People's Republic of China
| | - Jun Yan
- School of Pharmacy and Bioengineering, Chengdu University, Chengdu, 610106 People's Republic of China
| | - Yu Fan
- College of Agriculture, Guizhou University, Guiyang, 550025 People's Republic of China
| | - Yao Li
- School of Public Health, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137 People's Republic of China
| | - Jing-Jun Ruan
- College of Agriculture, Guizhou University, Guiyang, 550025 People's Republic of China
| | - Jun-Zhen Wang
- Research Station of Alpine Crops, Xichang Institute of Agricultural Sciences, Liangshan, 616150 People's Republic of China
| | - Yue Fan
- College of Agriculture, Guizhou University, Guiyang, 550025 People's Republic of China
| | - Xiao-Bin Cheng
- Department of Environmental and Life Sciences, Sichuan MinZu College, Kangding, 626001 People's Republic of China
| | - Jian-Ping Cheng
- College of Agriculture, Guizhou University, Guiyang, 550025 People's Republic of China
| |
Collapse
|
37
|
Linard B, Ebersberger I, McGlynn SE, Glover N, Mochizuki T, Patricio M, Lecompte O, Nevers Y, Thomas PD, Gabaldón T, Sonnhammer E, Dessimoz C, Uchiyama I. Ten Years of Collaborative Progress in the Quest for Orthologs. Mol Biol Evol 2021; 38:3033-3045. [PMID: 33822172 PMCID: PMC8321534 DOI: 10.1093/molbev/msab098] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 02/07/2021] [Accepted: 04/01/2021] [Indexed: 12/19/2022] Open
Abstract
Accurate determination of the evolutionary relationships between genes is a foundational challenge in biology. Homology-evolutionary relatedness-is in many cases readily determined based on sequence similarity analysis. By contrast, whether or not two genes directly descended from a common ancestor by a speciation event (orthologs) or duplication event (paralogs) is more challenging, yet provides critical information on the history of a gene. Since 2009, this task has been the focus of the Quest for Orthologs (QFO) Consortium. The sixth QFO meeting took place in Okazaki, Japan in conjunction with the 67th National Institute for Basic Biology conference. Here, we report recent advances, applications, and oncoming challenges that were discussed during the conference. Steady progress has been made toward standardization and scalability of new and existing tools. A feature of the conference was the presentation of a panel of accessible tools for phylogenetic profiling and several developments to bring orthology beyond the gene unit-from domains to networks. This meeting brought into light several challenges to come: leveraging orthology computations to get the most of the incoming avalanche of genomic data, integrating orthology from domain to biological network levels, building better gene models, and adapting orthology approaches to the broad evolutionary and genomic diversity recognized in different forms of life and viruses.
Collapse
Affiliation(s)
- Benjamin Linard
- LIRMM, University of Montpellier, CNRS, Montpellier, France.,SPYGEN, Le Bourget-du-Lac, France
| | - Ingo Ebersberger
- Institute of Cell Biology and Neuroscience, Goethe University Frankfurt, Frankfurt, Germany.,Senckenberg Biodiversity and Climate Research Centre (S-BIKF), Frankfurt, Germany.,LOEWE Center for Translational Biodiversity Genomics (TBG), Frankfurt, Germany
| | - Shawn E McGlynn
- Earth-Life Science Institute, Tokyo Institute of Technology, Meguro, Tokyo, Japan.,Blue Marble Space Institute of Science, Seattle, WA, USA
| | - Natasha Glover
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Tomohiro Mochizuki
- Earth-Life Science Institute, Tokyo Institute of Technology, Meguro, Tokyo, Japan
| | - Mateus Patricio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Odile Lecompte
- Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Fédération de Médecine Translationnelle de Strasbourg, Strasbourg, France
| | - Yannis Nevers
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Paul D Thomas
- Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, CA, USA
| | - Toni Gabaldón
- Barcelona Supercomputing Centre (BCS-CNS), Jordi Girona, Barcelona, Spain.,Institute for Research in Biomedicine (IRB), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Erik Sonnhammer
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden
| | - Christophe Dessimoz
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.,Department of Computer Science, University College London, London, United Kingdom.,Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Ikuo Uchiyama
- Department of Theoretical Biology, National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Aichi, Japan
| | | |
Collapse
|
38
|
Li J, Liu X, Wang Q, Sun J, He D. Genome-wide identification and analysis of cystatin family genes in Sorghum ( Sorghum bicolor (L.) Moench). PeerJ 2021; 9:e10617. [PMID: 33552717 PMCID: PMC7827979 DOI: 10.7717/peerj.10617] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2019] [Accepted: 11/30/2020] [Indexed: 11/20/2022] Open
Abstract
To set a systematic study of the Sorghum cystatins (SbCys) gene family, a genome-wide analysis of the SbCys family genes was performed by bioinformatics-based methods. In total, 18 SbCys genes were identified in Sorghum, which were distributed unevenly on chromosomes, and two genes were involved in a tandem duplication event. All SbCys genes had similar exon/intron structure and motifs, indicating their high evolutionary conservation. Transcriptome analysis showed that 16 SbCys genes were expressed in different tissues, and most genes displayed higher expression levels in reproductive tissues than in vegetative tissues, indicating that the SbCys genes participated in the regulation of seed formation. Furthermore, the expression profiles of the SbCys genes revealed that seven cystatin family genes were induced during Bipolaris sorghicola infection and only two genes were responsive to aphid infestation. In addition, quantitative real-time polymerase chain reaction (qRT-PCR) confirmed that 17 SbCys genes were induced by one or two abiotic stresses (dehydration, salt, and ABA stresses). The interaction network indicated that SbCys proteins were associated with several biological processes, including seed development and stress responses. Notably, the expression of SbCys4 was up-regulated under biotic and abiotic stresses, suggesting its potential roles in mediating the responses of Sorghum to adverse environmental impact. Our results provide new insights into the structural and functional characteristics of the SbCys gene family, which lay the foundation for better understanding the roles and regulatory mechanism of Sorghum cystatins in seed development and responses to different stress conditions.
Collapse
Affiliation(s)
- Jie Li
- College of Agronomy, Xinyang Agriculture and Forestry University, Xinyang, Henan Province, China
| | - Xinhao Liu
- Central Laboratory, Xinyang Agriculture and Forestry University, Xinyang, Henan Province, China
| | - Qingmei Wang
- Central Laboratory, Xinyang Agriculture and Forestry University, Xinyang, Henan Province, China
| | - Junyan Sun
- College of Agronomy, Xinyang Agriculture and Forestry University, Xinyang, Henan Province, China
| | - Dexian He
- Collaborative Innovation Center of Henan Grain Crops/National Key Laboratory of Wheat and Maize Crop Science, College of Agronomy, Henan Agricultural University, Zhengzhou, China
| |
Collapse
|
39
|
Dong N, Bandura J, Zhang Z, Wang Y, Labadie K, Noel B, Davison A, Koene JM, Sun HS, Coutellec MA, Feng ZP. Ion channel profiling of the Lymnaea stagnalis ganglia via transcriptome analysis. BMC Genomics 2021; 22:18. [PMID: 33407100 PMCID: PMC7789530 DOI: 10.1186/s12864-020-07287-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Accepted: 11/28/2020] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND The pond snail Lymnaea stagnalis (L. stagnalis) has been widely used as a model organism in neurobiology, ecotoxicology, and parasitology due to the relative simplicity of its central nervous system (CNS). However, its usefulness is restricted by a limited availability of transcriptome data. While sequence information for the L. stagnalis CNS transcripts has been obtained from EST libraries and a de novo RNA-seq assembly, the quality of these assemblies is limited by a combination of low coverage of EST libraries, the fragmented nature of de novo assemblies, and lack of reference genome. RESULTS In this study, taking advantage of the recent availability of a preliminary L. stagnalis genome, we generated an RNA-seq library from the adult L. stagnalis CNS, using a combination of genome-guided and de novo assembly programs to identify 17,832 protein-coding L. stagnalis transcripts. We combined our library with existing resources to produce a transcript set with greater sequence length, completeness, and diversity than previously available ones. Using our assembly and functional domain analysis, we profiled L. stagnalis CNS transcripts encoding ion channels and ionotropic receptors, which are key proteins for CNS function, and compared their sequences to other vertebrate and invertebrate model organisms. Interestingly, L. stagnalis transcripts encoding numerous putative Ca2+ channels showed the most sequence similarity to those of Mus musculus, Danio rerio, Xenopus tropicalis, Drosophila melanogaster, and Caenorhabditis elegans, suggesting that many calcium channel-related signaling pathways may be evolutionarily conserved. CONCLUSIONS Our study provides the most thorough characterization to date of the L. stagnalis transcriptome and provides insights into differences between vertebrates and invertebrates in CNS transcript diversity, according to function and protein class. Furthermore, this study provides a complete characterization of the ion channels of Lymnaea stagnalis, opening new avenues for future research on fundamental neurobiological processes in this model system.
Collapse
Affiliation(s)
- Nancy Dong
- Department of Physiology, University of Toronto, 3308 MSB, 1 King's College Circle, Toronto, ON, M5S 1A8, Canada
| | - Julia Bandura
- Department of Physiology, University of Toronto, 3308 MSB, 1 King's College Circle, Toronto, ON, M5S 1A8, Canada
| | - Zhaolei Zhang
- Donnelly Centre for Cellular and Biomolecular Research and Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 3E1, Canada
| | - Yan Wang
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario, M5S 3B2, Canada
- Department of Biological Sciences, University of Toronto Scarborough, Toronto, Ontario, M1C 1A4, Canada
| | - Karine Labadie
- Genoscope, Institut de biologie François Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, BP5706, 91057, Evry, France
| | - Benjamin Noel
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, University of Evry, Université Paris-Saclay, 91057, Evry, France
| | - Angus Davison
- School of Life Sciences, University of Nottingham, University Park, Nottingham, UK, NG7 2RD, UK
| | - Joris M Koene
- Department of Ecological Science, Faculty of Science, Vrije Universiteit, Amsterdam, The Netherlands
| | - Hong-Shuo Sun
- Department of Physiology, University of Toronto, 3308 MSB, 1 King's College Circle, Toronto, ON, M5S 1A8, Canada
- Department of Surgery, University of Toronto, Toronto, Ontario, M5S 1A8, Canada
| | | | - Zhong-Ping Feng
- Department of Physiology, University of Toronto, 3308 MSB, 1 King's College Circle, Toronto, ON, M5S 1A8, Canada.
| |
Collapse
|
40
|
Penin AA, Kasianov AS, Klepikova AV, Kirov IV, Gerasimov ES, Fesenko AN, Logacheva MD. High-Resolution Transcriptome Atlas and Improved Genome Assembly of Common Buckwheat, Fagopyrum esculentum. FRONTIERS IN PLANT SCIENCE 2021; 12:612382. [PMID: 33815435 PMCID: PMC8010679 DOI: 10.3389/fpls.2021.612382] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 02/03/2021] [Indexed: 05/06/2023]
Abstract
Common buckwheat (Fagopyrum esculentum) is an important non-cereal grain crop and a prospective component of functional food. Despite this, the genomic resources for this species and for the whole family Polygonaceae, to which it belongs, are scarce. Here, we report the assembly of the buckwheat genome using long-read technology and a high-resolution expression atlas including 46 organs and developmental stages. We found that the buckwheat genome has an extremely high content of transposable elements, including several classes of recently (0.5-1 Mya) multiplied TEs ("transposon burst") and gradually accumulated TEs. The difference in TE content is a major factor contributing to the three-fold increase in the genome size of F. esculentum compared with its sister species F. tataricum. Moreover, we detected the differences in TE content between the wild ancestral subspecies F. esculentum ssp. ancestrale and buckwheat cultivars, suggesting that TE activity accompanied buckwheat domestication. Expression profiling allowed us to test a hypothesis about the genetic control of petaloidy of tepals in buckwheat. We showed that it is not mediated by B-class gene activity, in contrast to the prediction from the ABC model. Based on a survey of expression profiles and phylogenetic analysis, we identified the MYB family transcription factor gene tr_18111 as a potential candidate for the determination of conical cells in buckwheat petaloid tepals. The information on expression patterns has been integrated into the publicly available database TraVA: http://travadb.org/browse/Species=Fesc/. The improved genome assembly and transcriptomic resources will enable research on buckwheat, including practical applications.
Collapse
Affiliation(s)
- Aleksey A. Penin
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
| | - Artem S. Kasianov
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
| | - Anna V. Klepikova
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
| | - Ilya V. Kirov
- All-Russia Research Institute of Agricultural Biotechnology, Moscow, Russia
| | | | | | - Maria D. Logacheva
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
- Skolkovo Institute of Science and Technology, Moscow, Russia
- *Correspondence: Maria D. Logacheva,
| |
Collapse
|
41
|
Salmanian S, Pezeshk H, Sadeghi M. Inter-protein residue covariation information unravels physically interacting protein dimers. BMC Bioinformatics 2020; 21:584. [PMID: 33334319 PMCID: PMC7745481 DOI: 10.1186/s12859-020-03930-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 12/09/2020] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Predicting physical interaction between proteins is one of the greatest challenges in computational biology. There are considerable various protein interactions and a huge number of protein sequences and synthetic peptides with unknown interacting counterparts. Most of co-evolutionary methods discover a combination of physical interplays and functional associations. However, there are only a handful of approaches which specifically infer physical interactions. Hybrid co-evolutionary methods exploit inter-protein residue coevolution to unravel specific physical interacting proteins. In this study, we introduce a hybrid co-evolutionary-based approach to predict physical interplays between pairs of protein families, starting from protein sequences only. RESULTS In the present analysis, pairs of multiple sequence alignments are constructed for each dimer and the covariation between residues in those pairs are calculated by CCMpred (Contacts from Correlated Mutations predicted) and three mutual information based approaches for ten accessible surface area threshold groups. Then, whole residue couplings between proteins of each dimer are unified into a single Frobenius norm value. Norms of residue contact matrices of all dimers in different accessible surface area thresholds are fed into support vector machine as single or multiple feature models. The results of training the classifiers by single features show no apparent different accuracies in distinct methods for different accessible surface area thresholds. Nevertheless, mutual information product and context likelihood of relatedness procedures may roughly have an overall higher and lower performances than other two methods for different accessible surface area cut-offs, respectively. The results also demonstrate that training support vector machine with multiple norm features for several accessible surface area thresholds leads to a considerable improvement of prediction performance. In this context, CCMpred roughly achieves an overall better performance than mutual information based approaches. The best accuracy, sensitivity, specificity, precision and negative predictive value for that method are 0.98, 1, 0.962, 0.96, and 0.962, respectively. CONCLUSIONS In this paper, by feeding norm values of protein dimers into support vector machines in different accessible surface area thresholds, we demonstrate that even small number of proteins in pairs of multiple alignments could allow one to accurately discriminate between positive and negative dimers.
Collapse
Affiliation(s)
- Sara Salmanian
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Hamid Pezeshk
- School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran
- Present Address: Department of Mathematics and Statistics, Concordia University, Montreal, Canada
- School of Biological Sciences, Institute for Research in Fundamental Sciences, Tehran, Iran
| | - Mehdi Sadeghi
- National Institute of Genetic Engineering and Biotechnology, Tehran, Iran
| |
Collapse
|
42
|
Purification and Characterization of Two Novel Laccases from Peniophora lycii. J Fungi (Basel) 2020; 6:jof6040340. [PMID: 33291231 PMCID: PMC7762197 DOI: 10.3390/jof6040340] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Revised: 12/02/2020] [Accepted: 12/03/2020] [Indexed: 01/09/2023] Open
Abstract
Although, currently, more than 100 laccases have been purified from basidiomycete fungi, the majority of these laccases were obtained from fungi of the Polyporales order, and only scarce data are available about the laccases from other fungi. In this article, laccase production by the white-rot basidiomycete fungus Peniophora lycii, belonging to the Russulales order, was investigated. It was shown that, under copper induction, this fungus secreted three different laccase isozymes. Two laccase isozymes—Lac5 and LacA—were purified and their corresponding nucleotide sequences were determined. Both purified laccases were relatively thermostable with periods of half-life at 70 °C of 10 and 8 min for Lac5 and LacA, respectively. The laccases demonstrated the highest activity toward ABTS (97 U·mg−1 for Lac5 and 121 U·mg−1 for LacA at pH 4.5); Lac5 demonstrated the lowest activity toward 2,6-DMP (2.5 U·mg−1 at pH 4.5), while LacA demonstrated this towards gallic acid (1.4 U·mg−1 at pH 4.5). Both Lac5 and LacA were able to efficiently decolorize such dyes as RBBR and Bromcresol Green. Additionally, phylogenetic relationships among laccases of Peniophora spp. were reconstructed, and groups of orthologous genes were determined. Based on these groups, all currently available data about laccases of Peniophora spp. were systematized.
Collapse
|
43
|
Agarwal PR, Lahiri A. Comparative study of the SBP-box gene family in rice siblings. J Biosci 2020. [DOI: 10.1007/s12038-020-00048-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
44
|
Ahrens JB, Teufel AI, Siltberg-Liberles J. A Phylogenetic Rate Parameter Indicates Different Sequence Divergence Patterns in Orthologs and Paralogs. J Mol Evol 2020; 88:720-730. [PMID: 33118098 DOI: 10.1007/s00239-020-09969-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Accepted: 10/15/2020] [Indexed: 10/23/2022]
Abstract
Heterotachy-the change in sequence evolutionary rate over time-is a common feature of protein molecular evolution. Decades of studies have shed light on the conditions under which heterotachy occurs, and there is evidence that site-specific evolutionary rate shifts are correlated with changes in protein function. Here, we present a large-scale, computational analysis using thousands of protein sequence alignments from animal and plant proteomes, representing genes related either by orthology (speciation events) or paralogy (gene duplication), to compare sequence divergence patterns in orthologous vs. paralogous sequence alignments. We use sequence-based phylogenetic analyses to infer overall sequence divergence (tree length/number of sequences) and to fit site-specific rates to a discrete gamma distribution with a shape parameter α. This inference method is applied to real protein sequence alignments, as well as alignments simulated under various models of protein sequence evolution. Our simulations indicate that sequence divergence and the α parameter are positively correlated when sequences evolve with heterotachy, meaning that inferred site rate distributions appear more uniform as sequences diverge. Divergence and α are also positively correlated in both orthologous and paralogous genes, but the average increase in α (as a function of divergence) is significantly higher in paralogous protein alignments than in orthologous alignments. This result is consistent with the widely held view that recently duplicated proteins initially evolve under relaxed selective pressure, promoting functional divergence by accumulation of amino acid replacements, and hence experience more evolutionary rate fluctuations than orthologous proteins. We discuss these findings in the context of the ortholog conjecture, a long-standing assumption in molecular evolution, which posits that protein sequences related by orthology tend to be more functionally conserved than paralogous proteins.
Collapse
Affiliation(s)
- Joseph B Ahrens
- Department of Biological Sciences, Biomolecular Sciences Institute, Florida International University, Miami, FL, USA. .,Department of Biochemistry and Molecular Genetics, Computational Bioscience Program, University of Colorado Denver, Aurora, CO, USA.
| | - Ashley I Teufel
- Department of Integrative Biology, The University of Texas At Austin, Austin, TX, USA.,Santa Fe Institute, Santa Fe, NM, USA
| | - Jessica Siltberg-Liberles
- Department of Biological Sciences, Biomolecular Sciences Institute, Florida International University, Miami, FL, USA.
| |
Collapse
|
45
|
Prometheus, an omics portal for interkingdom comparative genomic analyses. PLoS One 2020; 15:e0240191. [PMID: 33112870 PMCID: PMC7592745 DOI: 10.1371/journal.pone.0240191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 09/21/2020] [Indexed: 11/27/2022] Open
Abstract
Functional analyses of genes are crucial for unveiling biological responses, genetic engineering, and developing new medicines. However, functional analyses have largely been restricted to model organisms, representing a major hurdle for functional studies and industrial applications. To resolve this, comparative genome analyses can be used to provide clues to gene functions as well as their evolutionary history. To this end, we present Prometheus, a web-based omics portal that contains more than 17,215 sequences from prokaryotic and eukaryotic genomes. This portal supports interkingdom comparative analyses via a domain architecture-based gene identification system and Gene Search, and users can easily and rapidly identify single or entire gene sets in specific pathways. Bioinformatics tools for further analyses are provided in Prometheus or through Bio-Express, a cloud-based bioinformatics analysis platform. Prometheus is a new paradigm for comparative analyses of large amounts of genomic information.
Collapse
|
46
|
Hernández-Salmerón JE, Moreno-Hagelsieb G. Progress in quickly finding orthologs as reciprocal best hits: comparing blast, last, diamond and MMseqs2. BMC Genomics 2020; 21:741. [PMID: 33099302 PMCID: PMC7585182 DOI: 10.1186/s12864-020-07132-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Accepted: 10/09/2020] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Finding orthologs remains an important bottleneck in comparative genomics analyses. While the authors of software for the quick comparison of protein sequences evaluate the speed of their software and compare their results against the most usual software for the task, it is not common for them to evaluate their software for more particular uses, such as finding orthologs as reciprocal best hits (RBH). Here we compared RBH results obtained using software that runs faster than blastp. Namely, lastal, diamond, and MMseqs2. RESULTS We found that lastal required the least time to produce results. However, it yielded fewer results than any other program when comparing the proteins encoded by evolutionarily distant genomes. The program producing the most similar number of RBH to blastp was diamond ran with the "ultra-sensitive" option. However, this option was diamond's slowest, with the "very-sensitive" option offering the best balance between speed and RBH results. The speeding up of the programs was much more evident when dealing with eukaryotic genomes, which code for more numerous proteins. For example, lastal took a median of approx. 1.5% of the blastp time to run with bacterial proteomes and 0.6% with eukaryotic ones, while diamond with the very-sensitive option took 7.4% and 5.2%, respectively. Though estimated error rates were very similar among the RBH obtained with all programs, RBH obtained with MMseqs2 had the lowest error rates among the programs tested. CONCLUSIONS The fast algorithms for pairwise protein comparison produced results very similar to blast in a fraction of the time, with diamond offering the best compromise in speed, sensitivity and quality, as long as a sensitivity option, other than the default, was chosen.
Collapse
Affiliation(s)
| | - Gabriel Moreno-Hagelsieb
- Wilfrid Laurier University, Department of Biology, 75 University Ave W, Waterloo, N2L 3C5 ON Canada
| |
Collapse
|
47
|
Eshkiki EM, Hajiahmadi Z, Abedi A, Kordrostami M, Jacquard C. In Silico Analyses of Autophagy-Related Genes in Rapeseed ( Brassica napus L.) under Different Abiotic Stresses and in Various Tissues. PLANTS 2020; 9:plants9101393. [PMID: 33092180 PMCID: PMC7594038 DOI: 10.3390/plants9101393] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 10/14/2020] [Accepted: 10/15/2020] [Indexed: 12/21/2022]
Abstract
The autophagy-related genes (ATGs) play important roles in plant growth and response to environmental stresses. Brassica napus (B. napus) is among the most important oilseed crops, but ATGs are largely unknown in this species. Therefore, a genome-wide analysis of the B. napus ATG gene family (BnATGs) was performed. One hundred and twenty-seven ATGs were determined due to the B. napus genome, which belongs to 20 main groups. Segmental duplication occurred more than the tandem duplication in BnATGs. Ka/Ks for the most duplicated pair genes were less than one, which indicated that the negative selection occurred to maintain their function during the evolution of B. napus plants. Based on the results, BnATGs are involved in various developmental processes and respond to biotic and abiotic stresses. One hundred and seven miRNA molecules are involved in the post-transcriptional regulation of 41 BnATGs. In general, 127 simple sequence repeat marker (SSR) loci were also detected in BnATGs. Based on the RNA-seq data, the highest expression in root and silique was related to BnVTI12e, while in shoot and seed, it was BnATG8p. The expression patterns of the most BnATGs were significantly up-regulated or down-regulated responding to dehydration, salinity, abscisic acid, and cold. This research provides information that can detect candidate genes for genetic manipulation in B. napus.
Collapse
Affiliation(s)
- Elham Mehri Eshkiki
- Department of Agricultural Biotechnology, Payame Noor University (PNU), Tehran P.O. Box 19395-4697, Iran;
| | - Zahra Hajiahmadi
- Department of Biotechnology, Faculty of Agricultural Sciences, University of Guilan, Rasht P.O. Box 41635-1314, Iran; (Z.H.); (A.A.)
| | - Amin Abedi
- Department of Biotechnology, Faculty of Agricultural Sciences, University of Guilan, Rasht P.O. Box 41635-1314, Iran; (Z.H.); (A.A.)
| | - Mojtaba Kordrostami
- Nuclear Agriculture Research School, Nuclear Science and Technology Research Institute (NSTRI), Karaj P.O. Box 31485498, Iran;
| | - Cédric Jacquard
- Resistance Induction and Bioprotection of Plants Unit (RIBP)—EA4707, SFR Condorcet FR CNRS 3417, University of Reims Champagne-Ardenne, Moulin de la Housse, CEDEX 2, BP 1039, 51687 Reims, France
- Correspondence: ; Tel.: +33-3-26-91-34-36
| |
Collapse
|
48
|
Amalgamated cross-species transcriptomes reveal organ-specific propensity in gene expression evolution. Nat Commun 2020; 11:4459. [PMID: 32900997 PMCID: PMC7479108 DOI: 10.1038/s41467-020-18090-8] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Accepted: 07/29/2020] [Indexed: 12/24/2022] Open
Abstract
The origins of multicellular physiology are tied to evolution of gene expression. Genes can shift expression as organisms evolve, but how ancestral expression influences altered descendant expression is not well understood. To examine this, we amalgamate 1,903 RNA-seq datasets from 182 research projects, including 6 organs in 21 vertebrate species. Quality control eliminates project-specific biases, and expression shifts are reconstructed using gene-family-wise phylogenetic Ornstein-Uhlenbeck models. Expression shifts following gene duplication result in more drastic changes in expression properties than shifts without gene duplication. The expression properties are tightly coupled with protein evolutionary rate, depending on whether and how gene duplication occurred. Fluxes in expression patterns among organs are nonrandom, forming modular connections that are reshaped by gene duplication. Thus, if expression shifts, ancestral expression in some organs induces a strong propensity for expression in particular organs in descendants. Regardless of whether the shifts are adaptive or not, this supports a major role for what might be termed preadaptive pathways of gene expression evolution.
Collapse
|
49
|
Recurrent sequence evolution after independent gene duplication. BMC Evol Biol 2020; 20:98. [PMID: 32770961 PMCID: PMC7414715 DOI: 10.1186/s12862-020-01660-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Accepted: 07/17/2020] [Indexed: 11/10/2022] Open
Abstract
Background Convergent and parallel evolution provide unique insights into the mechanisms of natural selection. Some of the most striking convergent and parallel (collectively recurrent) amino acid substitutions in proteins are adaptive, but there are also many that are selectively neutral. Accordingly, genome-wide assessment has shown that recurrent sequence evolution in orthologs is chiefly explained by nearly neutral evolution. For paralogs, more frequent functional change is expected because additional copies are generally not retained if they do not acquire their own niche. Yet, it is unknown to what extent recurrent sequence differentiation is discernible after independent gene duplications in different eukaryotic taxa. Results We develop a framework that detects patterns of recurrent sequence evolution in duplicated genes. This is used to analyze the genomes of 90 diverse eukaryotes. We find a remarkable number of families with a potentially predictable functional differentiation following gene duplication. In some protein families, more than ten independent duplications show a similar sequence-level differentiation between paralogs. Based on further analysis, the sequence divergence is found to be generally asymmetric. Moreover, about 6% of the recurrent sequence evolution between paralog pairs can be attributed to recurrent differentiation of subcellular localization. Finally, we reveal the specific recurrent patterns for the gene families Hint1/Hint2, Sco1/Sco2 and vma11/vma3. Conclusions The presented methodology provides a means to study the biochemical underpinning of functional differentiation between paralogs. For instance, two abundantly repeated substitutions are identified between independently derived Sco1 and Sco2 paralogs. Such identified substitutions allow direct experimental testing of the biological role of these residues for the repeated functional differentiation. We also uncover a diverse set of families with recurrent sequence evolution and reveal trends in the functional and evolutionary trajectories of this hitherto understudied phenomenon.
Collapse
|
50
|
Costa SS, Guimarães LC, Silva A, Soares SC, Baraúna RA. First Steps in the Analysis of Prokaryotic Pan-Genomes. Bioinform Biol Insights 2020; 14:1177932220938064. [PMID: 32843837 PMCID: PMC7418249 DOI: 10.1177/1177932220938064] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 05/26/2020] [Indexed: 01/14/2023] Open
Abstract
Pan-genome is defined as the set of orthologous and unique genes of a specific group of organisms. The pan-genome is composed by the core genome, accessory genome, and species- or strain-specific genes. The pan-genome is considered open or closed based on the alpha value of the Heap law. In an open pan-genome, the number of gene families will continuously increase with the addition of new genomes to the analysis, while in a closed pan-genome, the number of gene families will not increase considerably. The first step of a pan-genome analysis is the homogenization of genome annotation. The same software should be used to annotate genomes, such as GeneMark or RAST. Subsequently, several software are used to calculate the pan-genome such as BPGA, GET_HOMOLOGUES, PGAP, among others. This review presents all these initial steps for those who want to perform a pan-genome analysis, explaining key concepts of the area. Furthermore, we present the pan-genomic analysis of 9 bacterial species. These are the species with the highest number of genomes deposited in GenBank. We also show the influence of the identity and coverage parameters on the prediction of orthologous and paralogous genes. Finally, we cite the perspectives of several research areas where pan-genome analysis can be used to answer important issues.
Collapse
Affiliation(s)
- Sávio Souza Costa
- Centro de Genômica e Biologia de Sistemas, Universidade Federal do Pará, Belém, Brazil
- Laboratório de Engenharia Biológica, Espaço Inovação, Parque de Ciência e Tecnologia Guamá, Belém, Brazil
| | - Luís Carlos Guimarães
- Centro de Genômica e Biologia de Sistemas, Universidade Federal do Pará, Belém, Brazil
| | - Artur Silva
- Centro de Genômica e Biologia de Sistemas, Universidade Federal do Pará, Belém, Brazil
- Laboratório de Engenharia Biológica, Espaço Inovação, Parque de Ciência e Tecnologia Guamá, Belém, Brazil
| | - Siomar Castro Soares
- Instituto de Ciências Biológicas e Naturais, Universidade Federal do Triângulo Mineiro, Uberaba, Brazil
| | - Rafael Azevedo Baraúna
- Centro de Genômica e Biologia de Sistemas, Universidade Federal do Pará, Belém, Brazil
- Laboratório de Engenharia Biológica, Espaço Inovação, Parque de Ciência e Tecnologia Guamá, Belém, Brazil
| |
Collapse
|