1
|
Langleib M, Calvelo J, Costábile A, Castillo E, Tort JF, Hoffmann FG, Protasio AV, Koziol U, Iriarte A. Evolutionary analysis of species-specific duplications in flatworm genomes. Mol Phylogenet Evol 2024; 199:108141. [PMID: 38964593 DOI: 10.1016/j.ympev.2024.108141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Revised: 06/15/2024] [Accepted: 07/01/2024] [Indexed: 07/06/2024]
Abstract
Platyhelminthes, also known as flatworms, is a phylum of bilaterian invertebrates infamous for their parasitic representatives. The classes Cestoda, Monogenea, and Trematoda comprise parasitic helminths inhabiting multiple hosts, including fishes, humans, and livestock, and are responsible for considerable economic damage and burden on human health. As in other animals, the genomes of flatworms have a wide variety of paralogs, genes related via duplication, whose origins could be mapped throughout the evolution of the phylum. Through in-silico analysis, we studied inparalogs, i.e., species-specific duplications, focusing on their biological functions, expression changes, and evolutionary rate. These genes are thought to be key players in the adaptation process of species to each particular niche. Our results showed that genes related with specific functional terms, such as response to stress, transferase activity, oxidoreductase activity, and peptidases, are overrepresented among inparalogs. This trend is conserved among species from different classes, including free-living species. Available expression data from Schistosoma mansoni, a parasite from the trematode class, demonstrated high conservation of expression patterns between inparalogs, but with notable exceptions, which also display evidence of rapid evolution. We discuss how natural selection may operate to maintain these genes and the particular duplication models that fit better to the observations. Our work supports the critical role of gene duplication in the evolution of flatworms, representing the first study of inparalogs evolution at the genome-wide level in this group.
Collapse
Affiliation(s)
- Mauricio Langleib
- Laboratorio de Biología Computacional, Departamento de Desarrollo Biotecnológico, Instituto de Higiene, Facultad de Medicina, Universidad de la República, Montevideo, Uruguay; Departamento de Genética, Facultad de Medicina, Universidad de la República, Montevideo, Uruguay
| | - Javier Calvelo
- Laboratorio de Biología Computacional, Departamento de Desarrollo Biotecnológico, Instituto de Higiene, Facultad de Medicina, Universidad de la República, Montevideo, Uruguay
| | - Alicia Costábile
- Sección Bioquímica, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay
| | - Estela Castillo
- Laboratorio de Biología Parasitaria, Instituto de Higiene, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay
| | - José F Tort
- Departamento de Genética, Facultad de Medicina, Universidad de la República, Montevideo, Uruguay
| | - Federico G Hoffmann
- Department of Biochemistry, Molecular Biology, Entomology, and Plant Pathology, Mississippi State University, Mississippi, United States of America; Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi, United States of America
| | - Anna V Protasio
- Department of Pathology, University of Cambridge, Tennis Court Road, CB2 1QP, Cambridge, United Kingdom
| | - Uriel Koziol
- Sección Biología Celular, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay
| | - Andrés Iriarte
- Laboratorio de Biología Computacional, Departamento de Desarrollo Biotecnológico, Instituto de Higiene, Facultad de Medicina, Universidad de la República, Montevideo, Uruguay.
| |
Collapse
|
2
|
Ludwig J, Mrázek J. OrthoRefine: automated enhancement of prior ortholog identification via synteny. BMC Bioinformatics 2024; 25:163. [PMID: 38664637 PMCID: PMC11044567 DOI: 10.1186/s12859-024-05786-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 04/15/2024] [Indexed: 04/29/2024] Open
Abstract
BACKGROUND Identifying orthologs continues to be an early and imperative step in genome analysis but remains a challenging problem. While synteny (conservation of gene order) has previously been used independently and in combination with other methods to identify orthologs, applying synteny in ortholog identification has yet to be automated in a user-friendly manner. This desire for automation and ease-of-use led us to develop OrthoRefine, a standalone program that uses synteny to refine ortholog identification. RESULTS We developed OrthoRefine to improve the detection of orthologous genes by implementing a look-around window approach to detect synteny. We tested OrthoRefine in tandem with OrthoFinder, one of the most used software for identification of orthologs in recent years. We evaluated improvements provided by OrthoRefine in several bacterial and a eukaryotic dataset. OrthoRefine efficiently eliminates paralogs from orthologous groups detected by OrthoFinder. Using synteny increased specificity and functional ortholog identification; additionally, analysis of BLAST e-value, phylogenetics, and operon occurrence further supported using synteny for ortholog identification. A comparison of several window sizes suggested that smaller window sizes (eight genes) were generally the most suitable for identifying orthologs via synteny. However, larger windows (30 genes) performed better in datasets containing less closely related genomes. A typical run of OrthoRefine with ~ 10 bacterial genomes can be completed in a few minutes on a regular desktop PC. CONCLUSION OrthoRefine is a simple-to-use, standalone tool that automates the application of synteny to improve ortholog detection. OrthoRefine is particularly efficient in eliminating paralogs from orthologous groups delineated by standard methods.
Collapse
Affiliation(s)
- J Ludwig
- Institute of Bioinformatics, The University of Georgia, Athens, GA, 30602, USA.
| | - J Mrázek
- Department of Microbiology and Institute of Bioinformatics, The University of Georgia, Athens, GA, 30602, USA
| |
Collapse
|
3
|
Lopes JML, Nascimento LSDQ, Souza VC, de Matos EM, Fortini EA, Grazul RM, Santos MO, Soltis DE, Soltis PS, Otoni WC, Viccini LF. Water stress modulates terpene biosynthesis and morphophysiology at different ploidal levels in Lippia alba (Mill.) N. E. Brown (Verbenaceae). PROTOPLASMA 2024; 261:227-243. [PMID: 37665420 DOI: 10.1007/s00709-023-01890-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 08/18/2023] [Indexed: 09/05/2023]
Abstract
Monoterpenes are the main component in essential oils of Lippia alba. In this species, the chemical composition of essential oils varies with genome size: citral (geraniol and neral) is dominant in diploids and tetraploids, and linalool in triploids. Because environmental stress impacts various metabolic pathways, we hypothesized that stress responses in L. alba could alter the relationship between genome size and essential oil composition. Water stress affects the flowering, production, and reproduction of plants. Here, we evaluated the effect of water stress on morphophysiology, essential oil production, and the expression of genes related to monoterpene synthesis in diploid, triploid, and tetraploid accessions of L. alba cultivated in vitro for 40 days. First, using transcriptome data, we performed de novo gene assembly and identified orthologous genes using phylogenetic and clustering-based approaches. The expression of candidate genes related to terpene biosynthesis was estimated by real-time quantitative PCR. Next, we assessed the expression of these genes under water stress conditions, whereby 1% PEG-4000 was added to MS medium. Water stress modulated L. alba morphophysiology at all ploidal levels. Gene expression and essential oil production were affected in triploid accessions. Polyploid accessions showed greater growth and metabolic tolerance under stress compared to diploids. These results confirm the complex regulation of metabolic pathways such as the production of essential oils in polyploid genomes. In addition, they highlight aspects of genotype and environment interactions, which may be important for the conservation of tropical biodiversity.
Collapse
Affiliation(s)
- Juliana Mainenti Leal Lopes
- Department of Biology, Insitute of Biological Science, Federal University of Juiz de Fora, Juiz de Fora, Minas Gerais, 36036-900, Brazil
- School of Life Science and Environment, Department of Genetic and Biotechnology, University of Trás-Os-Montes and Alto Douro, 5001-801, Vila Real, Portugal
- BioISI - Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, 1649-004, Lisbon, Portugal
| | | | - Vinicius Carius Souza
- Department of Biology, Insitute of Biological Science, Federal University of Juiz de Fora, Juiz de Fora, Minas Gerais, 36036-900, Brazil
| | - Elyabe Monteiro de Matos
- Department of Biology, Insitute of Biological Science, Federal University of Juiz de Fora, Juiz de Fora, Minas Gerais, 36036-900, Brazil
| | - Evandro Alexandre Fortini
- Laboratory of Plant Tissue Culture (LCTII), Department of Plant Biology/BIOAGRO, Universidade Federal de Viçosa, Av. P.H. Rolfs S/N, Campus Universitário, Viçosa, MG, 36570-000, Brazil
| | | | - Marcelo Oliveira Santos
- Department of Biology, Insitute of Biological Science, Federal University of Juiz de Fora, Juiz de Fora, Minas Gerais, 36036-900, Brazil
| | - Douglas E Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL, 32611, USA
- Department of Biology, University of Florida, Gainesville, FL, 32611, USA
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL, 32611, USA
| | - Wagner Campos Otoni
- Laboratory of Plant Tissue Culture (LCTII), Department of Plant Biology/BIOAGRO, Universidade Federal de Viçosa, Av. P.H. Rolfs S/N, Campus Universitário, Viçosa, MG, 36570-000, Brazil
| | - Lyderson Facio Viccini
- Department of Biology, Insitute of Biological Science, Federal University of Juiz de Fora, Juiz de Fora, Minas Gerais, 36036-900, Brazil.
| |
Collapse
|
4
|
Raghuraman P, Ramireddy S, Raman G, Park S, Sudandiradoss C. Understanding a point mutation signature D54K in the caspase activation recruitment domain of NOD1 capitulating concerted immunity via atomistic simulation. J Biomol Struct Dyn 2024:1-17. [PMID: 38415678 DOI: 10.1080/07391102.2024.2322618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 12/11/2023] [Indexed: 02/29/2024]
Abstract
Point mutation D54K in the human N-terminal caspase recruitment domain (CARD) of nucleotide-binding oligomerization domain -1 (NOD1) abrogates an imperative downstream interaction with receptor-interacting protein kinase (RIPK2) that entails combating bacterial infections and inflammatory dysfunction. Here, we addressed the molecular details concerning conformational changes and interaction patterns (monomeric-dimeric states) of D54K by signature-based molecular dynamics simulation. Initially, the sequence analysis prioritized D54K as a pathogenic mutation, among other variants, based on a sequence signature. Since the mutation is highly conserved, we derived the distant ortholog to predict the sequence and structural similarity between native and mutant. This analysis showed the utility of 33 communal core residues associated with structural-functional preservation and variations, concurrently served to infer the cryptic hotspots Cys39, Glu53, Asp54, Glu56, Ile57, Leu74, and Lys78 determining the inter helical fold forming homodimers for putative receptor interaction. Subsequently, the atomistic simulations with free energy (MM/PB(GB)SA) calculations predicted structural alteration that takes place in the N-terminal mutant CARD where coils changed to helices (45 α3- L4-α4-L6- α683) in contrast to native (45T2-L4-α4-L6-T483). Likewise, the C-terminal helices 93T1-α7105 connected to the loops distorted compared to native 93α6-L7105 may result in conformational misfolding that promotes functional regulation and activation. These structural perturbations of D54K possibly destabilize the flexible adaptation of critical homotypic NOD1CARD-CARDRIPK2 interactions (α4Asp42-Arg488α5 and α6Phe86-Lys471α4) is consistent with earlier experimental reports. Altogether, our findings unveil the conformational plasticity of mutation-dependent immunomodulatory response and may aid in functional validation exploring clinical investigation on CARD-regulated immunotherapies to prevent systemic infection and inflammation.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- P Raghuraman
- Department of Biotechnology, School of Bioscience and Technology, Vellore Institute of Technology, Vellore, India
- Department of Life Sciences, Yeungnam University, Gyeongsan, Gyeongsangbuk-do, Republic of Korea
| | - Sriroopreddy Ramireddy
- Department of Biotechnology, School of Bioscience and Technology, Vellore Institute of Technology, Vellore, India
- Department of Genetics and Molecular Biology, School of Health Sciences, The Apollo University, Chittoor, India
| | - Gurusamy Raman
- Department of Life Sciences, Yeungnam University, Gyeongsan, Gyeongsangbuk-do, Republic of Korea
| | - SeonJoo Park
- Department of Life Sciences, Yeungnam University, Gyeongsan, Gyeongsangbuk-do, Republic of Korea
| | - C Sudandiradoss
- Department of Biotechnology, School of Bioscience and Technology, Vellore Institute of Technology, Vellore, India
| |
Collapse
|
5
|
Puerta-Arias JD, Isaza Agudelo JP, Naranjo Preciado TW. Identification and production of novel potential pathogen-specific biomarkers for diagnosis of histoplasmosis. Microbiol Spectr 2023; 11:e0093923. [PMID: 37882565 PMCID: PMC10714873 DOI: 10.1128/spectrum.00939-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 09/08/2023] [Indexed: 10/27/2023] Open
Abstract
IMPORTANCE Histoplasmosis is considered one of the most important mycoses due to the increasing number of individuals susceptible to develop severe clinical forms, particularly those with HIV/AIDS or receiving immunosuppressive biological therapies, the high mortality rates reported when antifungal treatment is not initiated in a timely manner, and the limitations of conventional diagnostic methods. In this context, there is a clear need to improve the capacity of diagnostic tools to specifically detect the fungal pathogen, regardless of the patient's clinical condition or the presence of other co-infections. The proposed novel pathogen-specific biomarkers have the potential to be used in immunodiagnostic platforms and antifungal treatment monitoring in histoplasmosis. In addition, the bioinformatics strategy used in this study could be applied to identify potential diagnostic biomarkers in other models of fungal infection of public health importance.
Collapse
Affiliation(s)
- Juan David Puerta-Arias
- Medical and Experimental Mycology Group, Corporación para Investigaciones Biológicas (CIB-UdeA-UPB-UDES), Medellín, Colombia
- School of Health Sciences, Universidad Pontificia Bolivariana, Medellín, Colombia
- Universidad de Santander (UDES), Facultad de Ciencias Médicas y de la Salud, Bucaramanga, Colombia
| | | | - Tonny Williams Naranjo Preciado
- Medical and Experimental Mycology Group, Corporación para Investigaciones Biológicas (CIB-UdeA-UPB-UDES), Medellín, Colombia
- School of Health Sciences, Universidad Pontificia Bolivariana, Medellín, Colombia
| |
Collapse
|
6
|
Khan F, Jeong GJ, Javaid A, Thuy Nguyen Pham D, Tabassum N, Kim YM. Surface adherence and vacuolar internalization of bacterial pathogens to the Candida spp. cells: Mechanism of persistence and propagation. J Adv Res 2023; 53:115-136. [PMID: 36572338 PMCID: PMC10658324 DOI: 10.1016/j.jare.2022.12.013] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 12/17/2022] [Accepted: 12/21/2022] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND The co-existence of Candida albicans with the bacteria in the host tissues and organs displays interactions at competitive, antagonistic, and synergistic levels. Several pathogenic bacteria take advantage of such types of interaction for their survival and proliferation. The chemical interaction involves the signaling molecules produced by the bacteria or Candida spp., whereas the physical attachment occurs by involving the surface proteins of the bacteria and Candida. In addition, bacterial pathogens have emerged to internalize inside the C. albicans vacuole, which is one of the inherent properties of the endosymbiotic relationship between the bacteria and the eukaryotic host. AIM OF REVIEW The interaction occurring by the involvement of surface protein from diverse bacterial species with Candida species has been discussed in detail in this paper. An in silico molecular docking study was performed between the surface proteins of different bacterial species and Als3P of C. albicans to explain the molecular mechanism involved in the Als3P-dependent interaction. Furthermore, in order to understand the specificity of C. albicans interaction with Als3P, the evolutionary relatedness of several bacterial surface proteins has been investigated. Furthermore, the environmental factors that influence bacterial pathogen internalization into the Candida vacuole have been addressed. Moreover, the review presented future perspectives for disrupting the cross-kingdom interaction and eradicating the endosymbiotic bacterial pathogens. KEY SCIENTIFIC CONCEPTS OF REVIEW With the involvement of cross-kingdom interactions and endosymbiotic relationships, the bacterial pathogens escape from the environmental stresses and the antimicrobial activity of the host immune system. Thus, the study of interactions between Candida and bacterial pathogens is of high clinical significance.
Collapse
Affiliation(s)
- Fazlurrahman Khan
- Marine Integrated Biomedical Technology Center, The National Key Research Institutes in Universities, Pukyong National University, Busan 48513, Republic of Korea; Research Center for Marine Integrated Bionics Technology, Pukyong National University, Busan 48513, Republic of Korea.
| | - Geum-Jae Jeong
- Department of Food Science and Technology, Pukyong National University, Busan 48513, Republic of Korea
| | - Aqib Javaid
- Department of Biotechnology and Bioinformatics, University of Hyderabad, India
| | - Dung Thuy Nguyen Pham
- Institute of Applied Technology and Sustainable Development, Nguyen Tat Thanh University, Ho Chi Minh City 70000, Vietnam
| | - Nazia Tabassum
- Marine Integrated Biomedical Technology Center, The National Key Research Institutes in Universities, Pukyong National University, Busan 48513, Republic of Korea; Research Center for Marine Integrated Bionics Technology, Pukyong National University, Busan 48513, Republic of Korea
| | - Young-Mog Kim
- Marine Integrated Biomedical Technology Center, The National Key Research Institutes in Universities, Pukyong National University, Busan 48513, Republic of Korea; Research Center for Marine Integrated Bionics Technology, Pukyong National University, Busan 48513, Republic of Korea; Department of Food Science and Technology, Pukyong National University, Busan 48513, Republic of Korea.
| |
Collapse
|
7
|
Bourret J, Borvető F, Bravo IG. Subfunctionalisation of paralogous genes and evolution of differential codon usage preferences: The showcase of polypyrimidine tract binding proteins. J Evol Biol 2023; 36:1375-1392. [PMID: 37667674 DOI: 10.1111/jeb.14212] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 07/11/2023] [Accepted: 07/12/2023] [Indexed: 09/06/2023]
Abstract
Gene paralogs are copies of an ancestral gene that appear after gene or full genome duplication. When two sister gene copies are maintained in the genome, redundancy may release certain evolutionary pressures, allowing one of them to access novel functions. Here, we focused our study on gene paralogs on the evolutionary history of the three polypyrimidine tract binding protein genes (PTBP) and their concurrent evolution of differential codon usage preferences (CUPrefs) in vertebrate species. PTBP1-3 show high identity at the amino acid level (up to 80%) but display strongly different nucleotide composition, divergent CUPrefs and, in humans and in many other vertebrates, distinct tissue-specific expression levels. Our phylogenetic inference results show that the duplication events leading to the three extant PTBP1-3 lineages predate the basal diversification within vertebrates, and genomic context analysis illustrates that local synteny has been well preserved over time for the three paralogs. We identify a distinct evolutionary pattern towards GC3-enriching substitutions in PTBP1, concurrent with enrichment in frequently used codons and with a tissue-wide expression. In contrast, PTBP2s are enriched in AT-ending, rare codons, and display tissue-restricted expression. As a result of this substitution trend, CUPrefs sharply differ between mammalian PTBP1s and the rest of PTBPs. Genomic context analysis suggests that GC3-rich nucleotide composition in PTBP1s is driven by local substitution processes, while the evidence in this direction is thinner for PTBP2-3. An actual lack of co-variation between the observed GC composition of PTBP2-3 and that of the surrounding non-coding genomic environment would raise an interrogation on the origin of CUPrefs, warranting further research on a putative tissue-specific translational selection. Finally, we communicate an intriguing trend for the use of the UUG-Leu codon, which matches the trends of AT-ending codons. Our results are compatible with a scenario in which a combination of directional mutation-selection processes would have differentially shaped CUPrefs of PTBPs in vertebrates: the observed GC-enrichment of PTBP1 in placental mammals may be linked to genomic location and to the strong and broad tissue-expression, while AT-enrichment of PTBP2 and PTBP3 would be associated with rare CUPrefs and thus, possibly to specialized spatio-temporal expression. Our interpretation is coherent with a gene subfunctionalisation process by differential expression regulation associated with the evolution of specific CUPrefs.
Collapse
Affiliation(s)
- Jérôme Bourret
- Laboratoire MIVEGEC (CNRS IRD Univ Montpellier), Centre National de la Recherche Scientifique (CNRS), Montpellier, France
| | - Fanni Borvető
- Laboratoire MIVEGEC (CNRS IRD Univ Montpellier), Centre National de la Recherche Scientifique (CNRS), Montpellier, France
| | - Ignacio G Bravo
- Laboratoire MIVEGEC (CNRS IRD Univ Montpellier), Centre National de la Recherche Scientifique (CNRS), Montpellier, France
| |
Collapse
|
8
|
Cicconardi F, Milanetti E, Pinheiro de Castro EC, Mazo-Vargas A, Van Belleghem SM, Ruggieri AA, Rastas P, Hanly J, Evans E, Jiggins CD, Owen McMillan W, Papa R, Di Marino D, Martin A, Montgomery SH. Evolutionary dynamics of genome size and content during the adaptive radiation of Heliconiini butterflies. Nat Commun 2023; 14:5620. [PMID: 37699868 PMCID: PMC10497600 DOI: 10.1038/s41467-023-41412-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 08/30/2023] [Indexed: 09/14/2023] Open
Abstract
Heliconius butterflies, a speciose genus of Müllerian mimics, represent a classic example of an adaptive radiation that includes a range of derived dietary, life history, physiological and neural traits. However, key lineages within the genus, and across the broader Heliconiini tribe, lack genomic resources, limiting our understanding of how adaptive and neutral processes shaped genome evolution during their radiation. Here, we generate highly contiguous genome assemblies for nine Heliconiini, 29 additional reference-assembled genomes, and improve 10 existing assemblies. Altogether, we provide a dataset of annotated genomes for a total of 63 species, including 58 species within the Heliconiini tribe. We use this extensive dataset to generate a robust and dated heliconiine phylogeny, describe major patterns of introgression, explore the evolution of genome architecture, and the genomic basis of key innovations in this enigmatic group, including an assessment of the evolution of putative regulatory regions at the Heliconius stem. Our work illustrates how the increased resolution provided by such dense genomic sampling improves our power to generate and test gene-phenotype hypotheses, and precisely characterize how genomes evolve.
Collapse
Affiliation(s)
- Francesco Cicconardi
- School of Biological Sciences, Bristol University, Bristol, United Kingdom.
- Department of Zoology, University of Cambridge, Cambridge, United Kingdom.
| | - Edoardo Milanetti
- Department of Physics, Sapienza University, Piazzale Aldo Moro 5, 00185, Rome, Italy
- Center for Life Nano- & Neuro-Science, Italian Institute of Technology, Viale Regina Elena 291, 00161, Rome, Italy
| | | | - Anyi Mazo-Vargas
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Steven M Van Belleghem
- Department of Biology, University of Puerto Rico, Rio Piedras, PR, Puerto Rico
- Ecology, Evolution and Conservation Biology, Biology Department, KU Leuven, Leuven, Belgium
| | | | - Pasi Rastas
- Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| | - Joseph Hanly
- Department of Biological Sciences, The George Washington University, Washington DC, WA, 20052, USA
- Smithsonian Tropical Research Institute, Panama City, Panama
| | - Elizabeth Evans
- Department of Biology, University of Puerto Rico, Rio Piedras, PR, Puerto Rico
| | - Chris D Jiggins
- Department of Zoology, University of Cambridge, Cambridge, United Kingdom
| | - W Owen McMillan
- Smithsonian Tropical Research Institute, Panama City, Panama
| | - Riccardo Papa
- Department of Biology, University of Puerto Rico, Rio Piedras, PR, Puerto Rico
- Molecular Sciences and Research Center, University of Puerto Rico, San Juan, PR, Puerto Rico
- Comprehensive Cancer Center, University of Puerto Rico, San Juan, PR, Puerto Rico
| | - Daniele Di Marino
- Department of Life and Environmental Sciences, New York-Marche Structural Biology Center (NY-MaSBiC), Polytechnic University of Marche, Via Brecce Bianche, 60131, Ancona, Italy
- Neuronal Death and Neuroprotection Unit, Department of Neuroscience, Mario Negri Institute for Pharmacological Research-IRCCS, Via Mario Negri 2, 20156, Milano, Italy
- National Biodiversity Future Center (NBFC), Palermo, Italy
| | - Arnaud Martin
- Department of Biological Sciences, The George Washington University, Washington DC, WA, 20052, USA
| | - Stephen H Montgomery
- School of Biological Sciences, Bristol University, Bristol, United Kingdom.
- Smithsonian Tropical Research Institute, Panama City, Panama.
| |
Collapse
|
9
|
Orosz F. The Unicellular, Parasitic Fungi, Sanchytriomycota, Possess a DNA Sequence Possibly Encoding a Long Tubulin Polymerization Promoting Protein (TPPP) but Not a Fungal-Type One. Microorganisms 2023; 11:2029. [PMID: 37630588 PMCID: PMC10459994 DOI: 10.3390/microorganisms11082029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 08/02/2023] [Accepted: 08/06/2023] [Indexed: 08/27/2023] Open
Abstract
The unicellular, parasitic fungi of the phylum Sanchytriomycota (sanchytrids) were discovered a few years ago. These unusual chytrid-like fungi parasitize algae. The zoospores of the species of the phylum contain an extremely long kinetosome composed of microtubular singlets or doublets and a non-motile pseudocilium (i.e., a reduced posterior flagellum). Fungi provide an ideal opportunity to test and confirm the correlation between the occurrence of flagellar proteins (the ciliome) and that of the eukaryotic cilium/flagellum since the flagellum occurs in the early-branching phyla and not in terrestrial fungi. Tubulin polymerization promoting protein (TPPP)-like proteins, which contain a p25alpha domain, were also suggested to belong to the ciliome and are present in flagellated fungi. Although sanchytrids have lost many of the flagellar proteins, here it is shown that they possess a DNA sequence possibly encoding long (animal-type) TPPP, but not the fungal-type one characteristic of chytrid fungi. Phylogenetic analysis of p25alpha domains placed sanchytrids into a sister position to Blastocladiomycota, similarly to species phylogeny, with maximal support.
Collapse
Affiliation(s)
- Ferenc Orosz
- Institute of Enzymology, Research Centre for Natural Sciences, 1117 Budapest, Hungary
| |
Collapse
|
10
|
Orosz F. p25alpha Domain-Containing Proteins of Apicomplexans and Related Taxa. Microorganisms 2023; 11:1528. [PMID: 37375031 DOI: 10.3390/microorganisms11061528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 05/31/2023] [Accepted: 06/05/2023] [Indexed: 06/29/2023] Open
Abstract
TPPP (tubulin polymerization promoting protein)-like proteins contain one or more p25alpha (Pfam05517) domains. TPPP-like proteins occur in different types as determined by their length (e.g., long-, short-, truncated-, and fungal-type TPPP) and include the protein apicortin, which possesses another domain, doublecortin (DCX, Pfam 03607). These various TPPP-like proteins are found in various phylogenomic groups. In particular, short-type TPPPs and apicortin are well-represented in the Myzozoa, which include apicomplexans and related taxa, chrompodellids, dinoflagellates, and perkinsids. The long-, truncated-, and fungal-type TPPPs are not found in the myzozoans. Apicortins are found in all apicomplexans except one piroplasmid species, present in several other myzozoans, and seem to be correlated with the conoid and apical complex. Short-type TPPPs are predominantly found in myzozoans that have flagella, suggesting a role in flagellum assembly or structure.
Collapse
Affiliation(s)
- Ferenc Orosz
- Institute of Enzymology, Research Centre for Natural Sciences, 1117 Budapest, Hungary
| |
Collapse
|
11
|
Tubulin Polymerization Promoting Proteins (TPPPs) of Aphelidiomycota: Correlation between the Incidence of p25alpha Domain and the Eukaryotic Flagellum. J Fungi (Basel) 2023; 9:jof9030376. [PMID: 36983544 PMCID: PMC10057920 DOI: 10.3390/jof9030376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 03/16/2023] [Accepted: 03/17/2023] [Indexed: 03/22/2023] Open
Abstract
The seven most early diverging lineages of the 18 phyla of fungi are the non-terrestrial fungi, which reproduce through motile flagellated zoospores. There are genes/proteins that are present only in organisms with flagellum or cilium. It was suggested that TPPP-like proteins (proteins containing at least one complete or partial p25alpha domain) are among them, and a correlation between the incidence of the p25alpha domain and the eukaryotic flagellum was hypothesized. Of the seven phyla of flagellated fungi, six have been known to contain TPPP-like proteins. Aphelidiomycota, one of the early-branching phyla, has some species (e.g., Paraphelidium tribonematis) that retain the flagellum, whereas the Amoeboaphelidium genus has lost the flagellum. The first two Aphelidiomycota genomes (Amoeboaphelidium protococcorum and Amoeboaphelidium occidentale) were sequenced and published last year. A BLASTP search revealed that A. occidentale does not have a TPPP, but A. protococcorum, which possesses pseudocilium, does have a TPPP. This TPPP is the ‘long-type’ which occurs mostly in animals as well as other Opisthokonta. P. tribonematis has a ‘fungal-type’ TPPP, which is found only in some flagellated fungi. These data on Aphelidiomycota TPPP proteins strengthen the correlation between the incidence of p25alpha domain-containing proteins and that of the eukaryotic flagellum/cilium.
Collapse
|
12
|
Persson E, Sonnhammer ELL. InParanoiDB 9: Ortholog Groups for Protein Domains and Full-Length Proteins. J Mol Biol 2023:168001. [PMID: 36764355 DOI: 10.1016/j.jmb.2023.168001] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 01/20/2023] [Accepted: 02/01/2023] [Indexed: 02/11/2023]
Abstract
Prediction of orthologs is an important bioinformatics pursuit that is frequently used for inferring protein function and evolutionary analyses. The InParanoid database is a well known resource of ortholog predictions between a wide variety of organisms. Although orthologs have historically been inferred at the level of full-length protein sequences, many proteins consist of several independent protein domains that may be orthologous to domains in other proteins in a way that differs from the full-length protein case. To be able to capture all types of orthologous relations, conventional full-length protein orthologs can be complemented with orthologs inferred at the domain level. We here present InParanoiDB 9, covering 640 species and providing orthologs for both protein domains and full-length proteins. InParanoiDB 9 was built using the faster InParanoid-DIAMOND algorithm for orthology analysis, as well as Domainoid and Pfam to infer orthologous domains. InParanoiDB 9 is based on proteomes from 447 eukaryotes, 158 bacteria and 35 archaea, and includes over one billion predicted ortholog groups. A new website has been built for the database, providing multiple search options as well as visualization of groups of orthologs and orthologous domains. This release constitutes a major upgrade of the InParanoid database in terms of the number of species as well as the new capability to operate on the domain level. InParanoiDB 9 is available at https://inparanoidb.sbc.su.se/.
Collapse
Affiliation(s)
- Emma Persson
- Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 17121 Solna, Sweden. https://twitter.com/eriksonnhammer
| | - Erik L L Sonnhammer
- Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 17121 Solna, Sweden.
| |
Collapse
|
13
|
Xiong H, Wang D, Shao C, Yang X, Yang J, Ma T, Davis CC, Liu L, Xi Z. Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication. Syst Biol 2022; 71:1348-1361. [PMID: 35689633 PMCID: PMC9558847 DOI: 10.1093/sysbio/syac040] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Revised: 06/03/2022] [Accepted: 06/07/2022] [Indexed: 12/02/2022] Open
Abstract
Whole-genome duplication (WGD) occurs broadly and repeatedly across the history of eukaryotes and is recognized as a prominent evolutionary force, especially in plants. Immediately following WGD, most genes are present in two copies as paralogs. Due to this redundancy, one copy of a paralog pair commonly undergoes pseudogenization and is eventually lost. When speciation occurs shortly after WGD; however, differential loss of paralogs may lead to spurious phylogenetic inference resulting from the inclusion of pseudoorthologs–paralogous genes mistakenly identified as orthologs because they are present in single copies within each sampled species. The influence and impact of including pseudoorthologs versus true orthologs as a result of gene extinction (or incomplete laboratory sampling) are only recently gaining empirical attention in the phylogenomics community. Moreover, few studies have yet to investigate this phenomenon in an explicit coalescent framework. Here, using mathematical models, numerous simulated data sets, and two newly assembled empirical data sets, we assess the effect of pseudoorthologs on species tree estimation under varying degrees of incomplete lineage sorting (ILS) and differential gene loss scenarios following WGD. When gene loss occurs along the terminal branches of the species tree, alignment-based (BPP) and gene-tree-based (ASTRAL, MP-EST, and STAR) coalescent methods are adversely affected as the degree of ILS increases. This can be greatly improved by sampling a sufficiently large number of genes. Under the same circumstances, however, concatenation methods consistently estimate incorrect species trees as the number of genes increases. Additionally, pseudoorthologs can greatly mislead species tree inference when gene loss occurs along the internal branches of the species tree. Here, both coalescent and concatenation methods yield inconsistent results. These results underscore the importance of understanding the influence of pseudoorthologs in the phylogenomics era. [Coalescent method; concatenation method; incomplete lineage sorting; pseudoorthologs; single-copy gene; whole-genome duplication.]
Collapse
Affiliation(s)
- Haifeng Xiong
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Danying Wang
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Chen Shao
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Xuchen Yang
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Jialin Yang
- Department of Statistics and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Tao Ma
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Charles C Davis
- Department of Organismic and Evolutionary Biology, Harvard University Herbaria, Cambridge, MA 02138, USA
| | - Liang Liu
- Department of Statistics and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Zhenxiang Xi
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| |
Collapse
|
14
|
Chen Z, Schrödl M. How many single-copy orthologous genes from whole genomes reveal deep gastropod relationships? PeerJ 2022; 10:e13285. [PMID: 35497189 PMCID: PMC9048639 DOI: 10.7717/peerj.13285] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 03/28/2022] [Indexed: 01/13/2023] Open
Abstract
The Gastropoda contains 80% of existing mollusks and is the most diverse animal class second only to the Insecta. However, the deep phylogeny of gastropods has been controversial for a long time. Especially the position of Patellogastropoda is a major uncertainty. Morphology and some mitochondria studies concluded that Patellogastropoda is likely to be sister to all other gastropods (Orthogastropoda hypothesis), while transcriptomic and other mitogenomic studies indicated that Patellogastropoda and Vetigastropoda are sister taxa (Psilogastropoda). With the release of high-quality genomes, orthologous genes can be better identified and serve as powerful candidates for phylogenetic analysis. The question is, given the current limitations on the taxon sampling side, how many markers are needed to provide robust results. Here, we identified single-copy orthologous genes (SOGs) from 14 gastropods species with whole genomes available which cover five main gastropod subclasses. We generated different datasets from 395 to 1610 SOGs by allowing species missing in different levels. We constructed gene trees of each SOG, and inferred species trees from different collections of gene trees. We found as the number of SOGs increased, the inferred topology changed from Patellogastropoda being sister to all other gastropods to Patellogastropoda being sister to Vetigastropoda + Neomphalina (Psilogastropoda s.l.), with considerable support. Our study thus rejects the Orthogastropoda concept showing that the selection of the representative species and use of sufficient informative sites greatly influence the analysis of deep gastropod phylogeny.
Collapse
Affiliation(s)
- Zeyuan Chen
- Mollusca, SNSB-Bavarian State Collection of Zoology, Munich, Bavaria, Germany,Department Biology II, Ludwig-Maximilians-Universität München, Munich, Bavaria, Germany
| | - Michael Schrödl
- Mollusca, SNSB-Bavarian State Collection of Zoology, Munich, Bavaria, Germany,Department Biology II, Ludwig-Maximilians-Universität München, Munich, Bavaria, Germany,GeoBio-Center LMU, Munich, Bavaria, Germany
| |
Collapse
|
15
|
Diversity and Functional Evolution of Terpene Synthases in Rosaceae. PLANTS 2022; 11:plants11060736. [PMID: 35336617 PMCID: PMC8953233 DOI: 10.3390/plants11060736] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Revised: 02/22/2022] [Accepted: 03/04/2022] [Indexed: 11/20/2022]
Abstract
Terpenes are organic compounds and play important roles in plant development and stress response. Terpene synthases (TPSs) are the key enzymes for the biosynthesis of terpenes. For Rosaceae species, terpene composition represents a critical quality attribute, but limited information is available regarding the evolution and expansion occurring in the terpene synthases gene family. Here, we selected eight Rosaceae species with sequenced and annotated genomes for the identification of TPSs, including three Prunoideae, three Maloideae, and two Rosoideae species. Our data showed that the TPS gene family in the Rosaceae species displayed a diversity of family numbers and functions among different subfamilies. Lineage and species-specific expansion of the TPSs accompanied by frequent domain loss was widely observed within different TPS clades, which might have contributed to speciation or environmental adaptation in Rosaceae. In contrast to Maloideae and Rosoideae species, Prunoideae species owned less TPSs, with the evolution of Prunoideae species, TPSs were expanded in modern peach. Both tandem and segmental duplication significantly contributed to TPSs expansion. Ka/Ks calculations revealed that TPSs genes mainly evolved under purifying selection except for several pairs, where the divergent time indicated TPS-e clade was diverged relatively anciently. Gene function classification of TPSs further demonstrated the function diversity among clades and species. Moreover, based on already published RNA-Seq data from NCBI, the expression of most TPSs in Malus domestica, Prunus persica, and Fragaria vesca displayed tissue specificity and distinct expression patterns either in tissues or expression abundance between species and TPS clades. Certain putative TPS-like proteins lacking both domains were detected to be highly expressed, indicating the underlying functional or regulatory potentials. The result provided insight into the TPS family evolution and genetic information that would help to improve Rosaceae species quality.
Collapse
|
16
|
Csűrös M. Gain-loss-duplication models for copy number evolution on a phylogeny: Exact algorithms for computing the likelihood and its gradient. Theor Popul Biol 2022; 145:80-94. [DOI: 10.1016/j.tpb.2022.03.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 03/07/2022] [Accepted: 03/10/2022] [Indexed: 10/18/2022]
|
17
|
Ma X, Øvrebø JI, Thompson EM. Evolution of CDK1 Paralog Specializations in a Lineage With Fast Developing Planktonic Embryos. Front Cell Dev Biol 2022; 9:770939. [PMID: 35155443 PMCID: PMC8832800 DOI: 10.3389/fcell.2021.770939] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Accepted: 12/27/2021] [Indexed: 12/03/2022] Open
Abstract
The active site of the essential CDK1 kinase is generated by core structural elements, among which the PSTAIRE motif in the critical αC-helix, is universally conserved in the single CDK1 ortholog of all metazoans. We report serial CDK1 duplications in the chordate, Oikopleura. Paralog diversifications in the PSTAIRE, activation loop substrate binding platform, ATP entrance site, hinge region, and main Cyclin binding interface, have undergone positive selection to subdivide ancestral CDK1 functions along the S-M phase cell cycle axis. Apparent coevolution of an exclusive CDK1d:Cyclin Ba/b pairing is required for oogenic meiosis and early embryogenesis, a period during which, unusually, CDK1d, rather than Cyclin Ba/b levels, oscillate, to drive very rapid cell cycles. Strikingly, the modified PSTAIRE of odCDK1d shows convergence over great evolutionary distance with plant CDKB, and in both cases, these variants exhibit increased specialization to M-phase.
Collapse
Affiliation(s)
- Xiaofei Ma
- College of Life Sciences, Northwest Normal University, Lanzhou, China
- Sars International Centre, University of Bergen, Bergen, Norway
- *Correspondence: Xiaofei Ma, , ; Eric M. Thompson,
| | - Jan Inge Øvrebø
- Sars International Centre, University of Bergen, Bergen, Norway
- Department of Biological Sciences, University of Bergen, Bergen, Norway
| | - Eric M. Thompson
- Sars International Centre, University of Bergen, Bergen, Norway
- Department of Biological Sciences, University of Bergen, Bergen, Norway
- *Correspondence: Xiaofei Ma, , ; Eric M. Thompson,
| |
Collapse
|
18
|
Wafula EK, Zhang H, Von Kuster G, Leebens-Mack JH, Honaas LA, dePamphilis CW. PlantTribes2: Tools for comparative gene family analysis in plant genomics. FRONTIERS IN PLANT SCIENCE 2022; 13:1011199. [PMID: 36798801 PMCID: PMC9928214 DOI: 10.3389/fpls.2022.1011199] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 12/02/2022] [Indexed: 05/12/2023]
Abstract
Plant genome-scale resources are being generated at an increasing rate as sequencing technologies continue to improve and raw data costs continue to fall; however, the cost of downstream analyses remains large. This has resulted in a considerable range of genome assembly and annotation qualities across plant genomes due to their varying sizes, complexity, and the technology used for the assembly and annotation. To effectively work across genomes, researchers increasingly rely on comparative genomic approaches that integrate across plant community resources and data types. Such efforts have aided the genome annotation process and yielded novel insights into the evolutionary history of genomes and gene families, including complex non-model organisms. The essential tools to achieve these insights rely on gene family analysis at a genome-scale, but they are not well integrated for rapid analysis of new data, and the learning curve can be steep. Here we present PlantTribes2, a scalable, easily accessible, highly customizable, and broadly applicable gene family analysis framework with multiple entry points including user provided data. It uses objective classifications of annotated protein sequences from existing, high-quality plant genomes for comparative and evolutionary studies. PlantTribes2 can improve transcript models and then sort them, either genome-scale annotations or individual gene coding sequences, into pre-computed orthologous gene family clusters with rich functional annotation information. Then, for gene families of interest, PlantTribes2 performs downstream analyses and customizable visualizations including, (1) multiple sequence alignment, (2) gene family phylogeny, (3) estimation of synonymous and non-synonymous substitution rates among homologous sequences, and (4) inference of large-scale duplication events. We give examples of PlantTribes2 applications in functional genomic studies of economically important plant families, namely transcriptomics in the weedy Orobanchaceae and a core orthogroup analysis (CROG) in Rosaceae. PlantTribes2 is freely available for use within the main public Galaxy instance and can be downloaded from GitHub or Bioconda. Importantly, PlantTribes2 can be readily adapted for use with genomic and transcriptomic data from any kind of organism.
Collapse
Affiliation(s)
- Eric K Wafula
- Department of Biology, The Pennsylvania State University, University Park, PA, United States
| | - Huiting Zhang
- Tree Fruit Research Laboratory, United States Department of Agriculture (USDA), Agricultural Research Service (ARS), Wenatchee, WA, United States
- Department of Horticulture, Washington State University, Pullman, WA, United States
| | - Gregory Von Kuster
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, United States
| | | | - Loren A Honaas
- Tree Fruit Research Laboratory, United States Department of Agriculture (USDA), Agricultural Research Service (ARS), Wenatchee, WA, United States
| | - Claude W dePamphilis
- Department of Biology, The Pennsylvania State University, University Park, PA, United States
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, United States
| |
Collapse
|
19
|
Niu H, Xia P, Hu Y, Zhan C, Li Y, Gong S, Li Y, Ma D. Genome-wide identification of ZF-HD gene family in Triticum aestivum: Molecular evolution mechanism and function analysis. PLoS One 2021; 16:e0256579. [PMID: 34559835 PMCID: PMC8462724 DOI: 10.1371/journal.pone.0256579] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Accepted: 08/11/2021] [Indexed: 12/04/2022] Open
Abstract
ZF-HD family genes play important roles in plant growth and development. Studies about the whole genome analysis of ZF-HD gene family have been reported in some plant species. In this study, the whole genome identification and expression profile of the ZF-HD gene family were analyzed for the first time in wheat. A total of 37 TaZF-HD genes were identified and divided into TaMIF and TaZHD subfamilies according to the conserved domain. The phylogeny tree of the TaZF-HD proteins was further divided into six groups based on the phylogenetic relationship. The 37 TaZF-HDs were distributed on 18 of 21 chromosomes, and almost all the genes had no introns. Gene duplication and Ka/Ks analysis showed that the gene family may have experienced powerful purification selection pressure during wheat evolution. The qRT-PCR analysis showed that TaZF-HD genes had significant expression patterns in different biotic stress and abiotic stress. Through subcellular localization experiments, we found that TaZHD6-3B was located in the nucleus, while TaMIF4-5D was located in the cell membrane and nucleus. Our research contributes to a comprehensive understanding of the TaZF-HD family, provides a new perspective for further research on the biological functions of TaZF-HD genes in wheat.
Collapse
Affiliation(s)
- Hongli Niu
- Hubei Collaborative Innovation Center for Grain Industry/Engineering Research Center of Ecology and Agricultural Use of Wetland, Ministry of Education/College of Agriculture, Yangtze University, Jingzhou, China
| | - Pengliang Xia
- Enshi Tobacco Company of Hubei Province, Enshi, China
| | - Yifeng Hu
- Hubei Collaborative Innovation Center for Grain Industry/Engineering Research Center of Ecology and Agricultural Use of Wetland, Ministry of Education/College of Agriculture, Yangtze University, Jingzhou, China
| | - Chuang Zhan
- Hubei Collaborative Innovation Center for Grain Industry/Engineering Research Center of Ecology and Agricultural Use of Wetland, Ministry of Education/College of Agriculture, Yangtze University, Jingzhou, China
| | - Yiting Li
- Hubei Collaborative Innovation Center for Grain Industry/Engineering Research Center of Ecology and Agricultural Use of Wetland, Ministry of Education/College of Agriculture, Yangtze University, Jingzhou, China
| | - Shuangjun Gong
- Hubei Collaborative Innovation Center for Grain Industry/Engineering Research Center of Ecology and Agricultural Use of Wetland, Ministry of Education/College of Agriculture, Yangtze University, Jingzhou, China
| | - Yan Li
- Hubei Collaborative Innovation Center for Grain Industry/Engineering Research Center of Ecology and Agricultural Use of Wetland, Ministry of Education/College of Agriculture, Yangtze University, Jingzhou, China
- * E-mail: (YL); (DM)
| | - Dongfang Ma
- Hubei Collaborative Innovation Center for Grain Industry/Engineering Research Center of Ecology and Agricultural Use of Wetland, Ministry of Education/College of Agriculture, Yangtze University, Jingzhou, China
- Key Laboratory of Integrated Pest Management on Crop in Central China, Ministry of Agriculture/Hubei Province Key Laboratory for Control of Crop Diseases, Pest and Weeds/Institute of Plant Protection and Soil Science, Hubei Academy of Agricultural Sciences, Wuhan, China
- * E-mail: (YL); (DM)
| |
Collapse
|
20
|
Fan T, Lv T, Xie C, Zhou Y, Tian C. Genome-Wide Analysis of the IQM Gene Family in Rice ( Oryza sativa L.). PLANTS (BASEL, SWITZERLAND) 2021; 10:plants10091949. [PMID: 34579481 PMCID: PMC8469326 DOI: 10.3390/plants10091949] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 09/06/2021] [Accepted: 09/14/2021] [Indexed: 06/01/2023]
Abstract
Members of the IQM (IQ-Motif Containing) gene family are involved in plant growth and developmental processes, biotic and abiotic stress response. To systematically analyze the IQM gene family and their expression profiles under diverse biotic and abiotic stresses, we identified 8 IQM genes in the rice genome. In the current study, the whole genome identification and characterization of OsIQMs, including the gene and protein structure, genome localization, phylogenetic relationship, gene expression and yeast two-hybrid were performed. Eight IQM genes were classified into three subfamilies (I-III) according to the phylogenetic analysis. Gene structure and protein motif analyses showed that these IQM genes are relatively conserved within each subfamily of rice. The 8 OsIQM genes are distributed on seven out of the twelve chromosomes, with three IQM gene pairs involved in segmental duplication events. The evolutionary patterns analysis revealed that the IQM genes underwent a large-scale event within the last 20 to 9 million years. In addition, quantitative real-time PCR analysis of eight OsIQMs genes displayed different expression patterns at different developmental stages and in different tissues as well as showed that most IQM genes were responsive to PEG, NaCl, jasmonic acid (JA), abscisic acid (ABA) treatment, suggesting their crucial roles in biotic, and abiotic stress response. Additionally, a yeast two-hybrid assay showed that OsIQMs can interact with OsCaMs, and the IQ motif of OsIQMs is required for OsIQMs to combine with OsCaMs. Our results will be valuable to further characterize the important biological functions of rice IQM genes.
Collapse
|
21
|
Moiseenko KV, Glazunova OA, Savinova OS, Vasina DV, Zherebker AY, Kulikova NA, Nikolaev EN, Fedorova TV. Relation between lignin molecular profile and fungal exo-proteome during kraft lignin modification by Trametes hirsuta LE-BIN 072. BIORESOURCE TECHNOLOGY 2021; 335:125229. [PMID: 34010738 DOI: 10.1016/j.biortech.2021.125229] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 04/23/2021] [Accepted: 04/24/2021] [Indexed: 05/11/2023]
Abstract
The process of kraft lignin modification by the white-rot fungus Trametes hirsuta was investigated using electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry (ESI FT-ICR MS), and groups of systematically changing compounds were delineated. In the course of cultivation, fungus tended to degrade progressively more reduced compounds and produced more oxidized ones. However, this process was not gradual - the substantial discontinuity was observed between 6th and 10th days of cultivation. Simultaneously, the secretion of ligninolytic peroxidases by the fungus was changing in a cascade manner - new isoenzymes were added to the mixture of the already secreted ones, and once new isoenzyme appeared both its relative quantity and number of isoforms increased as cultivation proceeded. It was proposed, that the later secreted peroxidases (MnP7 and MnP1) possess higher substrate affinity for some phenolic compounds and act in more specialized manner than the early secreted ones (MnP5 and VP2).
Collapse
Affiliation(s)
- Konstantin V Moiseenko
- A. N. Bach Institute of Biochemistry, Research Center of Biotechnology, Russian Academy of Sciences, Leninsky Ave. 33/2, Moscow 119071, Russia.
| | - Olga A Glazunova
- A. N. Bach Institute of Biochemistry, Research Center of Biotechnology, Russian Academy of Sciences, Leninsky Ave. 33/2, Moscow 119071, Russia
| | - Olga S Savinova
- A. N. Bach Institute of Biochemistry, Research Center of Biotechnology, Russian Academy of Sciences, Leninsky Ave. 33/2, Moscow 119071, Russia
| | - Daria V Vasina
- A. N. Bach Institute of Biochemistry, Research Center of Biotechnology, Russian Academy of Sciences, Leninsky Ave. 33/2, Moscow 119071, Russia
| | | | - Natalia A Kulikova
- A. N. Bach Institute of Biochemistry, Research Center of Biotechnology, Russian Academy of Sciences, Leninsky Ave. 33/2, Moscow 119071, Russia; Department of Soil Science, Lomonosov Moscow State University, Moscow 119991, Russia
| | - Evgeny N Nikolaev
- Skolkovo Institute of Science and Technology, Skolkovo, Moscow Region 143025, Russia
| | - Tatiana V Fedorova
- A. N. Bach Institute of Biochemistry, Research Center of Biotechnology, Russian Academy of Sciences, Leninsky Ave. 33/2, Moscow 119071, Russia
| |
Collapse
|
22
|
Orosz F. Truncated TPPP - An Endopterygota-specific protein. Heliyon 2021; 7:e07135. [PMID: 34136696 PMCID: PMC8180608 DOI: 10.1016/j.heliyon.2021.e07135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 01/18/2021] [Accepted: 05/19/2021] [Indexed: 11/26/2022] Open
Abstract
TPPP proteins exhibiting microtubule stabilizing function constitute a eukaryotic protein superfamily, characterized by the presence of the p25alpha domain of various lengths. Vertebrate species possess three TPPP paralogs; all of them possess a full-length p25alpha domain of 160-170 amino acids and are encoded by three exons. Species of Endopterygota (Holometabola) have, besides a full-size TPPP ortholog, a protein with a truncated p25alpha domain as well, where the last coding exon, responsible for microtubule binding, is missing. It is not the result of an alternative splicing but is coded by another gene. In Drosophila melanogaster, they are named as CG45057 (long-type) and CG6709 (truncated). The truncated protein has been found in the Endopterygota orders Diptera, Coleoptera, Hymenoptera, Lepidoptera and Raphidioptera. In Lepidoptera, in several superfamilies (Gelechioidea, Bombycoidea, Noctuoidea, Pyraloidea) two paralogs of the truncated TPPP occur. Truncated orthologs (CG6709) were not found in other insects or in arthropods and are absent in any other organism, as well, while the long-type TPPPs (CG45057 orthologs) occur commonly in all animals. Thus it seems that CG6709 orthologs occur only in insects undergoing on metamorphosis.
Collapse
Affiliation(s)
- Ferenc Orosz
- Institute of Enzymology, Research Centre for Natural Sciences, Magyar Tudósok körútja 2, 1117 Budapest, Hungary
| |
Collapse
|
23
|
GenOrigin: A comprehensive protein-coding gene origination database on the evolutionary timescale of life. J Genet Genomics 2021; 48:1122-1129. [PMID: 34538772 DOI: 10.1016/j.jgg.2021.03.018] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 03/21/2021] [Accepted: 03/29/2021] [Indexed: 11/20/2022]
Abstract
The origination of new genes contributes to the biological diversity of life. New genes may quickly build their network, exert important functions, and generate novel phenotypes. Dating gene age and inferring the origination mechanisms of new genes, like primate-specific genes, is the basis for the functional study of the genes. However, no comprehensive resource of gene age estimates across species is available. Here, we systematically date the age of 9,102,113 protein-coding genes from 565 species in the Ensembl and Ensembl Genomes databases, including 82 bacteria, 57 protists, 134 fungi, 58 plants, 56 metazoa, and 178 vertebrates, using a protein-family-based pipeline with Wagner parsimony algorithm. We also collect gene age estimate data from other studies and uniformly distribute the gene age estimates to time ranges in a million years for comparison across studies. All the data are cataloged into GenOrigin (http://genorigin.chenzxlab.cn/), a user-friendly new database of gene age estimates, where users can browse gene age estimates by species, age, and gene ontology. In GenOrigin, the information such as gene age estimates, annotation, gene ontology, ortholog, and paralog, as well as detailed gene presence/absence views for gene age inference based on the species tree with evolutionary timescale, is provided to researchers for exploring gene functions.
Collapse
|
24
|
Thompson NA, Ranzani M, van der Weyden L, Iyer V, Offord V, Droop A, Behan F, Gonçalves E, Speak A, Iorio F, Hewinson J, Harle V, Robertson H, Anderson E, Fu B, Yang F, Zagnoli-Vieira G, Chapman P, Del Castillo Velasco-Herrera M, Garnett MJ, Jackson SP, Adams DJ. Combinatorial CRISPR screen identifies fitness effects of gene paralogues. Nat Commun 2021; 12:1302. [PMID: 33637726 PMCID: PMC7910459 DOI: 10.1038/s41467-021-21478-9] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Accepted: 01/25/2021] [Indexed: 12/15/2022] Open
Abstract
Genetic redundancy has evolved as a way for human cells to survive the loss of genes that are single copy and essential in other organisms, but also allows tumours to survive despite having highly rearranged genomes. In this study we CRISPR screen 1191 gene pairs, including paralogues and known and predicted synthetic lethal interactions to identify 105 gene combinations whose co-disruption results in a loss of cellular fitness. 27 pairs influence fitness across multiple cell lines including the paralogues FAM50A/FAM50B, two genes of unknown function. Silencing of FAM50B occurs across a range of tumour types and in this context disruption of FAM50A reduces cellular fitness whilst promoting micronucleus formation and extensive perturbation of transcriptional programmes. Our studies reveal the fitness effects of FAM50A/FAM50B in cancer cells.
Collapse
Affiliation(s)
- Nicola A Thompson
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Marco Ranzani
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | | | - Vivek Iyer
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Victoria Offord
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Alastair Droop
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Fiona Behan
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Emanuel Gonçalves
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Anneliese Speak
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Francesco Iorio
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
- Human Technopole, Milano, Italy
| | - James Hewinson
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Victoria Harle
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Holly Robertson
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | | | - Beiyuan Fu
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Fengtang Yang
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | | | - Phil Chapman
- Cancer Research UK, Manchester Institute, Manchester, UK
| | | | - Mathew J Garnett
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | | | - David J Adams
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK.
| |
Collapse
|
25
|
Yang H, Bayer PE, Tirnaz S, Edwards D, Batley J. Genome-Wide Identification and Evolution of Receptor-Like Kinases (RLKs) and Receptor like Proteins (RLPs) in Brassica juncea. BIOLOGY 2020; 10:biology10010017. [PMID: 33396674 PMCID: PMC7823396 DOI: 10.3390/biology10010017] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 12/21/2020] [Accepted: 12/21/2020] [Indexed: 12/19/2022]
Abstract
Brassica juncea, an allotetraploid species, is an important germplasm resource for canola improvement, due to its many beneficial agronomic traits, such as heat and drought tolerance and blackleg resistance. Receptor-like kinase (RLK) and receptor-like protein (RLP) genes are two types of resistance gene analogues (RGA) that play important roles in plant innate immunity, stress response and various development processes. In this study, genome wide analysis of RLKs and RLPs is performed in B. juncea. In total, 493 RLKs (LysM-RLKs and LRR-RLKs) and 228 RLPs (LysM-RLPs and LRR-RLPs) are identified in the genome of B. juncea, using RGAugury. Only 13.54% RLKs and 11.79% RLPs are observed to be grouped within gene clusters. The majority of RLKs (90.17%) and RLPs (52.83%) are identified as duplicates, indicating that gene duplications significantly contribute to the expansion of RLK and RLP families. Comparative analysis between B. juncea and its progenitor species, B. rapa and B. nigra, indicate that 83.62% RLKs and 41.98% RLPs are conserved in B. juncea, and RLPs are likely to have a faster evolution than RLKs. This study provides a valuable resource for the identification and characterisation of candidate RLK and RLP genes.
Collapse
Affiliation(s)
- Hua Yang
- School of Biological Sciences, University of Western Australia, Crawley, WA 6009, Australia; (H.Y.); (P.E.B.); (S.T.); (D.E.)
- School of Agriculture and Food Sciences, University of Queensland, St Lucia, QLD 4067, Australia
| | - Philipp E. Bayer
- School of Biological Sciences, University of Western Australia, Crawley, WA 6009, Australia; (H.Y.); (P.E.B.); (S.T.); (D.E.)
| | - Soodeh Tirnaz
- School of Biological Sciences, University of Western Australia, Crawley, WA 6009, Australia; (H.Y.); (P.E.B.); (S.T.); (D.E.)
| | - David Edwards
- School of Biological Sciences, University of Western Australia, Crawley, WA 6009, Australia; (H.Y.); (P.E.B.); (S.T.); (D.E.)
| | - Jacqueline Batley
- School of Biological Sciences, University of Western Australia, Crawley, WA 6009, Australia; (H.Y.); (P.E.B.); (S.T.); (D.E.)
- Correspondence: ; Tel.: +61-8-6488-5929
| |
Collapse
|
26
|
Orosz F. On the TPPP-like proteins of flagellated fungi. Fungal Biol 2020; 125:357-367. [PMID: 33910677 DOI: 10.1016/j.funbio.2020.12.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Revised: 12/02/2020] [Accepted: 12/06/2020] [Indexed: 12/12/2022]
Abstract
TPPP-like proteins, exhibiting microtubule stabilizing function, constitute a eukaryotic superfamily, characterized by the presence of the p25alpha domain. TPPPs in the strict sense are present in animals except Trichoplax adhaerens, which instead contains apicortin where a part of the p25alpha domain is combined with a DCX domain. Apicortin is absent in other animals and occurs mostly in the protozoan phylum, Apicomplexa. A strong correlation between the occurrence of p25alpha domain and that of the eukaryotic cilium/flagellum was suggested. Species of the deeper branching clades of Fungi possess flagellum but others lost it thus investigation of fungal genomes can help testing of this suggestion. Indeed, these proteins are present in early branching Fungi. Both TPPP and apicortin are present in Rozellomycota (Cryptomycota) and Chytridiomycota, TPPP in Blastocladiomycota, apicortin in Neocallimastigomycota, Monoblepharomycota and the non-flagellated Mucoromycota. Beside the "normal" TPPP occurring in animals, a special, fungal-type TPPP is also present in Fungi, in which a part of the p25alpha domain is duplicated. Dikarya, the most developed subkingdom of Fungi, lacks both flagellum and TPPPs. Thus it is strengthened that each ciliated/flagellated organism contains p25alpha domain-containing proteins while there are very few non-flagellated ones where p25alpha domain can be found.
Collapse
Affiliation(s)
- Ferenc Orosz
- Institute of Enzymology, Research Centre for Natural Sciences, Magyar Tudósok Körútja 2, 1117, Budapest, Hungary.
| |
Collapse
|
27
|
Altenhoff AM, Garrayo-Ventas J, Cosentino S, Emms D, Glover NM, Hernández-Plaza A, Nevers Y, Sundesha V, Szklarczyk D, Fernández JM, Codó L, For Orthologs Consortium TQ, Gelpi JL, Huerta-Cepas J, Iwasaki W, Kelly S, Lecompte O, Muffato M, Martin MJ, Capella-Gutierrez S, Thomas PD, Sonnhammer E, Dessimoz C. The Quest for Orthologs benchmark service and consensus calls in 2020. Nucleic Acids Res 2020; 48:W538-W545. [PMID: 32374845 PMCID: PMC7319555 DOI: 10.1093/nar/gkaa308] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Revised: 04/16/2020] [Accepted: 04/20/2020] [Indexed: 12/18/2022] Open
Abstract
The identification of orthologs—genes in different species which descended from the same gene in their last common ancestor—is a prerequisite for many analyses in comparative genomics and molecular evolution. Numerous algorithms and resources have been conceived to address this problem, but benchmarking and interpreting them is fraught with difficulties (need to compare them on a common input dataset, absence of ground truth, computational cost of calling orthologs). To address this, the Quest for Orthologs consortium maintains a reference set of proteomes and provides a web server for continuous orthology benchmarking (http://orthology.benchmarkservice.org). Furthermore, consensus ortholog calls derived from public benchmark submissions are provided on the Alliance of Genome Resources website, the joint portal of NIH-funded model organism databases.
Collapse
Affiliation(s)
- Adrian M Altenhoff
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.,ETH Zurich, Department of Computer Science, Zurich, Switzerland
| | | | - Salvatore Cosentino
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
| | - David Emms
- Department of Plant Sciences, University of Oxford, South Parks Road, Oxford, UK
| | - Natasha M Glover
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| | - Ana Hernández-Plaza
- Centro de Biotecnologia y Genomica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus de Montegancedo-UPM, 28223, Pozuelo de Alarcón, Madrid, Spain
| | - Yannis Nevers
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Fédération de Médecine Translationnelle de Strasbourg, Strasbourg, France
| | - Vicky Sundesha
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Damian Szklarczyk
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, Zurich, 8057, Switzerland
| | - José M Fernández
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Laia Codó
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | | | - Josep Ll Gelpi
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain.,Department of Biochemistry and Molecular Biomedicine. University of Barcelona. Barcelona, Spain
| | - Jaime Huerta-Cepas
- Centro de Biotecnologia y Genomica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus de Montegancedo-UPM, 28223, Pozuelo de Alarcón, Madrid, Spain
| | - Wataru Iwasaki
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
| | - Steven Kelly
- Department of Plant Sciences, University of Oxford, South Parks Road, Oxford, UK
| | - Odile Lecompte
- Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Fédération de Médecine Translationnelle de Strasbourg, Strasbourg, France
| | - Matthieu Muffato
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | | | - Paul D Thomas
- Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, USA
| | - Erik Sonnhammer
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden
| | - Christophe Dessimoz
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Genetics, Evolution & Environment, University College London, London, UK.,Department of Computer Science, University College London, London, UK
| |
Collapse
|
28
|
Mohanta TK, Mishra AK, Khan A, Hashem A, Abd_Allah EF, Al-Harrasi A. Gene Loss and Evolution of the Plastome. Genes (Basel) 2020; 11:E1133. [PMID: 32992972 PMCID: PMC7650654 DOI: 10.3390/genes11101133] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 09/07/2020] [Accepted: 09/14/2020] [Indexed: 12/13/2022] Open
Abstract
Chloroplasts are unique organelles within the plant cells and are responsible for sustaining life forms on the earth due to their ability to conduct photosynthesis. Multiple functional genes within the chloroplast are responsible for a variety of metabolic processes that occur in the chloroplast. Considering its fundamental role in sustaining life on the earth, it is important to identify the level of diversity present in the chloroplast genome, what genes and genomic content have been lost, what genes have been transferred to the nuclear genome, duplication events, and the overall origin and evolution of the chloroplast genome. Our analysis of 2511 chloroplast genomes indicated that the genome size and number of coding DNA sequences (CDS) in the chloroplasts genome of algae are higher relative to other lineages. Approximately 10.31% of the examined species have lost the inverted repeats (IR) in the chloroplast genome that span across all the lineages. Genome-wide analyses revealed the loss of the Rbcl gene in parasitic and heterotrophic plants occurred approximately 56 Ma ago. PsaM, Psb30, ChlB, ChlL, ChlN, and Rpl21 were found to be characteristic signature genes of the chloroplast genome of algae, bryophytes, pteridophytes, and gymnosperms; however, none of these genes were found in the angiosperm or magnoliid lineage which appeared to have lost them approximately 203-156 Ma ago. A variety of chloroplast-encoded genes were lost across different species lineages throughout the evolutionary process. The Rpl20 gene, however, was found to be the most stable and intact gene in the chloroplast genome and was not lost in any of the analyzed species, suggesting that it is a signature gene of the plastome. Our evolutionary analysis indicated that chloroplast genomes evolved from multiple common ancestors ~1293 Ma ago and have undergone vivid recombination events across different taxonomic lineages.
Collapse
Affiliation(s)
- Tapan Kumar Mohanta
- Biotech and Omics Laboratory, Natural and Medical Sciences Research Centre, University of Nizwa, Nizwa 616, Oman;
| | | | - Adil Khan
- Biotech and Omics Laboratory, Natural and Medical Sciences Research Centre, University of Nizwa, Nizwa 616, Oman;
| | - Abeer Hashem
- Botany and Microbiology Department, College of Science, King Saud University, Riyadh 11451, Saudi Arabia;
- Mycology and Plant Disease Survey Department, Plant Pathology Research Institute, Giza 12511, Egypt
| | - Elsayed Fathi Abd_Allah
- Plant Production Department, College of Food and Agricultural Sciences, King Saud University, P.O. Box. 2460, Riyadh 11451, Saudi Arabia;
| | - Ahmed Al-Harrasi
- Natural Product Laboratory, Natural and Medical Sciences Research Centre, University of Nizwa, Nizwa 616, Oman
| |
Collapse
|
29
|
Lallemand T, Leduc M, Landès C, Rizzon C, Lerat E. An Overview of Duplicated Gene Detection Methods: Why the Duplication Mechanism Has to Be Accounted for in Their Choice. Genes (Basel) 2020; 11:E1046. [PMID: 32899740 PMCID: PMC7565063 DOI: 10.3390/genes11091046] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 09/01/2020] [Accepted: 09/02/2020] [Indexed: 12/11/2022] Open
Abstract
Gene duplication is an important evolutionary mechanism allowing to provide new genetic material and thus opportunities to acquire new gene functions for an organism, with major implications such as speciation events. Various processes are known to allow a gene to be duplicated and different models explain how duplicated genes can be maintained in genomes. Due to their particular importance, the identification of duplicated genes is essential when studying genome evolution but it can still be a challenge due to the various fates duplicated genes can encounter. In this review, we first describe the evolutionary processes allowing the formation of duplicated genes but also describe the various bioinformatic approaches that can be used to identify them in genome sequences. Indeed, these bioinformatic approaches differ according to the underlying duplication mechanism. Hence, understanding the specificity of the duplicated genes of interest is a great asset for tool selection and should be taken into account when exploring a biological question.
Collapse
Affiliation(s)
- Tanguy Lallemand
- IRHS, Agrocampus-Ouest, INRAE, Université d’Angers, SFR 4207 QuaSaV, 49071 Beaucouzé, France; (T.L.); (M.L.); (C.L.)
| | - Martin Leduc
- IRHS, Agrocampus-Ouest, INRAE, Université d’Angers, SFR 4207 QuaSaV, 49071 Beaucouzé, France; (T.L.); (M.L.); (C.L.)
| | - Claudine Landès
- IRHS, Agrocampus-Ouest, INRAE, Université d’Angers, SFR 4207 QuaSaV, 49071 Beaucouzé, France; (T.L.); (M.L.); (C.L.)
| | - Carène Rizzon
- Laboratoire de Mathématiques et Modélisation d’Evry (LaMME), Université d’Evry Val d’Essonne, Université Paris-Saclay, UMR CNRS 8071, ENSIIE, USC INRAE, 23 bvd de France, CEDEX, 91037 Evry Paris, France;
| | - Emmanuelle Lerat
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR 5558, F-69622 Villeurbanne, France
| |
Collapse
|
30
|
Kim S, Park J, Kim T, Lee JS. The functional study of human proteins using humanized yeast. J Microbiol 2020; 58:343-349. [PMID: 32342338 DOI: 10.1007/s12275-020-0136-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 04/13/2020] [Accepted: 04/13/2020] [Indexed: 12/18/2022]
Abstract
The functional and optimal expression of genes is crucial for survival of all living organisms. Numerous experiments and efforts have been performed to reveal the mechanisms required for the functional and optimal expression of human genes. The yeast Saccharomyces cerevisiae has evolved independently of humans for billions of years. Nevertheless, S. cerevisiae has many conserved genes and expression mechanisms that are similar to those in humans. Yeast is the most commonly used model organism for studying the function and expression mechanisms of human genes because it has a relatively simple genome structure, which is easy to manipulate. Many previous studies have focused on understanding the functions and mechanisms of human proteins using orthologous genes and biological systems of yeast. In this review, we mainly introduce two recent studies that replaced human genes and nucleosomes with those of yeast. Here, we suggest that, although yeast is a relatively small eukaryotic cell, its humanization is useful for the direct study of human proteins. In addition, yeast can be used as a model organism in a broader range of studies, including drug screening.
Collapse
Affiliation(s)
- Seho Kim
- Department of Molecular Bioscience, College of Biomedical Science, Kangwon National University, Chuncheon, 24341, Republic of Korea
| | - Juhee Park
- Department of Molecular Bioscience, College of Biomedical Science, Kangwon National University, Chuncheon, 24341, Republic of Korea
| | - Taekyung Kim
- Department of Biology Education, Pusan National University, Busan, 26241, Republic of Korea.
| | - Jung-Shin Lee
- Department of Molecular Bioscience, College of Biomedical Science, Kangwon National University, Chuncheon, 24341, Republic of Korea.
| |
Collapse
|
31
|
Galperin MY, Kristensen DM, Makarova KS, Wolf YI, Koonin EV. Microbial genome analysis: the COG approach. Brief Bioinform 2020; 20:1063-1070. [PMID: 28968633 DOI: 10.1093/bib/bbx117] [Citation(s) in RCA: 144] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Revised: 08/01/2017] [Indexed: 11/15/2022] Open
Abstract
For the past 20 years, the Clusters of Orthologous Genes (COG) database had been a popular tool for microbial genome annotation and comparative genomics. Initially created for the purpose of evolutionary classification of protein families, the COG have been used, apart from straightforward functional annotation of sequenced genomes, for such tasks as (i) unification of genome annotation in groups of related organisms; (ii) identification of missing and/or undetected genes in complete microbial genomes; (iii) analysis of genomic neighborhoods, in many cases allowing prediction of novel functional systems; (iv) analysis of metabolic pathways and prediction of alternative forms of enzymes; (v) comparison of organisms by COG functional categories; and (vi) prioritization of targets for structural and functional characterization. Here we review the principles of the COG approach and discuss its key advantages and drawbacks in microbial genome analysis.
Collapse
|
32
|
Andermann T, Torres Jiménez MF, Matos-Maraví P, Batista R, Blanco-Pastor JL, Gustafsson ALS, Kistler L, Liberal IM, Oxelman B, Bacon CD, Antonelli A. A Guide to Carrying Out a Phylogenomic Target Sequence Capture Project. Front Genet 2020; 10:1407. [PMID: 32153629 PMCID: PMC7047930 DOI: 10.3389/fgene.2019.01407] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Accepted: 12/24/2019] [Indexed: 12/17/2022] Open
Abstract
High-throughput DNA sequencing techniques enable time- and cost-effective sequencing of large portions of the genome. Instead of sequencing and annotating whole genomes, many phylogenetic studies focus sequencing effort on large sets of pre-selected loci, which further reduces costs and bioinformatic challenges while increasing coverage. One common approach that enriches loci before sequencing is often referred to as target sequence capture. This technique has been shown to be applicable to phylogenetic studies of greatly varying evolutionary depth. Moreover, it has proven to produce powerful, large multi-locus DNA sequence datasets suitable for phylogenetic analyses. However, target capture requires careful considerations, which may greatly affect the success of experiments. Here we provide a simple flowchart for designing phylogenomic target capture experiments. We discuss necessary decisions from the identification of target loci to the final bioinformatic processing of sequence data. We outline challenges and solutions related to the taxonomic scope, sample quality, and available genomic resources of target capture projects. We hope this review will serve as a useful roadmap for designing and carrying out successful phylogenetic target capture studies.
Collapse
Affiliation(s)
- Tobias Andermann
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
| | - Maria Fernanda Torres Jiménez
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
| | - Pável Matos-Maraví
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
- Institute of Entomology, Biology Centre of the Czech Academy of Sciences, České Budějovice, Czechia
| | - Romina Batista
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
- Programa de Pós-Graduação em Genética, Conservação e Biologia Evolutiva, PPG GCBEv–Instituto Nacional de Pesquisas da Amazônia—INPA Campus II, Manaus, Brazil
- Coordenação de Zoologia, Museu Paraense Emílio Goeldi, Belém, Brazil
| | - José L. Blanco-Pastor
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- INRAE, Centre Nouvelle-Aquitaine-Poitiers, Lusignan, France
| | | | - Logan Kistler
- Department of Anthropology, National Museum of Natural History, Smithsonian Institution, Washington, DC, United States
| | - Isabel M. Liberal
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
| | - Bengt Oxelman
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
| | - Christine D. Bacon
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
| | - Alexandre Antonelli
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
- Royal Botanic Gardens, Kew, Richmond-Surrey, United Kingdom
| |
Collapse
|
33
|
Qu Y, Bi C, He B, Ye N, Yin T, Xu LA. Genome-wide identification and characterization of the MADS-box gene family in Salix suchowensis. PeerJ 2019; 7:e8019. [PMID: 31720123 PMCID: PMC6842560 DOI: 10.7717/peerj.8019] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2018] [Accepted: 10/09/2019] [Indexed: 01/19/2023] Open
Abstract
MADS-box genes encode transcription factors that participate in various plant growth and development processes, particularly floral organogenesis. To date, MADS-box genes have been reported in many species, the completion of the sequence of the willow genome provides us with the opportunity to conduct a comprehensive analysis of the willow MADS-box gene family. Here, we identified 60 willow MADS-box genes using bioinformatics-based methods and classified them into 22 M-type (11 Mα, seven Mβ and four Mγ) and 38 MIKC-type (32 MIKCc and six MIKC*) genes based on a phylogenetic analysis. Fifty-six of the 60 SsMADS genes were randomly distributed on 19 putative willow chromosomes. By combining gene structure analysis with evolutionary analysis, we found that the MIKC-type genes were more conserved and played a more important role in willow growth. Further study showed that the MIKC* type was a transition between the M-type and MIKC-type. Additionally, the number of MADS-box genes in gymnosperms was notably lower than that in angiosperms. Finally, the expression profiles of these willow MADS-box genes were analysed in five different tissues (root, stem, leave, bud and bark) and validated by RT-qPCR experiments. This study is the first genome-wide analysis of the willow MADS-box gene family, and the results establish a basis for further functional studies of willow MADS-box genes and serve as a reference for related studies of other woody plants.
Collapse
Affiliation(s)
- Yanshu Qu
- Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China
| | - Changwei Bi
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Bing He
- Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China
| | - Ning Ye
- College of Information Science and Technology, Nanjing Forestry University, Nanjing, China
| | - Tongming Yin
- Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China
| | - Li-An Xu
- Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China
| |
Collapse
|
34
|
Larson RT, Dacks JB, Barlow LD. Recent gene duplications dominate evolutionary dynamics of adaptor protein complex subunits in embryophytes. Traffic 2019; 20:961-973. [DOI: 10.1111/tra.12698] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Revised: 09/10/2019] [Accepted: 09/11/2019] [Indexed: 12/14/2022]
Affiliation(s)
- Raegan T. Larson
- Division of Infectious Disease, Department of Medicine, Faculty of Medicine and DentistryUniversity of Alberta Edmonton Alberta Canada
| | - Joel B. Dacks
- Division of Infectious Disease, Department of Medicine, Faculty of Medicine and DentistryUniversity of Alberta Edmonton Alberta Canada
- Department of Life SciencesThe Natural History Museum, Cromwell Road London UK
| | - Lael D. Barlow
- Department of Biological Sciences, Faculty of ScienceUniversity of Alberta Edmonton Alberta Canada
| |
Collapse
|
35
|
Bioinformatics for Marine Products: An Overview of Resources, Bottlenecks, and Perspectives. Mar Drugs 2019; 17:md17100576. [PMID: 31614509 PMCID: PMC6835618 DOI: 10.3390/md17100576] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Revised: 10/01/2019] [Accepted: 10/02/2019] [Indexed: 12/13/2022] Open
Abstract
The sea represents a major source of biodiversity. It exhibits many different ecosystems in a huge variety of environmental conditions where marine organisms have evolved with extensive diversification of structures and functions, making the marine environment a treasure trove of molecules with potential for biotechnological applications and innovation in many different areas. Rapid progress of the omics sciences has revealed novel opportunities to advance the knowledge of biological systems, paving the way for an unprecedented revolution in the field and expanding marine research from model organisms to an increasing number of marine species. Multi-level approaches based on molecular investigations at genomic, metagenomic, transcriptomic, metatranscriptomic, proteomic, and metabolomic levels are essential to discover marine resources and further explore key molecular processes involved in their production and action. As a consequence, omics approaches, accompanied by the associated bioinformatic resources and computational tools for molecular analyses and modeling, are boosting the rapid advancement of biotechnologies. In this review, we provide an overview of the most relevant bioinformatic resources and major approaches, highlighting perspectives and bottlenecks for an appropriate exploitation of these opportunities for biotechnology applications from marine resources.
Collapse
|
36
|
Hu X, Friedberg I. SwiftOrtho: A fast, memory-efficient, multiple genome orthology classifier. Gigascience 2019; 8:giz118. [PMID: 31648300 PMCID: PMC6812468 DOI: 10.1093/gigascience/giz118] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Revised: 06/07/2019] [Accepted: 09/05/2019] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Gene homology type classification is required for many types of genome analyses, including comparative genomics, phylogenetics, and protein function annotation. Consequently, a large variety of tools have been developed to perform homology classification across genomes of different species. However, when applied to large genomic data sets, these tools require high memory and CPU usage, typically available only in computational clusters. FINDINGS Here we present a new graph-based orthology analysis tool, SwiftOrtho, which is optimized for speed and memory usage when applied to large-scale data. SwiftOrtho uses long k-mers to speed up homology search, while using a reduced amino acid alphabet and spaced seeds to compensate for the loss of sensitivity due to long k-mers. In addition, it uses an affinity propagation algorithm to reduce the memory usage when clustering large-scale orthology relationships into orthologous groups. In our tests, SwiftOrtho was the only tool that completed orthology analysis of proteins from 1,760 bacterial genomes on a computer with only 4 GB RAM. Using various standard orthology data sets, we also show that SwiftOrtho has a high accuracy. CONCLUSIONS SwiftOrtho enables the accurate comparative genomic analyses of thousands of genomes using low-memory computers. SwiftOrtho is available at https://github.com/Rinoahu/SwiftOrtho.
Collapse
Affiliation(s)
- Xiao Hu
- Department of Veterinary Microbiology and Preventive Medicine, 2118 Veterinary Medicine, College of Veterinary Medicine, Iowa State University, Ames, IA, 50011, USA
| | - Iddo Friedberg
- Department of Veterinary Microbiology and Preventive Medicine, 2118 Veterinary Medicine, College of Veterinary Medicine, Iowa State University, Ames, IA, 50011, USA
| |
Collapse
|
37
|
Siu-Ting K, Torres-Sánchez M, San Mauro D, Wilcockson D, Wilkinson M, Pisani D, O'Connell MJ, Creevey CJ. Inadvertent Paralog Inclusion Drives Artifactual Topologies and Timetree Estimates in Phylogenomics. Mol Biol Evol 2019; 36:1344-1356. [PMID: 30903171 PMCID: PMC6526904 DOI: 10.1093/molbev/msz067] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Increasingly, large phylogenomic data sets include transcriptomic data from nonmodel organisms. This not only has allowed controversial and unexplored evolutionary relationships in the tree of life to be addressed but also increases the risk of inadvertent inclusion of paralogs in the analysis. Although this may be expected to result in decreased phylogenetic support, it is not clear if it could also drive highly supported artifactual relationships. Many groups, including the hyperdiverse Lissamphibia, are especially susceptible to these issues due to ancient gene duplication events and small numbers of sequenced genomes and because transcriptomes are increasingly applied to resolve historically conflicting taxonomic hypotheses. We tested the potential impact of paralog inclusion on the topologies and timetree estimates of the Lissamphibia using published and de novo sequencing data including 18 amphibian species, from which 2,656 single-copy gene families were identified. A novel paralog filtering approach resulted in four differently curated data sets, which were used for phylogenetic reconstructions using Bayesian inference, maximum likelihood, and quartet-based supertrees. We found that paralogs drive strongly supported conflicting hypotheses within the Lissamphibia (Batrachia and Procera) and older divergence time estimates even within groups where no variation in topology was observed. All investigated methods, except Bayesian inference with the CAT-GTR model, were found to be sensitive to paralogs, but with filtering convergence to the same answer (Batrachia) was observed. This is the first large-scale study to address the impact of orthology selection using transcriptomic data and emphasizes the importance of quality over quantity particularly for understanding relationships of poorly sampled taxa.
Collapse
Affiliation(s)
- Karen Siu-Ting
- Institute for Global Food Security, School of Biological Sciences, Queen's University Belfast, Belfast, United Kingdom.,School of Biotechnology, Dublin City University, Glasnevin, Dublin, Ireland.,Dpto. de Herpetología, Museo de Historia Natural, Universidad Nacional Mayor de San Marcos, Lima, Perú
| | - María Torres-Sánchez
- Department of Biodiversity, Ecology, and Evolution, Complutense University of Madrid, Madrid, Spain.,Department of Neuroscience, Spinal Cord and Brain Injury Research Center and Ambystoma Genetic Stock Center, University of Kentucky, Lexington, KY
| | - Diego San Mauro
- Department of Biodiversity, Ecology, and Evolution, Complutense University of Madrid, Madrid, Spain
| | - David Wilcockson
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, United Kingdom
| | - Mark Wilkinson
- Department of Life Sciences, Natural History Museum, London, United Kingdom
| | - Davide Pisani
- Life Sciences Building, University of Bristol, Bristol, United Kingdom
| | - Mary J O'Connell
- School of Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom.,School of Life Sciences, University of Nottingham, University Park, United Kingdom
| | - Christopher J Creevey
- Institute for Global Food Security, School of Biological Sciences, Queen's University Belfast, Belfast, United Kingdom
| |
Collapse
|
38
|
Genome-Wide Analysis of Serine/Arginine-Rich Protein Family in Wheat and Brachypodium distachyon. PLANTS 2019; 8:plants8070188. [PMID: 31247888 PMCID: PMC6681277 DOI: 10.3390/plants8070188] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 06/20/2019] [Accepted: 06/22/2019] [Indexed: 12/15/2022]
Abstract
By regulating the pre-mRNA splicing of other genes and themselves, plant serine/arginine-rich (SR) proteins play important roles in development and in response to abiotic stresses. Presently, the functions of most plant SR protein genes remain unclear. Wheat (Triticumaestivum) and Brachypodiumdistachyon are closely related species. In this study, 40 TaSR and 18 BdSR proteins were identified respectively, and they were classified into seven subfamilies: SR, RS, SCL, RSZ, RS2Z, SC35, and SR45. Similar to Arabidopsis and rice SR protein genes, most TaSR and BdSR protein genes are expressed extensively. Surprisingly, real-time polymerase chain reaction (RT-PCR) analyses showed that no alternative splicing event was found in TaSR protein genes, and only six BdSR protein genes are alternatively spliced genes. The detected alternatively spliced BdSR protein genes and transcripts are much fewer than in Arabidopsis, rice, maize, and sorghum. In the promoter regions, 92 development-related, stress-related, and hormone-related cis-elements were detected, indicating their functions in development and in response to environmental stresses. Meanwhile, 19 TaSR and 16 BdSR proteins were predicted to interact with other SR proteins or non-SR proteins, implying that they are involved in other functions in addition to modulating pre-mRNA splicing as essential components of the spliceosome. These results lay a foundation for further analyses of these genes.
Collapse
|
39
|
Zhang M, Xie S, Zhao Y, Meng X, Song L, Feng H, Huang L. Hce2 domain-containing effectors contribute to the full virulence of Valsa mali in a redundant manner. MOLECULAR PLANT PATHOLOGY 2019; 20:843-856. [PMID: 30912612 PMCID: PMC6637899 DOI: 10.1111/mpp.12796] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Valsa mali is the causal agent of apple Valsa canker, a destructive disease in East Asia. Effector proteins play important roles in the virulence of phytopathogenic fungi, and we identified five Hce2 domain-containing effectors (VmHEP1, VmHEP2, VmHEP3, VmHEP4 and VmHEP5) from the V. mali genome. Amongst these, VmHEP1 and VmHEP2 were found to be up-regulated during the early infection stage and VmHEP1 was also identified as a cell death inducer through its transient expression in Nicotiana benthamiana. Although the deletion of each single VmHEP gene did not lead to a reduction in virulence, the double-deletion of VmHEP1 and VmHEP2 notably attenuated V. mali virulence in both apple twigs and leaves. An evolutionary analysis revealed that VmHEP1 and VmHEP2 are two paralogues, under purifying selection. VmHEP1 and VmHEP2 are located next to each other on chromosome 11 as tandem genes with only a 604 bp physical distance. Interestingly, the deletion of VmHEP1 promoted the expression of VmHEP2 and, vice versa, the deletion of VmHEP2 promoted the expression of VmHEP1. The present results provide insights into the functions of Hce2 domain-containing effectors acting as virulence factors of V. mali, and provide a new perspective regarding the contribution of tandem genes to the virulence of phytopathogenic fungi.
Collapse
Affiliation(s)
- Mian Zhang
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant ProtectionNorthwest A&F UniversityYanglingChina
| | - Shichang Xie
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant ProtectionNorthwest A&F UniversityYanglingChina
| | - Yuhuan Zhao
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant ProtectionNorthwest A&F UniversityYanglingChina
| | - Xiang Meng
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant ProtectionNorthwest A&F UniversityYanglingChina
| | - Linlin Song
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant ProtectionNorthwest A&F UniversityYanglingChina
| | - Hao Feng
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant ProtectionNorthwest A&F UniversityYanglingChina
| | - Lili Huang
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant ProtectionNorthwest A&F UniversityYanglingChina
| |
Collapse
|
40
|
Heller D, Szklarczyk D, Mering CV. Tree reconciliation combined with subsampling improves large scale inference of orthologous group hierarchies. BMC Bioinformatics 2019; 20:228. [PMID: 31060495 PMCID: PMC6501302 DOI: 10.1186/s12859-019-2828-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2018] [Accepted: 04/17/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND An orthologous group (OG) comprises a set of orthologous and paralogous genes that share a last common ancestor (LCA). OGs are defined with respect to a chosen taxonomic level, which delimits the position of the LCA in time to a specified speciation event. A hierarchy of OGs expands on this notion, connecting more general OGs, distant in time, to more recent, fine-grained OGs, thereby spanning multiple levels of the tree of life. Large scale inference of OG hierarchies with independently computed taxonomic levels can suffer from inconsistencies between successive levels, such as the position in time of a duplication event. This can be due to confounding genetic signal or algorithmic limitations. Importantly, inconsistencies limit the potential use of OGs for functional annotation and third-party applications. RESULTS Here we present a new methodology to ensure hierarchical consistency of OGs across taxonomic levels. To resolve an inconsistency, we subsample the protein space of the OG members and perform gene tree-species tree reconciliation for each sampling. Differently from previous approaches, by subsampling the protein space, we avoid the notoriously difficult task of accurately building and reconciling very large phylogenies. We implement the method into a high-throughput pipeline and apply it to the eggNOG database. We use independent protein domain definitions to validate its performance. CONCLUSION The presented consistency pipeline shows that, contrary to previous limitations, tree reconciliation can be a useful instrument for the construction of OG hierarchies. The key lies in the combination of sampling smaller trees and aggregating their reconciliations for robustness. Results show comparable or greater performance to previous pipelines. The code is available on Github at: https://github.com/meringlab/og_consistency_pipeline .
Collapse
Affiliation(s)
- Davide Heller
- Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, Zurich, 8057 Switzerland
- SIB Swiss Institute of Bioinformatics, Quartier Sorge, Batiment Genopode, Lausanne, 1015 Switzerland
| | - Damian Szklarczyk
- Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, Zurich, 8057 Switzerland
- SIB Swiss Institute of Bioinformatics, Quartier Sorge, Batiment Genopode, Lausanne, 1015 Switzerland
| | - Christian von Mering
- Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, Zurich, 8057 Switzerland
- SIB Swiss Institute of Bioinformatics, Quartier Sorge, Batiment Genopode, Lausanne, 1015 Switzerland
| |
Collapse
|
41
|
Chang CH, Yan HY. Plasticity of opsin gene expression in the adult red shiner (Cyprinella lutrensis) in response to turbid habitats. PLoS One 2019; 14:e0215376. [PMID: 30978235 PMCID: PMC6461250 DOI: 10.1371/journal.pone.0215376] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Accepted: 04/01/2019] [Indexed: 11/30/2022] Open
Abstract
Vision is very important to fish as it is required for foraging food, fighting competitors, fleeing from predators, and finding potential mates. Vertebrates express opsin genes in photoreceptor cells to receive visual signals, and the variety of light levels in aquatic habits has driven fish to evolve multiple opsin genes with expression profiles that are highly plastic. In this study, red shiners (Cyprinella lutrensis) were exposed to four water turbidity treatments and their opsin genes were cloned to elucidate how opsin gene expression could be modulated by ambient light conditions. Opsin gene cloning revealed that these fish have single RH1, SWS1, SWS2 and LWS genes and two RH2 genes. Phylogenetic analysis also indicated that these two RH2 opsin genes-RH2A and RH2B -are in-paralogous. Using quantitative PCR, we found evidence that opsin expression is plastic in adults. Elevated proportional expression of LWS in the cone under ambient light and turbid treatment indicated that the red shiner's visual spectrum displays a red shift in response to increased turbidity.
Collapse
Affiliation(s)
- Chia-Hao Chang
- Department of Life Science, Tunghai University, Taichung City, Taiwan
- Center for Ecology and Environment, Tunghai University, Taichung City, Taiwan
| | - Hong Young Yan
- National Museum of Marine Biology & Aquarium, Checheng, Pingtung, Taiwan
| |
Collapse
|
42
|
Moreno LF, Vicente VA, de Hoog S. Black yeasts in the omics era: Achievements and challenges. Med Mycol 2018. [PMID: 29538737 DOI: 10.1093/mmy/myx129] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Black yeasts (BY) comprise a group of polyextremotolerant fungi, mainly belonging to the order Chaetothyriales, which are capable of colonizing a wide range of extreme environments. The tolerance to hostile habitats can be explained by their intrinsic ability to survive under acidic, alkaline, and toxic conditions, high temperature, low nutrient availability, and osmotic and mechanical stress. Occasionally, some species can cause human chromoblastomycosis, a chronic subcutaneous infection, as well as disseminated or cerebral phaeohyphomycosis. Three years after the release of the first black yeast genome, the number of projects for sequencing these organisms has significantly increased. Over 37 genomes of important opportunistic and saprobic black yeasts and relatives are now available in different databases. The whole-genome sequencing, as well as the analysis of differentially expressed mRNAs and the determination of protein expression profiles generated an unprecedented amount of data, requiring the development of a curated repository to provide easy accesses to this information. In the present article, we review various aspects of the impact of genomics, transcriptomics, and proteomics on black yeast studies. We discuss recent key findings achieved by the use of these technologies and further directions for medical mycology in this area. An important vehicle is the Working Groups on Black Yeasts and Chromoblastomycosis, under the umbrella of ISHAM, which unite the clinicians and a highly diverse population of fundamental scientists to exchange data for joint publications.
Collapse
Affiliation(s)
- Leandro Ferreira Moreno
- Westerdijk Fungal Biodiversity Institute, Utrecht, The Netherlands.,Institute of Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, Netherlands.,Department of Basic Pathology, Federal University of Paraná State, Curitiba, PR, Brazil
| | | | - Sybren de Hoog
- Westerdijk Fungal Biodiversity Institute, Utrecht, The Netherlands.,Institute of Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, Netherlands.,Department of Basic Pathology, Federal University of Paraná State, Curitiba, PR, Brazil.,Center of Expertise in Mycology of Radboudumc / CWZ, Nijmegen, The Netherlands
| |
Collapse
|
43
|
Battenberg K, Potter D, Tabuloc CA, Chiu JC, Berry AM. Comparative Transcriptomic Analysis of Two Actinorhizal Plants and the Legume Medicago truncatula Supports the Homology of Root Nodule Symbioses and Is Congruent With a Two-Step Process of Evolution in the Nitrogen-Fixing Clade of Angiosperms. FRONTIERS IN PLANT SCIENCE 2018; 9:1256. [PMID: 30349546 PMCID: PMC6187967 DOI: 10.3389/fpls.2018.01256] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2018] [Accepted: 08/08/2018] [Indexed: 05/18/2023]
Abstract
Root nodule symbiosis (RNS) is a symbiotic interaction established between angiosperm hosts and nitrogen-fixing soil bacteria in specialized organs called root nodules. The host plants provide photosynthate and the microsymbionts supply fixed nitrogen. The origin of RNS represents a major evolutionary event in the angiosperms, and understanding the genetic underpinnings of this event is of major economic and agricultural importance. Plants that engage in RNS are restricted to a single angiosperm clade known as the nitrogen-fixing clade (NFC), yet occur in multiple lineages scattered within the NFC. It has been postulated that RNS evolved in two steps: a gain-of-predisposition event occurring at the base of the NFC, followed by a gain-of-function event in each host plant lineage. Here, we first explore the premise that RNS has evolved from a single common background, and then we explore whether a two-step process better explains the evolutionary origin of RNS than either a single-step process, or multiple origins. We assembled the transcriptomes of root and nodule of two actinorhizal plants, Ceanothus thyrsiflorus and Datisca glomerata. Together with the corresponding published transcriptomes of the model legume Medicago truncatula, the gene expression patterns in roots and nodules were compared across the three lineages. We found that orthologs of many genes essential for RNS in the model legumes are expressed in all three lineages, and that the overall nodule gene expression patterns were more similar to each other than expected by random chance, a finding that supports a common evolutionary background for RNS shared by the three lineages. Moreover, phylogenetic analyses suggested that a substantial portion of the genes experiencing selection pressure changes at the base of the NFC also experienced additional changes at the base of each host plant lineage. Our results (1) support the occurrence of an event that led to RNS at the base of the NFC, and (2) suggest a subsequent change in each lineage, most consistent with a two-step origin of RNS. Among several conserved functions identified, strigolactone-related genes were down-regulated in nodules of all three species, suggesting a shared function similar to that shown for arbuscular mycorrhizal symbioses.
Collapse
Affiliation(s)
- Kai Battenberg
- Department of Plant Sciences, University of California, Davis, Davis, CA, United States
| | - Daniel Potter
- Department of Plant Sciences, University of California, Davis, Davis, CA, United States
| | - Christine A. Tabuloc
- Department of Entomology and Nematology, University of California, Davis, Davis, CA, United States
| | - Joanna C. Chiu
- Department of Entomology and Nematology, University of California, Davis, Davis, CA, United States
| | - Alison M. Berry
- Department of Plant Sciences, University of California, Davis, Davis, CA, United States
| |
Collapse
|
44
|
Abstract
This chapter covers the theory and practice of ortholog gene set computation. In the theoretical part we give detailed and formal descriptions of the relevant concepts. We also cover the topic of graph-based clustering as a tool to compute ortholog gene sets. In the second part we provide an overview of practical considerations intended for researchers who need to determine orthologous genes from a collection of annotated genomes, briefly describing some of the most popular programs and resources currently available for this task.
Collapse
|
45
|
Yang X, Wang J, Bing G, Bie P, De Y, Lyu Y, Wu Q. Ortholog-based screening and identification of genes related to intracellular survival. Gene 2018; 651:134-142. [DOI: 10.1016/j.gene.2018.01.059] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2017] [Revised: 10/29/2017] [Accepted: 01/17/2018] [Indexed: 12/29/2022]
|
46
|
Uchiyama I. Ortholog Identification and Comparative Analysis of Microbial Genomes Using MBGD and RECOG. Methods Mol Biol 2018; 1611:147-168. [PMID: 28451978 DOI: 10.1007/978-1-4939-7015-5_12] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/21/2023]
Abstract
Comparative genomics is becoming an essential approach for identification of genes associated with a specific function or phenotype. Here, we introduce the microbial genome database for comparative analysis (MBGD), which is a comprehensive ortholog database among the microbial genomes available so far. MBGD contains several precomputed ortholog tables including the standard ortholog table covering the entire taxonomic range and taxon-specific ortholog tables for various major taxa. In addition, MBGD allows the users to create an ortholog table within any specified set of genomes through dynamic calculations. In particular, MBGD has a "My MBGD" mode where users can upload their original genome sequences and incorporate them into orthology analysis. The created ortholog table can serve as the basis for various comparative analyses. Here, we describe the use of MBGD and briefly explain how to utilize the orthology information during comparative genome analysis in combination with the stand-alone comparative genomics software RECOG, focusing on the application to comparison of closely related microbial genomes.
Collapse
Affiliation(s)
- Ikuo Uchiyama
- Laboratory of Genome Informatics, National Institute for Basic Biology, National Institutes of Natural Sciences, Nishigonaka 38, Myodaiji, Okazaki, Aichi, 444-8585, Japan.
| |
Collapse
|
47
|
Darby CA, Stolzer M, Ropp PJ, Barker D, Durand D. Xenolog classification. Bioinformatics 2017; 33:640-649. [PMID: 27998934 PMCID: PMC5860392 DOI: 10.1093/bioinformatics/btw686] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2016] [Accepted: 10/26/2016] [Indexed: 01/31/2023] Open
Abstract
Motivation Orthology analysis is a fundamental tool in comparative genomics. Sophisticated methods have been developed to distinguish between orthologs and paralogs and to classify paralogs into subtypes depending on the duplication mechanism and timing, relative to speciation. However, no comparable framework exists for xenologs: gene pairs whose history, since their divergence, includes a horizontal transfer. Further, the diversity of gene pairs that meet this broad definition calls for classification of xenologs with similar properties into subtypes. Results We present a xenolog classification that uses phylogenetic reconciliation to assign each pair of genes to a class based on the event responsible for their divergence and the historical association between genes and species. Our classes distinguish between genes related through transfer alone and genes related through duplication and transfer. Further, they separate closely-related genes in distantly-related species from distantly-related genes in closely-related species. We present formal rules that assign gene pairs to specific xenolog classes, given a reconciled gene tree with an arbitrary number of duplications and transfers. These xenology classification rules have been implemented in software and tested on a collection of ∼13 000 prokaryotic gene families. In addition, we present a case study demonstrating the connection between xenolog classification and gene function prediction. Availability and Implementation The xenolog classification rules have been implemented in N otung 2.9, a freely available phylogenetic reconciliation software package. http://www.cs.cmu.edu/~durand/Notung . Gene trees are available at http://dx.doi.org/10.7488/ds/1503 . Contact durand@cmu.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Charlotte A Darby
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Maureen Stolzer
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Patrick J Ropp
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Daniel Barker
- School of Biology, University of St. Andrews, St. Andrews, Fife KY16 9TH, UK
| | - Dannie Durand
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA.,Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
48
|
Abstract
The patchy distribution of genes across the prokaryotes may be caused by multiple gene losses or lateral transfer. Probabilistic models of gene gain and loss are needed to distinguish between these possibilities. Existing models allow only single genes to be gained and lost, despite the empirical evidence for multi-gene events. We compare birth-death models (currently the only widely-used models, in which only one gene can be gained or lost at a time) to blocks models (allowing gain and loss of multiple genes within a family). We analyze two pairs of genomes: two E. coli strains, and the distantly-related Archaeoglobus fulgidus (archaea) and Bacillus subtilis (gram positive bacteria). Blocks models describe the data much better than birth-death models. Our models suggest that lateral transfers of multiple genes from the same family are rare (although transfers of single genes are probably common). For both pairs, the estimated median time that a gene will remain in the genome is not much greater than the time separating the common ancestors of the archaea and bacteria. Deep phylogenetic reconstruction from sequence data will therefore depend on choosing genes likely to remain in the genome for a long time. Phylogenies based on the blocks model are more biologically plausible than phylogenies based on the birth-death model.
Collapse
Affiliation(s)
- Matthew Spencer
- Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Edward Susko
- Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Andrew J. Roger
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada
| |
Collapse
|
49
|
Burgetz IJ, Shariff S, Pang A, Tillier* ERM. Positional Homology in Bacterial Genomes. Evol Bioinform Online 2017. [DOI: 10.1177/117693430600200031] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
In comparative genomic studies, syntenic groups of homologous sequence in the same order have been used as supplementary information that can be used in helping to determine the orthology of the compared sequences. The assumption is that orthologous gene copies are more likely to share the same genome positions and share the same gene neighbors. In this study we have defined positional homologs as those that also have homologous neighboring genes and we investigated the usefulness of this distinction for bacterial comparative genomics. We considered the identification of positionaly homologous gene pairs in bacterial genomes using protein and DNA sequence level alignments and found that the positional homologs had on average relatively lower rates of substitution at the DNA level (synonymous substitutions) than duplicate homologs in different genomic locations, regardless of the level of protein sequence divergence (measured with non-synonymous substitution rate). Since gene order conservation can indicate accuracy of orthology assignments, we also considered the effect of imposing certain alignment quality requirements on the sensitivity and specificity of identification of protein pairs by BLAST and FASTA when neighboring information is not available and in comparisons where gene order is not conserved. We found that the addition of a stringency filter based on the second best hits was an efficient way to remove dubious ortholog identifications in BLAST and FASTA analyses. Gene order conservation and DNA sequence homology are useful to consider in comparative genomic studies as they may indicate different orthology assignments than protein sequence homology alone.
Collapse
Affiliation(s)
- Ingrid J. Burgetz
- Dept. of Medical Biophysics Ontario Cancer Institute, University Health Network, Toronto, Ontario, Canada
| | - Salimah Shariff
- Dept. of Medical Biophysics Ontario Cancer Institute, University Health Network, Toronto, Ontario, Canada
| | - Andy Pang
- Dept. of Medical Biophysics Ontario Cancer Institute, University Health Network, Toronto, Ontario, Canada
| | | |
Collapse
|
50
|
Mans BJ, Featherston J, de Castro MH, Pienaar R. Gene Duplication and Protein Evolution in Tick-Host Interactions. Front Cell Infect Microbiol 2017; 7:413. [PMID: 28993800 PMCID: PMC5622192 DOI: 10.3389/fcimb.2017.00413] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2017] [Accepted: 09/06/2017] [Indexed: 01/01/2023] Open
Abstract
Ticks modulate their hosts' defense responses by secreting a biopharmacopiea of hundreds to thousands of proteins and bioactive chemicals into the feeding site (tick-host interface). These molecules and their functions evolved over millions of years as ticks adapted to blood-feeding, tick lineages diverged, and host-shifts occurred. The evolution of new proteins with new functions is mainly dependent on gene duplication events. Central questions around this are the rates of gene duplication, when they occurred and how new functions evolve after gene duplication. The current review investigates these questions in the light of tick biology and considers the possibilities of ancient genome duplication, lineage specific expansion events, and the role that positive selection played in the evolution of tick protein function. It contrasts current views in tick biology regarding adaptive evolution with the more general view that neutral evolution may account for the majority of biological innovations observed in ticks.
Collapse
Affiliation(s)
- Ben J Mans
- Epidemiology, Parasites and Vectors, Agricultural Research Council-Onderstepoort Veterinary ResearchOnderstepoort, South Africa.,Department of Veterinary Tropical Diseases, University of PretoriaPretoria, South Africa.,Department of Life and Consumer Sciences, University of South AfricaPretoria, South Africa
| | - Jonathan Featherston
- Agricultural Research Council-The Biotechnology PlatformOnderstepoort, South Africa
| | - Minique H de Castro
- Epidemiology, Parasites and Vectors, Agricultural Research Council-Onderstepoort Veterinary ResearchOnderstepoort, South Africa.,Department of Life and Consumer Sciences, University of South AfricaPretoria, South Africa.,Agricultural Research Council-The Biotechnology PlatformOnderstepoort, South Africa
| | - Ronel Pienaar
- Epidemiology, Parasites and Vectors, Agricultural Research Council-Onderstepoort Veterinary ResearchOnderstepoort, South Africa
| |
Collapse
|