1
|
Rahiminejad S, De Sanctis B, Pevzner P, Mushegian A. Synthetic lethality and the minimal genome size problem. mSphere 2024:e0013924. [PMID: 38904396 DOI: 10.1128/msphere.00139-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Accepted: 05/13/2024] [Indexed: 06/22/2024] Open
Abstract
Gene knockout studies suggest that ~300 genes in a bacterial genome and ~1,100 genes in a yeast genome cannot be deleted without loss of viability. These single-gene knockout experiments do not account for negative genetic interactions, when two or more genes can each be deleted without effect, but their joint deletion is lethal. Thus, large-scale single-gene deletion studies underestimate the size of a minimal gene set compatible with cell survival. In yeast Saccharomyces cerevisiae, the viability of all possible deletions of gene pairs (2-tuples), and of some deletions of gene triplets (3-tuples), has been experimentally tested. To estimate the size of a yeast minimal genome from that data, we first established that finding the size of a minimal gene set is equivalent to finding the minimum vertex cover in the lethality (hyper)graph, where the vertices are genes and (hyper)edges connect k-tuples of genes whose joint deletion is lethal. Using the Lovász-Johnson-Chvatal greedy approximation algorithm, we computed the minimum vertex cover of the synthetic-lethal 2-tuples graph to be 1,723 genes. We next simulated the genetic interactions in 3-tuples, extrapolating from the existing triplet sample, and again estimated minimum vertex covers. The size of a minimal gene set in yeast rapidly approaches the size of the entire genome even when considering only synthetic lethalities in k-tuples with small k. In contrast, several studies reported successful experimental reductions of yeast and bacterial genomes by simultaneous deletions of hundreds of genes, without eliciting synthetic lethality. We discuss possible reasons for this apparent contradiction.IMPORTANCEHow can we estimate the smallest number of genes sufficient for a unicellular organism to survive on a rich medium? One approach is to remove genes one at a time and count how many of such deletion strains are unable to grow. However, the single-gene knockout data are insufficient, because joint gene deletions may result in negative genetic interactions, also known as synthetic lethality. We used a technique from graph theory to estimate the size of minimal yeast genome from partial data on synthetic lethality. The number of potential synthetic lethal interactions grows very fast when multiple genes are deleted, revealing a paradoxical contrast with the experimental reductions of yeast genome by ~100 genes, and of bacterial genomes by several hundreds of genes.
Collapse
Affiliation(s)
- Sara Rahiminejad
- Department of Bioengineering, University of California-San Diego, La Jolla, California, USA
| | - Bianca De Sanctis
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
- Department of Ecology and Evolutionary Biology, University of California-Santa Cruz, Santa Cruz, California, USA
| | - Pavel Pevzner
- Department of Computer Science and Engineering, University of California-San Diego, La Jolla, California, USA
| | - Arcady Mushegian
- Molecular and Cellular Biosciences Division, National Science Foundation, Alexandria, Virginia, USA
- Clare Hall College, Cambridge, United Kingdom
| |
Collapse
|
2
|
Dong H, Wang Y, Zhi T, Guo H, Guo Y, Liu L, Yin Y, Shi J, He B, Hu L, Jiang G. Construction of protein-protein interaction network in sulfate-reducing bacteria: Unveiling of global response to Hg. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2024; 351:124048. [PMID: 38714230 DOI: 10.1016/j.envpol.2024.124048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Revised: 04/20/2024] [Accepted: 04/23/2024] [Indexed: 05/09/2024]
Abstract
Sulfate-reducing bacteria (SRB) play pivotal roles in the biotransformation of mercury (Hg). However, unrevealed global responses of SRB to Hg have restricted our understanding of details of Hg biotransformation processes. The absence of protein-protein interaction (PPI) network under Hg stimuli has been a bottleneck of proteomic analysis for molecular mechanisms of Hg transformation. This study constructed the first comprehensive PPI network of SRB in response to Hg, encompassing 67 connected nodes, 26 independent nodes, and 121 edges, covering 93% of differentially expressed proteins from both previous studies and this study. The network suggested that proteomic changes of SRB in response to Hg occurred globally, including microbial metabolism in diverse environments, carbon metabolism, nucleic acid metabolism and translation, nucleic acid repair, transport systems, nitrogen metabolism, and methyltransferase activity, partial of which could cover the known knowledge. Antibiotic resistance was the original response revealed by this network, providing insights into of Hg biotransformation mechanisms. This study firstly provided the foundational network for a comprehensive understanding of SRB's responses to Hg, convenient for exploration of potential targets for Hg biotransformation. Furthermore, the network indicated that Hg enhances the metabolic activities and modification pathways of SRB to maintain cellular activities, shedding light on the influences of Hg on the carbon, nitrogen, and sulfur cycles at the cellular level.
Collapse
Affiliation(s)
- Hongzhe Dong
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, 100085, China; Sino-Danish College, University of Chinese Academy of Sciences, Beijing, 100049, China; Sino-Danish Centre for Education and Research, Beijing, 100049, China
| | - Yuchuan Wang
- Hebei Key Laboratory for Chronic Diseases, School of Basic Medical Sciences, North China University of Science and Technology, Tangshan, Hebei, 063210, China
| | - Tingting Zhi
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, 100085, China
| | - Hua Guo
- School of Environment, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310024, China
| | - Yingying Guo
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, 100085, China
| | - Lihong Liu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, 100085, China
| | - Yongguang Yin
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, 100085, China; School of Environment, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310024, China
| | - Jianbo Shi
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, 100085, China; School of Environment and Health, Jianghan University, Wuhan, 430056, China
| | - Bin He
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, 100085, China; School of Environment, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310024, China
| | - Ligang Hu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, 100085, China; Sino-Danish College, University of Chinese Academy of Sciences, Beijing, 100049, China; Sino-Danish Centre for Education and Research, Beijing, 100049, China; School of Environment, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310024, China.
| | - Guibin Jiang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, 100085, China; School of Environment, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310024, China
| |
Collapse
|
3
|
Ludwig J, Mrázek J. OrthoRefine: automated enhancement of prior ortholog identification via synteny. BMC Bioinformatics 2024; 25:163. [PMID: 38664637 PMCID: PMC11044567 DOI: 10.1186/s12859-024-05786-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 04/15/2024] [Indexed: 04/29/2024] Open
Abstract
BACKGROUND Identifying orthologs continues to be an early and imperative step in genome analysis but remains a challenging problem. While synteny (conservation of gene order) has previously been used independently and in combination with other methods to identify orthologs, applying synteny in ortholog identification has yet to be automated in a user-friendly manner. This desire for automation and ease-of-use led us to develop OrthoRefine, a standalone program that uses synteny to refine ortholog identification. RESULTS We developed OrthoRefine to improve the detection of orthologous genes by implementing a look-around window approach to detect synteny. We tested OrthoRefine in tandem with OrthoFinder, one of the most used software for identification of orthologs in recent years. We evaluated improvements provided by OrthoRefine in several bacterial and a eukaryotic dataset. OrthoRefine efficiently eliminates paralogs from orthologous groups detected by OrthoFinder. Using synteny increased specificity and functional ortholog identification; additionally, analysis of BLAST e-value, phylogenetics, and operon occurrence further supported using synteny for ortholog identification. A comparison of several window sizes suggested that smaller window sizes (eight genes) were generally the most suitable for identifying orthologs via synteny. However, larger windows (30 genes) performed better in datasets containing less closely related genomes. A typical run of OrthoRefine with ~ 10 bacterial genomes can be completed in a few minutes on a regular desktop PC. CONCLUSION OrthoRefine is a simple-to-use, standalone tool that automates the application of synteny to improve ortholog detection. OrthoRefine is particularly efficient in eliminating paralogs from orthologous groups delineated by standard methods.
Collapse
Affiliation(s)
- J Ludwig
- Institute of Bioinformatics, The University of Georgia, Athens, GA, 30602, USA.
| | - J Mrázek
- Department of Microbiology and Institute of Bioinformatics, The University of Georgia, Athens, GA, 30602, USA
| |
Collapse
|
4
|
McCartney N, Kondakath G, Tai A, Trimmer BA. Functional annotation of insecta transcriptomes: A cautionary tale from Lepidoptera. INSECT BIOCHEMISTRY AND MOLECULAR BIOLOGY 2024; 165:104038. [PMID: 37952902 DOI: 10.1016/j.ibmb.2023.104038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 10/30/2023] [Accepted: 11/07/2023] [Indexed: 11/14/2023]
Abstract
Functional annotation is a critical step in the analysis of genomic data, as it provides insight into the function of individual genes and the pathways in which they participate. Currently, there is no consensus on the best computational approach for assigning functional annotation. This study compares three functional annotation methods (BLAST, eggNOG-Mapper, and InterProScan) in their ability to assign Gene Ontology terms in two species of Insecta with differing levels of annotation, Bombyx mori and Manduca sexta. The methods were compared for their annotation coverage, number of term assignments, term agreement and non-overlapping terms. Here we show that there are large discrepancies in gene ontology term assignment among the three computational methods, which could lead to confounding interpretations of data and non-comparable results. This study provide insight into the strengths and weaknesses of each computational method and highlight the need for more standardized methods of functional annotation.
Collapse
Affiliation(s)
- Naya McCartney
- Department of Biology, Tufts University, 200 Boston Ave, Medford, MA, 02155, USA
| | - Gayathri Kondakath
- Department of Biology, Tufts University, 200 Boston Ave, Medford, MA, 02155, USA
| | - Albert Tai
- School of Medicine, Tufts University, 136 Harrison Ave, Boston, MA, 02111, USA
| | - Barry A Trimmer
- Department of Biology, Tufts University, 200 Boston Ave, Medford, MA, 02155, USA.
| |
Collapse
|
5
|
Hellmuth M, Stadler PF. The Theory of Gene Family Histories. Methods Mol Biol 2024; 2802:1-32. [PMID: 38819554 DOI: 10.1007/978-1-0716-3838-5_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Most genes are part of larger families of evolutionary-related genes. The history of gene families typically involves duplications and losses of genes as well as horizontal transfers into other organisms. The reconstruction of detailed gene family histories, i.e., the precise dating of evolutionary events relative to phylogenetic tree of the underlying species has remained a challenging topic despite their importance as a basis for detailed investigations into adaptation and functional evolution of individual members of the gene family. The identification of orthologs, moreover, is a particularly important subproblem of the more general setting considered here. In the last few years, an extensive body of mathematical results has appeared that tightly links orthology, a formal notion of best matches among genes, and horizontal gene transfer. The purpose of this chapter is to broadly outline some of the key mathematical insights and to discuss their implication for practical applications. In particular, we focus on tree-free methods, i.e., methods to infer orthology or horizontal gene transfer as well as gene trees, species trees, and reconciliations between them without using a priori knowledge of the underlying trees or statistical models for the inference of phylogenetic trees. Instead, the initial step aims to extract binary relations among genes.
Collapse
Affiliation(s)
- Marc Hellmuth
- Department of Mathematics, Faculty of Science, Stockholm University, Stockholm, Sweden
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, Leipzig University, Leipzig, Germany.
- Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany.
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany.
- Universidad Nacional de Colombia, Bogotá, Colombia.
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria.
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark.
- Santa Fe Institute, Santa Fe, NM, USA.
| |
Collapse
|
6
|
Klemm P, Stadler PF, Lechner M. Proteinortho6: pseudo-reciprocal best alignment heuristic for graph-based detection of (co-)orthologs. FRONTIERS IN BIOINFORMATICS 2023; 3:1322477. [PMID: 38152702 PMCID: PMC10751348 DOI: 10.3389/fbinf.2023.1322477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 11/06/2023] [Indexed: 12/29/2023] Open
Abstract
Proteinortho is a widely used tool to predict (co)-orthologous groups of genes for any set of species. It finds application in comparative and functional genomics, phylogenomics, and evolutionary reconstructions. With a rapidly increasing number of available genomes, the demand for large-scale predictions is also growing. In this contribution, we evaluate and implement major algorithmic improvements that significantly enhance the speed of the analysis without reducing precision. Graph-based detection of (co-)orthologs is typically based on a reciprocal best alignment heuristic that requires an all vs. all comparison of proteins from all species under study. The initial identification of similar proteins is accelerated by introducing an alternative search tool along with a revised search strategy-the pseudo-reciprocal best alignment heuristic-that reduces the number of required sequence comparisons by one-half. The clustering algorithm was reworked to efficiently decompose very large clusters and accelerate processing. Proteinortho6 reduces the overall processing time by an order of magnitude compared to its predecessor while maintaining its small memory footprint and good predictive quality.
Collapse
Affiliation(s)
- Paul Klemm
- Center for Synthetic Microbiology (SYNMIKRO), Philipps-Universität Marburg, Marburg, Germany
| | - Peter F. Stadler
- Bioinformatics Group, Institute of Computer Science and Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany
- Max-Planck-Institute for Mathematics in the Sciences, Leipzig, Germany
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria
- Facultad de Ciencias, Universidad National de Colombia, Bogotá, Colombia
- Santa Fe Institute, Santa Fe, NM, United States
| | - Marcus Lechner
- Center for Synthetic Microbiology (SYNMIKRO), Philipps-Universität Marburg, Marburg, Germany
| |
Collapse
|
7
|
Wang C, Ran F, Zang Y, Liu L, Wang D, Min Y. Genome-wide identification and expression analysis of heat shock protein gene family in cassava. THE PLANT GENOME 2023; 16:e20407. [PMID: 37899677 DOI: 10.1002/tpg2.20407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 10/12/2023] [Accepted: 10/12/2023] [Indexed: 10/31/2023]
Abstract
Heat shock proteins are important molecular chaperones that are involved in plant growth and stress responses. However, members of the Hsp family have been poorly studied in cassava. In this study, 225 MeHsp genes were identified in the cassava genome, and their genetic structures exhibited relatively conserved features within each subfamily. The 225 MeHsp genes showed random chromosomal distribution, and at least 74 pairs of segmentally duplicated MeHsp genes. Eleven tandemly duplicated MeHsp genes were identified. Cis-element analysis revealed the importance of MeHsps in plant adaptations to the environment. The prediction of protein interactions suggested that MeHsp70-20 may play a critical regulatory role in the interactive network. Furthermore, the expression profiles of MeHsps in different tissues and cell subsets were analyzed using bulk transcriptomics and single-cell transcriptomic data. Several subfamily genes exhibited unique expression patterns in the transcriptome and were selected for detailed analysis of the single-cell transcriptome. Quantitative real-time polymerase chain reaction (qRT-PCR) revealed the expression patterns of these genes under temperature stress, further supporting the prediction of cis-acting elements. This study provides valuable information for understanding the functional characteristics of MeHsp genes and the evolutionary relationships between MeHsps.
Collapse
Affiliation(s)
- Changyi Wang
- Department of Biotechnology, School of Life Sciences, Hainan University, Haikou, China
- Laboratory of Biopharmaceuticals and Molecular Pharmacology, School of Pharmaceutical Sciences, Hainan University, Haikou, China
| | - Fangfang Ran
- Department of Biotechnology, School of Life Sciences, Hainan University, Haikou, China
- Laboratory of Biopharmaceuticals and Molecular Pharmacology, School of Pharmaceutical Sciences, Hainan University, Haikou, China
| | - Yuwei Zang
- Department of Biotechnology, School of Life Sciences, Hainan University, Haikou, China
- Laboratory of Biopharmaceuticals and Molecular Pharmacology, School of Pharmaceutical Sciences, Hainan University, Haikou, China
| | - Liangwang Liu
- Department of Biotechnology, School of Life Sciences, Hainan University, Haikou, China
- Laboratory of Biopharmaceuticals and Molecular Pharmacology, School of Pharmaceutical Sciences, Hainan University, Haikou, China
| | - Dayong Wang
- Laboratory of Biopharmaceuticals and Molecular Pharmacology, School of Pharmaceutical Sciences, Hainan University, Haikou, China
- Key Laboratory of Tropical Biological Resources, Hainan University, Haikou, China
- One Health Cooperative Innovation Center, Hainan University, Haikou, China
| | - Yi Min
- Department of Biotechnology, School of Life Sciences, Hainan University, Haikou, China
- Laboratory of Biopharmaceuticals and Molecular Pharmacology, School of Pharmaceutical Sciences, Hainan University, Haikou, China
- One Health Cooperative Innovation Center, Hainan University, Haikou, China
| |
Collapse
|
8
|
Bernot JP, Owen CL, Wolfe JM, Meland K, Olesen J, Crandall KA. Major Revisions in Pancrustacean Phylogeny and Evidence of Sensitivity to Taxon Sampling. Mol Biol Evol 2023; 40:msad175. [PMID: 37552897 PMCID: PMC10414812 DOI: 10.1093/molbev/msad175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Revised: 06/14/2023] [Accepted: 06/19/2023] [Indexed: 08/10/2023] Open
Abstract
The clade Pancrustacea, comprising crustaceans and hexapods, is the most diverse group of animals on earth, containing over 80% of animal species and half of animal biomass. It has been the subject of several recent phylogenomic analyses, yet relationships within Pancrustacea show a notable lack of stability. Here, the phylogeny is estimated with expanded taxon sampling, particularly of malacostracans. We show small changes in taxon sampling have large impacts on phylogenetic estimation. By analyzing identical orthologs between two slightly different taxon sets, we show that the differences in the resulting topologies are due primarily to the effects of taxon sampling on the phylogenetic reconstruction method. We compare trees resulting from our phylogenomic analyses with those from the literature to explore the large tree space of pancrustacean phylogenetic hypotheses and find that statistical topology tests reject the previously published trees in favor of the maximum likelihood trees produced here. Our results reject several clades including Caridoida, Eucarida, Multicrustacea, Vericrustacea, and Syncarida. Notably, we find Copepoda nested within Allotriocarida with high support and recover a novel relationship between decapods, euphausiids, and syncarids that we refer to as the Syneucarida. With denser taxon sampling, we find Stomatopoda sister to this latter clade, which we collectively name Stomatocarida, dividing Malacostraca into three clades: Leptostraca, Peracarida, and Stomatocarida. A new Bayesian divergence time estimation is conducted using 13 vetted fossils. We review our results in the context of other pancrustacean phylogenetic hypotheses and highlight 15 key taxa to sample in future studies.
Collapse
Affiliation(s)
- James P Bernot
- Department of Invertebrate Zoology, US National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Christopher L Owen
- Systematic Entomology Laboratory, USDA-ARS, ℅ National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - Joanna M Wolfe
- Museum of Comparative Zoology and Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Kenneth Meland
- Department of Biology, University of Bergen, Bergen, Norway
| | - Jørgen Olesen
- Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
| | - Keith A Crandall
- Department of Invertebrate Zoology, US National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
- Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, George Washington University, Washington, DC, USA
| |
Collapse
|
9
|
Bano N, Fakhrah S, Lone RA, Mohanty CS, Bag SK. Genome-wide identification and expression analysis of the HD2 protein family and its response to drought and salt stress in Gossypium species. FRONTIERS IN PLANT SCIENCE 2023; 14:1109031. [PMID: 36860898 PMCID: PMC9968887 DOI: 10.3389/fpls.2023.1109031] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Accepted: 01/26/2023] [Indexed: 06/18/2023]
Abstract
Histone deacetylase 2 (HD2) proteins play an important role in the regulation of gene expression. This helps with the growth and development of plants and also plays a crucial role in responses to biotic and abiotic stress es. HD2s comprise a C2H2-type Zn2+ finger at their C-terminal and an HD2 label, deacetylation and phosphorylation sites, and NLS motifs at their N-terminal. In this study, a total of 27 HD2 members were identified, using Hidden Markov model profiles, in two diploid cotton genomes (Gossypium raimondii and Gossypium arboretum) and two tetraploid cotton genomes (Gossypium hirsutum and Gossypium barbadense). These cotton HD2 members were classified into 10 major phylogenetic groups (I-X), of which group III was found to be the largest with 13 cotton HD2 members. An evolutionary investigation showed that the expansion of HD2 members primarily occurred as a result of segmental duplication in paralogous gene pairs. Further qRT-PCR validation of nine putative genes using RNA-Seq data suggested that GhHDT3D.2 exhibits significantly higher levels of expression at 12h, 24h, 48h, and 72h of exposure to both drought and salt stress conditions compared to a control measure at 0h. Furthermore, gene ontology, pathways, and co-expression network study of GhHDT3D.2 gene affirmed their significance in drought and salt stress responses.
Collapse
Affiliation(s)
- Nasreen Bano
- Council of Scientific & Industrial Research (CSIR)-National Botanical Research Institute (CSIR-NBRI), Lucknow, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Shafquat Fakhrah
- Council of Scientific & Industrial Research (CSIR)-National Botanical Research Institute (CSIR-NBRI), Lucknow, India
- Department of Botany, University of Lucknow, Lucknow, India
| | - Rayees Ahmad Lone
- Council of Scientific & Industrial Research (CSIR)-National Botanical Research Institute (CSIR-NBRI), Lucknow, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Chandra Sekhar Mohanty
- Council of Scientific & Industrial Research (CSIR)-National Botanical Research Institute (CSIR-NBRI), Lucknow, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Sumit Kumar Bag
- Council of Scientific & Industrial Research (CSIR)-National Botanical Research Institute (CSIR-NBRI), Lucknow, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| |
Collapse
|
10
|
Graham AM, Barreto FS. Myxozoans (Cnidaria) do not Retain Key Oxygen-Sensing and Homeostasis Toolkit Genes. Genome Biol Evol 2023; 15:6989568. [PMID: 36648250 PMCID: PMC9887271 DOI: 10.1093/gbe/evad003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 01/03/2023] [Accepted: 01/09/2023] [Indexed: 01/18/2023] Open
Abstract
For aerobic organisms, both the hypoxia-inducible factor pathway and the mitochondrial genomes are key players in regulating oxygen homeostasis. Recent work has suggested that these mechanisms are not as highly conserved as previously thought, prompting more surveys across animal taxonomic levels, which would permit testing of hypotheses about the ecological conditions facilitating evolutionary loss of such genes. The Phylum Cnidaria is known to harbor wide variation in mitochondrial chromosome morphology, including an extreme example, in the Myxozoa, of mitochondrial genome loss. Because myxozoans are obligate endoparasites, frequently encountering hypoxic environments, we hypothesize that variation in environmental oxygen availability could be a key determinant in the evolution of metabolic gene networks associated with oxygen-sensing, hypoxia-response, and energy production. Here, we surveyed genomes and transcriptomes across 46 cnidarian species for the presence of HIF pathway members, as well as for an assortment of hypoxia, mitochondrial, and stress-response toolkit genes. We find that presence of the HIF pathway, as well as number of genes associated with mitochondria, hypoxia, and stress response, do not vary in parallel to mitochondrial genome morphology. More interestingly, we uncover evidence that myxozoans have lost the canonical HIF pathway repression machinery, potentially altering HIF pathway functionality to work under the specific conditions of their parasitic lifestyles. In addition, relative to other cnidarians, myxozoans show loss of large proportions of genes associated with the mitochondrion and involved in response to hypoxia and general stress. Our results provide additional evidence that the HIF regulatory machinery is evolutionarily labile and that variations in the canonical system have evolved in many animal groups.
Collapse
Affiliation(s)
| | - Felipe S Barreto
- Department of Integrative Biology, Oregon State University, Corvallis, Oregon
| |
Collapse
|
11
|
Duan G, Wu G, Chen X, Tian D, Li Z, Sun Y, Du Z, Hao L, Song S, Gao Y, Xiao J, Zhang Z, Bao Y, Tang B, Zhao W. HGD: an integrated homologous gene database across multiple species. Nucleic Acids Res 2022; 51:D994-D1002. [PMID: 36318261 PMCID: PMC9825607 DOI: 10.1093/nar/gkac970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/28/2022] [Accepted: 10/17/2022] [Indexed: 11/06/2022] Open
Abstract
Homology is fundamental to infer genes' evolutionary processes and relationships with shared ancestry. Existing homolog gene resources vary in terms of inferring methods, homologous relationship and identifiers, posing inevitable difficulties for choosing and mapping homology results from one to another. Here, we present HGD (Homologous Gene Database, https://ngdc.cncb.ac.cn/hgd), a comprehensive homologs resource integrating multi-species, multi-resources and multi-omics, as a complement to existing resources providing public and one-stop data service. Currently, HGD houses a total of 112 383 644 homologous pairs for 37 species, including 19 animals, 16 plants and 2 microorganisms. Meanwhile, HGD integrates various annotations from public resources, including 16 909 homologs with traits, 276 670 homologs with variants, 398 573 homologs with expression and 536 852 homologs with gene ontology (GO) annotations. HGD provides a wide range of omics gene function annotations to help users gain a deeper understanding of gene function.
Collapse
Affiliation(s)
| | | | - Xiaoning Chen
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Dongmei Tian
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Zhaohua Li
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yanling Sun
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Zhenglin Du
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Lili Hao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Shuhui Song
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yuan Gao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jingfa Xiao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhang Zhang
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yiming Bao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Bixia Tang
- Correspondence may also be addressed to Bixia Tang.
| | - Wenming Zhao
- To whom correspondence should be addressed. Tel: +86 1084097636; Fax: +86 1084097720;
| |
Collapse
|
12
|
Hajibarat Z, Saidi A, Gorji AM, Zeinalabedini M, Ghaffari MR, Hajibarat Z, Nasrollahi A. Identification of myosin genes and their expression in response to biotic (PVY, PVX, PVS, and PVA) and abiotic (Drought, Heat, Cold, and High-light) stress conditions in potato. Mol Biol Rep 2022; 49:11983-11996. [PMID: 36271979 DOI: 10.1007/s11033-022-08007-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 10/04/2022] [Indexed: 10/24/2022]
Abstract
BACKGROUND Plant organelles are highly motile where their movement is significant for fast distribution of material around the cell, facilitation of the plant's ability to respond to abiotic and biotic signals, and for appropriate growth. Abiotic and biotic stresses are among the major factors limiting crop yields, and biological membranes are the first target of these stresses. Plants utilize adaptive mechanisms namely myosin to repair injured membranes following exposure to abiotic and biotic stresses. OBJECTIVE Due to the economic importance and cultivation of potato grown under abiotic and biotic stress prone areas, identification and characterization of myosin family members in potato were performed in the present research. METHODS To identify the myosin genes in potato, we performed genome-wide analysis of myosin genes in the S. tuberosum genome using the phytozome. All putative sequences were approved with the interproscan. Bioinformatics analysis was conducted using phylogenetic tree, gene structure, cis-regulatory elements, protein-protein interaction, and gene expression. RESULT The majority of the cell machinery contain actin cytoskeleton and myosins, where motility of organelles are dependent on them. Homology-based analysis was applied to determine seven myosin genes in the potato genome. The members of myosin could be categorized into two groups (XI and VIII). Some of myosin proteins were sub-cellularly located in the nucleus containing 71.5% of myosin proteins and other myosin proteins were localized in the mitochondria, plasma-membrane, and cytoplasm. Determination of co-expressed network, promoter analysis, and gene structure were also performed and gene expression pattern of each gene was surveyed. Number of introns in the gene family members varied from 1 to 39. Gene expression analysis demonstrated that StMyoXI-B and StMyoVIII-2 had the highest transcripts, induced by biotic and abiotic stresses in all three tissues of stem, root, and leaves, respectively. Overall, different cis-elements including abiotic and biotic responsive, hormonal responsive, light responsive, defense responsive elements were found in the myosin promoter sequences. Among the cis-elements, the MYB, G-box, ABRE, JA, and SA contributed the most in the plant growth and development, and in response to abiotic and biotic stress conditions. CONCLUSION Our results showed that myosin genes can be utilized in breeding programs and genetic engineering of plants with the aim of increasing tolerance to abiotic and biotic stresses, especially to viral stresses such as PVY, PVX, PVA, PVS, high light, drought, cold and heat.
Collapse
Affiliation(s)
- Zahra Hajibarat
- Department of Plant Sciences and Biotechnology, Faculty of Life Sciences and Biotechnology, Shahid Beheshti University, Tehran, Iran
| | - Abbas Saidi
- Department of Plant Sciences and Biotechnology, Faculty of Life Sciences and Biotechnology, Shahid Beheshti University, Tehran, Iran.
| | - Ahmad Mosuapour Gorji
- Department of Vegetable Research, Seed and Plant Improvement Institute (SPII), Agricultural Research, Education and Extension Organization (AREEO), Karaj, Iran
| | - Mehrshad Zeinalabedini
- Agricultural Biotechnology Research Institute of Iran, Agricultural Research, Education and Extension Organization (AREEO), Karaj, Iran
| | - Mohammad Reza Ghaffari
- Agricultural Biotechnology Research Institute of Iran, Agricultural Research, Education and Extension Organization (AREEO), Karaj, Iran
| | - Zohreh Hajibarat
- Department of Plant Sciences and Biotechnology, Faculty of Life Sciences and Biotechnology, Shahid Beheshti University, Tehran, Iran
| | - Ali Nasrollahi
- Department of Vegetable Research, Seed and Plant Improvement Institute (SPII), Agricultural Research, Education and Extension Organization (AREEO), Karaj, Iran
| |
Collapse
|
13
|
BuscoPhylo: a webserver for Busco-based phylogenomic analysis for non-specialists. Sci Rep 2022; 12:17352. [PMID: 36253435 PMCID: PMC9576783 DOI: 10.1038/s41598-022-22461-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 10/14/2022] [Indexed: 01/10/2023] Open
Abstract
Here we present the BuscoPhylo tool that enables both students and established scientists to easily perform Busco-based phylogenomic analysis starting from a set of genomes sequences. BuscoPhylo is an efficient and user-friendly web server freely accessible at https://buscophylo.inra.org.ma/ . The source code, along with documentation, is freely available under an MIT license at https://github.com/alaesahbou/BuscoPhylo .
Collapse
|
14
|
Escorcia-Rodríguez JM, Esposito M, Freyre-González JA, Moreno-Hagelsieb G. Non-synonymous to synonymous substitutions suggest that orthologs tend to keep their functions, while paralogs are a source of functional novelty. PeerJ 2022; 10:e13843. [PMID: 36065404 PMCID: PMC9440661 DOI: 10.7717/peerj.13843] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 07/14/2022] [Indexed: 01/18/2023] Open
Abstract
Orthologs separate after lineages split from each other and paralogs after gene duplications. Thus, orthologs are expected to remain more functionally coherent across lineages, while paralogs have been proposed as a source of new functions. Because protein functional divergence follows from non-synonymous substitutions, we performed an analysis based on the ratio of non-synonymous to synonymous substitutions (dN/dS), as proxy for functional divergence. We used five working definitions of orthology, including reciprocal best hits (RBH), among other definitions based on network analyses and clustering. The results showed that orthologs, by all definitions tested, had values of dN/dS noticeably lower than those of paralogs, suggesting that orthologs generally tend to be more functionally stable than paralogs. The differences in dN/dS ratios remained suggesting the functional stability of orthologs after eliminating gene comparisons with potential problems, such as genes with high codon usage biases, low coverage of either of the aligned sequences, or sequences with very high similarities. Separation by percent identity of the encoded proteins showed that the differences between the dN/dS ratios of orthologs and paralogs were more evident at high sequence identity, less so as identity dropped. The last results suggest that the differences between dN/dS ratios were partially related to differences in protein identity. However, they also suggested that paralogs undergo functional divergence relatively early after duplication. Our analyses indicate that choosing orthologs as probably functionally coherent remains the right approach in comparative genomics.
Collapse
Affiliation(s)
- Juan M. Escorcia-Rodríguez
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autonóma de México, Cuernavaca, Morelos, México
| | - Mario Esposito
- Department of Biology, Wilfrid Laurier University, Waterloo, Canada
| | - Julio A. Freyre-González
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autonóma de México, Cuernavaca, Morelos, México
| | | |
Collapse
|
15
|
Effect of the PmARF6 Gene from Masson Pine (Pinus massoniana) on the Development of Arabidopsis. Genes (Basel) 2022; 13:genes13030469. [PMID: 35328022 PMCID: PMC8949783 DOI: 10.3390/genes13030469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 03/01/2022] [Accepted: 03/03/2022] [Indexed: 11/16/2022] Open
Abstract
Masson pine (Pinus massoniana) is a core industrial tree species that is used for afforestation in southern China. Previous studies have shown that Auxin Response Factors (ARFs) are involved in the growth and development of various species, but the function of ARFs in Masson pine is unclear. In this research, we cloned and identified Masson pine ARF6 cDNA (PmARF6). The results showed that PmARF6 encodes a protein of 681 amino acids that is highly expressed in female flowers. Subcellular analysis showed that the PmARF6 protein occurred predominantly in the nucleus and cytomembrane of Masson pine cells. Compared with wild-type (WT) Arabidopsis, transgenic Arabidopsis plants overexpressing PmARF6 had fewer rosette leaves, and their flower development was slower. These results suggest that overexpression of PmARF6 may inhibit the flower and leaf development of Masson pine and provide new insights into the underlying developmental mechanism.
Collapse
|
16
|
Canário Viana MV, Profeta R, Cerqueira JC, Wattam AR, Barh D, Silva A, Azevedo V. Evidence of episodic positive selection in Corynebacterium diphtheriae complex of species and its implementations in identification of drug and vaccine targets. PeerJ 2022; 10:e12662. [PMID: 35190783 PMCID: PMC8857904 DOI: 10.7717/peerj.12662] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 11/30/2021] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Within the pathogenic bacterial species Corynebacterium genus, six species that can produce diphtheria toxin (C. belfantii, C. diphtheriae, C. pseudotuberculosis, C. rouxii, C. silvaticum and C. ulcerans) form a clade referred to as the C. diphtheria complex. These species have been found in humans and other animals, causing diphtheria or other diseases. Here we show the results of a genome scale analysis to identify positive selection in protein-coding genes that may have resulted in the adaptations of these species to their ecological niches and suggest drug and vaccine targets. METHODS Forty genomes were sampled to represent species, subspecies or biovars of Corynebacterium. Ten phylogenetic groups were tested for positive selection using the PosiGene pipeline, including species and biovars from the C. diphtheria complex. The detected genes were tested for recombination and had their sequences alignments and homology manually examined. The final genes were investigated for their function and a probable role as vaccine or drug targets. RESULTS Nineteen genes were detected in the species C. diphtheriae (two), C. pseudotuberculosis (10), C. rouxii (one), and C. ulcerans (six). Those were found to be involved in defense, translation, energy production, and transport and in the metabolism of carbohydrates, amino acids, nucleotides, and coenzymes. Fourteen were identified as essential genes, and six as virulence factors. Thirteen from the 19 genes were identified as potential drug targets and four as potential vaccine candidates. These genes could be important in the prevention and treatment of the diseases caused by these bacteria.
Collapse
Affiliation(s)
- Marcus Vinicius Canário Viana
- Departamento de Genética, Ecologia e Evolução, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil,Departamento de Genética, Universidade Federal do Pará, Belém, Pará, Brazil
| | - Rodrigo Profeta
- Departamento de Genética, Ecologia e Evolução, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Janaína Canário Cerqueira
- Departamento de Genética, Ecologia e Evolução, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Alice Rebecca Wattam
- Biocomplexity Institute, University of Virginia, Charlottesville, Virginia, United States
| | - Debmalya Barh
- Departamento de Genética, Ecologia e Evolução, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil,Institute of Integrative Omics and Applied Biotechnology, Nonakuri, West Bengal, India
| | - Artur Silva
- Departamento de Genética, Universidade Federal do Pará, Belém, Pará, Brazil
| | - Vasco Azevedo
- Departamento de Genética, Ecologia e Evolução, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| |
Collapse
|
17
|
Hajibarat Z, Saidi A, Zeinalabedini M, Gorji AM, Ghaffari MR, Shariati V, Ahmadvand R. Genome-wide identification of StU-box gene family and assessment of their expression in developmental stages of Solanum tuberosum. JOURNAL OF GENETIC ENGINEERING AND BIOTECHNOLOGY 2022; 20:25. [PMID: 35147812 PMCID: PMC8837765 DOI: 10.1186/s43141-022-00306-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Accepted: 01/19/2022] [Indexed: 12/16/2022]
Abstract
Background The Plant U-box (PUB), ubiquitin ligase gene, has a highly conserved domain in potato. However, little information is available about U-box genes in potato (Solanum tuberosum). In this study, 62 U-box genes were detected in the potato genome using bioinformatics methods. Further, motif analysis, gene structure, gene expression, TFBS, and synteny analysis were performed on the U-box genes. Results Based on in silico analysis, most of StU-boxs included a U-box domain; however, some of them lacked harbored domain the ARM, Pkinase_Tyr, and other domains. Based on their phylogenetic relationships, the StU-box family members were categorized into four classes. Analysis of transcription factor binding sites (TFBS) in the promoter region of StU-box genes revealed that StU-box genes had the highest and the lowest number of TFBS in MYB and CSD, respectively. Moreover, based on in silico and gene expression data, variable frequencies of TFBS in StU-box genes could indicate that these genes control different developmental stages and are involved in complex regulatory mechanisms. The number of exons in U-box genes ranged from one to sixteen. For most U-box genes, the exon–intron compositions and conserved motifs composition in most proteins in each group were similar. The intron–exon patterns and the composition of conserved motifs validated the U-box genes phylogenetic classification. Based on the results of genome distribution, StU-box genes were distributed unevenly on the 12 S. tuberosum chromosomes. The results showed that gene duplication may possess a significant role in genome expansion of S. tuberosum. Furthermore, genome evolution of S. tuberosum was surveyed using identification of orthologous and paralogous. We identified 40 orthologous gene pairs between S. tuberosum with Solanum lycopersicum, Oryza sativa, Triticum aestivum, Gossypium hirsutum, Zea maize, Coriaria mytifolia, and Arabidopsis thaliana as well as eight duplicated genes (paralogous) in S. tuberosum. StU-box 51 gene is one of the important gene among other StU-boxes in S. tuberosum under drought stress which was expressed in tuber and leaf under drought stress. Furthermore, StU-box 51 gene has the highest expression levels in four tissue-specific (stem, root, leaf, and tuber) in potato as well as it had the highest number of TFBS in promoter region. Based on our results, StU-box 51 can introduce to researcher to utilize in breeding program and genetic engineering in potato. Conclusions The results of this survey will be useful for further investigation of the probable role and molecular mechanisms of U-box genes in response to different stresses. Supplementary Information The online version contains supplementary material available at 10.1186/s43141-022-00306-7.
Collapse
Affiliation(s)
- Zahra Hajibarat
- Department of Plant Sciences and Biotechnology, Faculty of Life Sciences and Biotechnology, Shahid Beheshti University, Tehran, Iran
| | - Abbas Saidi
- Department of Plant Sciences and Biotechnology, Faculty of Life Sciences and Biotechnology, Shahid Beheshti University, Tehran, Iran.
| | - Mehrshad Zeinalabedini
- Department of Systems and Synthetic Biology, Agricultural Biotechnology Research Institute of Iran, Karaj, Iran.,Agricultural Research, Education and Extension Organization (AREEO), Karaj, Iran
| | - Ahmad Mosuapour Gorji
- Department of Vegetable Research, Seed and Plant Improvement Institute (SPII), Agricultural Research, Education and Extension Organization (AREEO), Karaj, Iran
| | - Mohammad Reza Ghaffari
- Department of Systems and Synthetic Biology, Agricultural Biotechnology Research Institute of Iran, Karaj, Iran.,Agricultural Research, Education and Extension Organization (AREEO), Karaj, Iran
| | - Vahid Shariati
- NIGEB Genome Center, National Institute of Genetic Engineering and Biotechnology, Tehran, Iran
| | - Rahim Ahmadvand
- Department of Vegetable Research, Seed and Plant Improvement Institute (SPII), Agricultural Research, Education and Extension Organization (AREEO), Karaj, Iran
| |
Collapse
|
18
|
Ji M, Sun K, Fang H, Zhuang Z, Chen H, Chen Q, Cao Z, Wang Y, Ditta A, Khan MKR, Wang K, Wang B. Genome-wide identification and characterization of the CLASP_N gene family in upland cotton ( Gossypium hirsutum L.). PeerJ 2022; 10:e12733. [PMID: 35036102 PMCID: PMC8734470 DOI: 10.7717/peerj.12733] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 12/12/2021] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Cytoplasmic linker-associated proteins (CLASPs) are tubule proteins that can bind to microtubules and participate in regulating the structure and function of microtubules, which significantly affects the development and growth of plants. These proteins have been identified in Arabidopsis; however, little research has been performed in upland cotton. METHODS In this study, the whole genome of the CLASP_N family was analyzed to provide theoretical support for the function of this gene family in the development of upland cotton fiber. Bioinformatics was used to analyze the family characteristics of CLASP_N in upland cotton, such as member identification, sequence characteristics, conserved domain structure and coevolutionary relationships. Real-time fluorescent quantitative PCR (qRT-PCR) was used to clarify the expression pattern of the upland cotton CLASP_N gene family in cotton fiber. RESULTS At the genome-wide level, we identified 16 upland cotton CLASP_N genes. A chromosomal localization analysis revealed that these 16 genes were located on 13 chromosomes. The motif results showed that all CLASP_N proteins have the CLASP_N domain. Gene structure analysis showed that the structure and length of exons and introns were consistent in the subgroups. In the evolutionary analysis with other species, the gene family clearly diverged from the other species in the evolutionary process. A promoter sequence analysis showed that this gene family contains a large number of cis-acting elements related to a variety of plant hormones. qRT-PCR was used to clarify the expression pattern of the upland cotton CLASP_N gene family in cotton fiber and leaves, and Gh210800 was found to be highly expressed in the later stages of fiber development. The results of this study provide a foundation for further research on the molecular role of the CLASP_N genes in cotton fiber development.
Collapse
Affiliation(s)
- Meijun Ji
- School of Life Sciences, Nantong University, Nantong, Jiangsu, China
| | - Kangtai Sun
- School of Life Sciences, Nantong University, Nantong, Jiangsu, China
| | - Hui Fang
- School of Life Sciences, Nantong University, Nantong, Jiangsu, China
| | - Zhimin Zhuang
- School of Life Sciences, Nantong University, Nantong, Jiangsu, China
| | - Haodong Chen
- Cotton Sciences Research Institute of Hunan/ National Hybrid Cotton Research Promotion Center, Changde, Hunan, China
| | - Qi Chen
- School of Life Sciences, Nantong University, Nantong, Jiangsu, China
| | - Ziyi Cao
- School of Life Sciences, Nantong University, Nantong, Jiangsu, China
| | - Yiting Wang
- School of Life Sciences, Nantong University, Nantong, Jiangsu, China
| | - Allah Ditta
- Plant Breeding and Genetics Division, Nuclear Institute for Agriculture and Biology, Faisalabad, Pakistan
| | - Muhammad Kashif Riaz Khan
- Plant Breeding and Genetics Division, Nuclear Institute for Agriculture and Biology, Faisalabad, Pakistan
| | - Kai Wang
- School of Life Sciences, Nantong University, Nantong, Jiangsu, China
| | - Baohua Wang
- School of Life Sciences, Nantong University, Nantong, Jiangsu, China
| |
Collapse
|
19
|
Vazquez JM, Pena MT, Muhammad B, Kraft M, Adams LB, Lynch VJ. Parallel evolution of reduced cancer risk and tumor suppressor duplications in Xenarthra. eLife 2022; 11:82558. [PMID: 36480266 PMCID: PMC9810328 DOI: 10.7554/elife.82558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 12/07/2022] [Indexed: 12/14/2022] Open
Abstract
The risk of developing cancer is correlated with body size and lifespan within species, but there is no correlation between cancer and either body size or lifespan between species indicating that large, long-lived species have evolved enhanced cancer protection mechanisms. Previously we showed that several large bodied Afrotherian lineages evolved reduced intrinsic cancer risk, particularly elephants and their extinct relatives (Proboscideans), coincident with pervasive duplication of tumor suppressor genes (Vazquez and Lynch, 2021). Unexpectedly, we also found that Xenarthrans (sloths, armadillos, and anteaters) evolved very low intrinsic cancer risk. Here, we show that: (1) several Xenarthran lineages independently evolved large bodies, long lifespans, and reduced intrinsic cancer risk; (2) the reduced cancer risk in the stem lineages of Xenarthra and Pilosa coincided with bursts of tumor suppressor gene duplications; (3) cells from sloths proliferate extremely slowly while Xenarthran cells induce apoptosis at very low doses of DNA damaging agents; and (4) the prevalence of cancer is extremely low Xenarthrans, and cancer is nearly absent from armadillos. These data implicate the duplication of tumor suppressor genes in the evolution of remarkably large body sizes and decreased cancer risk in Xenarthrans and suggest they are a remarkably cancer-resistant group of mammals.
Collapse
Affiliation(s)
- Juan Manuel Vazquez
- Department of Integrative Biology, Valley Life Sciences, University of California, BerkeleyBerkeleyUnited States
| | - Maria T Pena
- United States Department of Health and Human Services, Health Resources and Services Administration, Health Systems Bureau, National Hansen's Disease ProgramBaton RougeUnited States
| | - Baaqeyah Muhammad
- Department of Biological Sciences, University at Buffalo, SUNYBuffaloUnited States
| | - Morgan Kraft
- Department of Biological Sciences, University at Buffalo, SUNYBuffaloUnited States
| | - Linda B Adams
- United States Department of Health and Human Services, Health Resources and Services Administration, Health Systems Bureau, National Hansen's Disease ProgramBaton RougeUnited States
| | - Vincent J Lynch
- Department of Biological Sciences, University at Buffalo, SUNYBuffaloUnited States
| |
Collapse
|
20
|
Amaral DT, Romeiro-Brito M, Bonatelli IAS. Exploring Phylogenetic Relationships and Divergence Times of Bioluminescent Species Using Genomic and Transcriptomic Data. Methods Mol Biol 2022; 2525:409-423. [PMID: 35836087 DOI: 10.1007/978-1-0716-2473-9_32] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Next-generation sequencing (NGS) has dominated the scene of genomics and evolutionary biology as a great amount of genomic data have been accumulated for a diverse set of species. At the same time, phylogenetic approaches and programs are in development to allow better use of such large-size datasets. Phylogenomics appears as a promising field to accommodate and explore all the information of NGS data in phylogenetic methods, being an important approach to investigate the evolution of bioluminescence in different organisms. To guarantee accurate results in phylogenomic studies, it is mandatory to correctly identify orthologous genes in phylogenetic reconstruction. Here, we show a simplified step-by-step framework to perform phylogenetic analysis along with divergence time estimation, beginning with an orthologous search. As empirical data, we exemplify transcriptome sequences of six species of the Elateroidea superfamily (Coleoptera). We introduce several bioinformatics tools for handling genomic data, especially those available in the software OrthoFinder, IQTREE, BEAST2, and TreePL.
Collapse
Affiliation(s)
- Danilo T Amaral
- Departamento de Biologia, Centro de Ciências Humanas e Biológicas, Universidade Federal de São Carlos (UFSCar), Sorocaba, Brazil.
- Programa de Pós Graduação em Biologia Comparada, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo (USP), Ribeirão Preto, Brazil.
| | - Monique Romeiro-Brito
- Departamento de Biologia, Centro de Ciências Humanas e Biológicas, Universidade Federal de São Carlos (UFSCar), Sorocaba, Brazil
| | - Isabel A S Bonatelli
- Departamento de Ecologia e Biologia Evolutiva, Universidade Federal de São Paulo (UNIFESP), Diadema, São Paulo, Brazil
| |
Collapse
|
21
|
Ali F, Li Y, Li F, Wang Z. Genome-wide characterization and expression analysis of cystathionine β-synthase genes in plant development and abiotic stresses of cotton (Gossypium spp.). Int J Biol Macromol 2021; 193:823-837. [PMID: 34687765 DOI: 10.1016/j.ijbiomac.2021.10.079] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2021] [Revised: 10/09/2021] [Accepted: 10/11/2021] [Indexed: 11/20/2022]
Abstract
Cystathionine β-synthase (CBS) domains containing proteins (CDCPs) form a large family and play roles in development via regulation of the thioredoxin system as well as abiotic and biotic stress responses of plant. However, the comprehensive study of CBS genes remained elusive in cotton. Here, we identified 237 CBS genes in 11 plant species and the phylogenetic analysis categorized CBS genes into four groups. Whole-genome or segmental with dispersed duplication events contributed to GhCBS gene family expansion. Moreover, orthologous/paralogous genes among three cotton species (G. hirsutum, G. arboreum, and G. raimondii) were detected from the syntenic map among eight plant species. Strong purifying selection for dicotyledonous and monocotyledonous CBS genes, and cis-elements related to plant growth and development, abiotic and hormonal response were observed. Transcriptomic data and qRT-PCR validation of 12 GhCBS genes indicated their critical role in ovule development as most of the genes showed high enrichment. Further, some of GhCBS (GhCBS5, GhCBS16, GhCBS17, GhCBS24, GhCBS25, GhCBS26, and GhCBS52) genes were regulated under various abiotic and hormonal treatments for different time points and involve in ovule and fiber development which provided key genes for future cotton breeding programs. In addition, transgenic tobacco plants overexpressing GhCBS4 transiently exhibited higher water and chlorophyll content indicating improved tolerance toward drought stress. Overall, this study provides the characterization of GhCBS genes for plant growth, abiotic and hormonal stresses, thereby, intimating their significance in cotton molecular breeding for resistant cultivars.
Collapse
Affiliation(s)
- Faiza Ali
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, 450001 Zhengzhou, China
| | - Yonghui Li
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, 450001 Zhengzhou, China
| | - Fuguang Li
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, 450001 Zhengzhou, China; State Key Laboratory of Cotton Biology, Key Laboratory of Biological and Genetic Breeding of Cotton, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, China.
| | - Zhi Wang
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, 450001 Zhengzhou, China; State Key Laboratory of Cotton Biology, Key Laboratory of Biological and Genetic Breeding of Cotton, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, China.
| |
Collapse
|
22
|
Wollenberg Valero KC, Garcia-Porta J, Irisarri I, Feugere L, Bates A, Kirchhof S, Jovanović Glavaš O, Pafilis P, Samuel SF, Müller J, Vences M, Turner AP, Beltran-Alvarez P, Storey KB. Functional genomics of abiotic environmental adaptation in lacertid lizards and other vertebrates. J Anim Ecol 2021; 91:1163-1179. [PMID: 34695234 DOI: 10.1111/1365-2656.13617] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Accepted: 09/27/2021] [Indexed: 11/27/2022]
Abstract
Understanding the genomic basis of adaptation to different abiotic environments is important in the context of climate change and resulting short-term environmental fluctuations. Using functional and comparative genomics approaches, we here investigated whether signatures of genomic adaptation to a set of environmental parameters are concentrated in specific subsets of genes and functions in lacertid lizards and other vertebrates. We first identify 200 genes with signatures of positive diversifying selection from transcriptomes of 24 species of lacertid lizards and demonstrate their involvement in physiological and morphological adaptations to climate. To understand how functionally similar these genes are to previously predicted candidate functions for climate adaptation and to compare them with other vertebrate species, we then performed a meta-analysis of 1,100 genes under selection obtained from -omics studies in vertebrate species adapted to different abiotic factors. We found that the vertebrate gene set formed a tightly connected interactome, which was to 23% enriched in previously predicted functions of adaptation to climate, and to a large part (18%) involved in organismal stress response. We found a much higher degree of identical genes being repeatedly selected among different animal groups (43.6%), and of functional similarity and post-translational modifications than expected by chance, and no clear functional division between genes used for ectotherm and endotherm physiological strategies. In total, 171 out of 200 genes of Lacertidae were part of this network. These results highlight an important role of a comparatively small set of genes and their functions in environmental adaptation and narrow the set of candidate pathways and markers to be used in future research on adaptation and stress response related to climate change.
Collapse
Affiliation(s)
| | - Joan Garcia-Porta
- Department of Biology, Washington University in St. Louis, St. Louis, MO, USA
| | - Iker Irisarri
- Department of Applied Bioinformatics, Institute for Microbiology and Genetics, University of Göttingen, Göttingen, Germany.,Campus Institut Data Science (CIDAS), Göttingen, Germany
| | - Lauric Feugere
- Department of Biological and Marine Sciences, University of Hull, Kingston-Upon-Hull, UK
| | - Adam Bates
- Department of Biological and Marine Sciences, University of Hull, Kingston-Upon-Hull, UK
| | - Sebastian Kirchhof
- Museum für Naturkunde, Leibniz Institute for Evolution and Biodiversity Science, Berlin, Germany.,New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
| | | | - Panayiotis Pafilis
- Section of Zoology and Marine Biology, Department of Biology, National and Kapodistrian University of Athens, Athens, Greece
| | - Sabrina F Samuel
- Department of Biomedical Sciences, University of Hull, Kingston-Upon-Hull, UK
| | - Johannes Müller
- Museum für Naturkunde, Leibniz Institute for Evolution and Biodiversity Science, Berlin, Germany
| | - Miguel Vences
- Zoological Institute, Braunschweig University of Technology, Braunschweig, Germany
| | - Alexander P Turner
- Department of Computer Science, University of Nottingham, Nottingham, UK
| | | | | |
Collapse
|
23
|
Genome wide identification of StKNOX gene family and characterization of their expression in Solanum tuberosum. BIOCATALYSIS AND AGRICULTURAL BIOTECHNOLOGY 2021. [DOI: 10.1016/j.bcab.2021.102160] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
24
|
Wang R, Liu L, Kong Z, Li S, Lu L, Chen G, Zhang J, Qanmber G, Liu Z. Identification of GhLOG gene family revealed that GhLOG3 is involved in regulating salinity tolerance in cotton (Gossypium hirsutum L.). PLANT PHYSIOLOGY AND BIOCHEMISTRY : PPB 2021; 166:328-340. [PMID: 34147725 DOI: 10.1016/j.plaphy.2021.06.011] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 06/08/2021] [Indexed: 06/12/2023]
Abstract
Cytokinin (CK) is an important plant hormone that promotes plant cell division and differentiation, and participates in salt response under osmotic stress. LOGs (LONELY GUY) are CK-activating enzymes involved in CK synthesis. The LOG gene family has not been comprehensively characterized in cotton. In this study we identified 151 LOG genes from nine plant species, including 28 LOG genes in Gossypium hirsutum. Phylogenetic analysis divided LOG genes into three groups. Exon/intron structures and protein motifs of GhLOG genes were highly conserved. Synteny analysis revealed that several gene loci were highly conserved between the A and D sub-genomes of G. hirsutum with purifying selection pressure during evolution. Expression profiles showed that most LOG genes were constitutively expressed in eight different tissues. Furthermore, LOG genes can be regulated by abiotic stresses and phytohormone treatments. Moreover, subcellular localization revealed that GhLOG3_At resides inside the cell membrane. Overexpression of GhLOG3 enhanced salt tolerance in Arabidopsis. Virus-induced gene silencing (VIGS) of GhLOG3_At in cotton enhanced sensitivity of plants to salt stress with increased H2O2 contents and decreased chlorophyll and proline (PRO) activity. Our results suggested that GhLOG3_At induces salt stress tolerance in cotton, and provides a basis for the use of CK synthesis genes to regulate cotton growth and stress resistance.
Collapse
Affiliation(s)
- Rong Wang
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, Henan, China.
| | - Le Liu
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, Henan, China.
| | - Zhaosheng Kong
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, Henan, China; State Key Laboratory of Plant Genomics, Institute of Microbiology, Academy of Seed Design, Chinese Academy of Sciences, Beijing, China.
| | - Shengdong Li
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, Henan, China.
| | - Lili Lu
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China.
| | - Guoquan Chen
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, Henan, China.
| | - Jiaxin Zhang
- Saint John Paul the Great Catholic High School, Dumfries, VA, 22172, USA.
| | - Ghulam Qanmber
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China.
| | - Zhao Liu
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, Henan, China.
| |
Collapse
|
25
|
Saidi A, Hajibarat Z, Hajibarat Z. Phylogeny, gene structure and GATA genes expression in different tissues of solanaceae species. BIOCATALYSIS AND AGRICULTURAL BIOTECHNOLOGY 2021. [DOI: 10.1016/j.bcab.2021.102015] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
26
|
Wang J, Sheng J, Zhu J, Hu Z, Diao Y. Comparative transcriptome analysis and identification of candidate adaptive evolution genes of Miscanthus lutarioriparius and Miscanthus sacchariflorus. PHYSIOLOGY AND MOLECULAR BIOLOGY OF PLANTS : AN INTERNATIONAL JOURNAL OF FUNCTIONAL PLANT BIOLOGY 2021; 27:1499-1512. [PMID: 34366592 PMCID: PMC8295449 DOI: 10.1007/s12298-021-01030-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Revised: 06/24/2021] [Accepted: 06/27/2021] [Indexed: 06/13/2023]
Abstract
UNLABELLED Miscanthus species are perennial C4 grasses that are considered promising energy crops because of their high biomass yields, excellent adaptability and low management costs. Miscanthus lutarioriparius and Miscanthus sacchariflorus are closely related subspecies that are distributed in different habitats. However, there are only a few reports on the mechanisms by which Miscanthus adapts to different environments. Here, comparative transcriptomic and morphological analyses were used to study the evolutionary adaptation of M. lutarioriparius and M. sacchariflorus to different habitats. In total, among 7586 identified orthologs, 2060 orthologs involved in phenylpropanoid biosynthesis and plant hormones were differentially expressed between the two species. Through an analysis of the Ka/Ks ratios of the orthologs, we estimated that the divergence time between the two species was approximately 4.37 Mya. In addition, 37 candidate positively selected orthologs (PSGs) that played important roles in the adaptation of these species to different habitats were identified. Then, the expression levels of 20 PSGs in response to flooding and drought stress were analyzed, and the analysis revealed significant changes in their expression levels. These results facilitate our understanding of the evolutionary adaptation to habitats and the speciation of M. lutarioriparius and M. sacchariflorus. We hypothesise that lignin synthesis genes are the main cause of the morphological differences between the two species. In summary, the plant nonspecific phospholipase C gene family and the receptor-like protein kinase gene family played important roles in the evolution of these two species. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s12298-021-01030-1.
Collapse
Affiliation(s)
- Jia Wang
- School of Medicine, Anhui University of Science and Technology, Huainan, 232001 People’s Republic of China
| | - Jiajing Sheng
- College of Life Sciences, Nantong University, Nantong, 226019 People’s Republic of China
| | - Jianyong Zhu
- College of Forestry and Life Sciences, Chongqing University of Arts and Sciences, Chongqing, 402160 People’s Republic of China
| | - Zhongli Hu
- State Key Laboratory of Hybrid Rice, College of Life Sciences, Hubei Lotus Engineering Center, Wuhan University, Wuhan, 430072 People’s Republic of China
| | - Ying Diao
- School of Life Science and Technology, Wuhan Polytechnic University, Wuhan, 430023 People’s Republic of China
| |
Collapse
|
27
|
Harris CD, Torrance EL, Raymann K, Bobay LM. CoreCruncher: Fast and Robust Construction of Core Genomes in Large Prokaryotic Data Sets. Mol Biol Evol 2021; 38:727-734. [PMID: 32886787 PMCID: PMC7826169 DOI: 10.1093/molbev/msaa224] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The core genome represents the set of genes shared by all, or nearly all, strains of a given population or species of prokaryotes. Inferring the core genome is integral to many genomic analyses, however, most methods rely on the comparison of all the pairs of genomes; a step that is becoming increasingly difficult given the massive accumulation of genomic data. Here, we present CoreCruncher; a program that robustly and rapidly constructs core genomes across hundreds or thousands of genomes. CoreCruncher does not compute all pairwise genome comparisons and uses a heuristic based on the distributions of identity scores to classify sequences as orthologs or paralogs/xenologs. Although it is much faster than current methods, our results indicate that our approach is more conservative than other tools and less sensitive to the presence of paralogs and xenologs. CoreCruncher is freely available from: https://github.com/lbobay/CoreCruncher. CoreCruncher is written in Python 3.7 and can also run on Python 2.7 without modification. It requires the python library Numpy and either Usearch or Blast. Certain options require the programs muscle or mafft.
Collapse
Affiliation(s)
- Connor D Harris
- Department of Biology, University of North Carolina Greensboro, Greensboro, NC
| | - Ellis L Torrance
- Department of Biology, University of North Carolina Greensboro, Greensboro, NC
| | - Kasie Raymann
- Department of Biology, University of North Carolina Greensboro, Greensboro, NC
| | - Louis-Marie Bobay
- Department of Biology, University of North Carolina Greensboro, Greensboro, NC
| |
Collapse
|
28
|
GenOrigin: A comprehensive protein-coding gene origination database on the evolutionary timescale of life. J Genet Genomics 2021; 48:1122-1129. [PMID: 34538772 DOI: 10.1016/j.jgg.2021.03.018] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 03/21/2021] [Accepted: 03/29/2021] [Indexed: 11/20/2022]
Abstract
The origination of new genes contributes to the biological diversity of life. New genes may quickly build their network, exert important functions, and generate novel phenotypes. Dating gene age and inferring the origination mechanisms of new genes, like primate-specific genes, is the basis for the functional study of the genes. However, no comprehensive resource of gene age estimates across species is available. Here, we systematically date the age of 9,102,113 protein-coding genes from 565 species in the Ensembl and Ensembl Genomes databases, including 82 bacteria, 57 protists, 134 fungi, 58 plants, 56 metazoa, and 178 vertebrates, using a protein-family-based pipeline with Wagner parsimony algorithm. We also collect gene age estimate data from other studies and uniformly distribute the gene age estimates to time ranges in a million years for comparison across studies. All the data are cataloged into GenOrigin (http://genorigin.chenzxlab.cn/), a user-friendly new database of gene age estimates, where users can browse gene age estimates by species, age, and gene ontology. In GenOrigin, the information such as gene age estimates, annotation, gene ontology, ortholog, and paralog, as well as detailed gene presence/absence views for gene age inference based on the species tree with evolutionary timescale, is provided to researchers for exploring gene functions.
Collapse
|
29
|
Glover N, Sheppard S, Dessimoz C. Homoeolog Inference Methods Requiring Bidirectional Best Hits or Synteny Miss Many Pairs. Genome Biol Evol 2021; 13:6237894. [PMID: 33871639 PMCID: PMC8214411 DOI: 10.1093/gbe/evab077] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/12/2021] [Indexed: 12/22/2022] Open
Abstract
Homoeologs are pairs of genes or chromosomes in the same species that originated by speciation and were brought back together in the same genome by allopolyploidization. Bioinformatic methods for accurate homoeology inference are crucial for studying the evolutionary consequences of polyploidization, and homoeology is typically inferred on the basis of bidirectional best hit (BBH) and/or positional conservation (synteny). However, these methods neglect the fact that genes can duplicate and move, both prior to and after the allopolyploidization event. These duplications and movements can result in many-to-many and/or nonsyntenic homoeologs-which thus remain undetected and unstudied. Here, using the allotetraploid upland cotton (Gossypium hirsutum) as a case study, we show that conventional approaches indeed miss a substantial proportion of homoeologs. Additionally, we found that many of the missed pairs of homoeologs are broadly and highly expressed. A gene ontology analysis revealed a high proportion of the nonsyntenic and non-BBH homoeologs to be involved in protein translation and are likely to contribute to the functional repertoire of cotton. Thus, from an evolutionary and functional genomics standpoint, choosing a homoeolog inference method which does not solely rely on 1:1 relationship cardinality or synteny is crucial for not missing these potentially important homoeolog pairs.
Collapse
Affiliation(s)
- Natasha Glover
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Switzerland
| | | | - Christophe Dessimoz
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Switzerland.,Department of Genetics, Evolution, and Environment, University College London, United Kingdom.,Department of Computer Science, University College London, United Kingdom
| |
Collapse
|
30
|
Vazquez JM, Lynch VJ. Pervasive duplication of tumor suppressors in Afrotherians during the evolution of large bodies and reduced cancer risk. eLife 2021; 10:e65041. [PMID: 33513090 PMCID: PMC7952090 DOI: 10.7554/elife.65041] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Accepted: 01/28/2021] [Indexed: 12/11/2022] Open
Abstract
The risk of developing cancer is correlated with body size and lifespan within species. Between species, however, there is no correlation between cancer and either body size or lifespan, indicating that large, long-lived species have evolved enhanced cancer protection mechanisms. Elephants and their relatives (Proboscideans) are a particularly interesting lineage for the exploration of mechanisms underlying the evolution of augmented cancer resistance because they evolved large bodies recently within a clade of smaller-bodied species (Afrotherians). Here, we explore the contribution of gene duplication to body size and cancer risk in Afrotherians. Unexpectedly, we found that tumor suppressor duplication was pervasive in Afrotherian genomes, rather than restricted to Proboscideans. Proboscideans, however, have duplicates in unique pathways that may underlie some aspects of their remarkable anti-cancer cell biology. These data suggest that duplication of tumor suppressor genes facilitated the evolution of increased body size by compensating for decreasing intrinsic cancer risk.
Collapse
Affiliation(s)
- Juan M Vazquez
- Department of Human Genetics, The University of ChicagoChicagoUnited States
| | - Vincent J Lynch
- Department of Biological Sciences, University at BuffaloBuffaloUnited States
| |
Collapse
|
31
|
Wang J, Du Z, Huo X, Zhou J, Chen Y, Zhang J, Pan A, Wang X, Wang F, Zhang J. Genome-wide analysis of PRR gene family uncovers their roles in circadian rhythmic changes and response to drought stress in Gossypium hirsutum L. PeerJ 2020; 8:e9936. [PMID: 33033660 PMCID: PMC7521341 DOI: 10.7717/peerj.9936] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Accepted: 08/24/2020] [Indexed: 11/24/2022] Open
Abstract
Background The circadian clock not only participates in regulating various stages of plant growth, development and metabolism, but confers plant environmental adaptability to stress such as drought. Pseudo-Response Regulators (PRRs) are important component of the central oscillator (the core of circadian clock) and play a significant role in plant photoperiod pathway. However, no systematical study about this gene family has been performed in cotton. Methods PRR genes were identified in diploid and tetraploid cotton using bioinformatics methods to investigate their homology, duplication and evolution relationship. Differential gene expression, KEGG enrichment analysis and qRT-PCR were conducted to analyze PRR gene expression patterns under diurnal changes and their response to drought stress. Results A total of 44 PRR family members were identified in four Gossypium species, with 16 in G. hirsutum, 10 in G. raimondii, and nine in G. barbadense as well as in G. arboreum. Phylogenetic analysis indicated that PRR proteins were divided into five subfamilies and whole genome duplication or segmental duplication contributed to the expansion of Gossypium PRR gene family. Gene structure analysis revealed that members in the same clade are similar, and multiple cis-elements related to light and drought stress response were enriched in the promoters of GhPRR genes. qRT-PCR results showed that GhPRR genes transcripts presented four expression peaks (6 h, 9 h, 12 h, 15 h) during 24 h and form obvious rhythmic expression trend. Transcriptome data with PEG treatment, along with qRT-PCR verification suggested that members of clade III (GhPRR5a, b, d) and clade V (GhPRR3a and GhPRR3c) may be involved in drought response. This study provides an insight into understanding the function of PRR genes in circadian rhythm and in response to drought stress in cotton.
Collapse
Affiliation(s)
- Jingjing Wang
- Key Laboratory of Cotton Breeding and Cultivation in Huang-Huai-Hai Plain, Ministry of Agriculture and Rural Affairs, Cotton Research Center, Shandong Academy of Agricultural Sciences, Jinan, P. R. China.,College of Life Sciences, Shandong Normal University, Jinan, P. R. China
| | - Zhaohai Du
- Key Laboratory of Cotton Breeding and Cultivation in Huang-Huai-Hai Plain, Ministry of Agriculture and Rural Affairs, Cotton Research Center, Shandong Academy of Agricultural Sciences, Jinan, P. R. China
| | - Xuehan Huo
- Key Laboratory of Cotton Breeding and Cultivation in Huang-Huai-Hai Plain, Ministry of Agriculture and Rural Affairs, Cotton Research Center, Shandong Academy of Agricultural Sciences, Jinan, P. R. China.,College of Life Sciences, Shandong Normal University, Jinan, P. R. China
| | - Juan Zhou
- Key Laboratory of Cotton Breeding and Cultivation in Huang-Huai-Hai Plain, Ministry of Agriculture and Rural Affairs, Cotton Research Center, Shandong Academy of Agricultural Sciences, Jinan, P. R. China
| | - Yu Chen
- Key Laboratory of Cotton Breeding and Cultivation in Huang-Huai-Hai Plain, Ministry of Agriculture and Rural Affairs, Cotton Research Center, Shandong Academy of Agricultural Sciences, Jinan, P. R. China
| | - Jingxia Zhang
- Key Laboratory of Cotton Breeding and Cultivation in Huang-Huai-Hai Plain, Ministry of Agriculture and Rural Affairs, Cotton Research Center, Shandong Academy of Agricultural Sciences, Jinan, P. R. China
| | - Ao Pan
- Key Laboratory of Cotton Breeding and Cultivation in Huang-Huai-Hai Plain, Ministry of Agriculture and Rural Affairs, Cotton Research Center, Shandong Academy of Agricultural Sciences, Jinan, P. R. China
| | - Xiaoyang Wang
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, P. R. China
| | - Furong Wang
- Key Laboratory of Cotton Breeding and Cultivation in Huang-Huai-Hai Plain, Ministry of Agriculture and Rural Affairs, Cotton Research Center, Shandong Academy of Agricultural Sciences, Jinan, P. R. China.,College of Life Sciences, Shandong Normal University, Jinan, P. R. China
| | - Jun Zhang
- Key Laboratory of Cotton Breeding and Cultivation in Huang-Huai-Hai Plain, Ministry of Agriculture and Rural Affairs, Cotton Research Center, Shandong Academy of Agricultural Sciences, Jinan, P. R. China.,College of Life Sciences, Shandong Normal University, Jinan, P. R. China
| |
Collapse
|
32
|
Lafond M, Hellmuth M. Reconstruction of time-consistent species trees. Algorithms Mol Biol 2020; 15:16. [PMID: 32843891 PMCID: PMC7439642 DOI: 10.1186/s13015-020-00175-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 07/25/2020] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND The history of gene families-which are equivalent to event-labeled gene trees-can to some extent be reconstructed from empirically estimated evolutionary event-relations containing pairs of orthologous, paralogous or xenologous genes. The question then arises as whether inferred event-labeled gene trees are "biologically feasible" which is the case if one can find a species tree with which the gene tree can be reconciled in a time-consistent way. RESULTS In this contribution, we consider event-labeled gene trees that contain speciations, duplications as well as horizontal gene transfer (HGT) and we assume that the species tree is unknown. Although many problems become NP-hard as soon as HGT and time-consistency are involved, we show, in contrast, that the problem of finding a time-consistent species tree for a given event-labeled gene can be solved in polynomial-time. We provide a cubic-time algorithm to decide whether a "time-consistent" species tree for a given event-labeled gene tree exists and, in the affirmative case, to construct the species tree within the same time-complexity.
Collapse
Affiliation(s)
- Manuel Lafond
- Department of Computer Science, Université de Sherbrooke, 2500 Boul. de l’Université, Sherbrooke, J1K 2R1 Canada
| | - Marc Hellmuth
- School of Computing, University of Leeds, E C Stoner Building, Leeds, LS2 9JT UK
| |
Collapse
|
33
|
Christian RW, Hewitt SL, Roalson EH, Dhingra A. Genome-Scale Characterization of Predicted Plastid-Targeted Proteomes in Higher Plants. Sci Rep 2020; 10:8281. [PMID: 32427841 PMCID: PMC7237471 DOI: 10.1038/s41598-020-64670-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Accepted: 04/20/2020] [Indexed: 12/20/2022] Open
Abstract
Plastids are morphologically and functionally diverse organelles that are dependent on nuclear-encoded, plastid-targeted proteins for all biochemical and regulatory functions. However, how plastid proteomes vary temporally, spatially, and taxonomically has been historically difficult to analyze at a genome-wide scale using experimental methods. A bioinformatics workflow was developed and evaluated using a combination of fast and user-friendly subcellular prediction programs to maximize performance and accuracy for chloroplast transit peptides and demonstrate this technique on the predicted proteomes of 15 sequenced plant genomes. Gene family grouping was then performed in parallel using modified approaches of reciprocal best BLAST hits (RBH) and UCLUST. A total of 628 protein families were found to have conserved plastid targeting across angiosperm species using RBH, and 828 using UCLUST. However, thousands of clusters were also detected where only one species had predicted plastid targeting, most notably in Panicum virgatum which had 1,458 proteins with species-unique targeting. An average of 45% overlap was found in plastid-targeted protein-coding gene families compared with Arabidopsis, but an additional 20% of proteins matched against the full Arabidopsis proteome, indicating a unique evolution of plastid targeting. Neofunctionalization through subcellular relocalization is known to impart novel biological functions but has not been described before on a genome-wide scale for the plastid proteome. Further work to correlate these predicted novel plastid-targeted proteins to transcript abundance and high-throughput proteomics will uncover unique aspects of plastid biology and shed light on how the plastid proteome has evolved to influence plastid morphology and biochemistry.
Collapse
Affiliation(s)
- Ryan W Christian
- Department of Horticulture, Washington State University, Pullman, WA, USA
- Molecular Plant Sciences Program, Washington State University, Pullman, WA, USA
| | - Seanna L Hewitt
- Department of Horticulture, Washington State University, Pullman, WA, USA
- Molecular Plant Sciences Program, Washington State University, Pullman, WA, USA
| | - Eric H Roalson
- Molecular Plant Sciences Program, Washington State University, Pullman, WA, USA
- School of Biological Sciences, Washington State University, Pullman, WA, USA
| | - Amit Dhingra
- Department of Horticulture, Washington State University, Pullman, WA, USA.
- Molecular Plant Sciences Program, Washington State University, Pullman, WA, USA.
| |
Collapse
|
34
|
Wang X, Zhang Y, Wang L, Pan Z, He S, Gao Q, Chen B, Gong W, Du X. Casparian strip membrane domain proteins in Gossypium arboreum: genome-wide identification and negative regulation of lateral root growth. BMC Genomics 2020; 21:340. [PMID: 32366264 PMCID: PMC7199351 DOI: 10.1186/s12864-020-6723-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Accepted: 04/06/2020] [Indexed: 11/28/2022] Open
Abstract
Background Root systems are critical for plant growth and development. The Casparian strip in root systems is involved in stress resistance and maintaining homeostasis. Casparian strip membrane domain proteins (CASPs) are responsible for the formation of Casparian strips. Results To investigate the function of CASPs in cotton, we identified and characterized 48, 54, 91 and 94 CASPs from Gossypium arboreum, Gossypium raimondii, Gossypium barbadense and Gossypium hirsutum, respectively, at the genome-wide level. However, only 29 common homologous CASP genes were detected in the four Gossypium species. A collinearity analysis revealed that whole genome duplication (WGD) was the primary reason for the expansion of the genes of the CASP family in the four cotton species. However, dispersed duplication could also contribute to the expansion of the GaCASPs gene family in the ancestors of G. arboreum. Phylogenetic analysis was used to cluster a total of 85 CASP genes from G. arboreum and Arabidopsis into six distinct groups, while the genetic structure and motifs of CASPs were conserved in the same group. Most GaCASPs were expressed in diverse tissues, with the exception of that five GaCASPs (Ga08G0113, Ga08G0114, Ga08G0116, Ga08G0117 and Ga08G0118) that were highly expressed in root tissues. Analyses of the tissue and subcellular localization suggested that GaCASP27 genes (Ga08G0117) are membrane protein genes located in the root. In the GaCASP27 silenced plants and the Arabidopsis mutants, the lateral root number significantly increased. Furthermore, GaMYB36, which is related to root development was found to regulate lateral root growth by targeting GaCASP27. Conclusions This study provides a fundamental understanding of the CASP gene family in cotton and demonstrates the regulatory role of GaCASP27 on lateral root growth and development.
Collapse
Affiliation(s)
- Xiaoyang Wang
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China.,Crop Information Center, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China
| | - Yuanming Zhang
- Crop Information Center, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China
| | - Liyuan Wang
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China
| | - Zhaoe Pan
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China
| | - Shoupu He
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China
| | - Qiong Gao
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China
| | - Baojun Chen
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China
| | - Wenfang Gong
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China. .,Key Laboratory of Cultivation and Protection for Non-Wood Forest Trees, Ministry of Education, Central South University of Forestry and Technology, Ministry of Education, Changsha, 410004, China.
| | - Xiongming Du
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China.
| |
Collapse
|
35
|
Galperin MY, Kristensen DM, Makarova KS, Wolf YI, Koonin EV. Microbial genome analysis: the COG approach. Brief Bioinform 2020; 20:1063-1070. [PMID: 28968633 DOI: 10.1093/bib/bbx117] [Citation(s) in RCA: 144] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Revised: 08/01/2017] [Indexed: 11/15/2022] Open
Abstract
For the past 20 years, the Clusters of Orthologous Genes (COG) database had been a popular tool for microbial genome annotation and comparative genomics. Initially created for the purpose of evolutionary classification of protein families, the COG have been used, apart from straightforward functional annotation of sequenced genomes, for such tasks as (i) unification of genome annotation in groups of related organisms; (ii) identification of missing and/or undetected genes in complete microbial genomes; (iii) analysis of genomic neighborhoods, in many cases allowing prediction of novel functional systems; (iv) analysis of metabolic pathways and prediction of alternative forms of enzymes; (v) comparison of organisms by COG functional categories; and (vi) prioritization of targets for structural and functional characterization. Here we review the principles of the COG approach and discuss its key advantages and drawbacks in microbial genome analysis.
Collapse
|
36
|
Stadler PF, Geiß M, Schaller D, López Sánchez A, González Laffitte M, Valdivia DI, Hellmuth M, Hernández Rosales M. From pairs of most similar sequences to phylogenetic best matches. Algorithms Mol Biol 2020; 15:5. [PMID: 32308731 PMCID: PMC7147060 DOI: 10.1186/s13015-020-00165-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 03/26/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Many of the commonly used methods for orthology detection start from mutually most similar pairs of genes (reciprocal best hits) as an approximation for evolutionary most closely related pairs of genes (reciprocal best matches). This approximation of best matches by best hits becomes exact for ultrametric dissimilarities, i.e., under the Molecular Clock Hypothesis. It fails, however, whenever there are large lineage specific rate variations among paralogous genes. In practice, this introduces a high level of noise into the input data for best-hit-based orthology detection methods. RESULTS If additive distances between genes are known, then evolutionary most closely related pairs can be identified by considering certain quartets of genes provided that in each quartet the outgroup relative to the remaining three genes is known. A priori knowledge of underlying species phylogeny greatly facilitates the identification of the required outgroup. Although the workflow remains a heuristic since the correct outgroup cannot be determined reliably in all cases, simulations with lineage specific biases and rate asymmetries show that nearly perfect results can be achieved. In a realistic setting, where distances data have to be estimated from sequence data and hence are noisy, it is still possible to obtain highly accurate sets of best matches. CONCLUSION Improvements of tree-free orthology assessment methods can be expected from a combination of the accurate inference of best matches reported here and recent mathematical advances in the understanding of (reciprocal) best match graphs and orthology relations. AVAILABILITY Accompanying software is available at https://github.com/david-schaller/AsymmeTree.
Collapse
Affiliation(s)
- Peter F. Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16–18, 04107 Leipzig, Germany
- Competence Center for Scalable Data Services and Solutions Dresden/Leipzig, Interdisciplinary Center for Bioinformatics, German Centre for Integrative Biodiversity Research (iDiv), and Leipzig Research Center for Civilization Diseases, Universität Leipzig, Augustusplatz 12, 04107 Leipzig, Germany
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, 04103 Leipzig, Germany
- Department of Theoretical Chemistry, University of Vienna, Währinger Straße 17, 1090 Vienna, Austria
- Facultad de Ciencias, Universidad National de Colombia, Sede Bogotá, Ciudad Universitaria, 111321 Bogotá, D.C. Colombia
- Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM87501 USA
| | - Manuela Geiß
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16–18, 04107 Leipzig, Germany
- Software Competence Center Hagenberg GmbH, Softwarepark 21, 4232 Hagenberg, Austria
| | - David Schaller
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16–18, 04107 Leipzig, Germany
| | - Alitzel López Sánchez
- CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230 Juriquilla, Querétaro, QRO México
| | - Marcos González Laffitte
- CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230 Juriquilla, Querétaro, QRO México
| | - Dulce I. Valdivia
- Departamento de Ingeniería Genética, Centro de Investigación y de Estudios Avanzados del IPN (CINVESTAV), Km. 9.6 Libramiento Norte Carretera Irapuato-León, 36821 Irapuato, GTO México
| | - Marc Hellmuth
- School of Computing, University of Leeds, E C Stoner Building, Leeds, LS2 9JT UK
| | - Maribel Hernández Rosales
- CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230 Juriquilla, Querétaro, QRO México
| |
Collapse
|
37
|
Avelar RA, Ortega JG, Tacutu R, Tyler EJ, Bennett D, Binetti P, Budovsky A, Chatsirisupachai K, Johnson E, Murray A, Shields S, Tejada-Martinez D, Thornton D, Fraifeld VE, Bishop CL, de Magalhães JP. A multidimensional systems biology analysis of cellular senescence in aging and disease. Genome Biol 2020; 21:91. [PMID: 32264951 PMCID: PMC7333371 DOI: 10.1186/s13059-020-01990-9] [Citation(s) in RCA: 155] [Impact Index Per Article: 38.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Accepted: 03/08/2020] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND Cellular senescence, a permanent state of replicative arrest in otherwise proliferating cells, is a hallmark of aging and has been linked to aging-related diseases. Many genes play a role in cellular senescence, yet a comprehensive understanding of its pathways is still lacking. RESULTS We develop CellAge (http://genomics.senescence.info/cells), a manually curated database of 279 human genes driving cellular senescence, and perform various integrative analyses. Genes inducing cellular senescence tend to be overexpressed with age in human tissues and are significantly overrepresented in anti-longevity and tumor-suppressor genes, while genes inhibiting cellular senescence overlap with pro-longevity and oncogenes. Furthermore, cellular senescence genes are strongly conserved in mammals but not in invertebrates. We also build cellular senescence protein-protein interaction and co-expression networks. Clusters in the networks are enriched for cell cycle and immunological processes. Network topological parameters also reveal novel potential cellular senescence regulators. Using siRNAs, we observe that all 26 candidates tested induce at least one marker of senescence with 13 genes (C9orf40, CDC25A, CDCA4, CKAP2, GTF3C4, HAUS4, IMMT, MCM7, MTHFD2, MYBL2, NEK2, NIPA2, and TCEB3) decreasing cell number, activating p16/p21, and undergoing morphological changes that resemble cellular senescence. CONCLUSIONS Overall, our work provides a benchmark resource for researchers to study cellular senescence, and our systems biology analyses reveal new insights and gene regulators of cellular senescence.
Collapse
Affiliation(s)
- Roberto A Avelar
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, L7 8TX, UK
| | - Javier Gómez Ortega
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, L7 8TX, UK
- School of Biological Sciences, Monash University, Melbourne, VIC, 3800, Australia
| | - Robi Tacutu
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, L7 8TX, UK
- Computational Biology of Aging Group, Institute of Biochemistry, Romanian Academy, 060031, Bucharest, Romania
- Chronos Biosystems SRL, 060117, Bucharest, Romania
| | - Eleanor J Tyler
- Centre for Cell Biology and Cutaneous Research, Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, E1 2AT, UK
| | - Dominic Bennett
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, L7 8TX, UK
| | - Paolo Binetti
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, L7 8TX, UK
| | - Arie Budovsky
- Research and Development Authority, Barzilai Medical Center, Ashkelon, Israel
| | - Kasit Chatsirisupachai
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, L7 8TX, UK
| | - Emily Johnson
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, L7 8TX, UK
| | - Alex Murray
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, L7 8TX, UK
| | - Samuel Shields
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, L7 8TX, UK
| | - Daniela Tejada-Martinez
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, L7 8TX, UK
- Doctorado en Ciencias mención Ecología y Evolución, Instituto de Ciencias Ambientales y Evolutivas, Facultad de Ciencias, Universidad Austral de Chile, Independencia 631, Valdivia, Chile
| | - Daniel Thornton
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, L7 8TX, UK
| | - Vadim E Fraifeld
- The Shraga Segal Department of Microbiology, Immunology and Genetics, Faculty of Health Sciences, Center for Multidisciplinary Research on Aging, Ben-Gurion University of the Negev, 8410501, Beer Sheva, Israel
| | - Cleo L Bishop
- Centre for Cell Biology and Cutaneous Research, Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, E1 2AT, UK.
| | - João Pedro de Magalhães
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, L7 8TX, UK.
| |
Collapse
|
38
|
Mota MBS, Carvalho MA, Monteiro ANA, Mesquita RD. DNA damage response and repair in perspective: Aedes aegypti, Drosophila melanogaster and Homo sapiens. Parasit Vectors 2019; 12:533. [PMID: 31711518 PMCID: PMC6849265 DOI: 10.1186/s13071-019-3792-1] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Accepted: 11/05/2019] [Indexed: 01/18/2023] Open
Abstract
Background The maintenance of genomic integrity is the responsibility of a complex network, denominated the DNA damage response (DDR), which controls the lesion detection and DNA repair. The main repair pathways are base excision repair (BER), nucleotide excision repair (NER), mismatch repair (MMR), homologous recombination repair (HR) and non-homologous end joining repair (NHEJ). They correct double-strand breaks (DSB), single-strand breaks, mismatches and others, or when the damage is quite extensive and repair insufficient, apoptosis is activated. Methods In this study we used the BLAST reciprocal best-hit methodology to search for DDR orthologs proteins in Aedes aegypti. We also provided a comparison between Ae. aegypti, D. melanogaster and human DDR network. Results Our analysis revealed the presence of ATR and ATM signaling, including the H2AX ortholog, in Ae. aegypti. Key DDR proteins (orthologs to RAD51, Ku and MRN complexes, XP-components, MutS and MutL) were also identified in this insect. Other proteins were not identified in both Ae. aegypti and D. melanogaster, including BRCA1 and its partners from BRCA1-A complex, TP53BP1, PALB2, POLk, CSA, CSB and POLβ. In humans, their absence affects DSB signaling, HR and sub-pathways of NER and BER. Seven orthologs not known in D. melanogaster were found in Ae. aegypti (RNF168, RIF1, WRN, RAD54B, RMI1, DNAPKcs, ARTEMIS). Conclusions The presence of key DDR proteins in Ae. aegypti suggests that the main DDR pathways are functional in this insect, and the identification of proteins not known in D. melanogaster can help fill gaps in the DDR network. The mapping of the DDR network in Ae. aegypti can support mosquito biology studies and inform genetic manipulation approaches applied to this vector.
Collapse
Affiliation(s)
- Maria Beatriz S Mota
- Departamento de Bioquímica, Instituto de Química, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
| | - Marcelo Alex Carvalho
- Instituto Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil.,Instituto Nacional de Câncer, Divisão de Pesquisa Clínica, Rio de Janeiro, RJ, Brazil
| | - Alvaro N A Monteiro
- Cancer Epidemiology Program, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Rafael D Mesquita
- Departamento de Bioquímica, Instituto de Química, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil. .,Instituto Nacional de Ciência e Tecnologia em Entomologia Molecular, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil.
| |
Collapse
|
39
|
Mercatelli D, Scalambra L, Triboli L, Ray F, Giorgi FM. Gene regulatory network inference resources: A practical overview. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1863:194430. [PMID: 31678629 DOI: 10.1016/j.bbagrm.2019.194430] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 09/06/2019] [Accepted: 09/09/2019] [Indexed: 02/08/2023]
Abstract
Transcriptional regulation is a fundamental molecular mechanism involved in almost every aspect of life, from homeostasis to development, from metabolism to behavior, from reaction to stimuli to disease progression. In recent years, the concept of Gene Regulatory Networks (GRNs) has grown popular as an effective applied biology approach for describing the complex and highly dynamic set of transcriptional interactions, due to its easy-to-interpret features. Since cataloguing, predicting and understanding every GRN connection in all species and cellular contexts remains a great challenge for biology, researchers have developed numerous tools and methods to infer regulatory processes. In this review, we catalogue these methods in six major areas, based on the dominant underlying information leveraged to infer GRNs: Coexpression, Sequence Motifs, Chromatin Immunoprecipitation (ChIP), Orthology, Literature and Protein-Protein Interaction (PPI) specifically focused on transcriptional complexes. The methods described here cover a wide range of user-friendliness: from web tools that require no prior computational expertise to command line programs and algorithms for large scale GRN inferences. Each method for GRN inference described herein effectively illustrates a type of transcriptional relationship, with many methods being complementary to others. While a truly holistic approach for inferring and displaying GRNs remains one of the greatest challenges in the field of systems biology, we believe that the integration of multiple methods described herein provides an effective means with which experimental and computational biologists alike may obtain the most complete pictures of transcriptional relationships. This article is part of a Special Issue entitled: Transcriptional Profiles and Regulatory Gene Networks edited by Dr. Federico Manuel Giorgi and Dr. Shaun Mahony.
Collapse
Affiliation(s)
- Daniele Mercatelli
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Laura Scalambra
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Luca Triboli
- Centre for Integrative Biology (CIBIO), University of Trento, Italy
| | - Forest Ray
- Department of Systems Biology, Columbia University Medical Center, New York, NY, United States
| | - Federico M Giorgi
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.
| |
Collapse
|
40
|
Geiß M, Stadler PF, Hellmuth M. Reciprocal best match graphs. J Math Biol 2019; 80:865-953. [PMID: 31691135 DOI: 10.1007/s00285-019-01444-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Revised: 06/10/2019] [Indexed: 11/24/2022]
Abstract
Reciprocal best matches play an important role in numerous applications in computational biology, in particular as the basis of many widely used tools for orthology assessment. Nevertheless, very little is known about their mathematical structure. Here, we investigate the structure of reciprocal best match graphs (RBMGs). In order to abstract from the details of measuring distances, we define reciprocal best matches here as pairwise most closely related leaves in a gene tree, arguing that conceptually this is the notion that is pragmatically approximated by distance- or similarity-based heuristics. We start by showing that a graph G is an RBMG if and only if its quotient graph w.r.t. a certain thinness relation is an RBMG. Furthermore, it is necessary and sufficient that all connected components of G are RBMGs. The main result of this contribution is a complete characterization of RBMGs with 3 colors/species that can be checked in polynomial time. For 3 colors, there are three distinct classes of trees that are related to the structure of the phylogenetic trees explaining them. We derive an approach to recognize RBMGs with an arbitrary number of colors; it remains open however, whether a polynomial-time for RBMG recognition exists. In addition, we show that RBMGs that at the same time are cographs (co-RBMGs) can be recognized in polynomial time. Co-RBMGs are characterized in terms of hierarchically colored cographs, a particular class of vertex colored cographs that is introduced here. The (least resolved) trees that explain co-RBMGs can be constructed in polynomial time.
Collapse
Affiliation(s)
- Manuela Geiß
- Bioinformatics Group, Department of Computer Science, Leipzig University, Härtelstraße 16-18, 04107, Leipzig, Germany.,Interdisciplinary Center of Bioinformatics, Leipzig University, Härtelstraße 16-18, 04107, Leipzig, Germany
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, Leipzig University, Härtelstraße 16-18, 04107, Leipzig, Germany.,Interdisciplinary Center of Bioinformatics, Leipzig University, Härtelstraße 16-18, 04107, Leipzig, Germany.,German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig University, Härtelstraße 16-18, 04107, Leipzig, Germany.,Competence Center for Scalable Data Services and Solutions, Leipzig University, Härtelstraße 16-18, 04107, Leipzig, Germany.,Leipzig Research Center for Civilization Diseases, Leipzig University, Härtelstraße 16-18, 04107, Leipzig, Germany.,Max-Planck-Institute for Mathematics in the Sciences, Inselstraße 22, 04103, Leipzig, Germany.,Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090, Vienna, Austria.,Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM, 87501, USA
| | - Marc Hellmuth
- Institute of Mathematics and Computer Science, University of Greifswald, Walther-Rathenau-Straße 47, 17487, Greifswald, Germany. .,Center for Bioinformatics, Saarland University, Building E 2.1, P.O. Box 151150, 66041, Saarbrücken, Germany.
| |
Collapse
|
41
|
Bioinformatics for Marine Products: An Overview of Resources, Bottlenecks, and Perspectives. Mar Drugs 2019; 17:md17100576. [PMID: 31614509 PMCID: PMC6835618 DOI: 10.3390/md17100576] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Revised: 10/01/2019] [Accepted: 10/02/2019] [Indexed: 12/13/2022] Open
Abstract
The sea represents a major source of biodiversity. It exhibits many different ecosystems in a huge variety of environmental conditions where marine organisms have evolved with extensive diversification of structures and functions, making the marine environment a treasure trove of molecules with potential for biotechnological applications and innovation in many different areas. Rapid progress of the omics sciences has revealed novel opportunities to advance the knowledge of biological systems, paving the way for an unprecedented revolution in the field and expanding marine research from model organisms to an increasing number of marine species. Multi-level approaches based on molecular investigations at genomic, metagenomic, transcriptomic, metatranscriptomic, proteomic, and metabolomic levels are essential to discover marine resources and further explore key molecular processes involved in their production and action. As a consequence, omics approaches, accompanied by the associated bioinformatic resources and computational tools for molecular analyses and modeling, are boosting the rapid advancement of biotechnologies. In this review, we provide an overview of the most relevant bioinformatic resources and major approaches, highlighting perspectives and bottlenecks for an appropriate exploitation of these opportunities for biotechnology applications from marine resources.
Collapse
|
42
|
Hu X, Friedberg I. SwiftOrtho: A fast, memory-efficient, multiple genome orthology classifier. Gigascience 2019; 8:giz118. [PMID: 31648300 PMCID: PMC6812468 DOI: 10.1093/gigascience/giz118] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Revised: 06/07/2019] [Accepted: 09/05/2019] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Gene homology type classification is required for many types of genome analyses, including comparative genomics, phylogenetics, and protein function annotation. Consequently, a large variety of tools have been developed to perform homology classification across genomes of different species. However, when applied to large genomic data sets, these tools require high memory and CPU usage, typically available only in computational clusters. FINDINGS Here we present a new graph-based orthology analysis tool, SwiftOrtho, which is optimized for speed and memory usage when applied to large-scale data. SwiftOrtho uses long k-mers to speed up homology search, while using a reduced amino acid alphabet and spaced seeds to compensate for the loss of sensitivity due to long k-mers. In addition, it uses an affinity propagation algorithm to reduce the memory usage when clustering large-scale orthology relationships into orthologous groups. In our tests, SwiftOrtho was the only tool that completed orthology analysis of proteins from 1,760 bacterial genomes on a computer with only 4 GB RAM. Using various standard orthology data sets, we also show that SwiftOrtho has a high accuracy. CONCLUSIONS SwiftOrtho enables the accurate comparative genomic analyses of thousands of genomes using low-memory computers. SwiftOrtho is available at https://github.com/Rinoahu/SwiftOrtho.
Collapse
Affiliation(s)
- Xiao Hu
- Department of Veterinary Microbiology and Preventive Medicine, 2118 Veterinary Medicine, College of Veterinary Medicine, Iowa State University, Ames, IA, 50011, USA
| | - Iddo Friedberg
- Department of Veterinary Microbiology and Preventive Medicine, 2118 Veterinary Medicine, College of Veterinary Medicine, Iowa State University, Ames, IA, 50011, USA
| |
Collapse
|
43
|
Paraskevopoulou S, Dennis AB, Weithoff G, Hartmann S, Tiedemann R. Within species expressed genetic variability and gene expression response to different temperatures in the rotifer Brachionus calyciflorus sensu stricto. PLoS One 2019; 14:e0223134. [PMID: 31568501 PMCID: PMC6768451 DOI: 10.1371/journal.pone.0223134] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2019] [Accepted: 09/14/2019] [Indexed: 01/14/2023] Open
Abstract
Genetic divergence is impacted by many factors, including phylogenetic history, gene flow, genetic drift, and divergent selection. Rotifers are an important component of aquatic ecosystems, and genetic variation is essential to their ongoing adaptive diversification and local adaptation. In addition to coding sequence divergence, variation in gene expression may relate to variable heat tolerance, and can impose ecological barriers within species. Temperature plays a significant role in aquatic ecosystems by affecting species abundance, spatio-temporal distribution, and habitat colonization. Recently described (formerly cryptic) species of the Brachionus calyciflorus complex exhibit different temperature tolerance both in natural and in laboratory studies, and show that B. calyciflorus sensu stricto (s.s.) is a thermotolerant species. Even within B. calyciflorus s.s., there is a tendency for further temperature specializations. Comparison of expressed genes allows us to assess the impact of stressors on both expression and sequence divergence among disparate populations within a single species. Here, we have used RNA-seq to explore expressed genetic diversity in B. calyciflorus s.s. in two mitochondrial DNA lineages with different phylogenetic histories and differences in thermotolerance. We identify a suite of candidate genes that may underlie local adaptation, with a particular focus on the response to sustained high or low temperatures. We do not find adaptive divergence in established candidate genes for thermal adaptation. Rather, we detect divergent selection among our two lineages in genes related to metabolism (lipid metabolism, metabolism of xenobiotics).
Collapse
Affiliation(s)
- Sofia Paraskevopoulou
- Unit of Evolutionary Biology/Systematic Zoology, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
- Unit of Ecology and Ecosystem Modelling, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
- * E-mail:
| | - Alice B. Dennis
- Unit of Evolutionary Biology/Systematic Zoology, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
| | - Guntram Weithoff
- Unit of Ecology and Ecosystem Modelling, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
- Berlin-Brandenburg Institute of Advanced Biodiversity Research (BBIB), Berlin, Germany
| | - Stefanie Hartmann
- Unit of Evolutionary Adaptive Genomics, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
| | - Ralph Tiedemann
- Unit of Evolutionary Biology/Systematic Zoology, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
| |
Collapse
|
44
|
Qanmber G, Lu L, Liu Z, Yu D, Zhou K, Huo P, Li F, Yang Z. Genome-wide identification of GhAAI genes reveals that GhAAI66 triggers a phase transition to induce early flowering. JOURNAL OF EXPERIMENTAL BOTANY 2019; 70:4721-4736. [PMID: 31106831 PMCID: PMC6760319 DOI: 10.1093/jxb/erz239] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2018] [Accepted: 05/11/2019] [Indexed: 05/20/2023]
Abstract
Plants undergo a phase transition from vegetative to reproductive development that triggers floral induction. Genes containing an AAI (α-amylase inhibitor) domain form a large gene family, but there have been no comprehensive analyses of this gene family in any plant species. Here, we identified 336 AAI genes from nine plant species including122 AAI genes in cotton (Gossypium hirsutum). The AAI gene family has evolutionarily conserved amino acid residues throughout the plant kingdom. Phylogenetic analysis classified AAI genes into five major clades with significant polyploidization and showing effects of genome duplication. Our study identified 42 paralogous and 216 orthologous gene pairs resulting from segmental and whole-genome duplication, respectively, demonstrating significant contributions of gene duplication to expansion of the cotton AAI gene family. Further, GhAAI66 was preferentially expressed in flower tissue and as responses to phytohormone treatments. Ectopic expression of GhAAI66 in Arabidopsis and silencing in cotton revealed that GhAAI66 triggers a phase transition to induce early flowering. Further, GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes) analysis of RNA sequencing data and qRT-PCR (quantitative reverse transcription-PCR) analysis indicated that GhAAI66 integrates multiple flower signaling pathways including gibberellin, jasmonic acid, and floral integrators to trigger an early flowering cascade in Arabidopsis. Therefore, characterization of the AAI family provides invaluable insights for improving cotton breeding.
Collapse
Affiliation(s)
- Ghulam Qanmber
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, Henan, China
| | - Lili Lu
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, Henan, China
| | - Zhao Liu
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, Henan, China
| | - Daoqian Yu
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, Henan, China
| | - Kehai Zhou
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, Henan, China
| | - Peng Huo
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, Henan, China
| | - Fuguang Li
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, Henan, China
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, Henan, China
- Correspondence: or
| | - Zuoren Yang
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, Henan, China
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, Henan, China
- Correspondence: or
| |
Collapse
|
45
|
dos Santos Gomes AC, Falkoski D, Battaglia E, Peng M, Nicolau de Almeida M, Coconi Linares N, Meijnen JP, Visser J, de Vries RP. Myceliophthora thermophila Xyr1 is predominantly involved in xylan degradation and xylose catabolism. BIOTECHNOLOGY FOR BIOFUELS 2019; 12:220. [PMID: 31534479 PMCID: PMC6745793 DOI: 10.1186/s13068-019-1556-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Accepted: 08/31/2019] [Indexed: 06/10/2023]
Abstract
BACKGROUND Myceliophthora thermophila is a thermophilic ascomycete fungus that is used as a producer of enzyme cocktails used in plant biomass saccharification. Further development of this species as an industrial enzyme factory requires a detailed understanding of its regulatory systems driving the production of plant biomass-degrading enzymes. In this study, we analyzed the function of MtXlr1, an ortholog of the (hemi-)cellulolytic regulator XlnR first identified in another industrially relevant fungus, Aspergillus niger. RESULTS The Mtxlr1 gene was deleted and the resulting strain was compared to the wild type using growth profiling and transcriptomics. The deletion strain was unable to grow on xylan and d-xylose, but showed only a small growth reduction on l-arabinose, and grew similar to the wild type on Avicel and cellulose. These results were supported by the transcriptome analyses which revealed reduction of genes encoding xylan-degrading enzymes, enzymes of the pentose catabolic pathway and putative pentose transporters. In contrast, no or minimal effects were observed for the expression of cellulolytic genes. CONCLUSIONS Myceliophthora thermophila MtXlr1 controls the expression of xylanolytic genes and genes involved in pentose transport and catabolism, but has no significant effects on the production of cellulases. It therefore resembles more the role of its ortholog in Neurospora crassa, rather than the broader role described for this regulator in A. niger and Trichoderma reesei. By revealing the range of genes controlled by MtXlr1, our results provide the basic knowledge for targeted strain improvement by overproducing or constitutively activating this regulator, to further improve the biotechnological value of M. thermophila.
Collapse
Affiliation(s)
- Ana Carolina dos Santos Gomes
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
| | - Daniel Falkoski
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
- Present Address: Novozymes Latin America, Professor Francisco Ribeiro Street 683, Araucária, PR 83707-660 Brazil
| | - Evy Battaglia
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
| | - Mao Peng
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
| | - Maira Nicolau de Almeida
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
- DuPont Industrial Biosciences, Archimedesweg 30, 2333 CN Leiden, The Netherlands
- Present Address: Federal University of São João del Rei, Praça Dom Helvécio, 74, São João del Rei, Minas Gerais Brazil
| | - Nancy Coconi Linares
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
| | - Jean-Paul Meijnen
- DuPont Industrial Biosciences, Archimedesweg 30, 2333 CN Leiden, The Netherlands
- Present Address: Dutch DNA Biotech BV, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Jaap Visser
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
| | - Ronald P. de Vries
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
| |
Collapse
|
46
|
Hellmuth M, Huber KT, Moulton V. Reconciling event-labeled gene trees with MUL-trees and species networks. J Math Biol 2019; 79:1885-1925. [PMID: 31410552 DOI: 10.1007/s00285-019-01414-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Revised: 05/08/2019] [Indexed: 11/30/2022]
Abstract
Phylogenomics commonly aims to construct evolutionary trees from genomic sequence information. One way to approach this problem is to first estimate event-labeled gene trees (i.e., rooted trees whose non-leaf vertices are labeled by speciation or gene duplication events), and to then look for a species tree which can be reconciled with this tree through a reconciliation map between the trees. In practice, however, it can happen that there is no such map from a given event-labeled tree to any species tree. An important situation where this might arise is where the species evolution is better represented by a network instead of a tree. In this paper, we therefore consider the problem of reconciling event-labeled trees with species networks. In particular, we prove that any event-labeled gene tree can be reconciled with some network and that, under certain mild assumptions on the gene tree, the network can even be assumed to be multi-arc free. To prove this result, we show that we can always reconcile the gene tree with some multi-labeled (MUL-)tree, which can then be "folded up" to produce the desired reconciliation and network. In addition, we study the interplay between reconciliation maps from event-labeled gene trees to MUL-trees and networks. Our results could be useful for understanding how genomes have evolved after undergoing complex evolutionary events such as polyploidy.
Collapse
Affiliation(s)
- Marc Hellmuth
- Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, Germany. .,Center for Bioinformatics, Saarland University, Saarbrücken, Germany.
| | - Katharina T Huber
- School of Computing Sciences, University of East Anglia, Norwich, UK
| | - Vincent Moulton
- School of Computing Sciences, University of East Anglia, Norwich, UK
| |
Collapse
|
47
|
Dong Y, Chen S, Cheng S, Zhou W, Ma Q, Chen Z, Fu CX, Liu X, Zhao YP, Soltis PS, Wong GKS, Soltis DE, Xiang QYJ. Natural selection and repeated patterns of molecular evolution following allopatric divergence. eLife 2019; 8:45199. [PMID: 31373555 PMCID: PMC6744222 DOI: 10.7554/elife.45199] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 08/01/2019] [Indexed: 11/13/2022] Open
Abstract
Although geographic isolation is a leading driver of speciation, the tempo and pattern of divergence at the genomic level remain unclear. We examine genome-wide divergence of putatively single-copy orthologous genes (POGs) in 20 allopatric species/variety pairs from diverse angiosperm clades, with 16 pairs reflecting the classic eastern Asia-eastern North America floristic disjunction. In each pair, >90% of POGs are under purifying selection, and <10% are under positive selection. A set of POGs are under strong positive selection, 14 of which are shared by 10-15 pairs, and one shared by all pairs; 15 POGs are annotated to biological processes responding to various stimuli. The relative abundance of POGs under different selective forces exhibits a repeated pattern among pairs despite an ~10 million-year difference in divergence time. Species divergence times are positively correlated with abundance of POGs under moderate purifying selection, but negatively correlated with abundance of POGs under strong purifying selection.
Collapse
Affiliation(s)
- Yibo Dong
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, United States.,Plant Biology Division, Noble Research Institute, Ardmore, United States
| | - Shichao Chen
- Florida Museum of Natural History, University of Florida, Gainesville, United States.,Department of Biology, University of Florida, Gainesville, United States.,School of Life Sciences and Technology, Tongji University, Shanghai, China
| | | | - Wenbin Zhou
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, United States
| | - Qing Ma
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, United States
| | - Zhiduan Chen
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China
| | - Cheng-Xin Fu
- Laboratory of Systematic & Evolutionary Botany and Biodiversity, College of Life Sciences, Zhejiang University, Hangzhou, China
| | - Xin Liu
- Beijing Genomics Institute, Shenzhen, China
| | - Yun-Peng Zhao
- Laboratory of Systematic & Evolutionary Botany and Biodiversity, College of Life Sciences, Zhejiang University, Hangzhou, China
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, United States
| | - Gane Ka-Shu Wong
- Beijing Genomics Institute, Shenzhen, China.,Department of Biological Sciences, University of Alberta, Edmonton, Canada.,Department of Medicine, University of Alberta, Edmonton, Canada
| | - Douglas E Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, United States.,Department of Biology, University of Florida, Gainesville, United States
| | - Qiu-Yun Jenny Xiang
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, United States
| |
Collapse
|
48
|
Rey C, Veber P, Boussau B, Sémon M. CAARS: comparative assembly and annotation of RNA-Seq data. Bioinformatics 2019; 35:2199-2207. [PMID: 30452539 PMCID: PMC6596894 DOI: 10.1093/bioinformatics/bty903] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2017] [Revised: 09/13/2018] [Accepted: 11/16/2018] [Indexed: 02/05/2023] Open
Abstract
MOTIVATION RNA sequencing (RNA-Seq) is a widely used approach to obtain transcript sequences in non-model organisms, notably for performing comparative analyses. However, current bioinformatic pipelines do not take full advantage of pre-existing reference data in related species for improving RNA-Seq assembly, annotation and gene family reconstruction. RESULTS We built an automated pipeline named CAARS to combine novel data from RNA-Seq experiments with existing multi-species gene family alignments. RNA-Seq reads are assembled into transcripts by both de novo and assisted assemblies. Then, CAARS incorporates transcripts into gene families, builds gene alignments and trees and uses phylogenetic information to classify the genes as orthologs and paralogs of existing genes. We used CAARS to assemble and annotate RNA-Seq data in rodents and fishes using distantly related genomes as reference, a difficult case for this kind of analysis. We showed CAARS assemblies are more complete and accurate than those assembled by a standard pipeline consisting of de novo assembly coupled with annotation by sequence similarity on a guide species. In addition to annotated transcripts, CAARS provides gene family alignments and trees, annotated with orthology relationships, directly usable for downstream comparative analyses. AVAILABILITY AND IMPLEMENTATION CAARS is implemented in Python and Ocaml and is freely available at https://github.com/carinerey/caars. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Carine Rey
- UnivLyon, Université Claude Bernard Lyon 1, ENS de Lyon, CNRS UMR, INSERM U1210, LBMC, F-69007, Lyon, France
| | - Philippe Veber
- UnivLyon, Université Claude Bernard Lyon 1, CNRS, UMR, LBBE, F-69100, Villeurbanne, France
| | - Bastien Boussau
- UnivLyon, Université Claude Bernard Lyon 1, CNRS, UMR, LBBE, F-69100, Villeurbanne, France
| | - Marie Sémon
- UnivLyon, Université Claude Bernard Lyon 1, ENS de Lyon, CNRS UMR, INSERM U1210, LBMC, F-69007, Lyon, France
| |
Collapse
|
49
|
Altenhoff AM, Levy J, Zarowiecki M, Tomiczek B, Warwick Vesztrocy A, Dalquen DA, Müller S, Telford MJ, Glover NM, Dylus D, Dessimoz C. OMA standalone: orthology inference among public and custom genomes and transcriptomes. Genome Res 2019; 29:1152-1163. [PMID: 31235654 PMCID: PMC6633268 DOI: 10.1101/gr.243212.118] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2018] [Accepted: 05/24/2019] [Indexed: 11/24/2022]
Abstract
Genomes and transcriptomes are now typically sequenced by individual laboratories but analyzing them often remains challenging. One essential step in many analyses lies in identifying orthologs—corresponding genes across multiple species—but this is far from trivial. The Orthologous MAtrix (OMA) database is a leading resource for identifying orthologs among publicly available, complete genomes. Here, we describe the OMA pipeline available as a standalone program for Linux and Mac. When run on a cluster, it has native support for the LSF, SGE, PBS Pro, and Slurm job schedulers and can scale up to thousands of parallel processes. Another key feature of OMA standalone is that users can combine their own data with existing public data by exporting genomes and precomputed alignments from the OMA database, which currently contains over 2100 complete genomes. We compare OMA standalone to other methods in the context of phylogenetic tree inference, by inferring a phylogeny of Lophotrochozoa, a challenging clade within the protostomes. We also discuss other potential applications of OMA standalone, including identifying gene families having undergone duplications/losses in specific clades, and identifying potential drug targets in nonmodel organisms. OMA standalone is available under the permissive open source Mozilla Public License Version 2.0.
Collapse
Affiliation(s)
- Adrian M Altenhoff
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.,Department of Computer Science, ETH Zurich, 8092 Zurich, Switzerland
| | - Jeremy Levy
- Centre for Mathematics and Physics in the Life Sciences and Experimental Biology (CoMPLEX), University College London, London WC1E 6BT, United Kingdom.,Centre for Life's Origins and Evolution, Department of Genetics, Evolution & Environment, University College London, London WC1E 6BT, United Kingdom
| | - Magdalena Zarowiecki
- Genomics England, Queen Mary University of London, London EC1M 6BQ, United Kingdom
| | - Bartłomiej Tomiczek
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution & Environment, University College London, London WC1E 6BT, United Kingdom.,Intercollegiate Faculty of Biotechnology, University of Gdansk and Medical University of Gdansk, 80-307 Gdansk, Poland
| | - Alex Warwick Vesztrocy
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.,Centre for Life's Origins and Evolution, Department of Genetics, Evolution & Environment, University College London, London WC1E 6BT, United Kingdom
| | - Daniel A Dalquen
- Department of Computer Science, ETH Zurich, 8092 Zurich, Switzerland
| | - Steven Müller
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution & Environment, University College London, London WC1E 6BT, United Kingdom
| | - Maximilian J Telford
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution & Environment, University College London, London WC1E 6BT, United Kingdom
| | - Natasha M Glover
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - David Dylus
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Christophe Dessimoz
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.,Centre for Life's Origins and Evolution, Department of Genetics, Evolution & Environment, University College London, London WC1E 6BT, United Kingdom.,Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland.,Department of Computer Science, University College London, London WC1E 6BT, United Kingdom
| |
Collapse
|
50
|
Abstract
Best match graphs arise naturally as the first processing intermediate in algorithms for orthology detection. Let T be a phylogenetic (gene) tree T and [Formula: see text] an assignment of leaves of T to species. The best match graph [Formula: see text] is a digraph that contains an arc from x to y if the genes x and y reside in different species and y is one of possibly many (evolutionary) closest relatives of x compared to all other genes contained in the species [Formula: see text]. Here, we characterize best match graphs and show that it can be decided in cubic time and quadratic space whether [Formula: see text] derived from a tree in this manner. If the answer is affirmative, there is a unique least resolved tree that explains [Formula: see text], which can also be constructed in cubic time.
Collapse
|