101
|
Hufnagel B, Soriano A, Taylor J, Divol F, Kroc M, Sanders H, Yeheyis L, Nelson M, Péret B. Pangenome of white lupin provides insights into the diversity of the species. PLANT BIOTECHNOLOGY JOURNAL 2021; 19:2532-2543. [PMID: 34346542 PMCID: PMC8633493 DOI: 10.1111/pbi.13678] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 07/07/2021] [Accepted: 07/22/2021] [Indexed: 05/21/2023]
Abstract
White lupin is an old crop with renewed interest due to its seed high protein content and high nutritional value. Despite a long domestication history in the Mediterranean basin, modern breeding efforts have been fairly scarce. Recent sequencing of its genome has provided tools for further description of genetic resources but detailed characterization of genomic diversity is still missing. Here, we report the genome sequencing of 39 accessions that were used to establish a white lupin pangenome. We defined 32 068 core genes that are present in all individuals and 14 822 that are absent in some and may represent a gene pool for breeding for improved productivity, grain quality, and stress adaptation. We used this new pangenome resource to identify candidate genes for alkaloid synthesis, a key grain quality trait. The white lupin pangenome provides a novel genetic resource to better understand how domestication has shaped the genomic variability within this crop. Thus, this pangenome resource is an important step towards the effective and efficient genetic improvement of white lupin to help meet the rapidly growing demand for plant protein sources for human and animal consumption.
Collapse
Affiliation(s)
- Bárbara Hufnagel
- BPMPUniv MontpellierCNRSINRAEInstitut AgroMontpellierFrance
- Present address:
CIRADUMR AGAP InstitutSEAPAG TeamPetit‐BourgGuadeloupeF‐97170French West Indies
| | | | | | - Fanchon Divol
- BPMPUniv MontpellierCNRSINRAEInstitut AgroMontpellierFrance
| | - Magdalena Kroc
- Institute of Plant Genetics Polish Academy of SciencesPoznanPoland
| | | | | | | | - Benjamin Péret
- BPMPUniv MontpellierCNRSINRAEInstitut AgroMontpellierFrance
| |
Collapse
|
102
|
Abstract
Animal tuberculosis (TB) is an emergent disease caused by Mycobacterium bovis, one of the animal-adapted ecotypes of the Mycobacterium tuberculosis complex (MTC). In this work, whole-genome comparative analyses of 70 M. bovis were performed to gain insights into the pan-genome architecture. The comparison across M. bovis predicted genome composition enabled clustering into the core- and accessory-genome components, with 2736 CDS for the former, while the accessory moiety included 3897 CDS, of which 2656 are restricted to one/two genomes only. These analyses predicted an open pan-genome architecture, with an average of 32 CDS added by each genome and show the diversification of discrete M. bovis subpopulations supported by both core- and accessory-genome components. The functional annotation of the pan-genome classified each CDS into one or several COG (Clusters of Orthologous Groups) categories, revealing ‘transcription’ (total average CDSs, n=258), ‘lipid metabolism and transport’ (n=242), ‘energy production and conversion’ (n=214) and ‘unknown function’ (n=876) as the most represented. The closer analysis of polymorphisms in virulence-related genes in a restrict group of M. bovis from a multi-host system enabled the identification of clade-monomorphic non-synonymous SNPs, illustrating clade-specific virulence landscapes and correlating with disease severity. This first comparative pan-genome study of a diverse collection of M. bovis encompassing all clonal complexes indicates a high percentage of accessory genes and denotes an open, dynamic non-conservative pan-genome structure, with high evolutionary potential, defying the canons of MTC biology. Furthermore, it shows that M. bovis can shape its virulence repertoire, either by acquisition and loss of genes or by SNP-based diversification, likely towards host immune evasion, adaptation and persistence.
Collapse
Affiliation(s)
- Ana C Reis
- Centre for Ecology, Evolution and Environmental Changes (cE3c), Faculdade de Ciências da Universidade de Lisboa, Lisboa, Portugal.,Biosystems & Integrative Sciences Institute (BioISI), Faculdade de Ciências da Universidade de Lisboa, Lisboa, Portugal
| | - Mónica V Cunha
- Centre for Ecology, Evolution and Environmental Changes (cE3c), Faculdade de Ciências da Universidade de Lisboa, Lisboa, Portugal.,Biosystems & Integrative Sciences Institute (BioISI), Faculdade de Ciências da Universidade de Lisboa, Lisboa, Portugal
| |
Collapse
|
103
|
Jiao D, Dong X, Yu Y, Wei C. Gene Presence/Absence Variation analysis of coronavirus family displays its pan-genomic diversity. Int J Biol Sci 2021; 17:3717-3727. [PMID: 34671195 PMCID: PMC8495401 DOI: 10.7150/ijbs.58220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 08/07/2021] [Indexed: 11/15/2022] Open
Abstract
SARS-CoV-2 belongs to the coronavirus family. Comparing genomic features of viral genomes of coronavirus family can improve our understanding about SARS-CoV-2. Here we present the first pan-genome analysis of 3,932 whole genomes of 101 species out of 4 genera from the coronavirus family. We found that a total of 181 genes in the pan-genome of coronavirus family, among which only 3 genes, the S gene, M gene and N gene, are highly conserved. We also constructed a pan-genome from 23,539 whole genomes of SARS-CoV-2. There are 13 genes in total in the SARS-CoV-2 pan-genome. All of the 13 genes are core genes for SARS-CoV-2. The pan-genome of coronaviruses shows a lower level of diversity than the pan-genomes of other RNA viruses, which contain no core gene. The three highly conserved genes in coronavirus family, which are also core genes in SARS-CoV-2 pan-genome, could be potential targets in developing nucleic acid diagnostic reagents with a decreased possibility of cross-reaction with other coronavirus species.
Collapse
Affiliation(s)
- Du Jiao
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
| | - Xiaorui Dong
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
| | - Yingyan Yu
- Department of General Surgery of Ruijin Hospital, Shanghai Institute of Digestive Surgery, and Shanghai Key Laboratory for Gastric Neoplasms, Shanghai Jiao Tong University School of Medicine, 200025, Shanghai, China
| | - Chaochun Wei
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China.,SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
| |
Collapse
|
104
|
Liu H, Prajapati V, Prajapati S, Bais H, Lu J. Comparative Genome Analysis of Bacillus amyloliquefaciens Focusing on Phylogenomics, Functional Traits, and Prevalence of Antimicrobial and Virulence Genes. Front Genet 2021; 12:724217. [PMID: 34659348 PMCID: PMC8514880 DOI: 10.3389/fgene.2021.724217] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Accepted: 08/26/2021] [Indexed: 11/13/2022] Open
Abstract
Bacillus amyloliquefaciens is a gram-positive, nonpathogenic, endospore-forming, member of a group of free-living soil bacteria with a variety of traits including plant growth promotion, production of antifungal and antibacterial metabolites, and production of industrially important enzymes. We have attempted to reconstruct the biogeographical structure according to functional traits and the evolutionary lineage of B. amyloliquefaciens using comparative genomics analysis. All the available 96 genomes of B. amyloliquefaciens strains were curated from the NCBI genome database, having a variety of important functionalities in all sectors keeping a high focus on agricultural aspects. In-depth analysis was carried out to deduce the orthologous gene groups and whole-genome similarity. Pan genome analysis revealed that shell genes, soft core genes, core genes, and cloud genes comprise 17.09, 5.48, 8.96, and 68.47%, respectively, which demonstrates that genomes are very different in the gene content. It also indicates that the strains may have flexible environmental adaptability or versatile functions. Phylogenetic analysis showed that B. amyloliquefaciens is divided into two clades, and clade 2 is further dived into two different clusters. This reflects the difference in the sequence similarity and diversification that happened in the B. amyloliquefaciens genome. The majority of plant-associated strains of B. amyloliquefaciens were grouped in clade 2 (73 strains), while food-associated strains were in clade 1 (23 strains). Genome mining has been adopted to deduce antimicrobial resistance and virulence genes and their prevalence among all strains. The genes tmrB and yuaB codes for tunicamycin resistance protein and hydrophobic coat forming protein only exist in clade 2, while clpP, which codes for serine proteases, is only in clade 1. Genome plasticity of all strains of B. amyloliquefaciens reflects their adaption to different niches.
Collapse
Affiliation(s)
- Hualin Liu
- School of Marine Sciences, Sun Yat-sen University, Zhuhai, China
| | - Vimalkumar Prajapati
- Division of Microbiology and Environmental, Biotechnology, Aspee Shakilam Biotechnology Institute, Navsari Agricultural University, Surat, India
| | - Shobha Prajapati
- SVP-A School of Sardar Vallabhbhai National Institute of Technology, Surat, India
| | - Harsh Bais
- Delaware Biotechnology Institute, University of Delaware, Newark, DE, United States
| | - Jianguo Lu
- School of Marine Sciences, Sun Yat-sen University, Zhuhai, China.,Southern Marine Science and Engineering Guangdong Laboratory, Zhuhai, China
| |
Collapse
|
105
|
Zou W, Ye G, Liu C, Zhang K, Li H, Yang J. Comparative genome analysis of Clostridium beijerinckii strains isolated from pit mud of Chinese strong flavor baijiu ecosystem. G3 (BETHESDA, MD.) 2021; 11:6364901. [PMID: 34542586 PMCID: PMC8527462 DOI: 10.1093/g3journal/jkab317] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 08/26/2021] [Indexed: 12/24/2022]
Abstract
Clostridium beijerinckii is a well-known anaerobic solventogenic bacterium which inhabits a wide range of different niches. Previously, we isolated five butyrate-producing C. beijerinckii strains from pit mud (PM) of strong-flavor baijiu (SFB) ecosystems. Genome annotation of the five strains showed that they could assimilate various carbon sources as well as ammonium to produce acetate, butyrate, lactate, hydrogen, and esters but did not produce the undesirable flavors isopropanol and acetone, making them useful for further exploration in SFB production. Our analysis of the genomes of an additional 233 C. beijerinckii strains revealed an open pangenome based on current sampling and will likely change with additional genomes. The core genome, accessory genome, and strain-specific genes comprised 1567, 8851, and 2154 genes, respectively. A total of 298 genes were found only in the five C. beijerinckii strains from PM, among which only 77 genes were assigned to Clusters of Orthologous Genes categories. In addition, 15 transposase and 12 phage integrase families were found in all five C. beijerinckii strains from PM. Between 18 and 21 genome islands were predicted for the five C. beijerinckii genomes. The existence of a large number of mobile genetic elements indicated that the genomes of the five C. beijerinckii strains evolved with the loss or insertion of DNA fragments in the PM of SFB ecosystems. This study presents a genomic framework of C. beijerinckii strains from PM that could be used for genetic diversification studies and further exploration of these strains.
Collapse
Affiliation(s)
- Wei Zou
- College of Bioengineering, Sichuan University of Science & Engineering, Yibin, Sichuan 644005, China
| | - Guangbin Ye
- College of Bioengineering, Sichuan University of Science & Engineering, Yibin, Sichuan 644005, China
| | - Chaojie Liu
- College of Bioengineering, Sichuan University of Science & Engineering, Yibin, Sichuan 644005, China
| | - Kaizheng Zhang
- College of Bioengineering, Sichuan University of Science & Engineering, Yibin, Sichuan 644005, China
| | - Hehe Li
- Beijing Key Laboratory of Flavor Chemistry, Beijing Technology and Business University (BTBU), Beijing 100048, China
| | - Jiangang Yang
- College of Bioengineering, Sichuan University of Science & Engineering, Yibin, Sichuan 644005, China
| |
Collapse
|
106
|
Mehrotra T, Devi TB, Kumar S, Talukdar D, Karmakar SP, Kothidar A, Verma J, Kumari S, Alexander SM, Retnakumar RJ, Devadas K, Ray A, Mutreja A, Nair GB, Chattopadhyay S, Das B. Antimicrobial resistance and virulence in Helicobacter pylori: Genomic insights. Genomics 2021; 113:3951-3966. [PMID: 34619341 DOI: 10.1016/j.ygeno.2021.10.002] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 09/10/2021] [Accepted: 10/01/2021] [Indexed: 12/26/2022]
Abstract
Microbes evolve rapidly by modifying their genome through mutations or acquisition of genetic elements. Antimicrobial resistance in Helicobacter pylori is increasingly prevalent in India. However, limited information is available about the genome of resistant H. pylori isolated from India. Our pan- and core-genome based analyses of 54 Indian H. pylori strains revealed plasticity of its genome. H. pylori is highly heterogenous both in terms of the genomic content and DNA sequence homology of ARGs and virulence factors. We observed that the H. pylori strains are clustered according to their geographical locations. The presence of point mutations in the ARGs and absence of acquired genetic elements linked with ARGs suggest target modifications are the primary mechanism of its antibiotic resistance. The findings of the present study would help in better understanding the emergence of drug-resistant H. pylori and controlling gastric disorders by advancing clinical guidance on selected treatment regimens.
Collapse
Affiliation(s)
- Tanshi Mehrotra
- Molecular Genetics Laboratory, Infection and Immunology Division, Translational Health Science and Technology Institute, Faridabad, India
| | - T Barani Devi
- Microbiome Laboratory, Pathogen Biology, Rajiv Gandhi Centre for Biotechnology, Trivandrum, Kerala, India
| | - Shakti Kumar
- Molecular Genetics Laboratory, Infection and Immunology Division, Translational Health Science and Technology Institute, Faridabad, India
| | - Daizee Talukdar
- Molecular Genetics Laboratory, Infection and Immunology Division, Translational Health Science and Technology Institute, Faridabad, India
| | - Sonali Porey Karmakar
- Molecular Genetics Laboratory, Infection and Immunology Division, Translational Health Science and Technology Institute, Faridabad, India
| | - Akansha Kothidar
- Molecular Genetics Laboratory, Infection and Immunology Division, Translational Health Science and Technology Institute, Faridabad, India
| | - Jyoti Verma
- Molecular Genetics Laboratory, Infection and Immunology Division, Translational Health Science and Technology Institute, Faridabad, India
| | - Shashi Kumari
- Molecular Genetics Laboratory, Infection and Immunology Division, Translational Health Science and Technology Institute, Faridabad, India
| | - Sneha Mary Alexander
- Microbiome Laboratory, Pathogen Biology, Rajiv Gandhi Centre for Biotechnology, Trivandrum, Kerala, India
| | - R J Retnakumar
- Microbiome Laboratory, Pathogen Biology, Rajiv Gandhi Centre for Biotechnology, Trivandrum, Kerala, India
| | - Krishnadas Devadas
- Department of Gastroenterology, Government Medical College, Thiruvananthapuram, Kerala, India
| | - Animesh Ray
- Department of Medicine, All India Institute of Medical, Science, New Delhi, India
| | - Ankur Mutreja
- Molecular Genetics Laboratory, Infection and Immunology Division, Translational Health Science and Technology Institute, Faridabad, India; Department of Medicine, Addenbrookes Hospital, University of Cambridge, Cambridge CB20QQ, United Kingdom
| | - G Balakrish Nair
- Microbiome Laboratory, Pathogen Biology, Rajiv Gandhi Centre for Biotechnology, Trivandrum, Kerala, India
| | - Santanu Chattopadhyay
- Microbiome Laboratory, Pathogen Biology, Rajiv Gandhi Centre for Biotechnology, Trivandrum, Kerala, India.
| | - Bhabatosh Das
- Molecular Genetics Laboratory, Infection and Immunology Division, Translational Health Science and Technology Institute, Faridabad, India.
| |
Collapse
|
107
|
Ferrés I, Iraola G. An object-oriented framework for evolutionary pangenome analysis. CELL REPORTS METHODS 2021; 1:100085. [PMID: 35474671 PMCID: PMC9017228 DOI: 10.1016/j.crmeth.2021.100085] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 06/04/2021] [Accepted: 08/25/2021] [Indexed: 05/13/2023]
Abstract
Pangenome analysis is fundamental to explore molecular evolution occurring in bacterial populations. Here, we introduce Pagoo, an R framework that enables straightforward handling of pangenome data. The encapsulated nature of Pagoo allows the storage of complex molecular and phenotypic information using an object-oriented approach. This facilitates to go back and forward to the data using a single programming environment and saving any stage of analysis (including the raw data) in a single file, making it sharable and reproducible. Pagoo provides tools to query, subset, compare, visualize, and perform statistical analyses, in concert with other microbial genomics packages available in the R ecosystem. As working examples, we used 1,000 Escherichia coli genomes to show that Pagoo is scalable, and a global dataset of Campylobacter fetus genomes to identify evolutionary patterns and genomic markers of host-adaptation in this pathogen.
Collapse
Affiliation(s)
- Ignacio Ferrés
- Microbial Genomics Laboratory, Institut Pasteur Montevideo, Montevideo, Uruguay
- Center for Innovation in Epidemiological Surveillance, Institut Pasteur Montevideo, Montevideo, Uruguay
| | - Gregorio Iraola
- Microbial Genomics Laboratory, Institut Pasteur Montevideo, Montevideo, Uruguay
- Center for Innovation in Epidemiological Surveillance, Institut Pasteur Montevideo, Montevideo, Uruguay
- Wellcome Sanger Institute, Hinxton, UK
- Center for Integrative Biology, Universidad Mayor, Santiago de Chile, Chile
| |
Collapse
|
108
|
Reis AC, Cunha MV. Genome-wide estimation of recombination, mutation and positive selection enlightens diversification drivers of Mycobacterium bovis. Sci Rep 2021; 11:18789. [PMID: 34552144 PMCID: PMC8458382 DOI: 10.1038/s41598-021-98226-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 08/27/2021] [Indexed: 02/08/2023] Open
Abstract
Genome sequencing has reinvigorated the infectious disease research field, shedding light on disease epidemiology, pathogenesis, host-pathogen interactions and also evolutionary processes exerted upon pathogens. Mycobacterium tuberculosis complex (MTBC), enclosing M. bovis as one of its animal-adapted members causing tuberculosis (TB) in terrestrial mammals, is a paradigmatic model of bacterial evolution. As other MTBC members, M. bovis is postulated as a strictly clonal, slowly evolving pathogen, with apparently no signs of recombination or horizontal gene transfer. In this work, we applied comparative genomics to a whole genome sequence (WGS) dataset composed by 70 M. bovis from different lineages (European and African) to gain insights into the evolutionary forces that shape genetic diversification in M. bovis. Three distinct approaches were used to estimate signs of recombination. Globally, a small number of recombinant events was identified and confirmed by two independent methods with solid support. Still, recombination reveals a weaker effect on M. bovis diversity compared with mutation (overall r/m = 0.037). The differential r/m average values obtained across the clonal complexes of M. bovis in our dataset are consistent with the general notion that the extent of recombination may vary widely among lineages assigned to the same taxonomical species. Based on this work, recombination in M. bovis cannot be excluded and should thus be a topic of further effort in future comparative genomics studies for which WGS of large datasets from different epidemiological scenarios across the world is crucial. A smaller M. bovis dataset (n = 42) from a multi-host TB endemic scenario was then subjected to additional analyses, with the identification of more than 1,800 sites wherein at least one strain showed a single nucleotide polymorphism (SNP). The majority (87.1%) was located in coding regions, with the global ratio of non-synonymous upon synonymous alterations (dN/dS) exceeding 1.5, suggesting that positive selection is an important evolutionary force exerted upon M. bovis. A higher percentage of SNPs was detected in genes enriched into "lipid metabolism", "cell wall and cell processes" and "intermediary metabolism and respiration" functional categories, revealing their underlying importance in M. bovis biology and evolution. A closer look on genes prone to horizontal gene transfer in the MTBC ancestor and included in the 3R (DNA repair, replication and recombination) system revealed a global average negative value for Taijima's D neutrality test, suggesting that past selective sweeps and population expansion after a recent bottleneck remain as major evolutionary drivers of the obligatory pathogen M. bovis in its struggle with the host.
Collapse
Affiliation(s)
- Ana C Reis
- Centre for Ecology, Evolution and Environmental Changes (cE3c), Faculdade de Ciências, Universidade de Lisboa, Campo Grande, C2, Room 2.4.11, 1749-016, Lisbon, Portugal
- Biosystems and Integrative Sciences Institute (BioISI), Faculdade de Ciências da Universidade de Lisboa, Lisbon, Portugal
| | - Mónica V Cunha
- Centre for Ecology, Evolution and Environmental Changes (cE3c), Faculdade de Ciências, Universidade de Lisboa, Campo Grande, C2, Room 2.4.11, 1749-016, Lisbon, Portugal.
- Biosystems and Integrative Sciences Institute (BioISI), Faculdade de Ciências da Universidade de Lisboa, Lisbon, Portugal.
| |
Collapse
|
109
|
Agarwal G, Choudhary D, Stice SP, Myers BK, Gitaitis RD, Venter SN, Kvitko BH, Dutta B. Pan-Genome-Wide Analysis of Pantoea ananatis Identified Genes Linked to Pathogenicity in Onion. Front Microbiol 2021; 12:684756. [PMID: 34489883 PMCID: PMC8417944 DOI: 10.3389/fmicb.2021.684756] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 07/28/2021] [Indexed: 11/13/2022] Open
Abstract
Pantoea ananatis, a gram negative and facultative anaerobic bacterium is a member of a Pantoea spp. complex that causes center rot of onion, which significantly affects onion yield and quality. This pathogen does not have typical virulence factors like type II or type III secretion systems but appears to require a biosynthetic gene-cluster, HiVir/PASVIL (located chromosomally comprised of 14 genes), for a phosphonate secondary metabolite, and the 'alt' gene cluster (located in plasmid and comprised of 11 genes) that aids in bacterial colonization in onion bulbs by imparting tolerance to thiosulfinates. We conducted a deep pan-genome-wide association study (pan-GWAS) to predict additional genes associated with pathogenicity in P. ananatis using a panel of diverse strains (n = 81). We utilized a red-onion scale necrosis assay as an indicator of pathogenicity. Based on this assay, we differentiated pathogenic (n = 51)- vs. non-pathogenic (n = 30)-strains phenotypically. Pan-genome analysis revealed a large core genome of 3,153 genes and a flexible accessory genome. Pan-GWAS using the presence and absence variants (PAVs) predicted 42 genes, including 14 from the previously identified HiVir/PASVIL cluster associated with pathogenicity, and 28 novel genes that were not previously associated with pathogenicity in onion. Of the 28 novel genes identified, eight have annotated functions of site-specific tyrosine kinase, N-acetylmuramoyl-L-alanine amidase, conjugal transfer, and HTH-type transcriptional regulator. The remaining 20 genes are currently hypothetical. Further, a core-genome SNPs-based phylogeny and horizontal gene transfer (HGT) studies were also conducted to assess the extent of lateral gene transfer among diverse P. ananatis strains. Phylogenetic analysis based on PAVs and whole genome multi locus sequence typing (wgMLST) rather than core-genome SNPs distinguished red-scale necrosis inducing (pathogenic) strains from non-scale necrosis inducing (non-pathogenic) strains of P. ananatis. A total of 1182 HGT events including the HiVir/PASVIL and alt cluster genes were identified. These events could be regarded as a major contributing factor to the diversification, niche-adaptation and potential acquisition of pathogenicity/virulence genes in P. ananatis.
Collapse
Affiliation(s)
- Gaurav Agarwal
- Department of Plant Pathology, Coastal Plain Experimental Station, University of Georgia, Tifton, GA, United States
| | - Divya Choudhary
- Department of Plant Pathology, Coastal Plain Experimental Station, University of Georgia, Tifton, GA, United States
| | - Shaun P Stice
- Department of Plant Pathology, University of Georgia, Athens, GA, United States
| | - Brendon K Myers
- Department of Plant Pathology, Coastal Plain Experimental Station, University of Georgia, Tifton, GA, United States
| | - Ronald D Gitaitis
- Department of Plant Pathology, Coastal Plain Experimental Station, University of Georgia, Tifton, GA, United States
| | - Stephanus N Venter
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute, University of Pretoria, Pretoria, South Africa
| | - Brian H Kvitko
- Department of Plant Pathology, University of Georgia, Athens, GA, United States
| | - Bhabesh Dutta
- Department of Plant Pathology, Coastal Plain Experimental Station, University of Georgia, Tifton, GA, United States
| |
Collapse
|
110
|
Abstract
The reference human genome sequence is inarguably the most important and widely used resource in the fields of human genetics and genomics. It has transformed the conduct of biomedical sciences and brought invaluable benefits to the understanding and improvement of human health. However, the commonly used reference sequence has profound limitations, because across much of its span, it represents the sequence of just one human haplotype. This single, monoploid reference structure presents a critical barrier to representing the broad genomic diversity in the human population. In this review, we discuss the modernization of the reference human genome sequence to a more complete reference of human genomic diversity, known as a human pangenome.
Collapse
Affiliation(s)
- Karen H Miga
- UC Santa Cruz Genomics Institute and Department of Biomedical Engineering, University of California, Santa Cruz, California 95064, USA;
| | - Ting Wang
- Department of Genetics, Edison Family Center for Genome Sciences and Systems Biology, and McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63110, USA;
| |
Collapse
|
111
|
Senkevich TG, Yutin N, Wolf YI, Koonin EV, Moss B. Ancient Gene Capture and Recent Gene Loss Shape the Evolution of Orthopoxvirus-Host Interaction Genes. mBio 2021; 12:e0149521. [PMID: 34253028 PMCID: PMC8406176 DOI: 10.1128/mbio.01495-21] [Citation(s) in RCA: 59] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 05/24/2021] [Indexed: 01/27/2023] Open
Abstract
The survival of viruses depends on their ability to resist host defenses and, of all animal virus families, the poxviruses have the most antidefense genes. Orthopoxviruses (ORPV), a genus within the subfamily Chordopoxvirinae, infect diverse mammals and include one of the most devastating human pathogens, the now eradicated smallpox virus. ORPV encode ∼200 genes, of which roughly half are directly involved in virus genome replication and expression as well as virion morphogenesis. The remaining ∼100 "accessory" genes are responsible for virus-host interactions, particularly counter-defense of innate immunity. Complete sequences are currently available for several hundred ORPV genomes isolated from a variety of mammalian hosts, providing a rich resource for comparative genomics and reconstruction of ORPV evolution. To identify the provenance and evolutionary trends of the ORPV accessory genes, we constructed clusters including the orthologs of these genes from all chordopoxviruses. Most of the accessory genes were captured in three major waves early in chordopoxvirus evolution, prior to the divergence of ORPV and the sister genus Centapoxvirus from their common ancestor. The capture of these genes from the host was followed by extensive gene duplication, yielding several paralogous gene families. In addition, nine genes were gained during the evolution of ORPV themselves. In contrast, nearly every accessory gene was lost, some on multiple, independent occasions in numerous lineages of ORPV, so that no ORPV retains them all. A variety of functional interactions could be inferred from examination of pairs of ORPV accessory genes that were either often or rarely lost concurrently. IMPORTANCE Orthopoxviruses (ORPV) include smallpox (variola) virus, one of the most devastating human pathogens, and vaccinia virus, comprising the vaccine used for smallpox eradication. Among roughly 200 ORPV genes, about half are essential for genome replication and expression as well as virion morphogenesis, whereas the remaining half consists of accessory genes counteracting the host immune response. We reannotated the accessory genes of ORPV, predicting the functions of uncharacterized genes, and reconstructed the history of their gain and loss during the evolution of ORPV. Most of the accessory genes were acquired in three major waves antedating the origin of ORPV from chordopoxviruses. The evolution of ORPV themselves was dominated by gene loss, with numerous genes lost at the base of each major group of ORPV. Examination of pairs of ORPV accessory genes that were either often or rarely lost concurrently during ORPV evolution allows prediction of different types of functional interactions.
Collapse
Affiliation(s)
- Tatiana G. Senkevich
- Laboratory of Viral Diseases, National Institute of Allergy and Infectious Diseases, National Instutes of Health, Bethesda, Maryland, USA
| | - Natalya Yutin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | - Yuri I. Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | - Bernard Moss
- Laboratory of Viral Diseases, National Institute of Allergy and Infectious Diseases, National Instutes of Health, Bethesda, Maryland, USA
| |
Collapse
|
112
|
Li Q, Tian S, Yan B, Liu CM, Lam TW, Li R, Luo R. Building a Chinese pan-genome of 486 individuals. Commun Biol 2021; 4:1016. [PMID: 34462542 PMCID: PMC8405635 DOI: 10.1038/s42003-021-02556-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Accepted: 08/13/2021] [Indexed: 02/07/2023] Open
Abstract
Pan-genome sequence analysis of human population ancestry is critical for expanding and better defining human genome sequence diversity. However, the amount of genetic variation still missing from current human reference sequences is still unknown. Here, we used 486 deep-sequenced Han Chinese genomes to identify 276 Mbp of DNA sequences that, to our knowledge, are absent in the current human reference. We classified these sequences into individual-specific and common sequences, and propose that the common sequence size is uncapped with a growing population. The 46.646 Mbp common sequences obtained from the 486 individuals improved the accuracy of variant calling and mapping rate when added to the reference genome. We also analyzed the genomic positions of these common sequences and found that they came from genomic regions characterized by high mutation rate and low pathogenicity. Our study authenticates the Chinese pan-genome as representative of DNA sequences specific to the Han Chinese population missing from the GRCh38 reference genome and establishes the newly defined common sequences as candidates to supplement the current human reference.
Collapse
Affiliation(s)
- Qiuhui Li
- Department of Computer Science, The University of Hong Kong, Hong Kong, China
| | - Shilin Tian
- Novogene Bioinformatics Institute, Beijing, China
| | - Bin Yan
- Department of Computer Science, The University of Hong Kong, Hong Kong, China
| | - Chi Man Liu
- Department of Computer Science, The University of Hong Kong, Hong Kong, China
| | - Tak-Wah Lam
- Department of Computer Science, The University of Hong Kong, Hong Kong, China.
| | - Ruiqiang Li
- Novogene Bioinformatics Institute, Beijing, China.
| | - Ruibang Luo
- Department of Computer Science, The University of Hong Kong, Hong Kong, China.
| |
Collapse
|
113
|
Shapiro JW, Putonti C. Rephine.r: a pipeline for correcting gene calls and clusters to improve phage pangenomes and phylogenies. PeerJ 2021; 9:e11950. [PMID: 34434663 PMCID: PMC8351571 DOI: 10.7717/peerj.11950] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 07/20/2021] [Indexed: 12/05/2022] Open
Abstract
Background A pangenome is the collection of all genes found in a set of related genomes. For microbes, these genomes are often different strains of the same species, and the pangenome offers a means to compare gene content variation with differences in phenotypes, ecology, and phylogenetic relatedness. Though most frequently applied to bacteria, there is growing interest in adapting pangenome analysis to bacteriophages. However, working with phage genomes presents new challenges. First, most phage families are under-sampled, and homologous genes in related viruses can be difficult to identify. Second, homing endonucleases and intron-like sequences may be present, resulting in fragmented gene calls. Each of these issues can reduce the accuracy of standard pangenome analysis tools. Methods We developed an R pipeline called Rephine.r that takes as input the gene clusters produced by an initial pangenomics workflow. Rephine.r then proceeds in two primary steps. First, it identifies three common causes of fragmented gene calls: (1) indels creating early stop codons and new start codons; (2) interruption by a selfish genetic element; and (3) splitting at the ends of the reported genome. Fragmented genes are then fused to create new sequence alignments. In tandem, Rephine.r searches for distant homologs separated into different gene families using Hidden Markov Models. Significant hits are used to merge families into larger clusters. A final round of fragment identification is then run, and results may be used to infer single-copy core genomes and phylogenetic trees. Results We applied Rephine.r to three well-studied phage groups: the Tevenvirinae (e.g., T4), the Studiervirinae (e.g., T7), and the Pbunaviruses (e.g., PB1). In each case, Rephine.r recovered additional members of the single-copy core genome and increased the overall bootstrap support of the phylogeny. The Rephine.r pipeline is provided through GitHub (https://www.github.com/coevoeco/Rephine.r) as a single script for automated analysis and with utility functions to assist in building single-copy core genomes and predicting the sources of fragmented genes.
Collapse
Affiliation(s)
- Jason W Shapiro
- Department of Biology, Loyola University Chicago, Chicago, IL, United States of America
| | - Catherine Putonti
- Department of Biology, Loyola University Chicago, Chicago, IL, United States of America.,Department of Microbiology and Immunology, Stritch School of Medicine, Loyola University Chicago, Maywood, IL, United States of America.,Bioinformatics Program, Loyola University Chicago, Chicago, IL, United States of America
| |
Collapse
|
114
|
Abstract
Pangenomes are organized collections of the genomic information from related individuals or groups. Graphical pangenomics is the study of these pangenomes using graphical methods to identify and analyze genes, regions, and mutations of interest to an array of biological questions. This field has seen significant progress in recent years including the development of graph based models that better resolve biological phenomena, and an explosion of new tools for mapping reads, creating graphical genomes, and performing pangenome analysis. In this review, we discuss recent developments in models, algorithms associated with graphical genomes, and comparisons between similar tools. In addition we briefly discuss what these developments may mean for the future of genomics.
Collapse
|
115
|
Integrated mass spectrometry-based multi-omics for elucidating mechanisms of bacterial virulence. Biochem Soc Trans 2021; 49:1905-1926. [PMID: 34374408 DOI: 10.1042/bst20191088] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 07/19/2021] [Accepted: 07/21/2021] [Indexed: 11/17/2022]
Abstract
Despite being considered the simplest form of life, bacteria remain enigmatic, particularly in light of pathogenesis and evolving antimicrobial resistance. After three decades of genomics, we remain some way from understanding these organisms, and a substantial proportion of genes remain functionally unknown. Methodological advances, principally mass spectrometry (MS), are paving the way for parallel analysis of the proteome, metabolome and lipidome. Each provides a global, complementary assay, in addition to genomics, and the ability to better comprehend how pathogens respond to changes in their internal (e.g. mutation) and external environments consistent with infection-like conditions. Such responses include accessing necessary nutrients for survival in a hostile environment where co-colonizing bacteria and normal flora are acclimated to the prevailing conditions. Multi-omics can be harnessed across temporal and spatial (sub-cellular) dimensions to understand adaptation at the molecular level. Gene deletion libraries, in conjunction with large-scale approaches and evolving bioinformatics integration, will greatly facilitate next-generation vaccines and antimicrobial interventions by highlighting novel targets and pathogen-specific pathways. MS is also central in phenotypic characterization of surface biomolecules such as lipid A, as well as aiding in the determination of protein interactions and complexes. There is increasing evidence that bacteria are capable of widespread post-translational modification, including phosphorylation, glycosylation and acetylation; with each contributing to virulence. This review focuses on the bacterial genotype to phenotype transition and surveys the recent literature showing how the genome can be validated at the proteome, metabolome and lipidome levels to provide an integrated view of organism response to host conditions.
Collapse
|
116
|
Sharma D, Sharma A, Singh B, Verma SK. Pan-proteome profiling of emerging and re-emerging zoonotic pathogen Orientia tsutsugamushi for getting insight into microbial pathogenesis. Microb Pathog 2021; 158:105103. [PMID: 34298125 DOI: 10.1016/j.micpath.2021.105103] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2021] [Revised: 07/13/2021] [Accepted: 07/16/2021] [Indexed: 01/21/2023]
Abstract
With the occurrence and evolution of antibiotic and multidrug resistance in bacteria most of the existing remedies are becoming ineffective. The pan-proteome exploration of the bacterial pathogens helps to identify the wide spectrum therapeutic targets which will be effective against all strains in a species. The current study is focused on the pan-proteome profiling of zoonotic pathogen Orientia tsutsugamushi (Ott) for the identification of potential therapeutic targets. The pan-proteome of Ott is estimated to be extensive in nature that has 1429 protein clusters, out of which 694 were core, 391 were accessory, and 344 were unique. It was revealed that 622 proteins were essential, 222 proteins were virulent factors, and 42 proteins were involved in antibiotic resistance. The potential therapeutic targets were further classified into eleven broad classes among which gene expression and regulation, transport, and metabolism were dominant. The biological interactome analysis of therapeutic targets revealed that an ample amount of interactions were present among the proteins involved in DNA replication, ribosome assembly, cellwall metabolism, cell division, and antimicrobial resistance. The predicted therapeutic targets from the pan-proteome of Ott are involved in various biological processes, virulence, and antibiotic resistance; hence envisioned as potential candidates for drug discovery to combat scrub typhus.
Collapse
Affiliation(s)
- Dixit Sharma
- Centre for Computational Biology and Bioinformatics, School of Life Sciences, Central University of Himachal Pradesh, Kangra, Himachal Pradesh, 176206, India.
| | - Ankita Sharma
- Centre for Computational Biology and Bioinformatics, School of Life Sciences, Central University of Himachal Pradesh, Kangra, Himachal Pradesh, 176206, India
| | - Birbal Singh
- ICAR-Indian Veterinary Research Institute, Regional Station, Palampur, Himachal Pradesh, 176061, India
| | - Shailender Kumar Verma
- Centre for Computational Biology and Bioinformatics, School of Life Sciences, Central University of Himachal Pradesh, Kangra, Himachal Pradesh, 176206, India
| |
Collapse
|
117
|
Koeksoy E, Bezuidt OM, Bayer T, Chan CS, Emerson D. Zetaproteobacteria Pan-Genome Reveals Candidate Gene Cluster for Twisted Stalk Biosynthesis and Export. Front Microbiol 2021; 12:679409. [PMID: 34220764 PMCID: PMC8250860 DOI: 10.3389/fmicb.2021.679409] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Accepted: 05/06/2021] [Indexed: 12/15/2022] Open
Abstract
Twisted stalks are morphologically unique bacterial extracellular organo-metallic structures containing Fe(III) oxyhydroxides that are produced by microaerophilic Fe(II)-oxidizers belonging to the Betaproteobacteria and Zetaproteobacteria. Understanding the underlying genetic and physiological mechanisms of stalk formation is of great interest based on their potential as novel biogenic nanomaterials and their relevance as putative biomarkers for microbial Fe(II) oxidation on ancient Earth. Despite the recognition of these special biominerals for over 150 years, the genetic foundation for the stalk phenotype has remained unresolved. Here we present a candidate gene cluster for the biosynthesis and secretion of the stalk organic matrix that we identified with a trait-based analyses of a pan-genome comprising 16 Zetaproteobacteria isolate genomes. The “stalk formation in Zetaproteobacteria” (sfz) cluster comprises six genes (sfz1-sfz6), of which sfz1 and sfz2 were predicted with functions in exopolysaccharide synthesis, regulation, and export, sfz4 and sfz6 with functions in cell wall synthesis manipulation and carbohydrate hydrolysis, and sfz3 and sfz5 with unknown functions. The stalk-forming Betaproteobacteria Ferriphaselus R-1 and OYT-1, as well as dread-forming Zetaproteobacteria Mariprofundus aestuarium CP-5 and Mariprofundus ferrinatatus CP-8 contain distant sfz gene homologs, whereas stalk-less Zetaproteobacteria and Betaproteobacteria lack the entire gene cluster. Our pan-genome analysis further revealed a significant enrichment of clusters of orthologous groups (COGs) across all Zetaproteobacteria isolate genomes that are associated with the regulation of a switch between sessile and motile growth controlled by the intracellular signaling molecule c-di-GMP. Potential interactions between stalk-former unique transcription factor genes, sfz genes, and c-di-GMP point toward a c-di-GMP regulated surface attachment function of stalks during sessile growth.
Collapse
Affiliation(s)
- Elif Koeksoy
- Bigelow Laboratory for Ocean Sciences, East Boothbay, ME, United States.,Leibniz Institute DSMZ (German Collection of Microorganisms and Cell Cultures), Braunschweig, Germany
| | - Oliver M Bezuidt
- Bigelow Laboratory for Ocean Sciences, East Boothbay, ME, United States
| | - Timm Bayer
- Geomicrobiology Group, Center for Applied Geoscience, University of Tübingen, Tübingen, Germany
| | - Clara S Chan
- Department of Earth Sciences, University of Delaware, Newark, DE, United States.,School of Marine Sciences and Policy, University of Delaware, Newark, DE, United States
| | - David Emerson
- Bigelow Laboratory for Ocean Sciences, East Boothbay, ME, United States
| |
Collapse
|
118
|
Harris CD, Torrance EL, Raymann K, Bobay LM. CoreCruncher: Fast and Robust Construction of Core Genomes in Large Prokaryotic Data Sets. Mol Biol Evol 2021; 38:727-734. [PMID: 32886787 PMCID: PMC7826169 DOI: 10.1093/molbev/msaa224] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The core genome represents the set of genes shared by all, or nearly all, strains of a given population or species of prokaryotes. Inferring the core genome is integral to many genomic analyses, however, most methods rely on the comparison of all the pairs of genomes; a step that is becoming increasingly difficult given the massive accumulation of genomic data. Here, we present CoreCruncher; a program that robustly and rapidly constructs core genomes across hundreds or thousands of genomes. CoreCruncher does not compute all pairwise genome comparisons and uses a heuristic based on the distributions of identity scores to classify sequences as orthologs or paralogs/xenologs. Although it is much faster than current methods, our results indicate that our approach is more conservative than other tools and less sensitive to the presence of paralogs and xenologs. CoreCruncher is freely available from: https://github.com/lbobay/CoreCruncher. CoreCruncher is written in Python 3.7 and can also run on Python 2.7 without modification. It requires the python library Numpy and either Usearch or Blast. Certain options require the programs muscle or mafft.
Collapse
Affiliation(s)
- Connor D Harris
- Department of Biology, University of North Carolina Greensboro, Greensboro, NC
| | - Ellis L Torrance
- Department of Biology, University of North Carolina Greensboro, Greensboro, NC
| | - Kasie Raymann
- Department of Biology, University of North Carolina Greensboro, Greensboro, NC
| | - Louis-Marie Bobay
- Department of Biology, University of North Carolina Greensboro, Greensboro, NC
| |
Collapse
|
119
|
Lei L, Goltsman E, Goodstein D, Wu GA, Rokhsar DS, Vogel JP. Plant Pan-Genomics Comes of Age. ANNUAL REVIEW OF PLANT BIOLOGY 2021; 72:411-435. [PMID: 33848428 DOI: 10.1146/annurev-arplant-080720-105454] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
A pan-genome is the nonredundant collection of genes and/or DNA sequences in a species. Numerous studies have shown that plant pan-genomes are typically much larger than the genome of any individual and that a sizable fraction of the genes in any individual are present in only some genomes. The construction and interpretation of plant pan-genomes are challenging due to the large size and repetitive content of plant genomes. Most pan-genomes are largely focused on nontransposable element protein coding genes because they are more easily analyzed and defined than noncoding and repetitive sequences. Nevertheless, noncoding and repetitive DNA play important roles in determining the phenotype and genome evolution. Fortunately, it is now feasible to make multiple high-quality genomes that can be used to construct high-resolution pan-genomes that capture all the variation. However, assembling, displaying, and interacting with such high-resolution pan-genomes will require the development of new tools.
Collapse
Affiliation(s)
- Li Lei
- DOE Joint Genome Institute, Berkeley, California 94720, USA;
| | - Eugene Goltsman
- DOE Joint Genome Institute, Berkeley, California 94720, USA;
| | - David Goodstein
- DOE Joint Genome Institute, Berkeley, California 94720, USA;
| | | | - Daniel S Rokhsar
- DOE Joint Genome Institute, Berkeley, California 94720, USA;
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
| | - John P Vogel
- DOE Joint Genome Institute, Berkeley, California 94720, USA;
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| |
Collapse
|
120
|
Isla A, Martinez-Hernandez JE, Levipan HA, Haussmann D, Figueroa J, Rauch MC, Maracaja-Coutinho V, Yañez A. Development of a Multiplex PCR Assay for Genotyping the Fish Pathogen Piscirickettsia salmonis Through Comparative Genomics. Front Microbiol 2021; 12:673216. [PMID: 34177855 PMCID: PMC8226252 DOI: 10.3389/fmicb.2021.673216] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Accepted: 05/17/2021] [Indexed: 11/20/2022] Open
Abstract
Piscirickettsia salmonis is a bacterial pathogen that severely impact the aquaculture in several countries as Canada, Scotland, Ireland, Norway, and Chile. It provokes Piscirickettsiosis outbreaks in the marine phase of salmonid farming, resulting in economic losses. The monophyletic genogroup LF-89 and a divergent genogroup EM-90 are responsible for the most severe Piscirickettsiosis outbreaks in Chile. Therefore, the development of methods for quick genotyping of P. salmonis genogroups in field samples is vital for veterinary diagnoses and understanding the population structure of this pathogen. The present study reports the development of a multiplex PCR for genotyping LF-89 and EM-90 genogroups based on comparative genomics of 73 fully sequenced P. salmonis genomes. The results revealed 2,322 sequences shared between 35 LF-89 genomes, 2,280 sequences in the core-genome of 38 EM-90 genomes, and 331 and 534 accessory coding sequences each genogroup, respectively. A total of 1,801 clusters of coding sequences were shared among all tested genomes of P. salmonis (LF-89 and EM-90), with 253 and 291 unique sequences for LF-89 and EM-90 genogroups, respectively. The Multiplex-1 prototype was chosen for reliable genotyping because of differences in annealing temperatures and respective reaction efficiencies. This method also identified the pathogen in field samples infected with LF-89 or EM-90 strains, which is not possible with other methods currently available. Finally, the genome-based multiplex PCR protocol presented in this study is a rapid and affordable alternative to classical sequencing of PCR products and analyzing the length of restriction fragment polymorphisms.
Collapse
Affiliation(s)
- Adolfo Isla
- Instituto de Bioquímica y Microbiología, Universidad Austral de Chile, Valdivia, Chile.,Interdisciplinary Center for Aquaculture Research (INCAR), University of Concepcion, Concepción, Chile.,Departamento de Ciencias Básicas, Facultad de Ciencias, Universidad Santo Tomás, Santiago, Chile
| | - J Eduardo Martinez-Hernandez
- Centro de Modelamiento Molecular, Biofísica y Bioinformática - CM2B2, Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, Santiago, Chile.,Programa de Doctorado en Genómica Integrativa, Vicerrectoría de Investigación, Universidad Mayor, Santiago, Chile.,Laboratorio de Biología de Redes, Centro de Genómica y Bioinformática, Facultad de Ciencias, Universidad Mayor, Santiago, Chile
| | - Héctor A Levipan
- Laboratorio de Ecopatología y Nanobiomateriales, Departamento de Biología, Facultad de Ciencias Naturales y Exactas, Universidad de Playa Ancha, Valparaiso, Chile
| | - Denise Haussmann
- Departamento de Ciencias Básicas, Facultad de Ciencias, Universidad Santo Tomás, Santiago, Chile
| | - Jaime Figueroa
- Instituto de Bioquímica y Microbiología, Universidad Austral de Chile, Valdivia, Chile.,Interdisciplinary Center for Aquaculture Research (INCAR), University of Concepcion, Concepción, Chile
| | - Maria Cecilia Rauch
- Instituto de Bioquímica y Microbiología, Universidad Austral de Chile, Valdivia, Chile
| | - Vinicius Maracaja-Coutinho
- Centro de Modelamiento Molecular, Biofísica y Bioinformática - CM2B2, Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, Santiago, Chile.,Instituto Vandique, João Pessoa, Brazil.,Beagle Bioinformatics, Santiago, Chile
| | - Alejandro Yañez
- Interdisciplinary Center for Aquaculture Research (INCAR), University of Concepcion, Concepción, Chile.,Facultad de Ciencias, Universidad Austral de Chile, Valdivia, Chile
| |
Collapse
|
121
|
He Z, Ji R, Havlickova L, Wang L, Li Y, Lee HT, Song J, Koh C, Yang J, Zhang M, Parkin IAP, Wang X, Edwards D, King GJ, Zou J, Liu K, Snowdon RJ, Banga SS, Machackova I, Bancroft I. Genome structural evolution in Brassica crops. NATURE PLANTS 2021; 7:757-765. [PMID: 34045706 DOI: 10.1038/s41477-021-00928-8] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 04/22/2021] [Indexed: 05/15/2023]
Abstract
The cultivated Brassica species include numerous vegetable and oil crops of global importance. Three genomes (designated A, B and C) share mesohexapolyploid ancestry and occur both singly and in each pairwise combination to define the Brassica species. With organizational errors (such as misplaced genome segments) corrected, we showed that the fundamental structure of each of the genomes is the same, irrespective of the species in which it occurs. This enabled us to clarify genome evolutionary pathways, including updating the Ancestral Crucifer Karyotype (ACK) block organization and providing support for the Brassica mesohexaploidy having occurred via a two-step process. We then constructed genus-wide pan-genomes, drawing from genes present in any species in which the respective genome occurs, which enabled us to provide a global gene nomenclature system for the cultivated Brassica species and develop a methodology to cost-effectively elucidate the genomic impacts of alien introgressions. Our advances not only underpin knowledge-based approaches to the more efficient breeding of Brassica crops but also provide an exemplar for the study of other polyploids.
Collapse
Affiliation(s)
- Zhesi He
- Department of Biology, University of York, York, UK
| | - Ruiqin Ji
- Department of Biology, University of York, York, UK
- Department of Horticulture, Shenyang Agricultural University, Shenyang, China
| | | | - Lihong Wang
- Department of Biology, University of York, York, UK
| | - Yi Li
- Department of Biology, University of York, York, UK
| | - Huey Tyng Lee
- Department of Plant Breeding, Justus Liebig University of Giessen, Giessen, Germany
| | - Jiaming Song
- National Key Laboratory of Crop Genetic Improvement, College of Plant Science & Technology, Huazhong Agricultural University, Wuhan, China
| | - Chushin Koh
- Global Institute for Food Security (GIFS), University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Jinghua Yang
- Department of Horticulture, College of Agriculture & Biotechnology, Zhejiang University, Hangzhou, China
| | - Mingfang Zhang
- Department of Horticulture, College of Agriculture & Biotechnology, Zhejiang University, Hangzhou, China
| | | | - Xiaowu Wang
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences (IVF, CAAS), Beijing, China
| | - David Edwards
- School of Biological Sciences and the Institute of Agriculture, Faculty of Science, The University of Western Australia, Crawley, Western Australia, Australia
| | - Graham J King
- Southern Cross Plant Science, Southern Cross University, Lismore, New South Wales, Australia
| | - Jun Zou
- National Key Laboratory of Crop Genetic Improvement, College of Plant Science & Technology, Huazhong Agricultural University, Wuhan, China
| | - Kede Liu
- National Key Laboratory of Crop Genetic Improvement, College of Plant Science & Technology, Huazhong Agricultural University, Wuhan, China
| | - Rod J Snowdon
- Department of Plant Breeding, Justus Liebig University of Giessen, Giessen, Germany
| | - Surinder S Banga
- Department of Plant Breeding and Genetics, Punjab Agricultural University, Ludhiana, India
| | - Ivana Machackova
- Selgen, a.s., Plant breeding station, Chlumec nad Cidlinou, Czech Republic
| | - Ian Bancroft
- Department of Biology, University of York, York, UK.
| |
Collapse
|
122
|
Lomsadze A, Bonny C, Strozzi F, Borodovsky M. GeneMark-HM: improving gene prediction in DNA sequences of human microbiome. NAR Genom Bioinform 2021; 3:lqab047. [PMID: 34056597 PMCID: PMC8153819 DOI: 10.1093/nargab/lqab047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2020] [Revised: 04/27/2021] [Accepted: 05/24/2021] [Indexed: 11/14/2022] Open
Abstract
Computational reconstruction of nearly complete genomes from metagenomic reads may identify thousands of new uncultured candidate bacterial species. We have shown that reconstructed prokaryotic genomes along with genomes of sequenced microbial isolates can be used to support more accurate gene prediction in novel metagenomic sequences. We have proposed an approach that used three types of gene prediction algorithms and found for all contigs in a metagenome nearly optimal models of protein-coding regions either in libraries of pre-computed models or constructed de novo. The model selection process and gene annotation were done by the new GeneMark-HM pipeline. We have created a database of the species level pan-genomes for the human microbiome. To create a library of models representing each pan-genome we used a self-training algorithm GeneMarkS-2. Genes initially predicted in each contig served as queries for a fast similarity search through the pan-genome database. The best matches led to selection of the model for gene prediction. Contigs not assigned to pan-genomes were analyzed by crude, but still accurate models designed for sequences with particular GC compositions. Tests of GeneMark-HM on simulated metagenomes demonstrated improvement in gene annotation of human metagenomic sequences in comparison with the current state-of-the-art gene prediction tools.
Collapse
Affiliation(s)
| | | | | | - Mark Borodovsky
- Gene Probe, Inc., 1106 Wrights Mill Ct, Atlanta, GA 30324, USA
| |
Collapse
|
123
|
Barragan AC, Weigel D. Plant NLR diversity: the known unknowns of pan-NLRomes. THE PLANT CELL 2021; 33:814-831. [PMID: 33793812 PMCID: PMC8226294 DOI: 10.1093/plcell/koaa002] [Citation(s) in RCA: 69] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Accepted: 10/23/2020] [Indexed: 05/20/2023]
Abstract
Plants and pathogens constantly adapt to each other. As a consequence, many members of the plant immune system, and especially the intracellular nucleotide-binding site leucine-rich repeat receptors, also known as NOD-like receptors (NLRs), are highly diversified, both among family members in the same genome, and between individuals in the same species. While this diversity has long been appreciated, its true extent has remained unknown. With pan-genome and pan-NLRome studies becoming more and more comprehensive, our knowledge of NLR sequence diversity is growing rapidly, and pan-NLRomes provide powerful platforms for assigning function to NLRs. These efforts are an important step toward the goal of comprehensively predicting from sequence alone whether an NLR provides disease resistance, and if so, to which pathogens.
Collapse
Affiliation(s)
- A Cristina Barragan
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | | |
Collapse
|
124
|
Calcino AD, Kenny NJ, Gerdol M. Single individual structural variant detection uncovers widespread hemizygosity in molluscs. Philos Trans R Soc Lond B Biol Sci 2021; 376:20200153. [PMID: 33813894 PMCID: PMC8059565 DOI: 10.1098/rstb.2020.0153] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/07/2021] [Indexed: 11/12/2022] Open
Abstract
The advent of complete genomic sequencing has opened a window into genomic phenomena obscured by fragmented assemblies. A good example of these is the existence of hemizygous regions of autosomal chromosomes, which can result in marked differences in gene content between individuals within species. While these hemizygous regions, and presence/absence variation of genes that can result, are well known in plants, firm evidence has only recently emerged for their existence in metazoans. Here, we use recently published, complete genomes from wild-caught molluscs to investigate the prevalence of hemizygosity across a well-known and ecologically important clade. We show that hemizygous regions are widespread in mollusc genomes, not clustered in individual chromosomes, and often contain genes linked to transposition, DNA repair and stress response. With targeted investigations of HSP70-12 and C1qDC, we also show how individual gene families are distributed within pan-genomes. This work suggests that extensive pan-genomes are widespread across the conchiferan Mollusca, and represent useful tools for genomic evolution, allowing the maintenance of additional genetic diversity within the population. As genomic sequencing and re-sequencing becomes more routine, the prevalence of hemizygosity, and its impact on selection and adaptation, are key targets for research across the tree of life. This article is part of the Theo Murphy meeting issue 'Molluscan genomics: broad insights and future directions for a neglected phylum'.
Collapse
Affiliation(s)
- Andrew D. Calcino
- Department of Evolutionary Biology, Integrative Zoology, University of Vienna, Althanstrasse 14, Vienna 1090, Austria
| | - Nathan J. Kenny
- Life Sciences, The Natural History Museum, Cromwell Road, London SW7 5BD, UK
| | - Marco Gerdol
- Department of Life Sciences, University of Trieste, Via Licio Giorgieri 5, 34127 Trieste, Italy
| |
Collapse
|
125
|
Foerster H, Battey JND, Sierro N, Ivanov NV, Mueller LA. Metabolic networks of the Nicotiana genus in the spotlight: content, progress and outlook. Brief Bioinform 2021; 22:bbaa136. [PMID: 32662816 PMCID: PMC8138835 DOI: 10.1093/bib/bbaa136] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Revised: 05/19/2020] [Accepted: 06/04/2020] [Indexed: 01/09/2023] Open
Abstract
Manually curated metabolic databases residing at the Sol Genomics Network comprise two taxon-specific databases for the Solanaceae family, i.e. SolanaCyc and the genus Nicotiana, i.e. NicotianaCyc as well as six species-specific databases for Nicotiana tabacum TN90, N. tabacum K326, Nicotiana benthamiana, N. sylvestris, N. tomentosiformis and N. attenuata. New pathways were created through the extraction, examination and verification of related data from the literature and the aid of external database guided by an expert-led curation process. Here we describe the curation progress that has been achieved in these databases since the first release version 1.0 in 2016, the curation flow and the curation process using the example metabolic pathway for cholesterol in plants. The current content of our databases comprises 266 pathways and 36 superpathways in SolanaCyc and 143 pathways plus 21 superpathways in NicotianaCyc, manually curated and validated specifically for the Solanaceae family and Nicotiana genus, respectively. The curated data have been propagated to the respective Nicotiana-specific databases, which resulted in the enrichment and more accurate presentation of their metabolic networks. The quality and coverage in those databases have been compared with related external databases and discussed in terms of literature support and metabolic content.
Collapse
|
126
|
Reyes-Cortes JL, Azaola-Espinosa A, Lozano-Aguirre L, Ponce-Alquicira E. Physiological and Genomic Analysis of Bacillus pumilus UAMX Isolated from the Gastrointestinal Tract of Overweight Individuals. Microorganisms 2021; 9:microorganisms9051076. [PMID: 34067853 PMCID: PMC8156450 DOI: 10.3390/microorganisms9051076] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 05/08/2021] [Accepted: 05/13/2021] [Indexed: 12/15/2022] Open
Abstract
The study aimed to evaluate the metabolism and resistance to the gastrointestinal tract conditions of Bacillus pumilus UAMX (BP-UAMX) isolated from overweight individuals using genomic tools. Specifically, we assessed its ability to metabolize various carbon sources, its resistance to low pH exposure, and its growth in the presence of bile salts. The genomic and bioinformatic analyses included the prediction of gene and protein metabolic functions, a pan-genome and phylogenomic analysis. BP-UAMX survived at pH 3, while bile salts (0.2-0.3% w/v) increased its growth rate. Moreover, it showed the ability to metabolize simple and complex carbon sources (glucose, starch, carboxymethyl-cellulose, inulin, and tributyrin), showing a differentiated electrophoretic profile. Genome was assembled into a single contig, with a high percentage of genes and proteins associated with the metabolism of amino acids, carbohydrates, and lipids. Antibiotic resistance genes were detected, but only one beta-Lactam resistance protein related to the inhibition of peptidoglycan biosynthesis was identified. The pan-genome of BP-UAMX is still open with phylogenetic similarities with other Bacillus of human origin. Therefore, BP-UAMX seems to be adapted to the intestinal environment, with physiological and genomic analyses demonstrating the ability to metabolize complex carbon sources, the strain has an open pan-genome with continuous evolution and adaptation.
Collapse
Affiliation(s)
- José Luis Reyes-Cortes
- Departamento de Biotecnología, Universidad Autónoma Metropolitana Unidad Iztapalapa, Av. San Rafael Atlixco 186, Col. Vicentina, Ciudad de México 09340, Mexico;
| | - Alejandro Azaola-Espinosa
- Departamento de Sistemas Biológicos, Universidad Autónoma Metropolitana Unidad Xochimilco, Calzada del Hueso 1100, Coyoacán, Ciudad de México 04960, Mexico;
| | - Luis Lozano-Aguirre
- Unidad de Análisis Bioinformáticos del Centro de Ciencias Genómicas, UNAM, Cuernavaca, Morelos 62210, Mexico;
| | - Edith Ponce-Alquicira
- Departamento de Biotecnología, Universidad Autónoma Metropolitana Unidad Iztapalapa, Av. San Rafael Atlixco 186, Col. Vicentina, Ciudad de México 09340, Mexico;
- Correspondence: ; Tel.: +52-55-58044600 (ext. 2676)
| |
Collapse
|
127
|
Almeida OGGD, Furlan JPR, Stehling EG, De Martinis ECP. Comparative phylo-pangenomics reveals generalist lifestyles in representative Acinetobacter species and proposes candidate gene markers for species identification. Gene 2021; 791:145707. [PMID: 33979679 DOI: 10.1016/j.gene.2021.145707] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 04/09/2021] [Accepted: 05/06/2021] [Indexed: 02/05/2023]
Abstract
Acinetobacter species have the potential to invade and colonize immunocompromised patients, therefore being well-known as opportunistic pathogens. Among these bacteria, the species of the Acinetobacter calcoaceticus-Acinetobacter baumannii "complex" (Acb members) emerge as the main often isolated bacteria in clinical specimens. The unequivocal taxonomy is crucial to correctly identify these species and associated with comparative genomic analyses aids to understand their life-styles as well. In this study, all publicly available Acinetobacter species at the date of this study preparation were analyzed. The results revealed that the Acb members are in fact a complex when phenotypic methods are confronted, while for comparative and phylogenomics analyses this term is misleading, since they composed a monophyletic group instead. Nine best gene markers (response regulator, recJ, recG, phosphomannomutase, pepSY, monovalent cation/H + antiporter subunit D, mnmE, glnE, and bamA) were selected for identification of Acinetobacter species. Moreover, representative strains of each species were split according their isolation sources in the categories: environmental, human, insect and non-human vertebrate. Neither niche-specific genome signature nor niche-associated functional and pathogenic potential were associated with their isolation source, meaning it is not the main force acting on Acinetobacter adaptation in a given niche and corroborating that their ubiquitous distribution is a reflex of their generalist life-styles.
Collapse
Affiliation(s)
| | | | - Eliana Guedes Stehling
- Faculdade de Ciências Farmacêuticas de Ribeirão Preto, Universidade de São Paulo, Brazil
| | | |
Collapse
|
128
|
Fu X, Gong L, Liu Y, Lai Q, Li G, Shao Z. Bacillus pumilus Group Comparative Genomics: Toward Pangenome Features, Diversity, and Marine Environmental Adaptation. Front Microbiol 2021; 12:571212. [PMID: 34025591 PMCID: PMC8139322 DOI: 10.3389/fmicb.2021.571212] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Accepted: 04/12/2021] [Indexed: 11/13/2022] Open
Abstract
Background Members of the Bacillus pumilus group (abbreviated as the Bp group) are quite diverse and ubiquitous in marine environments, but little is known about correlation with their terrestrial counterparts. In this study, 16 marine strains that we had isolated before were sequenced and comparative genome analyses were performed with a total of 52 Bp group strains. The analyses included 20 marine isolates (which included the 16 new strains) and 32 terrestrial isolates, and their evolutionary relationships, differentiation, and environmental adaptation. Results Phylogenomic analysis revealed that the marine Bp group strains were grouped into three species: B. pumilus, B. altitudinis and B. safensis. All the three share a common ancestor. However, members of B. altitudinis were observed to cluster independently, separating from the other two, thus diverging from the others. Consistent with the universal nature of genes involved in the functioning of the translational machinery, the genes related to translation were enriched in the core genome. Functional genomic analyses revealed that the marine-derived and the terrestrial strains showed differences in certain hypothetical proteins, transcriptional regulators, K+ transporter (TrK) and ABC transporters. However, species differences showed the precedence of environmental adaptation discrepancies. In each species, land specific genes were found with possible functions that likely facilitate survival in diverse terrestrial niches, while marine bacteria were enriched with genes of unknown functions and those related to transcription, phage defense, DNA recombination and repair. Conclusion Our results indicated that the Bp isolates show distinct genomic features even as they share a common core. The marine and land isolates did not evolve independently; the transition between marine and non-marine habitats might have occurred multiple times. The lineage exhibited a priority effect over the niche in driving their dispersal. Certain intra-species niche specific genes could be related to a strains adaptation to its respective marine or terrestrial environment(s). In summary, this report describes the systematic evolution of 52 Bp group strains and will facilitate future studies toward understanding their ecological role and adaptation to marine and/or terrestrial environments.
Collapse
Affiliation(s)
- Xiaoteng Fu
- Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, China.,State Key Laboratory Breeding Base of Marine Genetic Resources, Xiamen, China.,Key Laboratory of Marine Genetic Resources of Fujian Province, Xiamen, China
| | - Linfeng Gong
- Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, China.,State Key Laboratory Breeding Base of Marine Genetic Resources, Xiamen, China.,Key Laboratory of Marine Genetic Resources of Fujian Province, Xiamen, China
| | - Yang Liu
- State Key Laboratory of Applied Microbiology Southern China, Guangdong Provincial Key Laboratory of Microbial Culture Collection and Application, Guangdong Open Laboratory of Applied Microbiology, Guangdong Microbial Culture Collection Center (GDMCC), Guangdong Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou, China
| | - Qiliang Lai
- Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, China.,State Key Laboratory Breeding Base of Marine Genetic Resources, Xiamen, China.,Key Laboratory of Marine Genetic Resources of Fujian Province, Xiamen, China
| | - Guangyu Li
- Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, China.,State Key Laboratory Breeding Base of Marine Genetic Resources, Xiamen, China.,Key Laboratory of Marine Genetic Resources of Fujian Province, Xiamen, China
| | - Zongze Shao
- Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, China.,State Key Laboratory Breeding Base of Marine Genetic Resources, Xiamen, China.,Key Laboratory of Marine Genetic Resources of Fujian Province, Xiamen, China.,Southern Marine Science and Engineering Guangdong Laboratory, Zhuhai, China
| |
Collapse
|
129
|
Nikulin NA, Zimin AA. Influence of Non-canonical DNA Bases on the Genomic Diversity of Tevenvirinae. Front Microbiol 2021; 12:632686. [PMID: 33889139 PMCID: PMC8056088 DOI: 10.3389/fmicb.2021.632686] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Accepted: 03/08/2021] [Indexed: 12/03/2022] Open
Abstract
The Tevenvirinae viruses are some of the most common viruses on Earth. Representatives of this subfamily have long been used in the molecular biology studies as model organisms – since the emergence of the discipline. Tevenvirinae are promising agents for phage therapy in animals and humans, since their representatives have only lytic life cycle and many of their host bacteria are pathogens. As confirmed experimentally, some Tevenvirinae have non-canonical DNA bases. Non-canonical bases can play an essential role in the diversification of closely related viruses. The article performs a comparative and evolutionary analysis of Tevenvirinae genomes and components of Tevenvirinae genomes. A comparative analysis of these genomes and the genes associated with the synthesis of non-canonical bases allows us to conclude that non-canonical bases have a major influence on the divergence of Tevenvirinae viruses within the same habitats. Supposedly, Tevenvirinae developed a strategy for changing HGT frequency in individual populations, which was based on the accumulation of proteins for the synthesis of non-canonical bases and proteins that used those bases as substrates. Owing to this strategy, ancestors of Tevenvirinae with the highest frequency of HGT acquired genes that allowed them to exist in a certain niche, and ancestors with the lowest HGT frequency preserved the most adaptive of those genes. Given the origin and characteristics of genes associated with the synthesis of non-canonical bases in Tevenvirinae, one can assume that other phages may have similar strategies. The article demonstrates the dependence of genomic diversity of closely related Tevenvirinae on non-canonical bases.
Collapse
Affiliation(s)
- Nikita A Nikulin
- Laboratory of Bacteriophage Biology, G.K. Skryabin Institute of Biochemistry and Physiology of Microorganisms, Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences, Pushchino, Russia
| | - Andrei A Zimin
- Laboratory of Molecular Microbiology, G.K. Skryabin Institute of Biochemistry and Physiology of Microorganisms, Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences, Pushchino, Russia
| |
Collapse
|
130
|
Sharma A, Sanduja P, Anand A, Mahajan P, Guzman CA, Yadav P, Awasthi A, Hanski E, Dua M, Johri AK. Advanced strategies for development of vaccines against human bacterial pathogens. World J Microbiol Biotechnol 2021; 37:67. [PMID: 33748926 PMCID: PMC7982316 DOI: 10.1007/s11274-021-03021-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Accepted: 02/17/2021] [Indexed: 12/18/2022]
Abstract
Infectious diseases are one of the main grounds of death and disabilities in human beings globally. Lack of effective treatment and immunization for many deadly infectious diseases and emerging drug resistance in pathogens underlines the need to either develop new vaccines or sufficiently improve the effectiveness of currently available drugs and vaccines. In this review, we discuss the application of advanced tools like bioinformatics, genomics, proteomics and associated techniques for a rational vaccine design.
Collapse
Affiliation(s)
- Abhinay Sharma
- School of Life Sciences, Jawaharlal Nehru University, Aruna Asaf Ali Marg, New Delhi, 110067, India
- Department of Vaccinology, Helmholtz Centre for Infection Research, Inhoffenstraße 7, 38124, Braunschweig, Germany
- Department of Microbiology and Molecular Genetics, The Institute for Medical Research, Israel-Canada (IMRIC), Faculty of Medicine, The Hebrew University of Jerusalem, 9112102, Jerusalem, Israel
| | - Pooja Sanduja
- School of Life Sciences, Jawaharlal Nehru University, Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Aparna Anand
- Department of Microbiology and Molecular Genetics, The Institute for Medical Research, Israel-Canada (IMRIC), Faculty of Medicine, The Hebrew University of Jerusalem, 9112102, Jerusalem, Israel
| | - Pooja Mahajan
- School of Life Sciences, Jawaharlal Nehru University, Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Carlos A Guzman
- Department of Vaccinology, Helmholtz Centre for Infection Research, Inhoffenstraße 7, 38124, Braunschweig, Germany
| | - Puja Yadav
- Department of Microbiology, Central University of Haryana, Mahendragarh, Harayana, India
| | - Amit Awasthi
- Translational Health Science and Technology Institute, Faridabad-Gurgaon Expressway, PO box #04, NCR Biotech Science Cluster, 3rd Milestone, Faridabad, Haryana, 121001, India
| | - Emanuel Hanski
- Department of Microbiology and Molecular Genetics, The Institute for Medical Research, Israel-Canada (IMRIC), Faculty of Medicine, The Hebrew University of Jerusalem, 9112102, Jerusalem, Israel
| | - Meenakshi Dua
- School of Environmental Sciences, Jawaharlal Nehru University, Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Atul Kumar Johri
- School of Life Sciences, Jawaharlal Nehru University, Aruna Asaf Ali Marg, New Delhi, 110067, India.
| |
Collapse
|
131
|
Ramírez-Durán N, de la Haba RR, Vera-Gargallo B, Sánchez-Porro C, Alonso-Carmona S, Sandoval-Trujillo H, Ventosa A. Taxogenomic and Comparative Genomic Analysis of the Genus Saccharomonospora Focused on the Identification of Biosynthetic Clusters PKS and NRPS. Front Microbiol 2021; 12:603791. [PMID: 33776952 PMCID: PMC7990883 DOI: 10.3389/fmicb.2021.603791] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Accepted: 02/17/2021] [Indexed: 11/13/2022] Open
Abstract
Actinobacteria are prokaryotes with a large biotechnological interest due to their ability to produce secondary metabolites, produced by two main biosynthetic gene clusters (BGCs): polyketide synthase (PKS) and non-ribosomal peptide synthetase (NRPS). Most studies on bioactive products have been carried out on actinobacteria isolated from soil, freshwater or marine habitats, while very few have been focused on halophilic actinobacteria isolated from extreme environments. In this study we have carried out a comparative genomic analysis of the actinobacterial genus Saccharomonospora, which includes species isolated from soils, lake sediments, marine or hypersaline habitats. A total of 19 genome sequences of members of Saccharomonospora were retrieved and analyzed. We compared the 16S rRNA gene-based phylogeny of this genus with evolutionary relationships inferred using a phylogenomic approach obtaining almost identical topologies between both strategies. This method allowed us to unequivocally assign strains into species and to identify some taxonomic relationships that need to be revised. Our study supports a recent speciation event occurring between Saccharomonospora halophila and Saccharomonospora iraqiensis. Concerning the identification of BGCs, a total of 18 different types of BGCs were detected in the analyzed genomes of Saccharomonospora, including PKS, NRPS and hybrid clusters which might be able to synthetize 40 different putative products. In comparison to other genera of the Actinobacteria, members of the genus Saccharomonospora showed a high degree of novelty and diversity of BGCs.
Collapse
Affiliation(s)
- Ninfa Ramírez-Durán
- Faculty of Medicine, Autonomous University of the State of Mexico, Toluca, Mexico.,Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Sevilla, Seville, Spain
| | - Rafael R de la Haba
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Sevilla, Seville, Spain
| | - Blanca Vera-Gargallo
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Sevilla, Seville, Spain
| | - Cristina Sánchez-Porro
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Sevilla, Seville, Spain
| | | | - Horacio Sandoval-Trujillo
- Department of Biological Systems, Metropolitan Autonomous University-Xochimilco, Mexico City, Mexico
| | - Antonio Ventosa
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Sevilla, Seville, Spain
| |
Collapse
|
132
|
Alessandri G, van Sinderen D, Ventura M. The genus bifidobacterium: From genomics to functionality of an important component of the mammalian gut microbiota running title: Bifidobacterial adaptation to and interaction with the host. Comput Struct Biotechnol J 2021; 19:1472-1487. [PMID: 33777340 PMCID: PMC7979991 DOI: 10.1016/j.csbj.2021.03.006] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 03/03/2021] [Accepted: 03/03/2021] [Indexed: 02/07/2023] Open
Abstract
Members of the genus Bifidobacterium are dominant and symbiotic inhabitants of the mammalian gastrointestinal tract. Being vertically transmitted, bifidobacterial host colonization commences immediately after birth and leads to a phase of host infancy during which bifidobacteria are highly prevalent and abundant to then transit to a reduced, yet stable abundance phase during host adulthood. However, in order to reach and stably colonize their elective niche, i.e. the large intestine, bifidobacteria have to cope with a multitude of oxidative, osmotic and bile salt/acid stress challenges that occur along the gastrointestinal tract (GIT). Concurrently, bifidobacteria not only have to compete with the myriad of other gut commensals for nutrient acquisition, but they also require protection against bacterial viruses. In this context, Next-Generation Sequencing (NGS) techniques, allowing large-scale comparative and functional genome analyses have helped to identify the genetic strategies that bifidobacteria have developed in order to colonize, survive and adopt to the highly competitive mammalian gastrointestinal environment. The current review is aimed at providing a comprehensive overview concerning the molecular strategies on which bifidobacteria rely to stably and successfully colonize the mammalian gut.
Collapse
Affiliation(s)
- Giulia Alessandri
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
| | - Douwe van Sinderen
- APC Microbiome Ireland and School of Microbiology, University College Cork, Western Road, Cork, Ireland
| | - Marco Ventura
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy.,Microbiome Research Hub, University of Parma, Parma, Italy
| |
Collapse
|
133
|
Zhong C, Chen C, Wang L, Ning K. Integrating pan-genome with metagenome for microbial community profiling. Comput Struct Biotechnol J 2021; 19:1458-1466. [PMID: 33841754 PMCID: PMC8010324 DOI: 10.1016/j.csbj.2021.02.021] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Revised: 02/24/2021] [Accepted: 02/27/2021] [Indexed: 02/07/2023] Open
Abstract
Advances in sequencing technology have led to the increased availability of genomes and metagenomes, which has greatly facilitated microbial pan-genome and metagenome analysis in the community. In line with this trend, studies on microbial genomes and phenotypes have gradually shifted from individuals to environmental communities. Pan-genomics and metagenomics are powerful strategies for in-depth profiling study of microbial communities. Pan-genomics focuses on genetic diversity, dynamics, and phylogeny at the multi-genome level, while metagenomics profiles the distribution and function of culture-free microbial communities in special environments. Combining pan-genome and metagenome analysis can reveal the microbial complicated connections from an individual complete genome to a mixture of genomes, thereby extending the catalog of traditional individual genomic profile to community microbial profile. Therefore, the combination of pan-genome and metagenome approaches has become a promising method to track the sources of various microbes and decipher the population-level evolution and ecosystem functions. This review summarized the pan-genome and metagenome approaches, the combined strategies of pan-genome and metagenome, and applications of these combined strategies in studies of microbial dynamics, evolution, and function in communities. We discussed emerging strategies for the study of microbial communities that integrate information in both pan-genome and metagenome. We emphasized studies in which the integrating pan-genome with metagenome approach improved the understanding of models of microbial community profiles, both structural and functional. Finally, we illustrated future perspectives of microbial community profile: more advanced analytical techniques, including big-data based artificial intelligence, will lead to an even better understanding of the patterns of microbial communities.
Collapse
Affiliation(s)
- Chaofang Zhong
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center of AI Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China.,Department of Computer Science, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon, Hong Kong, China
| | - Chaoyun Chen
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center of AI Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China
| | - Lusheng Wang
- Department of Computer Science, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon, Hong Kong, China.,City University of Hong Kong Shenzhen Research Institute, Shenzhen, China
| | - Kang Ning
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center of AI Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China
| |
Collapse
|
134
|
Wang Q, Kille B, Liu TR, Elworth RAL, Treangen TJ. PlasmidHawk improves lab of origin prediction of engineered plasmids using sequence alignment. Nat Commun 2021; 12:1167. [PMID: 33637701 PMCID: PMC7910462 DOI: 10.1038/s41467-021-21180-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 01/12/2021] [Indexed: 12/26/2022] Open
Abstract
With advances in synthetic biology and genome engineering comes a heightened awareness of potential misuse related to biosafety concerns. A recent study employed machine learning to identify the lab-of-origin of DNA sequences to help mitigate some of these concerns. Despite their promising results, this deep learning based approach had limited accuracy, was computationally expensive to train, and wasn't able to provide the precise features that were used in its predictions. To address these shortcomings, we developed PlasmidHawk for lab-of-origin prediction. Compared to a machine learning approach, PlasmidHawk has higher prediction accuracy; PlasmidHawk can successfully predict unknown sequences' depositing labs 76% of the time and 85% of the time the correct lab is in the top 10 candidates. In addition, PlasmidHawk can precisely single out the signature sub-sequences that are responsible for the lab-of-origin detection. In summary, PlasmidHawk represents an explainable and accurate tool for lab-of-origin prediction of synthetic plasmid sequences. PlasmidHawk is available at https://gitlab.com/treangenlab/plasmidhawk.git .
Collapse
Affiliation(s)
- Qi Wang
- Systems, Synthetic, and Physical Biology (SSPB) Graduate Program, Rice University, Houston, Texas, 77005, USA
| | - Bryce Kille
- Department of Computer Science, Rice University, Houston, Texas, 77005, United States
| | - Tian Rui Liu
- Department of Computer Science, Rice University, Houston, Texas, 77005, United States
| | - R A Leo Elworth
- Department of Computer Science, Rice University, Houston, Texas, 77005, United States
| | - Todd J Treangen
- Department of Computer Science, Rice University, Houston, Texas, 77005, United States.
| |
Collapse
|
135
|
Mizzi R, Timms VJ, Price-Carter ML, Gautam M, Whittington R, Heuer C, Biggs PJ, Plain KM. Comparative Genomics of Mycobacterium avium Subspecies Paratuberculosis Sheep Strains. Front Vet Sci 2021; 8:637637. [PMID: 33659287 PMCID: PMC7917049 DOI: 10.3389/fvets.2021.637637] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 01/25/2021] [Indexed: 12/15/2022] Open
Abstract
Mycobacterium avium subspecies paratuberculosis (MAP) is the aetiological agent of Johne's disease (JD), a chronic enteritis that causes major losses to the global livestock industry. Further, it has been associated with human Crohn's disease. Several strains of MAP have been identified, the two major groups being sheep strain MAP, which includes the Type I and Type III sub-lineages, and the cattle strain or Type II MAP lineage, of which bison strains are a sub-grouping. Major genotypic, phenotypic and pathogenic variations have been identified in prior comparisons, but the research has predominately focused on cattle strains of MAP. In countries where the sheep industries are more prevalent, however, such as Australia and New Zealand, ovine JD is a substantial burden. An information gap exists regarding the genomic differences between sheep strain sub-lineages and the relevance of Type I and Type III MAP in terms of epidemiology and/or pathogenicity. We therefore investigated sheep MAP isolates from Australia and New Zealand using whole genome sequencing. For additional context, sheep MAP genome datasets were downloaded from the Sequence Read Archive and GenBank. The final dataset contained 18 Type III and 16 Type I isolates and the K10 cattle strain MAP reference genome. Using a pan-genome approach, an updated global phylogeny for sheep MAP from de novo assemblies was produced. When rooted with the K10 cattle reference strain, two distinct clades representing the lineages were apparent. The Australian and New Zealand isolates formed a distinct sub-clade within the type I lineage, while the European type I isolates formed another less closely related group. Within the type III lineage, isolates appeared more genetically diverse and were from a greater number of continents. Querying of the pan-genome and verification using BLAST analysis revealed lineage-specific variations (n = 13) including genes responsible for metabolism and stress responses. The genetic differences identified may represent important epidemiological and virulence traits specific to sheep MAP. This knowledge will potentially contribute to improved vaccine development and control measures for these strains.
Collapse
Affiliation(s)
- Rachel Mizzi
- Farm Animal Health Group, Sydney School of Veterinary Science, Faculty of Science, The University of Sydney, Camden, NSW, Australia
| | - Verlaine J Timms
- Centre for Infectious Diseases and Microbiology, Public Health, Westmead Hospital, Westmead, NSW, Australia
| | | | - Milan Gautam
- School of Veterinary Science, Massey University, Palmerston North, New Zealand
| | - Richard Whittington
- Farm Animal Health Group, Sydney School of Veterinary Science, Faculty of Science, The University of Sydney, Camden, NSW, Australia
| | - Cord Heuer
- School of Veterinary Science, Massey University, Palmerston North, New Zealand
| | - Patrick J Biggs
- School of Veterinary Science, Massey University, Palmerston North, New Zealand.,School of Fundamental Sciences, Massey University, Palmerston North, New Zealand
| | - Karren M Plain
- Farm Animal Health Group, Sydney School of Veterinary Science, Faculty of Science, The University of Sydney, Camden, NSW, Australia
| |
Collapse
|
136
|
Copy number variation: Characteristics, evolutionary and pathological aspects. Biomed J 2021; 44:548-559. [PMID: 34649833 PMCID: PMC8640565 DOI: 10.1016/j.bj.2021.02.003] [Citation(s) in RCA: 73] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Revised: 02/01/2021] [Accepted: 02/05/2021] [Indexed: 12/12/2022] Open
Abstract
Copy number variants (CNVs) were the subject of extensive research in the past years. They are common features of the human genome that play an important role in evolution, contribute to population diversity, development of certain diseases, and influence host–microbiome interactions. CNVs have found application in the molecular diagnosis of many diseases and in non-invasive prenatal care, but their full potential is only emerging. CNVs are expected to have a tremendous impact on screening, diagnosis, prognosis, and monitoring of several disorders, including cancer and cardiovascular disease. Here, we comprehensively review basic definitions of the term CNV, outline mechanisms and factors involved in CNV formation, and discuss their evolutionary and pathological aspects. We suggest a need for better defined distinguishing criteria and boundaries between known types of CNVs.
Collapse
|
137
|
Sela I, Wolf YI, Koonin EV. Assessment of assumptions underlying models of prokaryotic pangenome evolution. BMC Biol 2021; 19:27. [PMID: 33563283 PMCID: PMC7874442 DOI: 10.1186/s12915-021-00960-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Accepted: 01/15/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The genomes of bacteria and archaea evolve by extensive loss and gain of genes which, for any group of related prokaryotic genomes, result in the formation of a pangenome with the universal, asymmetrical U-shaped distribution of gene commonality. However, the evolutionary factors that define the specific shape of this distribution are not thoroughly understood. RESULTS We investigate the fit of simple models of genome evolution to the empirically observed gene commonality distributions and genome intersections for 33 groups of closely related bacterial genomes. A model with an infinite external gene pool available for gene acquisition and constant genome size (IGP-CGS model), and two gene turnover rates, one for slow- and the other one for fast-evolving genes, allows two approaches to estimate the parameters for gene content dynamics. One is by fitting the model prediction to the distribution of the number of genes shared by precisely k genomes (gene commonality distribution) and another by analyzing the distribution of the number of genes common for k genome sets (k-cores). Both approaches produce a comparable overall quality of fit, although the former significantly overestimates the number of the universally conserved genes, while the latter overestimates the number of singletons. We further explore the effect of dropping each of the assumptions of the IGP-CGS model on the fit to the gene commonality distributions and show that models with either a finite gene pool or unequal rates of gene loss and gain (greater gene loss rate) eliminate the overestimate of the number of singletons or the core genome size. CONCLUSIONS We examine the assumptions that are usually adopted for modeling the evolution of the U-shaped gene commonality distributions in prokaryote genomes, namely, those of infinitely many genes and constant genome size. The combined analysis of genome intersections and gene commonality suggests that at least one of these assumptions is invalid. The violation of both these assumptions reflects the limited ability of prokaryotes to gain new genes. This limitation seems to stem, at least partly, from the horizontal gene transfer barrier, i.e., the cost of accommodation of foreign genes by prokaryotes. Further development of models taking into account the complexity of microbial evolution is necessary for an improved understanding of the evolution of prokaryotes.
Collapse
Affiliation(s)
- Itamar Sela
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
| | - Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
| |
Collapse
|
138
|
Sielemann K, Weisshaar B, Pucker B. Reference-based QUantification Of gene Dispensability (QUOD). PLANT METHODS 2021; 17:18. [PMID: 33563309 PMCID: PMC7871624 DOI: 10.1186/s13007-021-00718-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Accepted: 02/03/2021] [Indexed: 05/03/2023]
Abstract
BACKGROUND Dispensability of genes in a phylogenetic lineage, e.g. a species, genus, or higher-level clade, is gaining relevance as most genome sequencing projects move to a pangenome level. Most analyses classify genes as core genes, which are present in all investigated individual genomes, and dispensable genes, which only occur in a single or a few investigated genomes. The binary classification as 'core' or 'dispensable' is often based on arbitrary cutoffs of presence/absence in the analysed genomes. Even when extended to 'conditionally dispensable', this concept still requires the assignment of genes to distinct groups. RESULTS Here, we present a new method which overcomes this distinct classification by quantifying gene dispensability and present a dedicated tool for reference-based QUantification Of gene Dispensability (QUOD). As a proof of concept, sequence data of 966 Arabidopsis thaliana accessions (Ath-966) were processed to calculate a gene-specific dispensability score for each gene based on normalised coverage in read mappings. We validated this score by comparison of highly conserved Benchmarking Universal Single Copy Orthologs (BUSCOs) to all other genes. The average scores of BUSCOs were significantly lower than the scores of non-BUSCOs. Analysis of variation demonstrated lower variation values between replicates of a single accession than between iteratively, randomly selected accessions from the whole dataset Ath-966. Functional investigations revealed defense and antimicrobial response genes among the genes with high-dispensability scores. CONCLUSIONS Instead of classifying a gene as core or dispensable, QUOD assigns a dispensability score to each gene. Hence, QUOD facilitates the identification of candidate dispensable genes, associated with high dispensability scores, which often underlie lineage-specific adaptation to varying environmental conditions.
Collapse
Affiliation(s)
- Katharina Sielemann
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec) & Faculty of Biology, Bielefeld University, 33615 Bielefeld, Germany
- Graduate School DILS, Bielefeld Institute for Bioinformatics Infrastructure (BIBI), Bielefeld University, 33615 Bielefeld, Germany
| | - Bernd Weisshaar
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec) & Faculty of Biology, Bielefeld University, 33615 Bielefeld, Germany
| | - Boas Pucker
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec) & Faculty of Biology, Bielefeld University, 33615 Bielefeld, Germany
- Evolution and Diversity, Department of Plant Sciences, University of Cambridge, Cambridge, UK
| |
Collapse
|
139
|
Schulz T, Wittler R, Rahmann S, Hach F, Stoye J. Detecting High Scoring Local Alignments in Pangenome Graphs. Bioinformatics 2021; 37:2266-2274. [PMID: 33532821 PMCID: PMC8388040 DOI: 10.1093/bioinformatics/btab077] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 12/02/2020] [Accepted: 01/29/2021] [Indexed: 11/23/2022] Open
Abstract
Motivation Increasing amounts of individual genomes sequenced per species motivate the usage of pangenomic approaches. Pangenomes may be represented as graphical structures, e.g. compacted colored de Bruijn graphs, which offer a low memory usage and facilitate reference-free sequence comparisons. While sequence-to-graph mapping to graphical pangenomes has been studied for some time, no local alignment search tool in the vein of BLAST has been proposed yet. Results We present a new heuristic method to find maximum scoring local alignments of a DNA query sequence to a pangenome represented as a compacted colored de Bruijn graph. Our approach additionally allows a comparison of similarity among sequences within the pangenome. We show that local alignment scores follow an exponential-tail distribution similar to BLAST scores, and we discuss how to estimate its parameters to separate local alignments representing sequence homology from spurious findings. An implementation of our method is presented, and its performance and usability are shown. Our approach scales sublinearly in running time and memory usage with respect to the number of genomes under consideration. This is an advantage over classical methods that do not make use of sequence similarity within the pangenome. Availability and implementation Source code and test data are available from https://gitlab.ub.uni-bielefeld.de/gi/plast. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Tizian Schulz
- Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, 33615, Germany.,Bielefeld Institute for Bioinformatics Infrastructure (BIBI), Bielefeld University, Bielefeld, 33615, Germany.,Graduate School "Digital Infrastructure for the Life Sciences" (DILS), Bielefeld University, Bielefeld, 33615, Germany
| | - Roland Wittler
- Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, 33615, Germany.,Bielefeld Institute for Bioinformatics Infrastructure (BIBI), Bielefeld University, Bielefeld, 33615, Germany
| | - Sven Rahmann
- Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, 45122, Germany
| | - Faraz Hach
- Vancouver Prostate Centre, Vancouver, V6H 3Z6, Canada.,Department of Urologic Sciences, University of British Columbia, Vancouver, V6T 1Z4, Canada
| | - Jens Stoye
- Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, 33615, Germany.,Bielefeld Institute for Bioinformatics Infrastructure (BIBI), Bielefeld University, Bielefeld, 33615, Germany
| |
Collapse
|
140
|
González-Dominici LI, Saati-Santamaría Z, García-Fraile P. Genome Analysis and Genomic Comparison of the Novel Species Arthrobacter ipsi Reveal Its Potential Protective Role in Its Bark Beetle Host. MICROBIAL ECOLOGY 2021; 81:471-482. [PMID: 32901388 DOI: 10.1007/s00248-020-01593-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2020] [Accepted: 08/30/2020] [Indexed: 06/11/2023]
Abstract
The pine engraver beetle, Ips acuminatus Gyll, is a bark beetle that causes important damages in Scots pine (Pinus sylvestris) forests and plantations. As almost all higher organisms, Ips acuminatus harbours a microbiome, although the role of most members of its microbiome is not well understood. As part of a work in which we analysed the bacterial diversity associated to Ips acuminatus, we isolated the strain Arthrobacter sp. IA7. In order to study its potential role within the bark beetle holobiont, we sequenced and explored its genome and performed a pan-genome analysis of the genus Arthrobacter, showing specific genes of strain IA7 that might be related with its particular role in its niche. Based on these investigations, we suggest several potential roles of the bacterium within the beetle. Analysis of genes related to secondary metabolism indicated potential antifungal capability, confirmed by the inhibition of several entomopathogenic fungal strains (Metarhizium anisopliae CCF0966, Lecanicillium muscarium CCF6041, L. muscarium CCF3297, Isaria fumosorosea CCF4401, I. farinosa CCF4808, Beauveria bassiana CCF4422 and B. brongniartii CCF1547). Phylogenetic analyses of the 16S rRNA gene, six concatenated housekeeping genes (tuf-secY-rpoB-recA-fusA-atpD) and genome sequences indicated that strain IA7 is closely related to A. globiformis NBRC 12137T but forms a new species within the genus Arthrobacter; this was confirmed by digital DNA-DNA hybridization (37.10%) and average nucleotide identity (ANIb) (88.9%). Based on phenotypic and genotypic features, we propose strain IA7T as the novel species Arthrobacter ipsi sp. nov. (type strain IA7T = CECT 30100T = LMG 31782T) and suggest its protective role for its host.
Collapse
Affiliation(s)
- Lihuén Iraí González-Dominici
- Microbiology and Genetics Department, University of Salamanca, Salamanca, Spain
- Spanish-Portuguese Institute for Agricultural Research (CIALE), Villamayor, Salamanca, Spain
| | - Zaki Saati-Santamaría
- Microbiology and Genetics Department, University of Salamanca, Salamanca, Spain
- Spanish-Portuguese Institute for Agricultural Research (CIALE), Villamayor, Salamanca, Spain
| | - Paula García-Fraile
- Microbiology and Genetics Department, University of Salamanca, Salamanca, Spain.
- Spanish-Portuguese Institute for Agricultural Research (CIALE), Villamayor, Salamanca, Spain.
- Institute of Microbiology of the Czech Academy of Sciences, Prague, Czech Republic.
- Associated R&D Unit, USAL-CSIC (IRNASA), Salamanca, Spain.
| |
Collapse
|
141
|
Ruiz-Roldán L, de Toro M, Sáenz Y. Whole Genome Analysis of Environmental Pseudomonas mendocina Strains: Virulence Mechanisms and Phylogeny. Genes (Basel) 2021; 12:115. [PMID: 33477842 PMCID: PMC7832885 DOI: 10.3390/genes12010115] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2020] [Revised: 01/10/2021] [Accepted: 01/16/2021] [Indexed: 12/15/2022] Open
Abstract
Pseudomonas mendocina is an environmental bacterium, rarely isolated in clinical specimens, although it has been described as producing endocarditis and sepsis. Little is known about its genome. Whole genome sequencing can be used to learn about the phylogeny, evolution, or pathogenicity of these isolates. Thus, the aim of this study was to analyze the resistome, virulome, and phylogenetic relationship of two P. mendocina strains, Ps542 and Ps799, isolated from a healthy Anas platyrhynchos fecal sample and a lettuce, respectively. Among all of the small number of P.mendocina genomes available in the National Center for Biotechnology Information (NCBI) repository, both strains were placed within one of two well-defined phylogenetic clusters. Both P. mendocina strains lacked antimicrobial resistance genes, but the Ps799 genome showed a MOBP3 family relaxase. Nevertheless, this study revealed that P. mendocina possesses an important number of virulence factors, including a leukotoxin, flagella, pili, and the Type 2 and Type 6 Secretion Systems, that could be responsible for their pathogenesis. More phenotypical and in vivo studies are needed to deepen the association with human infections and the potential P. mendocina pathogenicity.
Collapse
Affiliation(s)
- Lidia Ruiz-Roldán
- Área de Microbiología Molecular, Centro de Investigación Biomédica de La Rioja (CIBIR), C/Piqueras 98, 26006 Logroño, Spain;
| | - María de Toro
- Plataforma de Genómica y Bioinformática, Centro de Investigación Biomédica de La Rioja (CIBIR), C/Piqueras 98, 26006 Logroño, Spain
| | - Yolanda Sáenz
- Área de Microbiología Molecular, Centro de Investigación Biomédica de La Rioja (CIBIR), C/Piqueras 98, 26006 Logroño, Spain;
| |
Collapse
|
142
|
Genomic Analysis of a Newly Isolated Acidithiobacillus ferridurans JAGS Strain Reveals Its Adaptation to Acid Mine Drainage. MINERALS 2021. [DOI: 10.3390/min11010074] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Acidithiobacillus ferridurans JAGS is a newly isolated acidophile from an acid mine drainage (AMD). The genome of isolate JAGS was sequenced and compared with eight other published genomes of Acidithiobacillus. The pairwise mutation distance (Mash) and average nucleotide identity (ANI) revealed that isolate JAGS had a close evolutionary relationship with A. ferridurans JCM18981, but whole-genome alignment showed that it had higher similarity in genomic structure with A. ferrooxidans species. Pan-genome analysis revealed that nine genomes were comprised of 4601 protein coding sequences, of which 43% were core genes (1982) and 23% were unique genes (1064). A. ferridurans species had more unique genes (205–246) than A. ferrooxidans species (21–234). Functional gene categorizations showed that A. ferridurans strains had a higher portion of genes involved in energy production and conversion while A. ferrooxidans had more for inorganic ion transport and metabolism. A high abundance of kdp, mer and ars genes, as well as mobile genetic elements, was found in isolate JAGS, which might contribute to its resistance to harsh environments. These findings expand our understanding of the evolutionary adaptation of Acidithiobacillus and indicate that A. ferridurans JAGS is a promising candidate for biomining and AMD biotreatment applications.
Collapse
|
143
|
Perrin A, Rocha EPC. PanACoTA: a modular tool for massive microbial comparative genomics. NAR Genom Bioinform 2021; 3:lqaa106. [PMID: 33575648 PMCID: PMC7803007 DOI: 10.1093/nargab/lqaa106] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 11/10/2020] [Accepted: 12/01/2020] [Indexed: 02/06/2023] Open
Abstract
The study of the gene repertoires of microbial species, their pangenomes, has become a key part of microbial evolution and functional genomics. Yet, the increasing number of genomes available complicates the establishment of the basic building blocks of comparative genomics. Here, we present PanACoTA (https://github.com/gem-pasteur/PanACoTA), a tool that allows to download all genomes of a species, build a database with those passing quality and redundancy controls, uniformly annotate and then build their pangenome, several variants of core genomes, their alignments and a rapid but accurate phylogenetic tree. While many programs building pangenomes have become available in the last few years, we have focused on a modular method, that tackles all the key steps of the process, from download to phylogenetic inference. While all steps are integrated, they can also be run separately and multiple times to allow rapid and extensive exploration of the parameters of interest. PanACoTA is built in Python3, includes a singularity container and features to facilitate its future development. We believe PanACoTa is an interesting addition to the current set of comparative genomics tools, since it will accelerate and standardize the more routine parts of the work, allowing microbial genomicists to more quickly tackle their specific questions.
Collapse
Affiliation(s)
- Amandine Perrin
- Microbial Evolutionary Genomics, CNRS, UMR3525, Institut Pasteur, 28, rue Dr Roux, Paris 75015, France
| | - Eduardo P C Rocha
- Microbial Evolutionary Genomics, CNRS, UMR3525, Institut Pasteur, 28, rue Dr Roux, Paris 75015, France
| |
Collapse
|
144
|
Domingo-Sananes MR, McInerney JO. Mechanisms That Shape Microbial Pangenomes. Trends Microbiol 2021; 29:493-503. [PMID: 33423895 DOI: 10.1016/j.tim.2020.12.004] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Revised: 12/09/2020] [Accepted: 12/10/2020] [Indexed: 01/02/2023]
Abstract
Analyses of multiple whole-genome sequences from the same species have revealed that differences in gene content can be substantial, particularly in prokaryotes. Such variation has led to the recognition of pangenomes, the complete set of genes present in a species - consisting of core genes, present in all individuals, and accessory genes whose presence is variable. Questions now arise about how pangenomes originate and evolve. We describe how gene content variation can arise as a result of the combination of several processes, including random drift, selection, gain/loss balance, and the influence of ecological and epistatic interactions. We believe that identifying the contributions of these processes to pangenomes will need novel theoretical approaches and empirical data.
Collapse
Affiliation(s)
- Maria Rosa Domingo-Sananes
- School of Life Sciences, University of Nottingham, Nottingham, UK; School of Science and Technology, Nottingham Trent University, Nottingham, UK.
| | | |
Collapse
|
145
|
High-Throughput Genotyping Technologies in Plant Taxonomy. Methods Mol Biol 2021; 2222:149-166. [PMID: 33301093 DOI: 10.1007/978-1-0716-0997-2_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Molecular markers provide researchers with a powerful tool for variation analysis between plant genomes. They are heritable and widely distributed across the genome and for this reason have many applications in plant taxonomy and genotyping. Over the last decade, molecular marker technology has developed rapidly and is now a crucial component for genetic linkage analysis, trait mapping, diversity analysis, and association studies. This chapter focuses on molecular marker discovery, its application, and future perspectives for plant genotyping through pangenome assemblies. Included are descriptions of automated methods for genome and sequence distance estimation, genome contaminant analysis in sequence reads, genome structural variation, and SNP discovery methods.
Collapse
|
146
|
Higdon SM, Huang BC, Bennett AB, Weimer BC. Identification of Nitrogen Fixation Genes in Lactococcus Isolated from Maize Using Population Genomics and Machine Learning. Microorganisms 2020; 8:microorganisms8122043. [PMID: 33419343 PMCID: PMC7768417 DOI: 10.3390/microorganisms8122043] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 12/08/2020] [Accepted: 12/17/2020] [Indexed: 02/06/2023] Open
Abstract
Sierra Mixe maize is a landrace variety from Oaxaca, Mexico, that utilizes nitrogen derived from the atmosphere via an undefined nitrogen fixation mechanism. The diazotrophic microbiota associated with the plant’s mucilaginous aerial root exudate composed of complex carbohydrates was previously identified and characterized by our group where we found 23 lactococci capable of biological nitrogen fixation (BNF) without containing any of the proposed essential genes for this trait (nifHDKENB). To determine the genes in Lactococcus associated with this phenotype, we selected 70 lactococci from the dairy industry that are not known to be diazotrophic to conduct a comparative population genomic analysis. This showed that the diazotrophic lactococcal genomes were distinctly different from the dairy isolates. Examining the pangenome followed by genome-wide association study and machine learning identified genes with the functions needed for BNF in the maize isolates that were absent from the dairy isolates. Many of the putative genes received an ‘unknown’ annotation, which led to the domain analysis of the 135 homologs. This revealed genes with molecular functions needed for BNF, including mucilage carbohydrate catabolism, glycan-mediated host adhesion, iron/siderophore utilization, and oxidation/reduction control. This is the first report of this pathway in this organism to underpin BNF. Consequently, we proposed a model needed for BNF in lactococci that plausibly accounts for BNF in the absence of the nif operon in this organism.
Collapse
Affiliation(s)
- Shawn M. Higdon
- Department of Plant Sciences, University of California, Davis, CA 95616, USA; (S.M.H.); (A.B.B.)
| | - Bihua C. Huang
- Department of Population Health and Reproduction, School of Veterinary Medicine, University of California, Davis, CA 95616, USA;
- 100 K Pathogen Genome Project, University of California, Davis, CA 95616, USA
| | - Alan B. Bennett
- Department of Plant Sciences, University of California, Davis, CA 95616, USA; (S.M.H.); (A.B.B.)
| | - Bart C. Weimer
- Department of Population Health and Reproduction, School of Veterinary Medicine, University of California, Davis, CA 95616, USA;
- 100 K Pathogen Genome Project, University of California, Davis, CA 95616, USA
- Correspondence:
| |
Collapse
|
147
|
Utter DR, Borisy GG, Eren AM, Cavanaugh CM, Mark Welch JL. Metapangenomics of the oral microbiome provides insights into habitat adaptation and cultivar diversity. Genome Biol 2020; 21:293. [PMID: 33323129 PMCID: PMC7739467 DOI: 10.1186/s13059-020-02200-2] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Accepted: 11/09/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND The increasing availability of microbial genomes and environmental shotgun metagenomes provides unprecedented access to the genomic differences within related bacteria. The human oral microbiome with its diverse habitats and abundant, relatively well-characterized microbial inhabitants presents an opportunity to investigate bacterial population structures at an ecosystem scale. RESULTS Here, we employ a metapangenomic approach that combines public genomes with Human Microbiome Project (HMP) metagenomes to study the diversity of microbial residents of three oral habitats: tongue dorsum, buccal mucosa, and supragingival plaque. For two exemplar taxa, Haemophilus parainfluenzae and the genus Rothia, metapangenomes reveal distinct genomic groups based on shared genome content. H. parainfluenzae genomes separate into three distinct subgroups with differential abundance between oral habitats. Functional enrichment analyses identify an operon encoding oxaloacetate decarboxylase as diagnostic for the tongue-abundant subgroup. For the genus Rothia, grouping by shared genome content recapitulates species-level taxonomy and habitat preferences. However, while most R. mucilaginosa are restricted to the tongue as expected, two genomes represent a cryptic population of R. mucilaginosa in many buccal mucosa samples. For both H. parainfluenzae and the genus Rothia, we identify not only limitations in the ability of cultivated organisms to represent populations in their native environment, but also specifically which cultivar gene sequences are absent or ubiquitous. CONCLUSIONS Our findings provide insights into population structure and biogeography in the mouth and form specific hypotheses about habitat adaptation. These results illustrate the power of combining metagenomes and pangenomes to investigate the ecology and evolution of bacteria across analytical scales.
Collapse
Affiliation(s)
- Daniel R Utter
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, 02138, USA.
| | | | - A Murat Eren
- The Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA, 02543, USA
- Department of Medicine, University of Chicago, Chicago, IL, 60637, USA
| | - Colleen M Cavanaugh
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, 02138, USA.
| | - Jessica L Mark Welch
- The Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA, 02543, USA.
| |
Collapse
|
148
|
Verma DK, Chaudhary C, Singh L, Sidhu C, Siddhardha B, Prasad SE, Thakur KG. Isolation and Taxonomic Characterization of Novel Haloarchaeal Isolates From Indian Solar Saltern: A Brief Review on Distribution of Bacteriorhodopsins and V-Type ATPases in Haloarchaea. Front Microbiol 2020; 11:554927. [PMID: 33362726 PMCID: PMC7755889 DOI: 10.3389/fmicb.2020.554927] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Accepted: 09/17/2020] [Indexed: 01/10/2023] Open
Abstract
Haloarchaea inhabit high salinity environments worldwide. They are a potentially rich source of crucial biomolecules like carotenoids and industrially useful proteins. However, diversity in haloarchaea present in Indian high salinity environments is poorly studied. In the present study, we isolated 12 haloarchaeal strains from hypersaline Kottakuppam, Tamil Nadu solar saltern in India. 16S rRNA based taxonomic characterization of these isolates suggested that nine of them are novel strains that belong to genera Haloarcula, Halomicrobium, and Haloferax. Transmission electron microscopy suggests the polymorphic nature of these haloarchaeal isolates. Most of the haloarchaeal species are known to be high producers of carotenoids. We were able to isolate carotenoids from all these 12 isolates. The UV-Vis spectroscopy-based analysis suggests that bacterioruberin and lycopene are the major carotenoids produced by these isolates. Based on the visual inspection of the purified carotenoids, the isolates were classified into two broad categories i.e., yellow and orange, attributed to the differences in the ratio of bacterioruberin and lycopene as confirmed by the UV-Vis spectral analysis. Using a PCR-based screening assay, we were able to detect the presence of the bacteriorhodopsin gene (bop) in 11 isolates. We performed whole-genome sequencing for three bop positive and one bop negative haloarchaeal isolates. Whole-genome sequencing, followed by pan-genome analysis identified multiple unique genes involved in various biological functions. We also successfully cloned, expressed, and purified functional recombinant bacteriorhodopsin (BR) from one of the isolates using Escherichia coli as an expression host. BR has light-driven proton pumping activity resulting in the proton gradient across the membrane, which is utilized by V-Type ATPases to produce ATP. We analyzed the distribution of bop and other accessory genes involved in functional BR expression and ATP synthesis in all the representative haloarchaeal species. Our bioinformatics-based analysis of all the sequenced members of genus Haloarcula suggests that bop, if present, is usually inserted between the genes coding for B and D subunits of the V-type ATPases operon. This study provides new insights into the genomic variations in haloarchaea and reports expression of new BR variant having good expression in functional form in E. coli.
Collapse
Affiliation(s)
- Dipesh Kumar Verma
- Structural Biology Laboratory, G. N. Ramachandran Protein Centre, Council of Scientific and Industrial Research-Institute of Microbial Technology (CSIR-IMTECH), Chandigarh, India
| | - Chetna Chaudhary
- Structural Biology Laboratory, G. N. Ramachandran Protein Centre, Council of Scientific and Industrial Research-Institute of Microbial Technology (CSIR-IMTECH), Chandigarh, India
| | - Latika Singh
- Structural Biology Laboratory, G. N. Ramachandran Protein Centre, Council of Scientific and Industrial Research-Institute of Microbial Technology (CSIR-IMTECH), Chandigarh, India
| | - Chandni Sidhu
- MTCC-Microbial Type Culture Collection & Gene Bank, Council of Scientific and Industrial Research Institute of Microbial Technology (CSIR-IMTECH), Chandigarh, India
| | - Busi Siddhardha
- Department of Microbiology, School of Life Sciences, Pondicherry University, Puducherry, India
| | - Senthil E Prasad
- Biochemical Engineering Research and Process Development Centre, Council of Scientific and Industrial Research-Institute of Microbial Technology (CSIR-IMTECH), Chandigarh, India
| | - Krishan Gopal Thakur
- Structural Biology Laboratory, G. N. Ramachandran Protein Centre, Council of Scientific and Industrial Research-Institute of Microbial Technology (CSIR-IMTECH), Chandigarh, India
| |
Collapse
|
149
|
Pangenome Analysis of Mycobacterium tuberculosis Reveals Core-Drug Targets and Screening of Promising Lead Compounds for Drug Discovery. Antibiotics (Basel) 2020; 9:antibiotics9110819. [PMID: 33213029 PMCID: PMC7698547 DOI: 10.3390/antibiotics9110819] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 11/13/2020] [Accepted: 11/15/2020] [Indexed: 12/03/2022] Open
Abstract
Tuberculosis, caused by Mycobacterium tuberculosis (M. tuberculosis), is one of the leading causes of human deaths globally according to the WHO TB 2019 report. The continuous rise in multi- and extensive-drug resistance in M. tuberculosis broadens the challenges to control tuberculosis. The availability of a large number of completely sequenced genomes of M. tuberculosis has provided an opportunity to explore the pangenome of the species along with the pan-phylogeny and to identify potential novel drug targets leading to drug discovery. We attempt to calculate the pangenome of M. tuberculosis that comprises a total of 150 complete genomes and performed the phylo-genomic classification and analysis. Further, the conserved core genome (1251 proteins) is subjected to various sequential filters (non-human homology, essentiality, virulence, physicochemical parameters, and pathway analysis) resulted in identification of eight putative broad-spectrum drug targets. Upon molecular docking analyses of these targets with ligands available at the DrugBank database shortlisted a total of five promising ligands with projected inhibitory potential; namely, 2′deoxy-thymidine-5′-diphospho-alpha-d-glucose, uridine diphosphate glucose, 2′-deoxy-thymidine-beta-l-rhamnose, thymidine-5′-triphosphate, and citicoline. We are confident that with further lead optimization and experimental validation, these lead compounds may provide a sound basis to develop safe and effective drugs against tuberculosis disease in humans.
Collapse
|
150
|
Abstract
A description of the genetic makeup of a species based on a single genome is often insufficient because it ignores the variability in gene repertoire among multiple strains. The estimation of the pangenome of a species is a solution to this issue as it provides an overview of genes that are shared by all strains and genes that are present in only some of the genomes. These different sets of genes can then be analyzed functionally to explore correlations with unique phenotypes and adaptations. This protocol presents the usage of Roary, a Linux-native pangenome application. Roary is a straightforward software that provides 1) an overview about core and accessory genes for those interested in general trends and, also, 2) detailed information on gene presence/absence in each genome for in-depth analyses. Results are provided both in text and graphic format.
Collapse
Affiliation(s)
- Farrah Sitto
- Department of Biological Sciences, Oakland University, Rochester, MI
| | - Fabia U Battistuzzi
- Department of Biological Sciences, Oakland University, Rochester, MI
- Center for Data Science and Big Data Analytics, Oakland University, Rochester, MI
- Corresponding author: E-mail:
| |
Collapse
|