1
|
Jayasundara SL, Algewatta HR, Jayawardana S, Perera M, Peiris LDC. Molecular Identification and Evolutionary Divergence of the Sri Lankan Sambar Deer, Rusa unicolor (Kerr 1792). Animals (Basel) 2023; 13:2877. [PMID: 37760277 PMCID: PMC10525601 DOI: 10.3390/ani13182877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 09/05/2023] [Accepted: 09/08/2023] [Indexed: 09/29/2023] Open
Abstract
The Sambar is one of the largest deer species distributed mainly in Asia, and it has been listed as a vulnerable species. Taxonomy based on morphological characterization has been the gold standard method used to identify the Sambar deer species. Yet, morphological identification is challenging and requires expertise. To conduct species identification and taxonomic decisions, we performed the molecular identification of R. unicolor found in Sri Lanka using DNA barcodes, COI, and Cyt b to compare the Sri Lankan R. unicolor with the Indian R. unicolor and other R. unicolor subspecies. We obtained mitochondrial DNA sequences from COI and Cyt b from blood samples collected from the wet zone in Sri Lanka. A phylogenetic tree was constructed based on the Bayesian analyses using MrBayes 3.2.7. Molecular dating was implemented in Bayesian Evolutionary Analysis Sampling Trees (BEAST v1.8.2) on the concatenated sequence using a log-normal relaxed clock and Yule species tree prior, with four categories. The results showed that the Sri Lankan R. unicolor is genetically different from the Indian R. unicolor and other R. unicolor subspecies. The divergence occurred approximately 1.1 MYA (million years ago) in the Pleistocene era. The results are essential for designing new conservation platforms for these Sambar deer species.
Collapse
Affiliation(s)
- Subodha Lakruwani Jayasundara
- Department of Zoology, Faculty of Applied Sciences, University of Sri Jayewardenepura, Nugegoda 10250, Sri Lanka; (S.L.J.); (M.P.)
| | - Hirusha Randimal Algewatta
- Department of Zoology, Faculty of Applied Sciences, University of Sri Jayewardenepura, Nugegoda 10250, Sri Lanka; (S.L.J.); (M.P.)
| | - Suhada Jayawardana
- Wildlife Rehabilitation Center, Department of Wildlife Conservation, 811A, Jayanthipura, Btataramulla 10120, Sri Lanka;
| | - Minoli Perera
- Department of Zoology, Faculty of Applied Sciences, University of Sri Jayewardenepura, Nugegoda 10250, Sri Lanka; (S.L.J.); (M.P.)
| | - L. Dinithi C. Peiris
- Genetics & Molecular Biology Unit/Department of Zoology, Faculty of Applied Sciences, University of Sri Jayewardenepura, Nugegoda 10250, Sri Lanka
| |
Collapse
|
2
|
Mposhi A, Cortés-Mancera F, Heegsma J, de Meijer VE, van de Sluis B, Sydor S, Bechmann LP, Theys C, de Rijk P, De Pooter T, Vanden Berghe W, İnce İA, Faber KN, Rots MG. Mitochondrial DNA methylation in metabolic associated fatty liver disease. Front Nutr 2023; 10:964337. [PMID: 37305089 PMCID: PMC10249072 DOI: 10.3389/fnut.2023.964337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 02/07/2023] [Indexed: 06/13/2023] Open
Abstract
Introduction Hepatic lipid accumulation and mitochondrial dysfunction are hallmarks of metabolic associated fatty liver disease (MAFLD), yet molecular parameters underlying MAFLD progression are not well understood. Differential methylation within the mitochondrial DNA (mtDNA) has been suggested to be associated with dysfunctional mitochondria, also during progression to Metabolic Steatohepatitis (MeSH). This study further investigates whether mtDNA methylation is associated with hepatic lipid accumulation and MAFLD. Methods HepG2 cells were constructed to stably express mitochondria-targeted viral and prokaryotic cytosine DNA methyltransferases (mtM.CviPI or mtM.SssI for GpC or CpG methylation, respectively). A catalytically inactive variant (mtM.CviPI-Mut) was constructed as a control. Mouse and human patients' samples were also investigated. mtDNA methylation was assessed by pyro- or nanopore sequencing. Results and discussion Differentially induced mtDNA hypermethylation impaired mitochondrial gene expression and metabolic activity in HepG2-mtM.CviPI and HepG2-mtM.SssI cells and was associated with increased lipid accumulation, when compared to the controls. To test whether lipid accumulation causes mtDNA methylation, HepG2 cells were subjected to 1 or 2 weeks of fatty acid treatment, but no clear differences in mtDNA methylation were detected. In contrast, hepatic Nd6 mitochondrial gene body cytosine methylation and Nd6 gene expression were increased in mice fed a high-fat high cholesterol diet (HFC for 6 or 20 weeks), when compared to controls, while mtDNA content was unchanged. For patients with simple steatosis, a higher ND6 methylation was confirmed using Methylation Specific PCR, but no additional distinctive cytosines could be identified using pyrosequencing. This study warrants further investigation into a role for mtDNA methylation in promoting mitochondrial dysfunction and impaired lipid metabolism in MAFLD.
Collapse
Affiliation(s)
- Archibold Mposhi
- Department of Pathology and Medical Biology, University of Groningen, University Medical Center Groningen, Groningen, Netherlands
- Department of Gastroenterology and Hepatology, University of Groningen, University Medical Center Groningen, Groningen, Netherlands
| | - Fabian Cortés-Mancera
- Department of Pathology and Medical Biology, University of Groningen, University Medical Center Groningen, Groningen, Netherlands
- Departamento de Ciencias Aplicadas, Instituto Tecnológico Metropolitano, Medellín, Colombia
| | - Janette Heegsma
- Department of Gastroenterology and Hepatology, University of Groningen, University Medical Center Groningen, Groningen, Netherlands
| | - Vincent E. de Meijer
- Department of Surgery, Division of Hepato-Pancreato-Biliary Surgery and Liver Transplantation, University of Groningen, University Medical Center Groningen, Groningen, Netherlands
| | - Bart van de Sluis
- Section of Molecular Genetics, University of Groningen, University Medical Center Groningen, Groningen, Netherlands
| | - Svenja Sydor
- Department of Internal Medicine, University Hospital Knappschaftskrankenhaus, Bochum, Germany
- Ruhr-University Bochum, Bochum, Germany
| | - Lars P. Bechmann
- Department of Internal Medicine, University Hospital Knappschaftskrankenhaus, Bochum, Germany
- Ruhr-University Bochum, Bochum, Germany
| | - Claudia Theys
- Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Peter de Rijk
- Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
- Neuromics Support Facility, VIB-UAntwerp Center for Molecular Neurology, University of Antwerp, Antwerp, Belgium
| | - Tim De Pooter
- Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
- Neuromics Support Facility, VIB-UAntwerp Center for Molecular Neurology, University of Antwerp, Antwerp, Belgium
| | - Wim Vanden Berghe
- Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - İkbal Agah İnce
- Department of Pathology and Medical Biology, University of Groningen, University Medical Center Groningen, Groningen, Netherlands
- Department of Medical Microbiology, School of Medicine, Acıbadem Mehmet Ali Aydınlar University, Istanbul, Türkiye
| | - Klaas Nico Faber
- Department of Gastroenterology and Hepatology, University of Groningen, University Medical Center Groningen, Groningen, Netherlands
| | - Marianne G. Rots
- Department of Pathology and Medical Biology, University of Groningen, University Medical Center Groningen, Groningen, Netherlands
| |
Collapse
|
3
|
Pinto LDA, Machado FP, Esteves R, Farias VM, Köptcke FBN, Ricci-Junior E, Rocha L, Keller LAM. Characterization and Inhibitory Effects of Essential Oil and Nanoemulsion from Ocotea indecora (Shott) Mez in Aspergillus Species. Molecules 2023; 28:molecules28083437. [PMID: 37110671 PMCID: PMC10142328 DOI: 10.3390/molecules28083437] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Revised: 04/05/2023] [Accepted: 04/09/2023] [Indexed: 04/29/2023] Open
Abstract
The Aspergillus genus, the etiological agent of aspergillosis, is an important food contaminant and mycotoxin producer. Plant extracts and essential oils are a source of bioactive substances with antimicrobial potential that can be used instead of synthetic food preservatives. Species from the Lauraceae family and the Ocotea genus have been used as traditional medicinal herbs. Their essential oils can be nanoemulsified to enhance their stability and bioavailability and increase their use. Therefore, this study sought to prepare and characterize both nanoemulsion and essential oil from the Ocotea indecora's leaves, a native and endemic species from the Mata Atlântica forest in Brazil, and evaluate the activity against Aspergillus flavus RC 2054, Aspergillus parasiticus NRRL 2999, and Aspergillus westerdjikiae NRRL 3174. The products were added to Sabouraud Dextrose Agar at concentrations of 256, 512, 1024, 2048, and 4096 µg/mL. The strains were inoculated and incubated for up to 96 h with two daily measurements. The results did not show fungicidal activity under these conditions. A fungistatic effect, however, was observed. The nanoemulsion decreased the fungistatic concentration of the essential oil more than ten times, mainly in A. westerdjikiae. There were no significant changes in aflatoxin production.
Collapse
Affiliation(s)
- Leonardo de Assunção Pinto
- Programa de Pós-Graduação em Biotecnologia Vegetal e Bioprocessos, Centro de Ciências em Saúde, Universidade Federal do Rio de Janeiro, Rio de Janeiro CEP 21941-590, Brazil
| | - Francisco Paiva Machado
- Programa de Pós-Graduação em Biotecnologia Vegetal e Bioprocessos, Centro de Ciências em Saúde, Universidade Federal do Rio de Janeiro, Rio de Janeiro CEP 21941-590, Brazil
| | - Ricardo Esteves
- Programa de Pós-Graduação em Biotecnologia Vegetal e Bioprocessos, Centro de Ciências em Saúde, Universidade Federal do Rio de Janeiro, Rio de Janeiro CEP 21941-590, Brazil
| | - Victor Moebus Farias
- Programa de Pós-Graduação em Higiene Veterinária e Processamento Tecnológico de Produtos de Origem Animal, Faculdade de Veterinária, Universidade Federal Fluminense, Niterói, Rio de Janeiro CEP 24220-000, Brazil
| | | | - Eduardo Ricci-Junior
- Departamento de Medicamentos, Faculdade de Farmácia, Universidade Federal do Rio de Janeiro, Rio de Janeiro CEP 21941-902, Brazil
| | - Leandro Rocha
- Faculdade de Farmácia, Universidade Federal Fluminense, Niterói, Rio de Janeiro CEP 24241-000, Brazil
- Laboratório de Tecnologia de Produtos Naturais, Faculdade de Farmácia, Universidade Federal Fluminense, Niterói, Rio de Janeiro CEP 24241-002, Brazil
| | - Luiz Antonio Moura Keller
- Departamento de Zootecnia e Desenvolvimento Agrosustentável, Faculdade de Veterinária, Universidade Federal Fluminense, Niterói, Rio de Janeiro CEP 24220-000, Brazil
| |
Collapse
|
4
|
de la Fuente R, Díaz-Villanueva W, Arnau V, Moya A. Genomic Signature in Evolutionary Biology: A Review. BIOLOGY 2023; 12:biology12020322. [PMID: 36829597 PMCID: PMC9953303 DOI: 10.3390/biology12020322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 02/11/2023] [Accepted: 02/13/2023] [Indexed: 02/19/2023]
Abstract
Organisms are unique physical entities in which information is stored and continuously processed. The digital nature of DNA sequences enables the construction of a dynamic information reservoir. However, the distinction between the hardware and software components in the information flow is crucial to identify the mechanisms generating specific genomic signatures. In this work, we perform a bibliometric analysis to identify the different purposes of looking for particular patterns in DNA sequences associated with a given phenotype. This study has enabled us to make a conceptual breakdown of the genomic signature and differentiate the leading applications. On the one hand, it refers to gene expression profiling associated with a biological function, which may be shared across taxa. This signature is the focus of study in precision medicine. On the other hand, it also refers to characteristic patterns in species-specific DNA sequences. This interpretation plays a key role in comparative genomics, identifying evolutionary relationships. Looking at the relevant studies in our bibliographic database, we highlight the main factors causing heterogeneities in genome composition and how they can be quantified. All these findings lead us to reformulate some questions relevant to evolutionary biology.
Collapse
Affiliation(s)
- Rebeca de la Fuente
- Institute of Integrative Systems Biology (I2Sysbio), University of Valencia and Spanish Research Council (CSIC), 46980 Valencia, Spain
- Correspondence:
| | - Wladimiro Díaz-Villanueva
- Institute of Integrative Systems Biology (I2Sysbio), University of Valencia and Spanish Research Council (CSIC), 46980 Valencia, Spain
| | - Vicente Arnau
- Institute of Integrative Systems Biology (I2Sysbio), University of Valencia and Spanish Research Council (CSIC), 46980 Valencia, Spain
| | - Andrés Moya
- Institute of Integrative Systems Biology (I2Sysbio), University of Valencia and Spanish Research Council (CSIC), 46980 Valencia, Spain
- Foundation for the Promotion of Sanitary and Biomedical Research of the Valencian Community (FISABIO), 46020 Valencia, Spain
- CIBER in Epidemiology and Public Health (CIBEResp), 28029 Madrid, Spain
| |
Collapse
|
5
|
Lo HY, Martínez-Lavanchy PM, Goris T, Heider J, Boll M, Kaster AK, Müller JA. IncP-type plasmids carrying genes for antibiotic resistance or for aromatic compound degradation are prevalent in sequenced Aromatoleum and Thauera strains. Environ Microbiol 2022; 24:6411-6425. [PMID: 36306376 DOI: 10.1111/1462-2920.16262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 10/25/2022] [Indexed: 01/12/2023]
Abstract
Self-transferable plasmids of the incompatibility group P-1 (IncP-1) are considered important carriers of genes for antibiotic resistance and other adaptive functions. In the laboratory, these plasmids have a broad host range; however, little is known about their in situ host profile. In this study, we discovered that Thauera aromatica K172T , a facultative denitrifying microorganism capable of degrading various aromatic compounds, contains a plasmid highly similar to the IncP-1 ε archetype pKJK5. The plasmid harbours multiple antibiotic resistance genes and is maintained in strain K172T for at least 1000 generations without selection pressure from antibiotics. In a subsequent search, we found additional nine IncP-type plasmids in a total of 40 sequenced genomes of the closely related genera Aromatoleum and Thauera. Six of these plasmids form a novel IncP-1 subgroup designated θ, four of which carry genes for anaerobic or aerobic degradation of aromatic compounds. Pentanucleotide sequence analyses (k-mer profiling) indicated that Aromatoleum spp. and Thauera spp. are among the most suitable hosts for the θ plasmids. Our results highlight the importance of IncP-1 plasmids for the genetic adaptation of these common facultative denitrifying bacteria and provide novel insights into the in situ host profile of these plasmids.
Collapse
Affiliation(s)
- Hao-Yu Lo
- Department of Environmental Biotechnology, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany.,Institute for Biological Interfaces (IBG-5), Karlsruhe Institute of Technology, Eggenstein-Leopoldshafen, Germany
| | - Paula M Martínez-Lavanchy
- Department of Environmental Biotechnology, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany
| | - Tobias Goris
- Department of Molecular Toxicology, Intestinal Microbiology, German Institute of Human Nutrition, Potsdam-Rehbruecke, Germany
| | - Johann Heider
- Department of Biology, Philipps-Universität Marburg, Germany
| | - Matthias Boll
- Institute of Biology II, Albert-Ludwigs-Universität Freiburg, Germany
| | - Anne-Kristin Kaster
- Institute for Biological Interfaces (IBG-5), Karlsruhe Institute of Technology, Eggenstein-Leopoldshafen, Germany
| | - Jochen A Müller
- Department of Environmental Biotechnology, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany.,Institute for Biological Interfaces (IBG-5), Karlsruhe Institute of Technology, Eggenstein-Leopoldshafen, Germany
| |
Collapse
|
6
|
Hitherto-Unnoticed Self-Transmissible Plasmids Widely Distributed among Different Environments in Japan. Appl Environ Microbiol 2022; 88:e0111422. [PMID: 36069618 PMCID: PMC9499019 DOI: 10.1128/aem.01114-22] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Various conjugative plasmids were obtained by exogenous plasmid capture, biparental mating, and/or triparental mating methods from different environmental samples in Japan. Based on phylogenetic analyses of their whole-nucleotide sequences, new IncP/P-1 plasmids that could be classified into novel subgroups were obtained. Mini-replicons of the plasmids were constructed, and each of them was incompatible with at least one of the IncP/P-1 plasmids, although they showed diverse iteron sequences in their oriV regions. There were two large clades of IncP/P-1 plasmids, clade I and II. Plasmids in clade I and II included antibiotic resistance genes. Notably, nucleotide compositions of newly found plasmids exhibited different tendencies compared with those of the previously well-studied IncP/P-1 plasmids. Indeed, the host range of plasmids of clade II was different from that of clade I. Although few PromA plasmids have been reported, the number of plasmids belonging to PromAβ, and -γ subgroups detected in this study was close to that of IncP/P-1 plasmids. The host ranges of PromAγ and PromAδ plasmids were broad and transferred to different and distinct classes of Proteobacteria. Interestingly, PromA plasmids and many IncP/P-1 plasmids do not carry any accessory genes. These findings indicate the presence of "hitherto-unnoticed" conjugative plasmids, including IncP/P-1 or PromA derivative ones in nature. These plasmids would have important roles in the exchange of various genes, including antibiotic resistance genes, among different bacteria in nature. IMPORTANCE Plasmids are known to spread among different bacteria. However, which plasmids spread among environmental samples and in which environments they are present is still poorly understood. This study showed that unidentified conjugative plasmids were present in various environments. Different novel IncP/P-1 plasmids were found, whose host ranges were different from those of known plasmids, showing wide diversity of IncP/P-1 plasmids. PromA plasmids, exhibiting a broad host range, were diversified into several subgroups and widely distributed in varied environments. These findings are important for understanding how bacteria naturally exchange their genes, including antibiotic resistance genes, a growing threat to human health worldwide.
Collapse
|
7
|
Hernández-Salmerón JE, Moreno-Hagelsieb G. FastANI, Mash and Dashing equally differentiate between Klebsiella species. PeerJ 2022; 10:e13784. [PMID: 35891643 PMCID: PMC9308963 DOI: 10.7717/peerj.13784] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Accepted: 07/05/2022] [Indexed: 01/17/2023] Open
Abstract
Bacteria of the genus Klebsiella are among the most important multi-drug resistant human pathogens, though they have been isolated from a variety of environments. The importance and ubiquity of these organisms call for quick and accurate methods for their classification. Average Nucleotide Identity (ANI) is becoming a standard for species delimitation based on whole genome sequence comparison. However, much faster genome comparison tools have been appearing in the literature. In this study we tested the quality of different approaches for genome-based species delineation against ANI. To this end, we compared 1,189 Klebsiella genomes using measures calculated with Mash, Dashing, and DNA compositional signatures, all of which run in a fraction of the time required to obtain ANI. Receiver Operating Characteristic (ROC) curve analyses showed equal quality in species discrimination for ANI, Mash and Dashing, with Area Under the Curve (AUC) values above 0.99, followed by DNA signatures (AUC: 0.96). Accordingly, groups obtained at optimized cutoffs largely agree with species designation, with ANI, Mash and Dashing producing 15 species-level groups. DNA signatures broke the dataset into more than 30 groups. Testing Mash to map species after adding draft genomes to the dataset also showed excellent results (AUC above 0.99), producing a total of 26 Klebsiella species-level groups. The ecological niches of Klebsiella strains were found to neither be related to species delimitation, nor to protein functional content, suggesting that a single Klebsiella species can have a wide repertoire of ecological functions.
Collapse
|
8
|
Malla MA, Dubey A, Raj A, Kumar A, Upadhyay N, Yadav S. Emerging frontiers in microbe-mediated pesticide remediation: Unveiling role of omics and In silico approaches in engineered environment. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2022; 299:118851. [PMID: 35085655 DOI: 10.1016/j.envpol.2022.118851] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 01/09/2022] [Accepted: 01/11/2022] [Indexed: 06/14/2023]
Abstract
The overuse of pesticides for augmenting agriculture productivity always comes at the cost of environment, biodiversity, and human health and has put the land, water, and environmental footprints under severe threat throughout the globe. Underpinning and maximizing the microbiome functions in pesticide-contaminated environments has become a prerequisite for a sustainable environment and resilient agriculture. It is imperative to elucidate the metabolic network of the microbial communities and environmental variables at the contaminated site to predict the best strategy for remediation and soil microbe-pesticide interactions. High throughput next-generation sequencing and in silico analysis allow us to identify and discern the members and characteristics of core microbiomes at the contaminated site. Integration of modern high throughput multi-omics investigations and informatics pipelines provide novel approaches and pathways to capitalize on the core microbiomes for enhancing environmental functioning and mitigation. The role of eco-genomics tools in visualising the microbial network, taxonomy, functional potential, and environmental variables in contaminated habitats is discussed in this review. The integrated role of the potential microbe identification as individual or consortia, mechanistic approach for pesticide degradation, identification of responsible enzymes/genes, and in silico approach is emphasized for the prospects of the area.
Collapse
Affiliation(s)
- Muneer Ahmad Malla
- Department of Zoology, Dr. Harisingh Gour University (Central University), Sagar, 470003, MP, India; Metagenomics and Secretomics Research Laboratory, Department of Botany, Dr. Harisingh Gour University (Central University), Sagar, 470003, MP, India
| | - Anamika Dubey
- Metagenomics and Secretomics Research Laboratory, Department of Botany, Dr. Harisingh Gour University (Central University), Sagar, 470003, MP, India
| | - Aman Raj
- Metagenomics and Secretomics Research Laboratory, Department of Botany, Dr. Harisingh Gour University (Central University), Sagar, 470003, MP, India
| | - Ashwani Kumar
- Metagenomics and Secretomics Research Laboratory, Department of Botany, Dr. Harisingh Gour University (Central University), Sagar, 470003, MP, India.
| | - Niraj Upadhyay
- Department of Chemistry, Dr. Harisingh Gour University (Central University), Sagar, 470003, MP, India
| | - Shweta Yadav
- Department of Zoology, Dr. Harisingh Gour University (Central University), Sagar, 470003, MP, India
| |
Collapse
|
9
|
Thummadi NB, Charutha S, Pal M, Manimaran P. Multifractal and cross-correlation analysis on mitochondrial genome sequences using chaos game representation. Mitochondrion 2021; 60:121-128. [PMID: 34375735 DOI: 10.1016/j.mito.2021.08.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 08/02/2021] [Accepted: 08/05/2021] [Indexed: 11/25/2022]
Abstract
We characterized the multifractality and power-law cross-correlation of mitochondrial genomes of various species through the recently developed method which combines the chaos game representation theory and 2D-multifractal detrended cross-correlation analysis. In the present paper, we analyzed 32 mitochondrial genomes of different species and the obtained results show that all the analyzed data exhibit multifractal nature and power-law cross-correlation behaviour. Further, we performed a cluster analysis from the calculated scaling exponents to identify the class affiliation and its outcome is represented as a dendrogram. We suggest that this integrative approach may help the researchers to understand the phylogeny of any kingdom with their varying genome lengths and also this approach may find applications in characterizing the protein sequences, mRNA sequences, next-generation sequencing, and drug development, etc.
Collapse
Affiliation(s)
- N B Thummadi
- Department of Animal Biology, School of Life Sciences, University of Hyderabad, Gachibowli, Hyderabad 500 046, India
| | - S Charutha
- School of Physics, University of Hyderabad, Gachibowli, Hyderabad 500 046, India
| | - Mayukha Pal
- ABB Ability Innovation Centre, Asea Brown Boveri Company, Hyderabad 500084, India
| | - P Manimaran
- School of Physics, University of Hyderabad, Gachibowli, Hyderabad 500 046, India.
| |
Collapse
|
10
|
Tay AP, Hosking B, Hosking C, Bauer DC, Wilson LO. INSIDER: alignment-free detection of foreign DNA sequences. Comput Struct Biotechnol J 2021; 19:3810-3816. [PMID: 34285780 PMCID: PMC8273350 DOI: 10.1016/j.csbj.2021.06.045] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 06/28/2021] [Accepted: 06/28/2021] [Indexed: 11/21/2022] Open
Abstract
External DNA sequences can be inserted into an organism's genome either through natural processes such as gene transfer, or through targeted genome engineering strategies. Being able to robustly identify such foreign DNA is a crucial capability for health and biosecurity applications, such as anti-microbial resistance (AMR) detection or monitoring gene drives. This capability does not exist for poorly characterised host genomes or with limited information about the integrated sequence. To address this, we developed the INserted Sequence Information DEtectoR (INSIDER). INSIDER analyses whole genome sequencing data and identifies segments of potentially foreign origin by their significant shift in k-mer signatures. We demonstrate the power of INSIDER to separate integrated DNA sequences from normal genomic sequences on a synthetic dataset simulating the insertion of a CRISPR-Cas gene drive into wild-type yeast. As a proof-of-concept, we use INSIDER to detect the exact AMR plasmid in whole genome sequencing data from a Citrobacter freundii patient isolate. INSIDER streamlines the process of identifying integrated DNA in poorly characterised wild species or when the insert is of unknown origin, thus enhancing the monitoring of emerging biosecurity threats.
Collapse
Affiliation(s)
- Aidan P. Tay
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, New South Wales, Sydney, Australia
- Applied BioSciences, Faculty of Science and Engineering, Macquarie University, New South Wales, Sydney, Australia
| | - Brendan Hosking
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, New South Wales, Sydney, Australia
| | - Cameron Hosking
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, New South Wales, Sydney, Australia
| | - Denis C. Bauer
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, New South Wales, Sydney, Australia
- Department of Biomedical Sciences, Macquarie University, New South Wales, Sydney, Australia
- Applied BioSciences, Faculty of Science and Engineering, Macquarie University, New South Wales, Sydney, Australia
| | - Laurence O.W. Wilson
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, New South Wales, Sydney, Australia
- Applied BioSciences, Faculty of Science and Engineering, Macquarie University, New South Wales, Sydney, Australia
| |
Collapse
|
11
|
Bize A, Midoux C, Mariadassou M, Schbath S, Forterre P, Da Cunha V. Exploring short k-mer profiles in cells and mobile elements from Archaea highlights the major influence of both the ecological niche and evolutionary history. BMC Genomics 2021; 22:186. [PMID: 33726663 PMCID: PMC7962313 DOI: 10.1186/s12864-021-07471-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 02/24/2021] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND K-mer-based methods have greatly advanced in recent years, largely driven by the realization of their biological significance and by the advent of next-generation sequencing. Their speed and their independence from the annotation process are major advantages. Their utility in the study of the mobilome has recently emerged and they seem a priori adapted to the patchy gene distribution and the lack of universal marker genes of viruses and plasmids. To provide a framework for the interpretation of results from k-mer based methods applied to archaea or their mobilome, we analyzed the 5-mer DNA profiles of close to 600 archaeal cells, viruses and plasmids. Archaea is one of the three domains of life. Archaea seem enriched in extremophiles and are associated with a high diversity of viral and plasmid families, many of which are specific to this domain. We explored the dataset structure by multivariate and statistical analyses, seeking to identify the underlying factors. RESULTS For cells, the 5-mer profiles were inconsistent with the phylogeny of archaea. At a finer taxonomic level, the influence of the taxonomy and the environmental constraints on 5-mer profiles was very strong. These two factors were interdependent to a significant extent, and the respective weights of their contributions varied according to the clade. A convergent adaptation was observed for the class Halobacteria, for which a strong 5-mer signature was identified. For mobile elements, coevolution with the host had a clear influence on their 5-mer profile. This enabled us to identify one previously known and one new case of recent host transfer based on the atypical composition of the mobile elements involved. Beyond the effect of coevolution, extrachromosomal elements strikingly retain the specific imprint of their own viral or plasmid taxonomic family in their 5-mer profile. CONCLUSION This specific imprint confirms that the evolution of extrachromosomal elements is driven by multiple parameters and is not restricted to host adaptation. In addition, we detected only recent host transfer events, suggesting the fast evolution of short k-mer profiles. This calls for caution when using k-mers for host prediction, metagenomic binning or phylogenetic reconstruction.
Collapse
Affiliation(s)
- Ariane Bize
- Université Paris-Saclay, INRAE, PROSE, F-92761, Antony, France.
| | - Cédric Midoux
- Université Paris-Saclay, INRAE, PROSE, F-92761, Antony, France.,Université Paris-Saclay, INRAE, MaIAGE, F-78350, Jouy-en-Josas, France.,Université Paris-Saclay, INRAE, BioinfOmics, MIGALE bioinformatics facility, F-78350, Jouy-en-Josas, France
| | - Mahendra Mariadassou
- Université Paris-Saclay, INRAE, MaIAGE, F-78350, Jouy-en-Josas, France.,Université Paris-Saclay, INRAE, BioinfOmics, MIGALE bioinformatics facility, F-78350, Jouy-en-Josas, France
| | - Sophie Schbath
- Université Paris-Saclay, INRAE, MaIAGE, F-78350, Jouy-en-Josas, France.,Université Paris-Saclay, INRAE, BioinfOmics, MIGALE bioinformatics facility, F-78350, Jouy-en-Josas, France
| | - Patrick Forterre
- Institut Pasteur, Unité de Virologie des Archées, Département de Microbiologie, 25 Rue du Docteur Roux, 75015, Paris, France. .,Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France.
| | - Violette Da Cunha
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| |
Collapse
|
12
|
Hernández-Salmerón JE, Moreno-Hagelsieb G. Progress in quickly finding orthologs as reciprocal best hits: comparing blast, last, diamond and MMseqs2. BMC Genomics 2020; 21:741. [PMID: 33099302 PMCID: PMC7585182 DOI: 10.1186/s12864-020-07132-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Accepted: 10/09/2020] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Finding orthologs remains an important bottleneck in comparative genomics analyses. While the authors of software for the quick comparison of protein sequences evaluate the speed of their software and compare their results against the most usual software for the task, it is not common for them to evaluate their software for more particular uses, such as finding orthologs as reciprocal best hits (RBH). Here we compared RBH results obtained using software that runs faster than blastp. Namely, lastal, diamond, and MMseqs2. RESULTS We found that lastal required the least time to produce results. However, it yielded fewer results than any other program when comparing the proteins encoded by evolutionarily distant genomes. The program producing the most similar number of RBH to blastp was diamond ran with the "ultra-sensitive" option. However, this option was diamond's slowest, with the "very-sensitive" option offering the best balance between speed and RBH results. The speeding up of the programs was much more evident when dealing with eukaryotic genomes, which code for more numerous proteins. For example, lastal took a median of approx. 1.5% of the blastp time to run with bacterial proteomes and 0.6% with eukaryotic ones, while diamond with the very-sensitive option took 7.4% and 5.2%, respectively. Though estimated error rates were very similar among the RBH obtained with all programs, RBH obtained with MMseqs2 had the lowest error rates among the programs tested. CONCLUSIONS The fast algorithms for pairwise protein comparison produced results very similar to blast in a fraction of the time, with diamond offering the best compromise in speed, sensitivity and quality, as long as a sensitivity option, other than the default, was chosen.
Collapse
Affiliation(s)
| | - Gabriel Moreno-Hagelsieb
- Wilfrid Laurier University, Department of Biology, 75 University Ave W, Waterloo, N2L 3C5 ON Canada
| |
Collapse
|
13
|
Tokuda M, Suzuki H, Yanagiya K, Yuki M, Inoue K, Ohkuma M, Kimbara K, Shintani M. Determination of Plasmid pSN1216-29 Host Range and the Similarity in Oligonucleotide Composition Between Plasmid and Host Chromosomes. Front Microbiol 2020; 11:1187. [PMID: 32582111 PMCID: PMC7296055 DOI: 10.3389/fmicb.2020.01187] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2020] [Accepted: 05/11/2020] [Indexed: 12/17/2022] Open
Abstract
Plasmids are extrachromosomal DNA that can be horizontally transferred between different bacterial cells by conjugation. Horizontal gene transfer of plasmids can promote rapid evolution and adaptation of bacteria by imparting various traits involved in antibiotic resistance, virulence, and metabolism to their hosts. The host range of plasmids is an important feature for understanding how they spread in environmental microbial communities. Earlier bioinformatics studies have demonstrated that plasmids are likely to have similar oligonucleotide (k-mer) compositions to their host chromosomes and that evolutionary host ranges of plasmids could be predicted from this similarity. However, there are no complementary studies to assess the consistency between the predicted evolutionary host range and experimentally determined replication/transfer host range of a plasmid. In the present study, the replication/transfer host range of a model plasmid, pSN1216-29, exogenously isolated from cow manure as a newly discovered self-transmissible plasmid, was experimentally determined within microbial communities extracted from soil and cow manure. In silico prediction of evolutionary host range was performed with the pSN1216-29 using its oligonucleotide compositions independently. The results showed that oligonucleotide compositions of the plasmid pSN1216-29 had more similarities to those of hosts (transconjugants genera) than those of non-hosts (other genera). These findings can contribute to the understanding of how plasmids behave in microbial communities, and aid in the designing of appropriate plasmid vectors for different bacteria.
Collapse
Affiliation(s)
- Maho Tokuda
- Applied Chemistry and Biochemical Engineering Course, Department of Engineering, Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
| | - Haruo Suzuki
- Institute for Advanced Biosciences, Keio University, Tsuruoka, Japan.,Faculty of Environment and Information Studies, Keio University, Fujisawa, Japan
| | - Kosuke Yanagiya
- Applied Chemistry and Biochemical Engineering Course, Department of Engineering, Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
| | - Masahiro Yuki
- Japan Collection of Microorganisms, RIKEN BioResource Research Center, Tsukuba, Japan
| | - Kengo Inoue
- Faculty of Agriculture, University of Miyazaki, Miyazaki, Japan
| | - Moriya Ohkuma
- Japan Collection of Microorganisms, RIKEN BioResource Research Center, Tsukuba, Japan
| | - Kazuhide Kimbara
- Applied Chemistry and Biochemical Engineering Course, Department of Engineering, Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
| | - Masaki Shintani
- Applied Chemistry and Biochemical Engineering Course, Department of Engineering, Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan.,Japan Collection of Microorganisms, RIKEN BioResource Research Center, Tsukuba, Japan.,Research Institute of Green Science and Technology, Shizuoka University, Shizuoka, Japan
| |
Collapse
|
14
|
Yan F, Fang J, Cao J, Wei Y, Liu R, Wang L, Xie Z. Halomonas piezotolerans sp. nov., a multiple-stress-tolerant bacterium isolated from a deep-sea sediment sample of the New Britain Trench. Int J Syst Evol Microbiol 2020; 70:2560-2568. [PMID: 32129736 DOI: 10.1099/ijsem.0.004069] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A piezotolerant, H2O2-tolerant, heavy-metal-tolerant, slightly halophilic bacterium (strain NBT06E8T) was isolated from a deep-sea sediment sample collected from the New Britain Trench at depth of 8900 m. The strain was aerobic, motile, Gram-stain-negative, rod-shaped, oxidase-positive and catalase-positive. Growth of the strain was observed at 4-45 °C (optimum, 30 °C), at pH 5-11 (optimum, pH 8-9) and in 0.5-21 % (w/v) NaCl (optimum, 3-7 %). The optimum pressure for growth was 0.1-30 MPa with tolerance up to 60 MPa. Under optimum growth conditions, the strain could tolerate 15 mM H2O2. Resuls of 16S rRNA gene sequence analysis showed that strain NBT06E8T is closely related to Halomonas aquamarina DSM 30161T (99.5%), Halomonas meridiana DSM 5425T (99.43%) and Halomonas axialensis Althf1T (99.35%). The digital DNA-DNA hybridization values between strain NBT06E8T and the three related type strains, H. aquamarina, H. meridiana and H. axialensis, were 30.5±2.4 %, 30.7±2.5% and 31.5±2.5 %, respectively. The average nucleotide identity values between strain NBT06E8T and the three related type strains were 86.26, 86.26 and 83.63 %, respectively. The major fatty acids were summed feature 8 (C18 : 1 ω7c and/or C18 : 1 ω6c) and C16 : 0. The predominant respiratory quinone detected was ubiquinone-9 (Q-9). Based on its phenotypic and phylogenetic characteristics, we conclude that strain NBT06E8T represents a novel species of the genus Halomonas, for which the name Halomonas piezotolerans sp. nov. is proposed (type strain NBT06E8T= MCCC 1K04228T=KCTC 72680T).
Collapse
Affiliation(s)
- Fangfang Yan
- Shanghai Engineering Research Center of Hadal Science and Technology, College of Marine Sciences, Shanghai Ocean University, Shanghai 201306, PR China
| | - Jiasong Fang
- Department of Natural Sciences, Hawaii Pacific University, Honolulu, HI 96813, USA.,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, PR China.,Shanghai Engineering Research Center of Hadal Science and Technology, College of Marine Sciences, Shanghai Ocean University, Shanghai 201306, PR China
| | - Junwei Cao
- National Engineering Research Center for Oceanic Fisheries, Shanghai Ocean University, Shanghai 201306, PR China.,Shanghai Engineering Research Center of Hadal Science and Technology, College of Marine Sciences, Shanghai Ocean University, Shanghai 201306, PR China
| | - Yuli Wei
- National Engineering Research Center for Oceanic Fisheries, Shanghai Ocean University, Shanghai 201306, PR China.,Shanghai Engineering Research Center of Hadal Science and Technology, College of Marine Sciences, Shanghai Ocean University, Shanghai 201306, PR China
| | - Rulong Liu
- National Engineering Research Center for Oceanic Fisheries, Shanghai Ocean University, Shanghai 201306, PR China.,Shanghai Engineering Research Center of Hadal Science and Technology, College of Marine Sciences, Shanghai Ocean University, Shanghai 201306, PR China
| | - Li Wang
- National Engineering Research Center for Oceanic Fisheries, Shanghai Ocean University, Shanghai 201306, PR China.,Shanghai Engineering Research Center of Hadal Science and Technology, College of Marine Sciences, Shanghai Ocean University, Shanghai 201306, PR China
| | - Zhe Xie
- Shanghai Engineering Research Center of Hadal Science and Technology, College of Marine Sciences, Shanghai Ocean University, Shanghai 201306, PR China.,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, PR China
| |
Collapse
|
15
|
Lewis CJ, Dixit B, Batiuk E, Hall CJ, O'Connor MS, Boominathan A. Codon optimization is an essential parameter for the efficient allotopic expression of mtDNA genes. Redox Biol 2020; 30:101429. [PMID: 31981894 PMCID: PMC6976934 DOI: 10.1016/j.redox.2020.101429] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2019] [Revised: 12/29/2019] [Accepted: 01/10/2020] [Indexed: 11/29/2022] Open
Abstract
Mutations in mitochondrial DNA can be inherited or occur de novo leading to several debilitating myopathies with no curative option and few or no effective treatments. Allotopic expression of recoded mitochondrial genes from the nucleus has potential as a gene therapy strategy for such conditions, however progress in this field has been hampered by technical challenges. Here we employed codon optimization as a tool to re-engineer the protein-coding genes of the human mitochondrial genome for robust, efficient expression from the nucleus. All 13 codon-optimized constructs exhibited substantially higher protein expression than minimally-recoded genes when expressed transiently, and steady-state mRNA levels for optimized gene constructs were 5-180 fold enriched over recoded versions in stably-selected wildtype cells. Eight of thirteen mitochondria-encoded oxidative phosphorylation (OxPhos) proteins maintained protein expression following stable selection, with mitochondrial localization of expression products. We also assessed the utility of this strategy in rescuing mitochondrial disease cell models and found the rescue capacity of allotopic expression constructs to be gene specific. Allotopic expression of codon optimized ATP8 in disease models could restore protein levels and respiratory function, however, rescue of the pathogenic phenotype for another gene, ND1 was only partially successful. These results imply that though codon-optimization alone is not sufficient for functional allotopic expression of most mitochondrial genes, it is an essential consideration in their design.
Collapse
Affiliation(s)
- Caitlin J Lewis
- Department of Mitochondrial Research, SENS Research Foundation, Mountain View, CA, 94041, USA
| | - Bhavna Dixit
- Department of Mitochondrial Research, SENS Research Foundation, Mountain View, CA, 94041, USA
| | - Elizabeth Batiuk
- Department of Mitochondrial Research, SENS Research Foundation, Mountain View, CA, 94041, USA
| | - Carter J Hall
- Department of Mitochondrial Research, SENS Research Foundation, Mountain View, CA, 94041, USA
| | - Matthew S O'Connor
- Department of Mitochondrial Research, SENS Research Foundation, Mountain View, CA, 94041, USA.
| | - Amutha Boominathan
- Department of Mitochondrial Research, SENS Research Foundation, Mountain View, CA, 94041, USA.
| |
Collapse
|
16
|
Dong Z, Pu L, Cui H. Mitoepigenetics and Its Emerging Roles in Cancer. Front Cell Dev Biol 2020; 8:4. [PMID: 32039210 PMCID: PMC6989428 DOI: 10.3389/fcell.2020.00004] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Accepted: 01/08/2020] [Indexed: 12/11/2022] Open
Abstract
In human beings, there is a ∼16,569 bp circular mitochondrial DNA (mtDNA) encoding 22 tRNAs, 12S and 16S rRNAs, 13 polypeptides that constitute the central core of ETC/OxPhos complexes, and some non-coding RNAs. Recently, mtDNA has been shown to have some covalent modifications such as methylation or hydroxylmethylation, which play pivotal epigenetic roles in mtDNA replication and transcription. Post-translational modifications of proteins in mitochondrial nucleoids such as mitochondrial transcription factor A (TFAM) also emerge as essential epigenetic modulations in mtDNA replication and transcription. Post-transcriptional modifications of mitochondrial RNAs (mtRNAs) including mt-rRNAs, mt-tRNAs and mt-mRNAs are important epigenetic modulations. Besides, mtDNA or nuclear DNA (n-DNA)-derived non-coding RNAs also play important roles in the regulation of translation and function of mitochondrial genes. These evidences introduce a novel concept of mitoepigenetics that refers to the study of modulations in the mitochondria that alter heritable phenotype in mitochondria itself without changing the mtDNA sequence. Since mitochondrial dysfunction contributes to carcinogenesis and tumor development, mitoepigenetics is also essential for cancer. Understanding the mode of actions of mitoepigenetics in cancers may shade light on the clinical diagnosis and prevention of these diseases. In this review, we summarize the present study about modifications in mtDNA, mtRNA and nucleoids and modulations of mtDNA/nDNA-derived non-coding RNAs that affect mtDNA translation/function, and overview recent studies of mitoepigenetic alterations in cancer.
Collapse
Affiliation(s)
- Zhen Dong
- State Key Laboratory of Silkworm Genome Biology, Institute of Sericulture and Systems Biology, Southwest University, Chongqing, China.,Cancer Center, Medical Research Institute, Southwest University, Chongqing, China.,Engineering Research Center for Cancer Biomedical and Translational Medicine, Southwest University, Chongqing, China.,Chongqing Engineering and Technology Research Center for Silk Biomaterials and Regenerative Medicine, Southwest University, Chongqing, China
| | - Longjun Pu
- Umeå Centre for Molecular Medicine, Umeå University, Umeå, Sweden
| | - Hongjuan Cui
- State Key Laboratory of Silkworm Genome Biology, Institute of Sericulture and Systems Biology, Southwest University, Chongqing, China.,Cancer Center, Medical Research Institute, Southwest University, Chongqing, China.,Engineering Research Center for Cancer Biomedical and Translational Medicine, Southwest University, Chongqing, China.,Chongqing Engineering and Technology Research Center for Silk Biomaterials and Regenerative Medicine, Southwest University, Chongqing, China
| |
Collapse
|
17
|
Zhou Y, Zhang W, Wu H, Huang K, Jin J. A high-resolution genomic composition-based method with the ability to distinguish similar bacterial organisms. BMC Genomics 2019; 20:754. [PMID: 31638897 PMCID: PMC6805505 DOI: 10.1186/s12864-019-6119-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Accepted: 09/20/2019] [Indexed: 12/03/2022] Open
Abstract
Background Genomic composition has been found to be species specific and is used to differentiate bacterial species. To date, almost no published composition-based approaches are able to distinguish between most closely related organisms, including intra-genus species and intra-species strains. Thus, it is necessary to develop a novel approach to address this problem. Results Here, we initially determine that the “tetranucleotide-derived z-value Pearson correlation coefficient” (TETRA) approach is representative of other published statistical methods. Then, we devise a novel method called “Tetranucleotide-derived Z-value Manhattan Distance” (TZMD) and compare it with the TETRA approach. Our results show that TZMD reflects the maximal genome difference, while TETRA does not in most conditions, demonstrating in theory that TZMD provides improved resolution. Additionally, our analysis of real data shows that TZMD improves species differentiation and clearly differentiates similar organisms, including similar species belonging to the same genospecies, subspecies and intraspecific strains, most of which cannot be distinguished by TETRA. Furthermore, TZMD is able to determine clonal strains with the TZMD = 0 criterion, which intrinsically encompasses identical composition, high average nucleotide identity and high percentage of shared genomes. Conclusions Our extensive assessment demonstrates that TZMD has high resolution. This study is the first to propose a composition-based method for differentiating bacteria at the strain level and to demonstrate that composition is also strain specific. TZMD is a powerful tool and the first easy-to-use approach for differentiating clonal and non-clonal strains. Therefore, as the first composition-based algorithm for strain typing, TZMD will facilitate bacterial studies in the future.
Collapse
Affiliation(s)
- Yizhuang Zhou
- Laboratory of Hepatobiliary and Pancreatic Surgery, The Affiliated Hospital of Guilin Medical University, Guilin, Guangxi, 541001, People's Republic of China. .,Peking-Tsinghua Center for Life Science, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, People's Republic of China.
| | - Wenting Zhang
- Laboratory of Hepatobiliary and Pancreatic Surgery, The Affiliated Hospital of Guilin Medical University, Guilin, Guangxi, 541001, People's Republic of China
| | - Huixian Wu
- China-USA Lipids in Health and Disease Research Center, Guilin Medical University, Guilin, Guangxi, 541001, People's Republic of China.,Guangxi Key Laboratory of Molecular Medicine in Liver Injury and Repair, Guilin Medical University, Guilin, Guangxi, 541001, People's Republic of China
| | - Kai Huang
- Laboratory of Hepatobiliary and Pancreatic Surgery, The Affiliated Hospital of Guilin Medical University, Guilin, Guangxi, 541001, People's Republic of China.,China-USA Lipids in Health and Disease Research Center, Guilin Medical University, Guilin, Guangxi, 541001, People's Republic of China.,Guangxi Key Laboratory of Molecular Medicine in Liver Injury and Repair, Guilin Medical University, Guilin, Guangxi, 541001, People's Republic of China
| | - Junfei Jin
- Laboratory of Hepatobiliary and Pancreatic Surgery, The Affiliated Hospital of Guilin Medical University, Guilin, Guangxi, 541001, People's Republic of China. .,China-USA Lipids in Health and Disease Research Center, Guilin Medical University, Guilin, Guangxi, 541001, People's Republic of China. .,Guangxi Key Laboratory of Molecular Medicine in Liver Injury and Repair, Guilin Medical University, Guilin, Guangxi, 541001, People's Republic of China.
| |
Collapse
|
18
|
Li W, Freudenberg J, Freudenberg J. Alignment-free approaches for predicting novel Nuclear Mitochondrial Segments (NUMTs) in the human genome. Gene 2019; 691:141-152. [PMID: 30630097 DOI: 10.1016/j.gene.2018.12.040] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Revised: 12/07/2018] [Accepted: 12/14/2018] [Indexed: 10/27/2022]
Abstract
The nuclear human genome harbors sequences of mitochondrial origin, indicating an ancestral transfer of DNA from the mitogenome. Several Nuclear Mitochondrial Segments (NUMTs) have been detected by alignment-based sequence similarity search, as implemented in the Basic Local Alignment Search Tool (BLAST). Identifying NUMTs is important for the comprehensive annotation and understanding of the human genome. Here we explore the possibility of detecting NUMTs in the human genome by alignment-free sequence similarity search, such as k-mers (k-tuples, k-grams, oligos of length k) distributions. We find that when k=6 or larger, the k-mer approach and BLAST search produce almost identical results, e.g., detect the same set of NUMTs longer than 3 kb. However, when k=5 or k=4, certain signals are only detected by the alignment-free approach, and these may indicate yet unrecognized, and potentially more ancestral NUMTs. We introduce a "Manhattan plot" style representation of NUMT predictions across the genome, which are calculated based on the reciprocal of the Jensen-Shannon divergence between the nuclear and mitochondrial k-mer frequencies. The further inspection of the k-mer-based NUMT predictions however shows that most of them contain long-terminal-repeat (LTR) annotations, whereas BLAST-based NUMT predictions do not. Thus, similarity of the mitogenome to LTR sequences is recognized, which we validate by finding the mitochondrial k-mer distribution closer to those for transposable sequences and specifically, close to some types of LTR.
Collapse
Affiliation(s)
- Wentian Li
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, Northwell Health, Manhasset, NY, USA.
| | - Jerome Freudenberg
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, Northwell Health, Manhasset, NY, USA
| | - Jan Freudenberg
- Regeneron Genetics Center, Regeneron Pharmaceuticals, Inc., Tarrytown, NY, USA
| |
Collapse
|
19
|
Yano H, Shintani M, Tomita M, Suzuki H, Oshima T. Reconsidering plasmid maintenance factors for computational plasmid design. Comput Struct Biotechnol J 2018; 17:70-81. [PMID: 30619542 PMCID: PMC6312765 DOI: 10.1016/j.csbj.2018.12.001] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2018] [Revised: 12/08/2018] [Accepted: 12/09/2018] [Indexed: 12/18/2022] Open
Abstract
Plasmids are genetic parasites of microorganisms. The genomes of naturally occurring plasmids are expected to be polished via natural selection to achieve long-term persistence in the microbial cell population. However, plasmid genomes are extremely diverse, and the rules governing plasmid genomes are not fully understood. Therefore, computationally designing plasmid genomes optimized for model and nonmodel organisms remains challenging. Here, we summarize current knowledge of the plasmid genome organization and the factors that can affect plasmid persistence, with the aim of constructing synthetic plasmids for use in gram-negative bacteria. Then, we introduce publicly available resources, plasmid data, and bioinformatics tools that are useful for computational plasmid design.
Collapse
Affiliation(s)
- Hirokazu Yano
- Graduate School of Life Sciences, Tohoku University, 2-1-1, Katahira, Aoba-ku, Sendai 980-8577, Japan
| | - Masaki Shintani
- Department of Engineering, Graduate School of Integrated Science and Technology, Shizuoka University, 3-5-1, Hamamatsu 432-8561, Japan
- Department of Bioscience, Graduate School of Science and Technology, Shizuoka University, 3-5-1, Hamamatsu 432-8561, Japan
| | - Masaru Tomita
- Institute for Advanced Biosciences, Keio University, 14-1, Baba-cho, Tsuruoka, Yamagata 997-0035, Japan
- Faculty of Environment and Information Studies, Keio University, 5322, Endo, Fujisawa, Kanagawa 252-0882, Japan
| | - Haruo Suzuki
- Institute for Advanced Biosciences, Keio University, 14-1, Baba-cho, Tsuruoka, Yamagata 997-0035, Japan
- Faculty of Environment and Information Studies, Keio University, 5322, Endo, Fujisawa, Kanagawa 252-0882, Japan
| | - Taku Oshima
- Department of Biotechnology, Toyama Prefectural University, 5180, Kurokawa, Imizu, Toyama 939-0398, Japan
| |
Collapse
|
20
|
Serrano-Solís V, Toscano Soares PE, de Farías ST. Genomic Signatures Among Acanthamoeba polyphaga Entoorganisms Unveil Evidence of Coevolution. J Mol Evol 2018; 87:7-15. [PMID: 30456441 DOI: 10.1007/s00239-018-9877-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2018] [Accepted: 11/09/2018] [Indexed: 11/30/2022]
Abstract
The definition of a genomic signature (GS) is "the total net response to selective pressure". Recent isolation and sequencing of naturally occurring organisms, hereby named entoorganisms, within Acanthamoeba polyphaga, raised the hypothesis of a common genomic signature despite their diverse and unrelated evolutionary origin. Widely accepted and implemented tests for GS detection are oligonucleotide relative frequencies (OnRF) and relative codon usage (RCU) surveys. A common pattern and strong correlations were unveiled from OnRFs among A. polyphaga's Mimivirus and virophage Sputnik. RCU showed a common A-T bias at third codon position. We expanded tests to the amoebal mitochondrial genome and amoeba-resistant bacteria, achieving strikingly coherent results to the aforementioned viral analyses. The GSs in these entoorganisms of diverse evolutionary origin are coevolutionarily conserved within an intracellular environment that provides sanctuary for species of ecological and biomedical relevance.
Collapse
Affiliation(s)
- Víctor Serrano-Solís
- Laboratório de Genética Evolutiva Paulo Leminsk, Departamento de Biologia Molecular, Centro de Ciencias Exatas e da Natureza, Universidade Federal da Paraíba, João Pessoa, Brazil.
| | - Paulo Eduardo Toscano Soares
- Laboratório de Genética Evolutiva Paulo Leminsk, Departamento de Biologia Molecular, Centro de Ciencias Exatas e da Natureza, Universidade Federal da Paraíba, João Pessoa, Brazil
| | - Sávio T de Farías
- Laboratório de Genética Evolutiva Paulo Leminsk, Departamento de Biologia Molecular, Centro de Ciencias Exatas e da Natureza, Universidade Federal da Paraíba, João Pessoa, Brazil
| |
Collapse
|
21
|
An open-source k-mer based machine learning tool for fast and accurate subtyping of HIV-1 genomes. PLoS One 2018; 13:e0206409. [PMID: 30427878 PMCID: PMC6235296 DOI: 10.1371/journal.pone.0206409] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Accepted: 10/14/2018] [Indexed: 01/11/2023] Open
Abstract
For many disease-causing virus species, global diversity is clustered into a taxonomy of subtypes with clinical significance. In particular, the classification of infections among the subtypes of human immunodeficiency virus type 1 (HIV-1) is a routine component of clinical management, and there are now many classification algorithms available for this purpose. Although several of these algorithms are similar in accuracy and speed, the majority are proprietary and require laboratories to transmit HIV-1 sequence data over the network to remote servers. This potentially exposes sensitive patient data to unauthorized access, and makes it impossible to determine how classifications are made and to maintain the data provenance of clinical bioinformatic workflows. We propose an open-source supervised and alignment-free subtyping method (Kameris) that operates on k-mer frequencies in HIV-1 sequences. We performed a detailed study of the accuracy and performance of subtype classification in comparison to four state-of-the-art programs. Based on our testing data set of manually curated real-world HIV-1 sequences (n = 2, 784), Kameris obtained an overall accuracy of 97%, which matches or exceeds all other tested software, with a processing rate of over 1,500 sequences per second. Furthermore, our fully standalone general-purpose software provides key advantages in terms of data security and privacy, transparency and reproducibility. Finally, we show that our method is readily adaptable to subtype classification of other viruses including dengue, influenza A, and hepatitis B and C virus.
Collapse
|
22
|
Llarena A, Ribeiro‐Gonçalves BF, Nuno Silva D, Halkilahti J, Machado MP, Da Silva MS, Jaakkonen A, Isidro J, Hämäläinen C, Joenperä J, Borges V, Viera L, Gomes JP, Correia C, Lunden J, Laukkanen‐Ninios R, Fredriksson‐Ahomaa M, Bikandi J, Millan RS, Martinez‐Ballesteros I, Laorden L, Mäesaar M, Grantina‐Ievina L, Hilbert F, Garaizar J, Oleastro M, Nevas M, Salmenlinna S, Hakkinen M, Carriço JA, Rossi M. INNUENDO: A cross‐sectoral platform for the integration of genomics in the surveillance of food‐borne pathogens. ACTA ACUST UNITED AC 2018. [DOI: 10.2903/sp.efsa.2018.en-1498] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
23
|
Bordignon V, Cavallo I, D'Agosto G, Trento E, Pontone M, Abril E, Di Domenico EG, Ensoli F. Nucleic Acid Sensing Perturbation: How Aberrant Recognition of Self-Nucleic Acids May Contribute to Autoimmune and Autoinflammatory Diseases. INTERNATIONAL REVIEW OF CELL AND MOLECULAR BIOLOGY 2018; 344:117-137. [PMID: 30798986 DOI: 10.1016/bs.ircmb.2018.09.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Bacteria and mammalian cells have developed sophisticated sensing mechanisms to detect and eliminate foreign genetic material or to restrict its expression and replication. Progress has been made in the understanding of these mechanisms, which keep foreign or unwanted nucleic acids in check. The complex of mechanisms involved in RNA and DNA sensing is part of a system which is now appreciated as "immune sensing of nucleic acids" or better "nucleic acid immunity." Nucleic acids, which are critical components for inheriting genetic information in all species, including pathogens, are key structures recognized by the innate immune system. However, while nucleic acid recognition is required for host defense against pathogens, there is a potential risk of self-nucleic acids recognition. In fact, besides its essential contribution to antiviral or microbial defense and restriction of endogenous retro elements, deregulation of nucleic acid immunity can also lead to human diseases due to erroneous detection and response to self-nucleic acids, causing sterile inflammation and autoimmunity. In this review we will discuss the roles of nucleic acid receptors in guarding against pathogen invasion, and how the microbial environment could interfere or influence immune sensing in discriminating between self and non-self and how this may contribute to autoimmunity or inflammatory diseases.
Collapse
Affiliation(s)
- Valentina Bordignon
- Clinical Pathology and Microbiology, San Gallicano Dermatological Institute IRCCS, Rome, Italy.
| | - Ilaria Cavallo
- Clinical Pathology and Microbiology, San Gallicano Dermatological Institute IRCCS, Rome, Italy
| | - Giovanna D'Agosto
- Clinical Pathology and Microbiology, San Gallicano Dermatological Institute IRCCS, Rome, Italy
| | - Elisabetta Trento
- Clinical Pathology and Microbiology, San Gallicano Dermatological Institute IRCCS, Rome, Italy
| | - Martina Pontone
- Clinical Pathology and Microbiology, San Gallicano Dermatological Institute IRCCS, Rome, Italy
| | - Elva Abril
- Clinical Pathology and Microbiology, San Gallicano Dermatological Institute IRCCS, Rome, Italy
| | - Enea Gino Di Domenico
- Clinical Pathology and Microbiology, San Gallicano Dermatological Institute IRCCS, Rome, Italy
| | - Fabrizio Ensoli
- Clinical Pathology and Microbiology, San Gallicano Dermatological Institute IRCCS, Rome, Italy
| |
Collapse
|
24
|
Deng Y, Hsiang T, Li S, Lin L, Wang Q, Chen Q, Xie B, Ming R. Comparison of the Mitochondrial Genome Sequences of Six Annulohypoxylon stygium Isolates Suggests Short Fragment Insertions as a Potential Factor Leading to Larger Genomic Size. Front Microbiol 2018; 9:2079. [PMID: 30250455 PMCID: PMC6140425 DOI: 10.3389/fmicb.2018.02079] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2018] [Accepted: 08/14/2018] [Indexed: 12/22/2022] Open
Abstract
Mitochondrial DNA (mtDNA) is a core non-nuclear genetic material found in all eukaryotic organisms, the size of which varies extensively in the eumycota, even within species. In this study, mitochondrial genomes of six isolates of Annulohypoxylon stygium (Lév.) were assembled from raw reads from PacBio and Illumina sequencing. The diversity of genomic structures, conserved genes, intergenic regions and introns were analyzed and compared. Genome sizes ranged from 132 to 147 kb and contained the same sets of conserved protein-coding, tRNA and rRNA genes and shared the same gene arrangements and orientation. In addition, most intergenic regions were homogeneous and had similar sizes except for the region between cytochrome b (cob) and cytochrome c oxidase I (cox1) genes which ranged from 2,998 to 8,039 bp among the six isolates. Sixty-five intron insertion sites and 99 different introns were detected in these genomes. Each genome contained 45 or more introns, which varied in distribution and content. Introns from homologous insertion sites also showed high diversity in size, type and content. Comparison of introns at the same loci showed some complex introns, such as twintrons and ORF-less introns. There were 44 short fragment insertions detected within introns, intergenic regions, or as introns, some of them located at conserved domain regions of homing endonuclease genes. Insertions of short fragments such as small inverted repeats might affect or hinder the movement of introns, and these allowed for intron accumulation in the mitochondrial genomes analyzed, and enlarged their size. This study showed that the evolution of fungal mitochondrial introns is complex, and the results suggest short fragment insertions as a potential factor leading to larger mitochondrial genomes in A. stygium.
Collapse
Affiliation(s)
- Youjin Deng
- Center for Genomics and Biotechnology, Haixia Institute of Science and Technology, College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China.,Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Tom Hsiang
- Environmental Sciences, University of Guelph, Guelph, ON, Canada
| | - Shuxian Li
- USDA-Agricultural Research Service, Crop Genetics Research Unit, Stoneville, MS, United States
| | - Longji Lin
- Center for Genomics and Biotechnology, Haixia Institute of Science and Technology, College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Qingfu Wang
- Center for Genomics and Biotechnology, Haixia Institute of Science and Technology, College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Qinghe Chen
- Center for Genomics and Biotechnology, Haixia Institute of Science and Technology, College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China.,Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Baogui Xie
- Center for Genomics and Biotechnology, Haixia Institute of Science and Technology, College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Ray Ming
- Center for Genomics and Biotechnology, Haixia Institute of Science and Technology, College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China.,Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| |
Collapse
|
25
|
Stauft CB, Shen SH, Song Y, Gorbatsevych O, Asare E, Futcher B, Mueller S, Payne A, Brecher M, Kramer L, Wimmer E. Extensive recoding of dengue virus type 2 specifically reduces replication in primate cells without gain-of-function in Aedes aegypti mosquitoes. PLoS One 2018; 13:e0198303. [PMID: 30192757 PMCID: PMC6128446 DOI: 10.1371/journal.pone.0198303] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Accepted: 07/10/2018] [Indexed: 12/13/2022] Open
Abstract
Dengue virus (DENV), an arthropod-borne ("arbovirus") virus, causes a range of human maladies ranging from self-limiting dengue fever to the life-threatening dengue shock syndrome and proliferates well in two different taxa of the Animal Kingdom, mosquitoes and primates. Mosquitoes and primates show taxonomic group-specific intolerance to certain codon pairs when expressing their genes by translation. This is called "codon pair bias". By necessity, dengue viruses evolved to delicately balance this fundamental difference in their open reading frames (ORFs). We have undone the evolutionarily conserved genomic balance in the DENV2 ORF sequence and specifically shifted the encoding preference away from primates. However, this recoding of DENV2 raised concerns of 'gain-of-function,' namely whether recoding could inadvertently increase fitness for replication in the arthropod vector. Using mosquito cell lines and two strains of Aedes aegypti we did not observe any increase in fitness in DENV2 variants codon pair deoptimized for humans. This ability to disrupt and control DENV2's host preference has great promise towards developing the next generation of synthetic vaccines not only for DENV but for other emerging arboviral pathogens such as chikungunya virus and Zika virus.
Collapse
Affiliation(s)
- Charles B. Stauft
- Stony Brook University, Department of Molecular Genetics and Microbiology, Stony Brook University School of Medicine, Stony Brook, New York, United States of America
- Codagenix, Incorporated, Farmingdale, New York, United States of America
| | - Sam H. Shen
- Stony Brook University, Department of Molecular Genetics and Microbiology, Stony Brook University School of Medicine, Stony Brook, New York, United States of America
| | - Yutong Song
- Stony Brook University, Department of Molecular Genetics and Microbiology, Stony Brook University School of Medicine, Stony Brook, New York, United States of America
| | - Oleksandr Gorbatsevych
- Stony Brook University, Department of Molecular Genetics and Microbiology, Stony Brook University School of Medicine, Stony Brook, New York, United States of America
| | - Emmanuel Asare
- Stony Brook University, Department of Molecular Genetics and Microbiology, Stony Brook University School of Medicine, Stony Brook, New York, United States of America
| | - Bruce Futcher
- Stony Brook University, Department of Molecular Genetics and Microbiology, Stony Brook University School of Medicine, Stony Brook, New York, United States of America
| | - Steffen Mueller
- Stony Brook University, Department of Molecular Genetics and Microbiology, Stony Brook University School of Medicine, Stony Brook, New York, United States of America
- Codagenix, Incorporated, Farmingdale, New York, United States of America
| | - Anne Payne
- Wadsworth Center, New York State Department of Health, Slingerlands, New York, United States of America
| | - Matthew Brecher
- Wadsworth Center, New York State Department of Health, Slingerlands, New York, United States of America
| | - Laura Kramer
- Wadsworth Center, New York State Department of Health, Slingerlands, New York, United States of America
- School of Public Health, State University of New York at Albany, Rensselaer, New York, United States of America
| | - Eckard Wimmer
- Stony Brook University, Department of Molecular Genetics and Microbiology, Stony Brook University School of Medicine, Stony Brook, New York, United States of America
- Codagenix, Incorporated, Farmingdale, New York, United States of America
| |
Collapse
|
26
|
Franzo G, Segales J, Tucciarone CM, Cecchinato M, Drigo M. The analysis of genome composition and codon bias reveals distinctive patterns between avian and mammalian circoviruses which suggest a potential recombinant origin for Porcine circovirus 3. PLoS One 2018; 13:e0199950. [PMID: 29958294 PMCID: PMC6025852 DOI: 10.1371/journal.pone.0199950] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Accepted: 06/15/2018] [Indexed: 01/30/2023] Open
Abstract
Members of the genus Circovirus are host-specific viruses, which are totally dependent on cell machinery for their replication. Consequently, certain mimicry of the host genome features is expected to maximize cellular replicative system exploitation and minimize the recognition by the innate immune system. In the present study, the analysis of several genome composition and codon bias parameters of circoviruses infecting avian and mammalian species demonstrated the presence of quite distinctive patterns between the two groups. Remarkably, a higher deviation from the expected values based only on mutational patterns was observed for mammalian circoviruses both at dinucleotide and codon levels. Accordingly, a stronger selective pressure was estimated to shape the genome of mammalian circoviruses, particularly in the Cap encoding gene, compared to avian circoviruses. These differences could be attributed to different physiological and immunological features of the two host classes and suggest a trade-off between a tendency to optimize the capsid protein translation while minimizing the recognition of the genome and the transcript molecules. Interestingly, the recently identified Porcine circovirus 3 (PCV-3) had an intermediate pattern in terms of genome composition and codon bias. Particularly, its Rep gene appeared closely related to other mammalian circoviruses (especially bat circoviruses) while the Cap gene more closely resembled avian circoviruses. These evidences, coupled with the high selective forces apparently modelling the PCV-3 Cap gene composition, suggest the potential recombinant origin, followed or preceded by a host jump, of this virus.
Collapse
Affiliation(s)
- Giovanni Franzo
- Department of Animal Medicine, Production and Health (MAPS), University of Padua, Legnaro, Padua, Italy
- * E-mail:
| | - Joaquim Segales
- Departament de Sanitat i Anatomia Animals, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain
- UAB, Centre de Recerca en Sanitat Animal (CReSA, IRTA- UAB), Campus de la Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain
| | - Claudia Maria Tucciarone
- Department of Animal Medicine, Production and Health (MAPS), University of Padua, Legnaro, Padua, Italy
| | - Mattia Cecchinato
- Department of Animal Medicine, Production and Health (MAPS), University of Padua, Legnaro, Padua, Italy
| | - Michele Drigo
- Department of Animal Medicine, Production and Health (MAPS), University of Padua, Legnaro, Padua, Italy
| |
Collapse
|
27
|
Almpanis A, Swain M, Gatherer D, McEwan N. Correlation between bacterial G+C content, genome size and the G+C content of associated plasmids and bacteriophages. Microb Genom 2018; 4:e000168. [PMID: 29633935 PMCID: PMC5989581 DOI: 10.1099/mgen.0.000168] [Citation(s) in RCA: 63] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2017] [Accepted: 03/06/2018] [Indexed: 02/06/2023] Open
Abstract
Based on complete bacterial genome sequence data, we demonstrate a correlation between bacterial chromosome length and the G+C content of the genome, with longer genomes having higher G+C contents. The correlation value decreases at shorter genome sizes, where there is a wider spread of G+C values. However, although significant (P<0.001), the correlation value (Pearson R=0.58) suggests that other factors also have a significant influence. A similar pattern was seen for plasmids; longer plasmids had higher G+C values, although the large number of shorter plasmids had a wide spread of G+C values. There was also a significant (P<0.0001) correlation between the G+C content of plasmids and the G+C content of their bacterial host. Conversely, the G+C content of bacteriophages tended to reduce with larger genome sizes, and although there was a correlation between host genome G+C content and that of the bacteriophage, it was not as strong as that seen between plasmids and their hosts.
Collapse
Affiliation(s)
- Apostolos Almpanis
- Aberystwyth University, Aberystwyth, UK
- Newcastle University, Newcastle-upon-Tyne, UK
| | | | | | - Neil McEwan
- Aberystwyth University, Aberystwyth, UK
- School of Pharmacy and Life Sciences, Robert Gordon University, Aberdeen, UK
| |
Collapse
|
28
|
Nagalakshmi B., Sagarkar S, Sakharkar AJ. Epigenetic Mechanisms of Traumatic Brain Injuries. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2018; 157:263-298. [DOI: 10.1016/bs.pmbts.2017.12.013] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
29
|
Gatherer D. Genome Signatures, Self-Organizing Maps and Higher Order Phylogenies: A Parametric Analysis. Evol Bioinform Online 2017. [DOI: 10.1177/117693430700300001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Genome signatures are data vectors derived from the compositional statistics of DNA. The self-organizing map (SOM) is a neural network method for the conceptualisation of relationships within complex data, such as genome signatures. The various parameters of the SOM training phase are investigated for their effect on the accuracy of the resulting output map. It is concluded that larger SOMs, as well as taking longer to train, are less sensitive in phylogenetic classification of unknown DNA sequences. However, where a classification can be made, a larger SOM is more accurate. Increasing the number of iterations in the training phase of the SOM only slightly increases accuracy, without improving sensitivity. The optimal length of the DNA sequence k-mer from which the genome signature should be derived is 4 or 5, but shorter values are almost as effective. In general, these results indicate that small, rapidly trained SOMs are generally as good as larger, longer trained ones for the analysis of genome signatures. These results may also be more generally applicable to the use of SOMs for other complex data sets, such as microarray data.
Collapse
Affiliation(s)
- Derek Gatherer
- MRC Virology Unit, Institute of Virology. Church Street, Glasgow G11 5JR, UK
| |
Collapse
|
30
|
Wang G, Hou Y, Zhang X, Zhang J, Li J, Chen Z. Strong population genetic structure of an invasive species, Rhynchophorus ferrugineus (Olivier), in southern China. Ecol Evol 2017; 7:10770-10781. [PMID: 29299256 PMCID: PMC5743574 DOI: 10.1002/ece3.3599] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Revised: 08/31/2017] [Accepted: 10/11/2017] [Indexed: 11/22/2022] Open
Abstract
The red palm weevil (RPW), Rhynchophorus ferrugineus (Olivier), was initially reported in China in the 1990s and is now considered one of the most successful invasive pests of palm plants in the country. A total of 14 microsatellite loci and one mitochondrial cytochrome oxidase subunit Ι (cox I) gene fragment were used to investigate the genetic characteristics and structure of R. ferrugineus in southern China. High levels of genetic differentiation among populations and significant correlations between genetic and geographical distances indicated an important role of geographical distance in the distribution of the RPW in southern China. High gene flow between Fujian and Taiwan province populations illustrated the increased effects of frequent anthropogenic activities on gene flow between them. Genetic similarity (i.e., haplotype similarity) indicated that RPW individuals from Taiwan and Fujian invaded from a different source than those from Hainan. To some extent, the genetic structure of the RPW in southern China correlated well with the geographic origins of this pest. We propose that geographical distance, anthropogenic activities, and the biological attributes of this pest are responsible for the distribution pattern of the RPW in southern China. The phylogenetic analysis suggests that the most likely native sources of the RPW in southern China are India, the Philippines, and Vietnam.
Collapse
Affiliation(s)
- Guihua Wang
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops Fujian Agriculture and Forestry University Fuzhou China.,Fujian Province Key Laboratory of Insect Ecology College of Plant Protection Fujian Agriculture and Forestry University Fuzhou China
| | - Youming Hou
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops Fujian Agriculture and Forestry University Fuzhou China.,Fujian Province Key Laboratory of Insect Ecology College of Plant Protection Fujian Agriculture and Forestry University Fuzhou China
| | - Xiang Zhang
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops Fujian Agriculture and Forestry University Fuzhou China.,Fujian Province Key Laboratory of Insect Ecology College of Plant Protection Fujian Agriculture and Forestry University Fuzhou China
| | - Jie Zhang
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops Fujian Agriculture and Forestry University Fuzhou China.,Fujian Province Key Laboratory of Insect Ecology College of Plant Protection Fujian Agriculture and Forestry University Fuzhou China
| | - Jinlei Li
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops Fujian Agriculture and Forestry University Fuzhou China.,Fujian Province Key Laboratory of Insect Ecology College of Plant Protection Fujian Agriculture and Forestry University Fuzhou China
| | - Zhiming Chen
- Fuzhou Entry-Exit Inspection & Quarantine Bureau of P.R.C. Fuzhou China
| |
Collapse
|
31
|
McCarthy CGP, Fitzpatrick DA. Multiple Approaches to Phylogenomic Reconstruction of the Fungal Kingdom. ADVANCES IN GENETICS 2017; 100:211-266. [PMID: 29153401 DOI: 10.1016/bs.adgen.2017.09.006] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Fungi are possibly the most diverse eukaryotic kingdom, with over a million member species and an evolutionary history dating back a billion years. Fungi have been at the forefront of eukaryotic genomics, and owing to initiatives like the 1000 Fungal Genomes Project the amount of fungal genomic data has increased considerably over the last 5 years, enabling large-scale comparative genomics of species across the kingdom. In this chapter, we first review fungal evolution and the history of fungal genomics. We then review in detail seven phylogenomic methods and reconstruct the phylogeny of 84 fungal species from 8 phyla using each method. Six methods have seen extensive use in previous fungal studies, while a Bayesian supertree method is novel to fungal phylogenomics. We find that both established and novel phylogenomic methods can accurately reconstruct the fungal kingdom. Finally, we discuss the accuracy and suitability of each phylogenomic method utilized.
Collapse
|
32
|
Vabret N, Bhardwaj N, Greenbaum BD. Sequence-Specific Sensing of Nucleic Acids. Trends Immunol 2016; 38:53-65. [PMID: 27856145 DOI: 10.1016/j.it.2016.10.006] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2016] [Revised: 10/14/2016] [Accepted: 10/14/2016] [Indexed: 12/25/2022]
Abstract
Innate immune cells are endowed with many nucleic acid receptors, but the role of sequence in the detection of foreign organisms remains unclear. Can sequence patterns influence recognition? In addition, how can we infer those patterns from sequence data? Here, we detail recent computational and experimental evidence associated with sequence-specific sensing. We review the mechanisms underlying the detection and discrimination of foreign sequences from self. We also describe quantitative approaches used to infer the stimulatory capacity of a given pathogen nucleic acid species, and the influence of sequence-specific sensing on host-pathogen coevolution, including endogenous sequences of foreign origin. Finally, we speculate how further studies of sequence-specific sensing will be useful to improve vaccine design, gene therapy and cancer treatment.
Collapse
Affiliation(s)
- Nicolas Vabret
- Tisch Cancer Institute, Departments of Medicine, Hematology, and Medical Oncology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Departments of Oncological Sciences and Pathology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
| | - Nina Bhardwaj
- Tisch Cancer Institute, Departments of Medicine, Hematology, and Medical Oncology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Benjamin D Greenbaum
- Tisch Cancer Institute, Departments of Medicine, Hematology, and Medical Oncology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Departments of Oncological Sciences and Pathology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
| |
Collapse
|
33
|
Karamichalis R, Kari L, Konstantinidis S, Kopecki S, Solis-Reyes S. Additive methods for genomic signatures. BMC Bioinformatics 2016; 17:313. [PMID: 27549194 PMCID: PMC4994249 DOI: 10.1186/s12859-016-1157-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2016] [Accepted: 07/19/2016] [Indexed: 01/09/2023] Open
Abstract
Background Studies exploring the potential of Chaos Game Representations (CGR) of genomic sequences to act as “genomic signatures” (to be species- and genome-specific) showed that CGR patterns of nuclear and organellar DNA sequences of the same organism can be very different. While the hypothesis that CGRs of mitochondrial DNA sequences can act as genomic signatures was validated for a snapshot of all sequenced mitochondrial genomes available in the NCBI GenBank sequence database, to our knowledge no such extensive analysis of CGRs of nuclear DNA sequences exists to date. Results We analyzed an extensive dataset, totalling 1.45 gigabase pairs, of nuclear/nucleoid genomic sequences (nDNA) from 42 different organisms, spanning all major kingdoms of life. Our computational experiments indicate that CGR signatures of nDNA of two different origins cannot always be differentiated, especially if they originate from closely-related species such as H. sapiens and P. troglodytes or E. coli and E. fergusonii. To address this issue, we propose the general concept of additive DNA signature of a set (collection) of DNA sequences. One particular instance, the composite DNA signature, combines information from nDNA fragments and organellar (mitochondrial, chloroplast, or plasmid) genomes. We demonstrate that, in this dataset, composite DNA signatures originating from two different organisms can be differentiated in all cases, including those where the use of CGR signatures of nDNA failed or was inconclusive. Another instance, the assembled DNA signature, combines information from many short DNA subfragments (e.g., 100 basepairs) of a given DNA fragment, to produce its signature. We show that an assembled DNA signature has the same distinguishing power as a conventionally computed CGR signature, while using shorter contiguous sequences and potentially less sequence information. Conclusions Our results suggest that, while CGR signatures of nDNA cannot always play the role of genomic signatures, composite and assembled DNA signatures (separately or in combination) could potentially be used instead. Such additive signatures could be used, e.g., with raw unassembled next-generation sequencing (NGS) read data, when high-quality sequencing data is not available, or to complement information obtained by other methods of species identification or classification. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1157-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Rallis Karamichalis
- Department of Computer Science, University of Western Ontario, London ON, N6A 5B7, Canada
| | - Lila Kari
- School of Computing Science, University of Waterloo, Waterloo, ON, N2L 3G1, Canada. .,Department of Computer Science, University of Western Ontario, London ON, N6A 5B7, Canada.
| | - Stavros Konstantinidis
- Department of Mathematics and Computing Science, Saint Mary's University, Halifax NS, Canada
| | - Steffen Kopecki
- Department of Computer Science, University of Western Ontario, London ON, N6A 5B7, Canada.,Department of Mathematics and Computing Science, Saint Mary's University, Halifax NS, Canada
| | - Stephen Solis-Reyes
- Department of Computer Science, University of Western Ontario, London ON, N6A 5B7, Canada
| |
Collapse
|
34
|
Bonnici V, Manca V. Informational laws of genome structures. Sci Rep 2016; 6:28840. [PMID: 27354155 PMCID: PMC4937431 DOI: 10.1038/srep28840] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Accepted: 06/09/2016] [Indexed: 01/06/2023] Open
Abstract
In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined.
Collapse
Affiliation(s)
- Vincenzo Bonnici
- University of Verona, Department of Computer Science, University of Verona, Verona 37134, Italy,Center for BioMedical Computing, University of Verona, Verona, 37134, Italy
| | - Vincenzo Manca
- University of Verona, Department of Computer Science, University of Verona, Verona 37134, Italy,Center for BioMedical Computing, University of Verona, Verona, 37134, Italy,
| |
Collapse
|
35
|
Apostolou-Karampelis K, Nikolaou C, Almirantis Y. A novel skew analysis reveals substitution asymmetries linked to genetic code GC-biases and PolIII a-subunit isoforms. DNA Res 2016; 23:353-63. [PMID: 27345720 PMCID: PMC4991834 DOI: 10.1093/dnares/dsw021] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2016] [Accepted: 05/09/2016] [Indexed: 11/30/2022] Open
Abstract
Strand biases reflect deviations from a null expectation of DNA evolution that assumes strand-symmetric substitution rates. Here, we present strong evidence that nearest-neighbour preferences are a strand-biased feature of bacterial genomes, indicating neighbour-dependent substitution asymmetries. To detect such asymmetries we introduce an alignment free index (relative abundance skews). The profiles of relative abundance skews along coding sequences can trace the phylogenetic relations of bacteria, suggesting that the patterns of neighbour-dependent substitution strand-biases are not common among different lineages, but are rather species-specific. Analysis of neighbour-dependent and codon-site skews sheds light on the origins of substitution asymmetries. Via a simple model we argue that the structure of the genetic code imposes position-dependent substitution strand-biases along coding sequences, as a response to GC mutation pressure. Thus, the organization of the genetic code per se can lead to an uneven distribution of nucleotides among different codon sites, even when requirements for specific codons and amino-acids are not accounted for. Moreover, our results suggest that strand-biases in replication fidelity of PolIII α-subunit induce substitution asymmetries, both neighbour-dependent and independent, on a genome scale. The role of DNA repair systems, such as transcription-coupled repair, is also considered.
Collapse
Affiliation(s)
| | - Christoforos Nikolaou
- Computational Genomics Group, Department of Biology, University of Crete, 71409 Heraklion, Greece
| | - Yannis Almirantis
- Institute of Biosciences and Applications, National Center for Scientific Research "Demokritos", 15310 Athens, Greece
| |
Collapse
|
36
|
Kunec D, Osterrieder N. Codon Pair Bias Is a Direct Consequence of Dinucleotide Bias. Cell Rep 2016; 14:55-67. [DOI: 10.1016/j.celrep.2015.12.011] [Citation(s) in RCA: 89] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2015] [Revised: 11/03/2015] [Accepted: 11/23/2015] [Indexed: 11/25/2022] Open
|
37
|
Ghosh S, Singh KK, Sengupta S, Scaria V. Mitoepigenetics: The different shades of grey. Mitochondrion 2015; 25:60-6. [PMID: 26437363 DOI: 10.1016/j.mito.2015.09.003] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2015] [Accepted: 09/28/2015] [Indexed: 11/24/2022]
Abstract
Epigenetic modifications of the nuclear genome have been well studied and it is established that these modifications play a key role in nuclear gene expression. However, the status of mitochondrial epigenetic modifications has not been delved in detail. The recent technological advancements in the genome analyzing tools and techniques, have helped in investigating mitochondrial epigenetic modifications with greater resolution and studies have indicated a regulatory role of the mitochondrial epigenome. Association of mitochondrial DNA methylation with various disease conditions, drug treatment, aging, exposure to environmental pollutants etc. has lent credence to this belief. Herein, we have reviewed studies on mitochondrial epigenetic modifications with a focus to comprehend its regulatory role in gene expression and disease association.
Collapse
Affiliation(s)
- Sourav Ghosh
- Genomics and Molecular Medicine, CSIR Institute of Genomics and Integrative Biology (CSIR IGIB), Mathura Road, Delhi, 110 020 Delhi, India; Academy of Scientific and Innovative Research (AcSIR), CSIR IGIB South Campus, Mathura Road, Delhi, 110020 Delhi, India
| | - Keshav K Singh
- Departments of Genetics, Pathology, Environmental Health, University of Alabama at Birmingham, Birmingham, Alabama; Center for Free Radical Biology, Center for Aging and UAB Comprehensive Cancer Center, University of Alabama at Birmingham, Birmingham, Alabama; Birmingham Veterans Affairs Medical Center, Birmingham, AL, USA 35294
| | - Shantanu Sengupta
- Genomics and Molecular Medicine, CSIR Institute of Genomics and Integrative Biology (CSIR IGIB), Mathura Road, Delhi, 110 020 Delhi, India; Academy of Scientific and Innovative Research (AcSIR), CSIR IGIB South Campus, Mathura Road, Delhi, 110020 Delhi, India
| | - Vinod Scaria
- GN Ramachandran Knowledge Center for Genome Informatics, CSIR Institute of Genomics and Integrative Biology (CSIR IGIB), Mathura Road, Delhi, 110 020 Delhi, India; Academy of Scientific and Innovative Research (AcSIR), CSIR IGIB South Campus, Mathura Road, Delhi, 110020 Delhi, India.
| |
Collapse
|
38
|
Hou T, Liu F, Liu Y, Zou QY, Zhang X, Wang K. Classification of metagenomics data at lower taxonomic level using a robust supervised classifier. Evol Bioinform Online 2015; 11:3-10. [PMID: 25673967 PMCID: PMC4309676 DOI: 10.4137/ebo.s20523] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2014] [Revised: 11/25/2014] [Accepted: 12/14/2014] [Indexed: 11/11/2022] Open
Abstract
As more and more completely sequenced genomes become available, the taxonomic classification of metagenomic data will benefit greatly from supervised classifiers that can be updated instantaneously in response to new genomes. Currently, some supervised classifiers have been developed to assess the organism of metagenomic sequences. We have found that the existing supervised classifiers usually cannot discriminate the training data from different classes accurately when the data contain some outliers. However, the training genomic data (bacterial and archaeal genomes) usually contain a portion of outliers, which come from sequencing errors, phage invasions, and some highly expressed genes, etc. The outliers, treated as noises, prohibit the development of classifiers with better prediction accuracy. To solve the problem, we present a robust supervised classifier, weighted support vector domain description (WSVDD), which can eliminate the interference from some outliers for training genomic data and then generate more accurate data domain descriptions for each taxonomic class. The experimental results demonstrate WSVDD is more robust than other classifiers for simulated Sanger and 454 reads with different outlier rates. In addition, in experiments performed on simulated metagenomes and real gut metagenomes, WSVDD also achieved better prediction accuracy than other classifiers.
Collapse
Affiliation(s)
- Tao Hou
- College of Communications Engineering, Jilin University, Changchun, China
| | - Fu Liu
- College of Communications Engineering, Jilin University, Changchun, China
| | - Yun Liu
- College of Communications Engineering, Jilin University, Changchun, China
| | - Qing Yu Zou
- College of Communications Engineering, Jilin University, Changchun, China
| | - Xiao Zhang
- College of Communications Engineering, Jilin University, Changchun, China
| | - Ke Wang
- College of Communications Engineering, Jilin University, Changchun, China
| |
Collapse
|
39
|
Abstract
Dinucleotide usage is known to vary in the genomes of organisms. The dinucleotide usage profiles or genome signatures are similar for sequence samples taken from the same genome, but are different for taxonomically distant species. This concept of genome signatures has been used to study several organisms including viruses, to elucidate the signatures of evolutionary processes at the genome level. Genome signatures assume greater importance in the case of host-pathogen interactions, where molecular interactions between the two species take place continuously, and can influence their genomic composition. In this study, analyses of whole genome sequences of the HIV-1 subtype B, a retrovirus that caused global pandemic of AIDS, have been carried out to analyse the variation in genome signatures of the virus from 1983 to 2007. We show statistically significant temporal variations in some dinucleotide patterns highlighting the selective evolution of the dinucleotide profiles of HIV-1 subtype B, possibly a consequence of host specific selection.
Collapse
|
40
|
Torriani SF, Penselin D, Knogge W, Felder M, Taudien S, Platzer M, McDonald BA, Brunner PC. Comparative analysis of mitochondrial genomes from closely related Rhynchosporium species reveals extensive intron invasion. Fungal Genet Biol 2014; 62:34-42. [DOI: 10.1016/j.fgb.2013.11.001] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Revised: 10/08/2013] [Accepted: 11/01/2013] [Indexed: 01/07/2023]
|
41
|
Song K, Ren J, Reinert G, Deng M, Waterman MS, Sun F. New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing. Brief Bioinform 2013; 15:343-53. [PMID: 24064230 DOI: 10.1093/bib/bbt067] [Citation(s) in RCA: 112] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
With the development of next-generation sequencing (NGS) technologies, a large amount of short read data has been generated. Assembly of these short reads can be challenging for genomes and metagenomes without template sequences, making alignment-based genome sequence comparison difficult. In addition, sequence reads from NGS can come from different regions of various genomes and they may not be alignable. Sequence signature-based methods for genome comparison based on the frequencies of word patterns in genomes and metagenomes can potentially be useful for the analysis of short reads data from NGS. Here we review the recent development of alignment-free genome and metagenome comparison based on the frequencies of word patterns with emphasis on the dissimilarity measures between sequences, the statistical power of these measures when two sequences are related and the applications of these measures to NGS data.
Collapse
Affiliation(s)
- Kai Song
- Molecular and Computational Biology Program, University of Southern California, 1050 Childs Way, Los Angeles, CA 90089, USA. or
| | | | | | | | | | | |
Collapse
|
42
|
An analysis of trypanosomatids kDNA minicircle by absolute dinucleotide frequency. Parasitol Int 2013; 62:397-403. [DOI: 10.1016/j.parint.2013.04.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2012] [Revised: 03/05/2013] [Accepted: 04/08/2013] [Indexed: 11/20/2022]
|
43
|
Salmonella utilizes D-glucosaminate via a mannose family phosphotransferase system permease and associated enzymes. J Bacteriol 2013; 195:4057-66. [PMID: 23836865 DOI: 10.1128/jb.00290-13] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Salmonella enterica is a globally significant bacterial food-borne pathogen that utilizes a variety of carbon sources. We report here that Salmonella enterica subsp. enterica serovar Typhimurium (S. Typhimurium) uses d-glucosaminate (2-amino-2-deoxy-d-gluconic acid) as a carbon and nitrogen source via a previously uncharacterized mannose family phosphotransferase system (PTS) permease, and we designate the genes encoding the permease dgaABCD (d-glucosaminate PTS permease components EIIA, EIIB, EIIC, and EIID). Two other genes in the dga operon (dgaE and dgaF) were required for wild-type growth of S. Typhimurium with d-glucosaminate. Transcription of dgaABCDEF was dependent on RpoN (σ(54)) and an RpoN-dependent activator gene we designate dgaR. Introduction of a plasmid bearing dgaABCDEF under the control of the lac promoter into Escherichia coli strains DH5α, BL21, and JM101 allowed these strains to grow on minimal medium containing d-glucosaminate as the sole carbon and nitrogen source. Biochemical and genetic data support a catabolic pathway in which d-glucosaminate, as it is transported across the cell membrane, is phosphorylated at the C-6 position by DgaABCD. DgaE converts the resulting d-glucosaminate-6-phosphate to 2-keto-3-deoxygluconate 6-phosphate (KDGP), which is subsequently cleaved by the aldolase DgaF to form glyceraldehyde-3-phosphate and pyruvate. DgaF catalyzes the same reaction as that catalyzed by Eda, a KDGP aldolase in the Entner-Doudoroff pathway, and the two enzymes can substitute for each other in their respective pathways. Examination of the Integrated Microbial Genomes database revealed that orthologs of the dga genes are largely restricted to certain enteric bacteria and a few species in the phylum Firmicutes.
Collapse
|
44
|
Krimitzas A, Pyrri I, Kouvelis VN, Kapsanaki-Gotsi E, Typas MA. A phylogenetic analysis of Greek isolates of Aspergillus species based on morphology and nuclear and mitochondrial gene sequences. BIOMED RESEARCH INTERNATIONAL 2013; 2013:260395. [PMID: 23762830 PMCID: PMC3665174 DOI: 10.1155/2013/260395] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/13/2013] [Accepted: 04/09/2013] [Indexed: 12/13/2022]
Abstract
Aspergillus species originating from Greece were examined by morphological and molecular criteria to explore the diversity of this genus. The phylogenetic relationships of these species were determined using sequences from the ITS and IGS region of the nuclear rRNA gene complex, two nuclear genes ( β -tubulin (benA) and RNA polymerase II second largest subunit (rpb2)) and two mitochondrial genes (small rRNA subunit (rns) and cytochrome oxidase subunit I (cox1)) and, where available, related sequences from databases. The morphological characters of the anamorphs and teleomorphs, and the single gene phylogenetic trees, differentiated and placed the species examined in the well-supported sections of Aenei, Aspergillus, Bispori, Candidi, Circumdati, Clavati, Cremei, Flavi, Flavipedes, Fumigati, Nidulantes, Nigri, Restricti, Terrei, Usti, and Zonati, with few uncertainties. The combined use of the three commonly employed nuclear genes (benA, rpb2, and ITS), the IGS region, and two less often used mitochondrial gene sequences (rns and cox1) as a single unit resolved several taxonomic ambiguities. A phylogenetic tree was inferred using Neighbour-Joining, Maximum Parsimony, and Bayesian methods. The strains examined formed seven well-supported clades within the genus Aspergillus. Altogether, the concatenated nuclear and mitochondrial sequences offer additional tools for an improved understanding of phylogenetic relationships within this genus.
Collapse
Affiliation(s)
- Antonios Krimitzas
- Department of Genetics and Biotechnology, Faculty of Biology, National and Kapodistrian University of Athens, Panepistemiopolis, 15701 Athens, Greece
| | - Ioanna Pyrri
- Department of Ecology and Systematics, Faculty of Biology, National and Kapodistrian University of Athens, Panepistemiopolis, 15784 Athens, Greece
| | - Vassili N. Kouvelis
- Department of Genetics and Biotechnology, Faculty of Biology, National and Kapodistrian University of Athens, Panepistemiopolis, 15701 Athens, Greece
| | - Evangelia Kapsanaki-Gotsi
- Department of Ecology and Systematics, Faculty of Biology, National and Kapodistrian University of Athens, Panepistemiopolis, 15784 Athens, Greece
| | - Milton A. Typas
- Department of Genetics and Biotechnology, Faculty of Biology, National and Kapodistrian University of Athens, Panepistemiopolis, 15701 Athens, Greece
| |
Collapse
|
45
|
Moreno-Hagelsieb G, Wang Z, Walsh S, ElSherbiny A. Phylogenomic clustering for selecting non-redundant genomes for comparative genomics. Bioinformatics 2013; 29:947-9. [PMID: 23396122 DOI: 10.1093/bioinformatics/btt064] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Analyses in comparative genomics often require non-redundant genome datasets. Eliminating redundancy is not as simple as keeping one strain for each named species because genomes might be redundant at a higher taxonomic level than that of species for some analyses; some strains with different species names can be as similar as most strains sharing a species name, whereas some strains sharing a species name can be so different that they should be put into different groups; and some genomes lack a species name. RESULTS We have implemented a method and Web server that clusters a genome dataset into groups of redundant genomes at different thresholds based on a few phylogenomic distance measures. AVAILABILITY The Web interface, similarity and distance data and R-scripts can be accessed at http://microbiome.wlu.ca/research/redundancy/.
Collapse
Affiliation(s)
- Gabriel Moreno-Hagelsieb
- Department of Biology and Department of Mathematics, Wilfrid Laurier University, Waterloo, ON N2L 3C5, Canada.
| | | | | | | |
Collapse
|
46
|
Mahfooz S, Singh P, Maurya DK, Yadav MC, Tahoor A, Sahay H, Srivastava A, Prakash A. Microsatellite repeat dynamics in mitochondrial genomes of phytopathogenic fungi: frequency and distribution in the genic and intergenic regions. Bioinformation 2012; 8:1171-5. [PMID: 23275715 PMCID: PMC3530887 DOI: 10.6026/97320630081171] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2012] [Accepted: 11/05/2012] [Indexed: 11/23/2022] Open
Abstract
The frequency and distribution of microsatellites were analyzed in the 19 mitogenomes of phytopathogenic fungi covering five phyla. Our analysis revealed that in all the mitogenomes studied, the frequency and relative abundance varied, and it was neither influenced by genome size nor by GC content. SSRs were found to be differential distributed in genic and intergenic regions. An average of 5.14 (23.6%) SSRs were present in genic sequences and 21.7 (76.4%) SSRs were located in the intergenic sequences. Relative abundance of SSRs in mitogenomes was the highest in Aspergillus tubigensis, whereas, it was the least in Phaeosphaeria nodurum, the average being 0.45. Trinucleotide repeats were the most abundant motifs in the genic and intergenic regions of the mitogenomes of the phytopathogenic fungi. Among the genes, cox1 harbors the maximum SSRs, whereas cox3 and nad 7 contain the least. Based on the presence of SSRs in a particular gene, genetic relationships among individual organisms were also established.
Collapse
Affiliation(s)
- Sahil Mahfooz
- Department of Biotechnology and Bioinformatics, Barkatullah University, Bhopal 462 026, Madhya Pradesh, India
| | - Pallavi Singh
- National Bureau of Agriculturally Important Microorganisms, Mau 275 101, Uttar Pradesh, India
| | - Deepak K Maurya
- National Bureau of Agriculturally Important Microorganisms, Mau 275 101, Uttar Pradesh, India
| | - Mahesh C Yadav
- National Research Centre on DNA Fingerprinting, National Bureau of Plant Genetic Resources, Pusa Campus, New Delhi 110 012, India
| | - Azram Tahoor
- Department of Wildlife Science, Aligarh Muslim University, Aligarh 202002, Uttar Pradesh, India
| | - Harmesh Sahay
- National Bureau of Agriculturally Important Microorganisms, Mau 275 101, Uttar Pradesh, India
| | - Arpita Srivastava
- National Bureau of Agriculturally Important Microorganisms, Mau 275 101, Uttar Pradesh, India
| | - Anil Prakash
- Department of Biotechnology and Bioinformatics, Barkatullah University, Bhopal 462 026, Madhya Pradesh, India
| |
Collapse
|
47
|
Arakawa K, Tomita M. Measures of compositional strand bias related to replication machinery and its applications. Curr Genomics 2012; 13:4-15. [PMID: 22942671 PMCID: PMC3269016 DOI: 10.2174/138920212799034749] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2011] [Revised: 09/10/2011] [Accepted: 09/20/2011] [Indexed: 11/22/2022] Open
Abstract
The compositional asymmetry of complementary bases in nucleotide sequences implies the existence of a mutational or selectional bias in the two strands of the DNA duplex, which is commonly shaped by strand-specific mechanisms in transcription or replication. Such strand bias in genomes, frequently visualized by GC skew graphs, is used for the computational prediction of transcription start sites and replication origins, as well as for comparative evolutionary genomics studies. The use of measures of compositional strand bias in order to quantify the degree of strand asymmetry is crucial, as it is the basis for determining the applicability of compositional analysis and comparing the strength of the mutational bias in different biological machineries in various species. Here, we review the measures of strand bias that have been proposed to date, including the ∆GC skew, the B1 index, the predictability score of linear discriminant analysis for gene orientation, the signal-to-noise ratio of the oligonucleotide bias, and the GC skew index. These measures have been predominantly designed for and applied to the analysis of replication-related mutational processes in prokaryotes, but we also give research examples in eukaryotes.
Collapse
Affiliation(s)
- Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Fujisawa 252-8520, Japan
| | | |
Collapse
|
48
|
Zhou Y, Call DR, Broschat SL. Genetic relationships among 527 Gram-negative bacterial plasmids. Plasmid 2012; 68:133-41. [DOI: 10.1016/j.plasmid.2012.05.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2012] [Revised: 05/03/2012] [Accepted: 05/07/2012] [Indexed: 11/28/2022]
|
49
|
Zhai Z, Reinert G, Song K, Waterman MS, Luan Y, Sun F. Normal and compound poisson approximations for pattern occurrences in NGS reads. J Comput Biol 2012; 19:839-54. [PMID: 22697250 PMCID: PMC3375642 DOI: 10.1089/cmb.2012.0029] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Next generation sequencing (NGS) technologies are now widely used in many biological studies. In NGS, sequence reads are randomly sampled from the genome sequence of interest. Most computational approaches for NGS data first map the reads to the genome and then analyze the data based on the mapped reads. Since many organisms have unknown genome sequences and many reads cannot be uniquely mapped to the genomes even if the genome sequences are known, alternative analytical methods are needed for the study of NGS data. Here we suggest using word patterns to analyze NGS data. Word pattern counting (the study of the probabilistic distribution of the number of occurrences of word patterns in one or multiple long sequences) has played an important role in molecular sequence analysis. However, no studies are available on the distribution of the number of occurrences of word patterns in NGS reads. In this article, we build probabilistic models for the background sequence and the sampling process of the sequence reads from the genome. Based on the models, we provide normal and compound Poisson approximations for the number of occurrences of word patterns from the sequence reads, with bounds on the approximation error. The main challenge is to consider the randomness in generating the long background sequence, as well as in the sampling of the reads using NGS. We show the accuracy of these approximations under a variety of conditions for different patterns with various characteristics. Under realistic assumptions, the compound Poisson approximation seems to outperform the normal approximation in most situations. These approximate distributions can be used to evaluate the statistical significance of the occurrence of patterns from NGS data. The theory and the computational algorithm for calculating the approximate distributions are then used to analyze ChIP-Seq data using transcription factor GABP. Software is available online (www-rcf.usc.edu/∼fsun/Programs/NGS_motif_power/NGS_motif_power.html). In addition, Supplementary Material can be found online (www.liebertonline.com/cmb).
Collapse
Affiliation(s)
- Zhiyuan Zhai
- School of Mathematics, Shandong University, Jinan, Shandong, China
| | - Gesine Reinert
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Kai Song
- School of Mathematics, Peking University, Beijing, China
| | - Michael S. Waterman
- Molecular and Computational Biology, University of Southern California, Los Angeles, California
- TNLIST/Department of Automation, Tsinghua University, Beijing, China
| | - Yihui Luan
- School of Mathematics, Shandong University, Jinan, Shandong, China
| | - Fengzhu Sun
- Molecular and Computational Biology, University of Southern California, Los Angeles, California
- TNLIST/Department of Automation, Tsinghua University, Beijing, China
| |
Collapse
|
50
|
Unsupervised two-way clustering of metagenomic sequences. J Biomed Biotechnol 2012; 2012:153647. [PMID: 22577288 PMCID: PMC3336163 DOI: 10.1155/2012/153647] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2011] [Accepted: 01/26/2012] [Indexed: 11/30/2022] Open
Abstract
A major challenge facing metagenomics is the development of tools for the characterization of functional and taxonomic content of vast amounts of short metagenome reads. The efficacy of clustering methods depends on the number of reads in the dataset, the read length and relative abundances of source genomes in the microbial community. In this paper, we formulate an unsupervised naive Bayes multispecies, multidimensional mixture model for reads from a metagenome. We use the proposed model to cluster metagenomic reads by their species of origin and to characterize the abundance of each species. We model the distribution of word counts along a genome as a Gaussian for shorter, frequent words and as a Poisson for longer words that are rare. We employ either a mixture of Gaussians or mixture of Poissons to model reads within each bin. Further, we handle the high-dimensionality and sparsity associated with the data, by grouping the set of words comprising the reads, resulting in a two-way mixture model. Finally, we demonstrate the accuracy and applicability of this method on simulated and real metagenomes. Our method can accurately cluster reads as short as 100 bps and is robust to varying abundances, divergences and read lengths.
Collapse
|