1
|
Music of metagenomics-a review of its applications, analysis pipeline, and associated tools. Funct Integr Genomics 2021; 22:3-26. [PMID: 34657989 DOI: 10.1007/s10142-021-00810-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 09/25/2021] [Accepted: 10/03/2021] [Indexed: 10/20/2022]
Abstract
This humble effort highlights the intricate details of metagenomics in a simple, poetic, and rhythmic way. The paper enforces the significance of the research area, provides details about major analytical methods, examines the taxonomy and assembly of genomes, emphasizes some tools, and concludes by celebrating the richness of the ecosystem populated by the "metagenome."
Collapse
|
2
|
Giles HH, Hegde MR, Lyon E, Stanley CM, Kerr ID, Garlapow ME, Eggington JM. The Science and Art of Clinical Genetic Variant Classification and Its Impact on Test Accuracy. Annu Rev Genomics Hum Genet 2021; 22:285-307. [PMID: 33900788 DOI: 10.1146/annurev-genom-121620-082709] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Clinical genetic variant classification science is a growing subspecialty of clinical genetics and genomics. The field's continued improvement is essential for the success of precision medicine in both germline (hereditary) and somatic (oncology) contexts. This review focuses on variant classification for DNA next-generation sequencing tests. We first summarize current limitations in variant discovery and definition, and then describe the current five- and four-tier classification systems outlined in dominant standards and guideline publications for germline and somatic tests, respectively. We then discuss measures of variant classification discordance and the field's bias for positive results, as well as considerations for panel size and population screening in the context of estimates of positive predictive value thatincorporate estimated variant classification imperfections. Finally, we share opinions on the current state of variant classification from some of the authors of the most widely used standards and guideline publications and from other domain experts.
Collapse
Affiliation(s)
- Hunter H Giles
- Center for Genomic Interpretation, Sandy, Utah 84092, USA; , ,
| | - Madhuri R Hegde
- PerkinElmer Genomics, Waltham, Massachusetts 02450, USA; .,Department of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Elaine Lyon
- HudsonAlpha Clinical Services Lab, Huntsville, Alabama 35806, USA;
| | - Christine M Stanley
- C2i Genomics, Cambridge, Massachusetts 02139, USA.,Variantyx, Framingham, Massachusetts 01701, USA;
| | | | | | | |
Collapse
|
3
|
García-López R, Cornejo-Granados F, Lopez-Zavala AA, Cota-Huízar A, Sotelo-Mundo RR, Gómez-Gil B, Ochoa-Leyva A. OTUs and ASVs Produce Comparable Taxonomic and Diversity from Shrimp Microbiota 16S Profiles Using Tailored Abundance Filters. Genes (Basel) 2021; 12:genes12040564. [PMID: 33924545 PMCID: PMC8070570 DOI: 10.3390/genes12040564] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 04/03/2021] [Accepted: 04/10/2021] [Indexed: 12/12/2022] Open
Abstract
The interplay between shrimp immune system, its environment, and microbiota contributes to the organism’s homeostasis and optimal production. The metagenomic composition is typically studied using 16S rDNA profiling by clustering amplicon sequences into operational taxonomic units (OTUs) and, more recently, amplicon sequence variants (ASVs). Establish the compatibility of the taxonomy, α, and β diversity described by both methods is necessary to compare past and future shrimp microbiota studies. Here, we used identical sequences to survey the V3 16S hypervariable-region using 97% and 99% OTUs and ASVs to assess the hepatopancreas and intestine microbiota of L. vannamei from two ponds under standardized rearing conditions. We found that applying filters to retain clusters >0.1% of the total abundance per sample enabled a consistent taxonomy comparison while preserving >94% of the total reads. The three sets turned comparable at the family level, whereas the 97% identity OTU set produced divergent genus and species profiles. Interestingly, the detection of organ and pond variations was robust to the clustering method’s choice, producing comparable α and β-diversity profiles. For comparisons on shrimp microbiota between past and future studies, we strongly recommend that ASVs be compared at the family level to 97% identity OTUs or use 99% identity OTUs, both using tailored frequency filters.
Collapse
Affiliation(s)
- Rodrigo García-López
- Departamento de Microbiología Molecular, Instituto de Biotecnología (IBT), Universidad Nacional, Autónoma de México (UNAM) Avenida Universidad #2001, Colonia Chamilpa, Cuernavaca, Morelos 62210, Mexico; (R.G.-L.); (F.C.-G.)
| | - Fernanda Cornejo-Granados
- Departamento de Microbiología Molecular, Instituto de Biotecnología (IBT), Universidad Nacional, Autónoma de México (UNAM) Avenida Universidad #2001, Colonia Chamilpa, Cuernavaca, Morelos 62210, Mexico; (R.G.-L.); (F.C.-G.)
| | - Alonso A. Lopez-Zavala
- Departamento de Ciencias Químico Biológicas, Universidad de Sonora (UNISON), Blvd., Rosales y Luis, Encinas, Hermosillo, Sonora 83000, Mexico;
| | - Andrés Cota-Huízar
- Camarones el Renacimiento S.P.R. de R.I. Justino Rubio 26, Colonia Ejidal, Higuera de Zaragoza, Sinaloa 81330, Mexico;
| | - Rogerio R. Sotelo-Mundo
- Laboratorio de Estructura Biomolecular, Centro de Investigación en Alimentación y Desarrollo, A.C. Hermosillo, Sonora 83304, Mexico;
| | - Bruno Gómez-Gil
- Centro de Investigación en Alimentación y Desarrollo, A.C. Mazatlán, Sinaloa 82100, Mexico;
| | - Adrian Ochoa-Leyva
- Departamento de Microbiología Molecular, Instituto de Biotecnología (IBT), Universidad Nacional, Autónoma de México (UNAM) Avenida Universidad #2001, Colonia Chamilpa, Cuernavaca, Morelos 62210, Mexico; (R.G.-L.); (F.C.-G.)
- Correspondence: ; Tel.: +52-777-3291614
| |
Collapse
|
4
|
Stoler N, Nekrutenko A. Sequencing error profiles of Illumina sequencing instruments. NAR Genom Bioinform 2021; 3:lqab019. [PMID: 33817639 PMCID: PMC8002175 DOI: 10.1093/nargab/lqab019] [Citation(s) in RCA: 149] [Impact Index Per Article: 49.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2020] [Revised: 02/01/2021] [Accepted: 03/16/2021] [Indexed: 12/13/2022] Open
Abstract
Sequencing technology has achieved great advances in the past decade. Studies have previously shown the quality of specific instruments in controlled conditions. Here, we developed a method able to retroactively determine the error rate of most public sequencing datasets. To do this, we utilized the overlaps between reads that are a feature of many sequencing libraries. With this method, we surveyed 1943 different datasets from seven different sequencing instruments produced by Illumina. We show that among public datasets, the more expensive platforms like HiSeq and NovaSeq have a lower error rate and less variation. But we also discovered that there is great variation within each platform, with the accuracy of a sequencing experiment depending greatly on the experimenter. We show the importance of sequence context, especially the phenomenon where preceding bases bias the following bases toward the same identity. We also show the difference in patterns of sequence bias between instruments. Contrary to expectations based on the underlying chemistry, HiSeq X Ten and NovaSeq 6000 share notable exceptions to the preceding-base bias. Our results demonstrate the importance of the specific circumstances of every sequencing experiment, and the importance of evaluating the quality of each one.
Collapse
Affiliation(s)
- Nicholas Stoler
- Graduate Program in Bioinformatics and Genomics, The Huck Institutes for Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA
| | - Anton Nekrutenko
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
5
|
Castaño C, Berlin A, Brandström Durling M, Ihrmark K, Lindahl BD, Stenlid J, Clemmensen KE, Olson Å. Optimized metabarcoding with Pacific biosciences enables semi-quantitative analysis of fungal communities. THE NEW PHYTOLOGIST 2020; 228:1149-1158. [PMID: 32531109 DOI: 10.1111/nph.16731] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Accepted: 05/29/2020] [Indexed: 06/11/2023]
Abstract
Recent studies have questioned the use of high-throughput sequencing of the nuclear ribosomal internal transcribed spacer (ITS) region to derive a semi-quantitative representation of fungal community composition. However, comprehensive studies that quantify biases occurring during PCR and sequencing of ITS amplicons are still lacking. We used artificially assembled communities consisting of 10 ITS-like fragments of varying lengths and guanine-cytosine (GC) contents to evaluate and quantify biases during PCR and sequencing with Illumina MiSeq, PacBio RS II and PacBio Sequel I technologies. Fragment length variation was the main source of bias in observed community composition relative to the template, with longer fragments generally being under-represented for all sequencing platforms. This bias was three times higher for Illumina MiSeq than for PacBio RS II and Sequel I. All 10 fragments in the artificial community were recovered when sequenced with PacBio technologies, whereas the three longest fragments (> 447 bases) were lost when sequenced with Illumina MiSeq. Fragment length bias also increased linearly with increasing number of PCR cycles but could be mitigated by optimization of the PCR setup. No significant biases related to GC content were observed. Despite lower sequencing output, PacBio sequencing was better able to reflect the community composition of the template than Illumina MiSeq sequencing.
Collapse
Affiliation(s)
- Carles Castaño
- Department of Forest Mycology and Plant Pathology, Swedish University of Agricultural Sciences, Uppsala, SE-75007, Sweden
| | - Anna Berlin
- Department of Forest Mycology and Plant Pathology, Swedish University of Agricultural Sciences, Uppsala, SE-75007, Sweden
| | - Mikael Brandström Durling
- Department of Forest Mycology and Plant Pathology, Swedish University of Agricultural Sciences, Uppsala, SE-75007, Sweden
| | - Katharina Ihrmark
- Department of Forest Mycology and Plant Pathology, Swedish University of Agricultural Sciences, Uppsala, SE-75007, Sweden
| | - Björn D Lindahl
- Department of Soil and Environment, Swedish University of Agricultural Sciences, Uppsala, SE-75007, Sweden
| | - Jan Stenlid
- Department of Forest Mycology and Plant Pathology, Swedish University of Agricultural Sciences, Uppsala, SE-75007, Sweden
| | - Karina E Clemmensen
- Department of Forest Mycology and Plant Pathology, Swedish University of Agricultural Sciences, Uppsala, SE-75007, Sweden
| | - Åke Olson
- Department of Forest Mycology and Plant Pathology, Swedish University of Agricultural Sciences, Uppsala, SE-75007, Sweden
| |
Collapse
|
6
|
McElhoe JA, Holland MM. Characterization of background noise in MiSeq MPS data when sequencing human mitochondrial DNA from various sample sources and library preparation methods. Mitochondrion 2020; 52:40-55. [PMID: 32068127 DOI: 10.1016/j.mito.2020.02.005] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 12/18/2019] [Accepted: 02/12/2020] [Indexed: 12/20/2022]
Abstract
Improved resolution of massively parallel sequencing (MPS) allows for the characterization of mitochondrial (mt) DNA heteroplasmy to levels previously unattainable with traditional sequencing approaches. An essential criterion for the reporting of heteroplasmy is the ability of the MPS method to distinguish minor sequence variants (MSVs) from system noise, or error. Therefore, an assessment of the background noise in the MPS method is desirable to identify the point at which reliable data can be reported. Substitution and sequence specific error (SSE) was evaluated for a variety of sample types and two library preparations. Substitution error rates ranged from 0.18 to 0.49 per 100 nucleotides with C positions generally having the highest rate of misincorporation. Comparison of error rates across sample types indicated a significant increase for samples with damaged DNA. The positions of error were varied across datasets (pairwise concordance 0-68%), but had greater consistency within the damaged samples (80-96%). The most commonly observed motif preceding error in forward reads was CCG, while GGT was most common in reverse reads, both consistent with previous findings. The findings illustrate that for datasets containing samples with damaged DNA, reporting thresholds for heteroplasmy may have to be modified and individual sites with error levels exceeding thresholds should be scrutinized. Collectively, the shifting error profiles observed across the various sample types and library preparation methods demonstrates the need for an assessment of error under these varying circumstances. Characterization of the applicable background noise will help to ensure that thresholds are reliably set for detection of true MSVs.
Collapse
Affiliation(s)
- Jennifer A McElhoe
- Department of Biochemistry & Molecular Biology, Forensic Science Program, The Pennsylvania State University, University Park, PA 16802, USA.
| | - Mitchell M Holland
- Department of Biochemistry & Molecular Biology, Forensic Science Program, The Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
7
|
Gallon R, Sheth H, Hayes C, Redford L, Alhilal G, O'Brien O, Spiewak H, Waltham A, McAnulty C, Izuogu OG, Arends MJ, Oniscu A, Alonso AM, Laguna SM, Borthwick GM, Santibanez‐Koref M, Jackson MS, Burn J. Sequencing-based microsatellite instability testing using as few as six markers for high-throughput clinical diagnostics. Hum Mutat 2020; 41:332-341. [PMID: 31471937 PMCID: PMC6973255 DOI: 10.1002/humu.23906] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Revised: 08/14/2019] [Accepted: 08/26/2019] [Indexed: 12/24/2022]
Abstract
Microsatellite instability (MSI) testing of colorectal cancers (CRCs) is used to screen for Lynch syndrome (LS), a hereditary cancer-predisposition, and can be used to predict response to immunotherapy. Here, we present a single-molecule molecular inversion probe and sequencing-based MSI assay and demonstrate its clinical validity according to existing guidelines. We amplified 24 microsatellites in multiplex and trained a classifier using 98 CRCs, which accommodates marker specific sensitivities to MSI. Sample classification achieved 100% concordance with the MSI Analysis System v1.2 (Promega) in three independent cohorts, totaling 220 CRCs. Backward-forward stepwise selection was used to identify a 6-marker subset of equal accuracy to the 24-marker panel. Assessment of assay detection limits showed that the 24-marker panel is marginally more robust to sample variables than the 6-marker subset, detecting as little as 3% high levels of MSI DNA in sample mixtures, and requiring a minimum of 10 template molecules to be sequenced per marker for >95% accuracy. BRAF c.1799 mutation analysis was also included to streamline LS testing, with all c.1799T>A variants being correctly identified. The assay, therefore, provides a cheap, robust, automatable, and scalable MSI test with internal quality controls, suitable for clinical cancer diagnostics.
Collapse
Affiliation(s)
- Richard Gallon
- Institute of Genetic MedicineNewcastle UniversityNewcastle upon TyneUnited Kingdom
| | - Harsh Sheth
- Institute of Genetic MedicineNewcastle UniversityNewcastle upon TyneUnited Kingdom
- FRIGE's Institute of Human GeneticsFRIGE HouseAhmedabadIndia
| | - Christine Hayes
- Institute of Genetic MedicineNewcastle UniversityNewcastle upon TyneUnited Kingdom
| | - Lisa Redford
- Institute of Genetic MedicineNewcastle UniversityNewcastle upon TyneUnited Kingdom
| | - Ghanim Alhilal
- Institute of Genetic MedicineNewcastle UniversityNewcastle upon TyneUnited Kingdom
| | - Ottilia O'Brien
- Northern Genetics ServiceNewcastle Hospitals NHS Foundation TrustNewcastle upon TyneUnited Kingdom
| | - Helena Spiewak
- Northern Genetics ServiceNewcastle Hospitals NHS Foundation TrustNewcastle upon TyneUnited Kingdom
| | - Amanda Waltham
- Northern Genetics ServiceNewcastle Hospitals NHS Foundation TrustNewcastle upon TyneUnited Kingdom
| | - Ciaron McAnulty
- Northern Genetics ServiceNewcastle Hospitals NHS Foundation TrustNewcastle upon TyneUnited Kingdom
| | - Osagie G. Izuogu
- Institute of Genetic MedicineNewcastle UniversityNewcastle upon TyneUnited Kingdom
| | - Mark J. Arends
- Division of Pathology, Institute of Genetics & Molecular MedicineUniversity of EdinburghEdinburghUnited Kingdom
| | - Anca Oniscu
- Department of Molecular Pathology, Laboratory MedicineRoyal Infirmary of EdinburghEdinburghUnited Kingdom
| | - Angel M. Alonso
- Oncogenetics and Hereditary Cancer Group, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Instituto de Investigación Sanitaria de Navarra (IdiSNA)Universidad Pública de Navarra (UPNA)PamplonaSpain
| | - Sira M. Laguna
- Oncogenetics and Hereditary Cancer Group, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Instituto de Investigación Sanitaria de Navarra (IdiSNA)Universidad Pública de Navarra (UPNA)PamplonaSpain
| | - Gillian M. Borthwick
- Institute of Genetic MedicineNewcastle UniversityNewcastle upon TyneUnited Kingdom
| | | | - Michael S. Jackson
- Institute of Genetic MedicineNewcastle UniversityNewcastle upon TyneUnited Kingdom
| | - John Burn
- Institute of Genetic MedicineNewcastle UniversityNewcastle upon TyneUnited Kingdom
| |
Collapse
|
8
|
Holland MM, Bonds RM, Holland CA, McElhoe JA. Recovery of mtDNA from unfired metallic ammunition components with an assessment of sequence profile quality and DNA damage through MPS analysis. Forensic Sci Int Genet 2018; 39:86-96. [PMID: 30611826 DOI: 10.1016/j.fsigen.2018.12.008] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Revised: 11/27/2018] [Accepted: 12/19/2018] [Indexed: 12/14/2022]
Abstract
Recovery of suitable amounts of quality DNA from copper and brass surfaces, like those encountered in ammunition, has been a challenge for the forensic community. The ability of copper ions to rapidly facilitate oxidative damage leading to fragmentation of DNA significantly reduces the pool of templates for PCR amplification. We compared two methods for recovering mitochondrial (mt) DNA from the surface of unfired copper projectiles, brass casings, and aluminum casings, and found that using a cotton swab moistened with 0.5M EDTA was the favored approach, especially when the metallic surface was etched. Degradation was significantly higher for DNA samples recovered from copper and brass surfaces, when compared to aluminum. Massively parallel sequencing (MPS) of the control region, using the PowerSeq™ CRM Nested System kit and the Illumina MiSeq instrument, produced full haplotypes for aluminum samples regardless of the method used to deposit or collect DNA, while less than 60% of the copper and brass samples produced partial or full profile information. Touch DNA collected from copper and brass samples produced higher rates of partial or full MPS profile information (∼88-96%), while collection with 0.5M EDTA produced better results than when collection was performed with water; average of ∼70% versus ∼47%. While MPS data was not impacted by noise in the sequencing process, a higher than expected rate of noise was observed, potentially due to an increase in low-level damage lesions. Noise patterns were strikingly different when compared to control data, suggesting that noisy sites may be predictable when testing samples with high levels of oxidative damage. Library preparation was a poor predictor of MPS data quality, as a large percentage of reads did not align with the reference genome. This may impact the number of samples that can be run when a deep-coverage MPS approach is being considered for analysis of mtDNA heteroplasmy. Overall, when applying an MPS approach to the analysis of mtDNA recovered from ammunition, results are expected from touch DNA, will be limited for copper and brass components when the DNA is exposed to an aqueous environment, and DNA degradation will be accelerated when DNA comes in contact with copper or brass surfaces. Practitioners should consider collecting DNA from metallic surfaces with 0.5M EDTA, as this will maximize yield and mitigate degradation. The results of this study directly impact MPS analysis of minor mtDNA sequence variants from metallic surfaces, and are particularly relevant to forensic investigations.
Collapse
Affiliation(s)
- Mitchell M Holland
- Forensic Science Program, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, 014 Thomas Building, University Park, PA, 16802, United States.
| | - Rachel M Bonds
- Forensic Science Program, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, 014 Thomas Building, University Park, PA, 16802, United States
| | - Charity A Holland
- Forensic Science Program, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, 014 Thomas Building, University Park, PA, 16802, United States
| | - Jennifer A McElhoe
- Forensic Science Program, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, 014 Thomas Building, University Park, PA, 16802, United States
| |
Collapse
|
9
|
Pfeiffer F, Gröber C, Blank M, Händler K, Beyer M, Schultze JL, Mayer G. Systematic evaluation of error rates and causes in short samples in next-generation sequencing. Sci Rep 2018; 8:10950. [PMID: 30026539 PMCID: PMC6053417 DOI: 10.1038/s41598-018-29325-6] [Citation(s) in RCA: 177] [Impact Index Per Article: 29.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2018] [Accepted: 07/09/2018] [Indexed: 01/08/2023] Open
Abstract
Next-generation sequencing (NGS) is the method of choice when large numbers of sequences have to be obtained. While the technique is widely applied, varying error rates have been observed. We analysed millions of reads obtained after sequencing of one single sequence on an Illumina sequencer. According to our analysis, the index-PCR for sample preparation has no effect on the observed error rate, even though PCR is traditionally seen as one of the major contributors to enhanced error rates in NGS. In addition, we observed very persistent pre-phasing effects although the base calling software corrects for these. Removal of shortened sequences abolished these effects and allowed analysis of the actual mutations. The average error rate determined was 0.24 ± 0.06% per base and the percentage of mutated sequences was found to be 6.4 ± 1.24%. Constant regions at the 5'- and 3'-end, e.g., primer binding sites used in in vitro selection procedures seem to have no effect on mutation rates and re-sequencing of samples obtains very reproducible results. As phasing effects and other sequencing problems vary between equipment and individual setups, we recommend evaluation of error rates and types to all NGS-users to improve the quality and analysis of NGS data.
Collapse
Affiliation(s)
- Franziska Pfeiffer
- University of Bonn, LIMES Institute, Chemical Biology, Gerhard-Domagk-Str. 1, 53121, Bonn, Germany
| | - Carsten Gröber
- AptaIT GmbH, Am Klopferspitz 19A, 82152, Planegg, Germany
| | - Michael Blank
- AptaIT GmbH, Am Klopferspitz 19A, 82152, Planegg, Germany
| | - Kristian Händler
- University of Bonn, LIMES Institute, Genomics and Immunoregulation, Carl-Troll-Str. 31, 53115, Bonn, Germany
- German Center for Neurodegenerative Diseases (DZNE) and University of Bonn, Platform for Single Cell Genomics and Epigenomics, Sigmund-Freud-Str. 25, 53127, Bonn, Germany
| | - Marc Beyer
- University of Bonn, LIMES Institute, Genomics and Immunoregulation, Carl-Troll-Str. 31, 53115, Bonn, Germany
- German Center for Neurodegenerative Diseases (DZNE) and University of Bonn, Platform for Single Cell Genomics and Epigenomics, Sigmund-Freud-Str. 25, 53127, Bonn, Germany
- DZNE, Molecular Immunology in Neurodegeneration, Sigmund-Freud-Str. 27, 53127, Bonn, Germany
| | - Joachim L Schultze
- University of Bonn, LIMES Institute, Genomics and Immunoregulation, Carl-Troll-Str. 31, 53115, Bonn, Germany
- German Center for Neurodegenerative Diseases (DZNE) and University of Bonn, Platform for Single Cell Genomics and Epigenomics, Sigmund-Freud-Str. 25, 53127, Bonn, Germany
| | - Günter Mayer
- University of Bonn, LIMES Institute, Chemical Biology, Gerhard-Domagk-Str. 1, 53121, Bonn, Germany.
- Center of Aptamer Research and Development, Gerhard-Domagk-Str. 1, 53121, Bonn, Germany.
| |
Collapse
|
10
|
Lendvay B, Hartmann M, Brodbeck S, Nievergelt D, Reinig F, Zoller S, Parducci L, Gugerli F, Büntgen U, Sperisen C. Improved recovery of ancient DNA from subfossil wood - application to the world's oldest Late Glacial pine forest. THE NEW PHYTOLOGIST 2018; 217:1737-1748. [PMID: 29243821 DOI: 10.1111/nph.14935] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2017] [Accepted: 10/13/2017] [Indexed: 06/07/2023]
Abstract
Ancient DNA from historical and subfossil wood has a great potential to provide new insights into the history of tree populations. However, its extraction and analysis have not become routine, mainly because contamination of the wood with modern plant material can complicate the verification of genetic information. Here, we used sapwood tissue from 22 subfossil pines that were growing c. 13 000 yr bp in Zurich, Switzerland. We developed and evaluated protocols to eliminate surface contamination, and we tested ancient DNA authenticity based on plastid DNA metabarcoding and the assessment of post-mortem DNA damage. A novel approach using laser irradiation coupled with bleaching and surface removal was most efficient in eliminating contaminating DNA. DNA metabarcoding confirmed which ancient DNA samples repeatedly amplified pine DNA and were free of exogenous plant taxa. Pine DNA sequences of these samples showed a high degree of cytosine to thymine mismatches, typical of post-mortem damage. Stringent decontamination of wood surfaces combined with DNA metabarcoding and assessment of post-mortem DNA damage allowed us to authenticate ancient DNA retrieved from the oldest Late Glacial pine forest. These techniques can be applied to any subfossil wood and are likely to improve the accessibility of relict wood for genome-scale ancient DNA studies.
Collapse
Affiliation(s)
- Bertalan Lendvay
- Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Zürcherstrasse 111, CH-8903, Birmensdorf, Switzerland
| | - Martin Hartmann
- Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Zürcherstrasse 111, CH-8903, Birmensdorf, Switzerland
| | - Sabine Brodbeck
- Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Zürcherstrasse 111, CH-8903, Birmensdorf, Switzerland
| | - Daniel Nievergelt
- Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Zürcherstrasse 111, CH-8903, Birmensdorf, Switzerland
| | - Frederick Reinig
- Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Zürcherstrasse 111, CH-8903, Birmensdorf, Switzerland
| | - Stefan Zoller
- Genetic Diversity Centre, ETH Zurich, Universitätstrasse 16, CH-8092, Zurich, Switzerland
| | - Laura Parducci
- Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, 75236, Uppsala, Sweden
| | - Felix Gugerli
- Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Zürcherstrasse 111, CH-8903, Birmensdorf, Switzerland
| | - Ulf Büntgen
- Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Zürcherstrasse 111, CH-8903, Birmensdorf, Switzerland
- Department of Geography, University of Cambridge, Downing Place, Cambridge, CB2 3EN, UK
- Global Change Research Centre, Masaryk University, 613 00, Brno, Czech Republic
| | - Christoph Sperisen
- Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Zürcherstrasse 111, CH-8903, Birmensdorf, Switzerland
| |
Collapse
|
11
|
Deep-Coverage MPS Analysis of Heteroplasmic Variants within the mtGenome Allows for Frequent Differentiation of Maternal Relatives. Genes (Basel) 2018; 9:genes9030124. [PMID: 29495418 PMCID: PMC5867845 DOI: 10.3390/genes9030124] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2018] [Revised: 02/15/2018] [Accepted: 02/20/2018] [Indexed: 12/11/2022] Open
Abstract
Distinguishing between maternal relatives through mitochondrial (mt) DNA sequence analysis has been a longstanding desire of the forensic community. Using a deep-coverage, massively parallel sequencing (DCMPS) approach, we studied the pattern of mtDNA heteroplasmy across the mtgenomes of 39 mother-child pairs of European decent; haplogroups H, J, K, R, T, U, and X. Both shared and differentiating heteroplasmy were observed on a frequent basis in these closely related maternal relatives, with the minor variant often presented as 2–10% of the sequencing reads. A total of 17 pairs exhibited differentiating heteroplasmy (44%), with the majority of sites (76%, 16 of 21) occurring in the coding region, further illustrating the value of conducting sequence analysis on the entire mtgenome. A number of the sites of differentiating heteroplasmy resulted in non-synonymous changes in protein sequence (5 of 21), and to changes in transfer or ribosomal RNA sequences (5 of 21), highlighting the potentially deleterious nature of these heteroplasmic states. Shared heteroplasmy was observed in 12 of the 39 mother-child pairs (31%), with no duplicate sites of either differentiating or shared heteroplasmy observed; a single nucleotide position (16093) was duplicated between the data sets. Finally, rates of heteroplasmy in blood and buccal cells were compared, as it is known that rates can vary across tissue types, with similar observations in the current study. Our data support the view that differentiating heteroplasmy across the mtgenome can be used to frequently distinguish maternal relatives, and could be of interest to both the medical genetics and forensic communities.
Collapse
|
12
|
Helbing S, Lattorff HMG, Moritz RFA, Buttstedt A. Comparative analyses of the major royal jelly protein gene cluster in three Apis species with long amplicon sequencing. DNA Res 2017; 24:279-287. [PMID: 28170034 PMCID: PMC5499652 DOI: 10.1093/dnares/dsw064] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2016] [Accepted: 12/20/2016] [Indexed: 01/04/2023] Open
Abstract
The western honeybee, Apis mellifera is a prominent model organism in the field of sociogenomics and a recent upgrade substantially improved annotations of the reference genome. Nevertheless, genome assemblies based on short-sequencing reads suffer from problems in regions comprising e.g. multi-copy genes. We used single-molecule nanopore-based sequencing with extensive read-lengths to reconstruct the organization of the major royal jelly protein (mrjp) region in three species of the genus Apis. Long-amplicon sequencing provides evidence for lineage-specific evolutionary fates of Apis mrjps. Whereas the most basal species, A. florea, seems to encode ten mrjps, different patterns of gene loss and retention were observed for A. mellifera and A. dorsata. Furthermore, we show that a previously reported pseudogene in A. mellifera, mrjp2-like, is an assembly artefact arising from short read sequencing.
Collapse
Affiliation(s)
- Sophie Helbing
- Institut für Biologie, Molekulare Ökologie, Martin-Luther-Universität Halle-Wittenberg, Halle (Saale), Germany
| | - H Michael G Lattorff
- Institut für Biologie, Molekulare Ökologie, Martin-Luther-Universität Halle-Wittenberg, Halle (Saale), Germany.,German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany
| | - Robin F A Moritz
- Institut für Biologie, Molekulare Ökologie, Martin-Luther-Universität Halle-Wittenberg, Halle (Saale), Germany.,German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany.,Department of Zoology and Entomology, University of Pretoria,Pretoria, South Africa
| | - Anja Buttstedt
- Institut für Biologie, Molekulare Ökologie, Martin-Luther-Universität Halle-Wittenberg, Halle (Saale), Germany
| |
Collapse
|
13
|
Production of individualized V gene databases reveals high levels of immunoglobulin genetic diversity. Nat Commun 2016; 7:13642. [PMID: 27995928 PMCID: PMC5187446 DOI: 10.1038/ncomms13642] [Citation(s) in RCA: 138] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2016] [Accepted: 10/21/2016] [Indexed: 12/19/2022] Open
Abstract
Comprehensive knowledge of immunoglobulin genetics is required to advance our understanding of B cell biology. Validated immunoglobulin variable (V) gene databases are close to completion only for human and mouse. We present a novel computational approach, IgDiscover, that identifies germline V genes from expressed repertoires to a specificity of 100%. IgDiscover uses a cluster identification process to produce candidate sequences that, once filtered, results in individualized germline V gene databases. IgDiscover was tested in multiple species, validated by genomic cloning and cross library comparisons and produces comprehensive gene databases even where limited genomic sequence is available. IgDiscover analysis of the allelic content of the Indian and Chinese-origin rhesus macaques reveals high levels of immunoglobulin gene diversity in this species. Further, we describe a novel human IGHV3-21 allele and confirm significant gene differences between Balb/c and C57BL6 mouse strains, demonstrating the power of IgDiscover as a germline V gene discovery tool. Current databases of V genes for antibody repertoire have limitations. Here Corcoran et al. develop a computational approach named IgDiscover that can identify germline V gene sequences from expressed antibody repertoires to high specificity and completeness.
Collapse
|