1
|
McElhoe JA, Addesso A, Young B, Holland MM. A New Tool for Probabilistic Assessment of MPS Data Associated with mtDNA Mixtures. Genes (Basel) 2024; 15:194. [PMID: 38397184 PMCID: PMC10887502 DOI: 10.3390/genes15020194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 01/24/2024] [Accepted: 01/26/2024] [Indexed: 02/25/2024] Open
Abstract
Mitochondrial (mt) DNA plays an important role in the fields of forensic and clinical genetics, molecular anthropology, and population genetics, with mixture interpretation being of particular interest in medical and forensic genetics. The high copy number, haploid state (only a single haplotype contributed per individual), high mutation rate, and well-known phylogeny of mtDNA, makes it an attractive marker for mixture deconvolution in damaged and low quantity samples of all types. Given the desire to deconvolute mtDNA mixtures, the goals of this study were to (1) create a new software, MixtureAceMT™, to deconvolute mtDNA mixtures by assessing and combining two existing software tools, MixtureAce™ and Mixemt, (2) create a dataset of in-silico MPS mixtures from whole mitogenome haplotypes representing a diverse set of population groups, and consisting of two and three contributors at different dilution ratios, and (3) since amplicon targeted sequencing is desirable, and is a commonly used approach in forensic laboratories, create biological mixture data associated with two amplification kits: PowerSeq™ Whole Genome Mito (Promega™, Madison, WI, USA) and Precision ID mtDNA Whole Genome Panel (Thermo Fisher Scientific by AB™, Waltham, MA, USA) to further validate the software for use in forensic laboratories. MixtureAceMT™ provides a user-friendly interface while reducing confounding features such as NUMTs and noise, reducing traditionally prohibitive processing times. The new software was able to detect the correct contributing haplogroups and closely estimate contributor proportions in sequencing data generated from small amplicons for mixtures with minor contributions of ≥5%. A challenge of mixture deconvolution using small amplicon sequencing is the potential generation of spurious haplogroups resulting from private mutations that differ from Phylotree. MixtureAceMT™ was able to resolve these additional haplogroups by including known haplotype/s in the evaluation. In addition, for some samples, the inclusion of known haplotypes was also able to resolve trace contributors (minor contribution 1-2%), which remain challenging to resolve even with deep sequencing.
Collapse
Affiliation(s)
- Jennifer A McElhoe
- Forensic Science Program, Department of Biochemistry & Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA; (A.A.); (M.M.H.)
| | - Alyssa Addesso
- Forensic Science Program, Department of Biochemistry & Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA; (A.A.); (M.M.H.)
| | - Brian Young
- NicheVision LLC, 526 South Main St., Akron, OH 44311, USA;
| | - Mitchell M Holland
- Forensic Science Program, Department of Biochemistry & Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA; (A.A.); (M.M.H.)
| |
Collapse
|
2
|
Woerner AE, Crysup B, Hewitt FC, Gardner MW, Freitas MA, Budowle B. Techniques for estimating genetically variable peptides and semi-continuous likelihoods from massively parallel sequencing data. Forensic Sci Int Genet 2022; 59:102719. [DOI: 10.1016/j.fsigen.2022.102719] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 04/25/2022] [Accepted: 05/01/2022] [Indexed: 11/25/2022]
|
3
|
Post hoc deconvolution of human mitochondrial DNA mixtures by EMMA 2 using fine-tuned Phylotree nomenclature. Comput Struct Biotechnol J 2022; 20:3630-3638. [PMID: 35860401 PMCID: PMC9283771 DOI: 10.1016/j.csbj.2022.06.053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 06/24/2022] [Accepted: 06/25/2022] [Indexed: 11/23/2022] Open
Abstract
MtDNA mixtures are observed frequently and difficult to deconvolute. Most previous methods require raw data or quantitative information. EMMA 2 produces valid splittings from consensus sequences of any sequencing technology. EMMA 2 can deconvolute 2 and 3 person mixtures in a fast and traceable way.
In this paper we present a new algorithm for splitting (partial) human mitogenomes into components with high similarity to haplogroup motifs of Phylotree. The algorithm reads a (partial) mitogenome coded by the differences to the reference (rCRS) and outputs the estimated haplogroups of the putative components. The algorithm requires no special information on the raw data of the sequencing process and is therefore suited for the post hoc analysis of mixtures of any sequencing technology. The software EMMA 2 implementing the algorithm will be made available via the EMPOP (https://empop.online) database and extends the nine years old software EMMA for haplogrouping single mitogenomes to mixtures with at most three components.
Collapse
|
4
|
Novroski NMM, Moo-Choy A, Wendt FR. Allele frequencies and minor contributor match statistic convergence using simulated population replicates. Int J Legal Med 2022; 136:1227-1235. [PMID: 35396663 DOI: 10.1007/s00414-022-02822-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 03/30/2022] [Indexed: 10/18/2022]
Abstract
Probabilistic genotyping permits a comparison of forensic evidence given hypotheses regarding the origin of observed short tandem repeat alleles in a mixed DNA profile. Using the publicly available R package forensim, it has been proposed that mixtures with non-contributors from low genetic diversity populations are more likely to be mistakenly identified as contributors to a mixture than non-contributors from high genetic diversity populations. We hypothesized that these observations are attributed to the unique distribution of alleles in the reference population and may not generalize to other samplings of the same population. We used forensim to simulate 200 US populations (50 each of self-reported African-American, Asian-American, European-American, and Hispanic descent). We compared likelihood ratios for 2400 mixtures to those derived from published data and identified stark differences. A minimum of ten population replicates were required to reduce observed differences relative to published data. Deviations from Hardy-Weinberg equilibrium and allele frequency distributions suggest that simulated populations should be sufficiently evaluated for expectations of population genetic parameters prior to use in DNA mixture modeling experiments. Overall, our findings support the utility of forensim and further describe its suitability to model population genetic parameters but suggest that a single population replicate (directly ascertained or simulated) may be insufficient to make conclusions about a given DNA mixture.
Collapse
Affiliation(s)
- Nicole M M Novroski
- Forensic Science Program, Department of Anthropology, University of Toronto Mississauga, Mississauga, ON, L5L 1C6, Canada.
| | - Ashley Moo-Choy
- Forensic Science Program, Department of Anthropology, University of Toronto Mississauga, Mississauga, ON, L5L 1C6, Canada
| | - Frank R Wendt
- Division of Human Genetics in Psychiatry, Yale School of Medicine & VA CT Healthcare System, New Haven, CT, 06516, USA.
| |
Collapse
|
5
|
Sturk-Andreaggi K, Ring JD, Ameur A, Gyllensten U, Bodner M, Parson W, Marshall C, Allen M. The Value of Whole-Genome Sequencing for Mitochondrial DNA Population Studies: Strategies and Criteria for Extracting High-Quality Mitogenome Haplotypes. Int J Mol Sci 2022; 23:ijms23042244. [PMID: 35216360 PMCID: PMC8876724 DOI: 10.3390/ijms23042244] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 02/08/2022] [Accepted: 02/09/2022] [Indexed: 02/04/2023] Open
Abstract
Whole-genome sequencing (WGS) data present a readily available resource for mitochondrial genome (mitogenome) haplotypes that can be utilized for genetics research including population studies. However, the reconstruction of the mitogenome is complicated by nuclear mitochondrial DNA (mtDNA) segments (NUMTs) that co-align with the mtDNA sequences and mimic authentic heteroplasmy. Two minimum variant detection thresholds, 5% and 10%, were assessed for the ability to produce authentic mitogenome haplotypes from a previously generated WGS dataset. Variants associated with NUMTs were detected in the mtDNA alignments for 91 of 917 (~8%) Swedish samples when the 5% frequency threshold was applied. The 413 observed NUMT variants were predominantly detected in two regions (nps 12,612–13,105 and 16,390–16,527), which were consistent with previously documented NUMTs. The number of NUMT variants was reduced by ~97% (400) using a 10% frequency threshold. Furthermore, the 5% frequency data were inconsistent with a platinum-quality mitogenome dataset with respect to observed heteroplasmy. These analyses illustrate that a 10% variant detection threshold may be necessary to ensure the generation of reliable mitogenome haplotypes from WGS data resources.
Collapse
Affiliation(s)
- Kimberly Sturk-Andreaggi
- Department of Immunology Genetics and Pathology, Uppsala University, Uppsala 751 08, Sweden; (A.A.); (U.G.)
- Armed Forces Medical Examiner System’s Armed Forces DNA Identification Laboratory (AFMES-AFDIL), Dover Air Force Base, Dover, DE 19902, USA; (J.D.R.); (C.M.)
- SNA International, LLC, Alexandria, VA 22314, USA
- Correspondence: (K.S.-A.); (M.A.)
| | - Joseph D. Ring
- Armed Forces Medical Examiner System’s Armed Forces DNA Identification Laboratory (AFMES-AFDIL), Dover Air Force Base, Dover, DE 19902, USA; (J.D.R.); (C.M.)
- SNA International, LLC, Alexandria, VA 22314, USA
| | - Adam Ameur
- Department of Immunology Genetics and Pathology, Uppsala University, Uppsala 751 08, Sweden; (A.A.); (U.G.)
| | - Ulf Gyllensten
- Department of Immunology Genetics and Pathology, Uppsala University, Uppsala 751 08, Sweden; (A.A.); (U.G.)
| | - Martin Bodner
- Institute of Legal Medicine, Medical University of Innsbruck, Innsbruck 6020, Austria; (M.B.); (W.P.)
| | - Walther Parson
- Institute of Legal Medicine, Medical University of Innsbruck, Innsbruck 6020, Austria; (M.B.); (W.P.)
- Forensic Science Program, The Pennsylvania State University, University Park, PA 16801, USA
| | - Charla Marshall
- Armed Forces Medical Examiner System’s Armed Forces DNA Identification Laboratory (AFMES-AFDIL), Dover Air Force Base, Dover, DE 19902, USA; (J.D.R.); (C.M.)
- SNA International, LLC, Alexandria, VA 22314, USA
- Forensic Science Program, The Pennsylvania State University, University Park, PA 16801, USA
| | - Marie Allen
- Department of Immunology Genetics and Pathology, Uppsala University, Uppsala 751 08, Sweden; (A.A.); (U.G.)
- Correspondence: (K.S.-A.); (M.A.)
| |
Collapse
|
6
|
Mandape SN, Smart U, King JL, Muenzler M, Kapema KB, Budowle B, Woerner AE. MMDIT: A tool for the deconvolution and interpretation of mitochondrial DNA mixtures. Forensic Sci Int Genet 2021; 55:102568. [PMID: 34416654 DOI: 10.1016/j.fsigen.2021.102568] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 06/22/2021] [Accepted: 08/03/2021] [Indexed: 01/01/2023]
Abstract
Short tandem repeats of the nuclear genome have been the preferred markers for analyzing forensic DNA mixtures. However, when nuclear DNA in a sample is degraded or limited, mitochondrial DNA (mtDNA) markers provide a powerful alternative. Though historically considered challenging, the interpretation and analysis of mtDNA mixtures have recently seen renewed interest with the advent of massively parallel sequencing. However, there are only a few software tools available for mtDNA mixture interpretation. To address this gap, the Mitochondrial Mixture Deconvolution and Interpretation Tool (MMDIT) was developed. MMDIT is an interactive application complete with a graphical user interface that allows users to deconvolve mtDNA (whole or partial genomes) mixtures into constituent donor haplotypes and estimate random match probabilities on these resultant haplotypes. In cases where deconvolution might not be feasible, the software allows mixture analysis directly within a binary framework (i.e. qualitatively, only using data on allele presence/absence). This paper explains the functionality of MMDIT, using an example of an in vitro two-person mtDNA mixture with a ratio of 1:4. The uniqueness of MMDIT lies in its ability to resolve mixtures into complete donor haplotypes using a statistical phasing framework before mixture analysis and evaluating statistical weights employing a novel graph algorithm approach. MMDIT is the first available open-source software that can automate mtDNA mixture deconvolution and analysis. The MMDIT web application can be accessed online at https://www.unthsc.edu/mmdit/. The source code is available at https://github.com/SammedMandape/MMDIT_UI and archived on zenodo (https://doi.org/10.5281/zenodo.4770184).
Collapse
Affiliation(s)
- Sammed N Mandape
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp, Bowie Blvd., Fort Worth, TX 76107, USA
| | - Utpal Smart
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp, Bowie Blvd., Fort Worth, TX 76107, USA
| | - Jonathan L King
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp, Bowie Blvd., Fort Worth, TX 76107, USA
| | - Melissa Muenzler
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp, Bowie Blvd., Fort Worth, TX 76107, USA
| | - Kapema Bupe Kapema
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp, Bowie Blvd., Fort Worth, TX 76107, USA
| | - Bruce Budowle
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp, Bowie Blvd., Fort Worth, TX 76107, USA; Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA
| | - August E Woerner
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp, Bowie Blvd., Fort Worth, TX 76107, USA; Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA.
| |
Collapse
|
7
|
Marshall C, Parson W. Interpreting NUMTs in forensic genetics: Seeing the forest for the trees. Forensic Sci Int Genet 2021; 53:102497. [PMID: 33740708 DOI: 10.1016/j.fsigen.2021.102497] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 03/10/2021] [Accepted: 03/11/2021] [Indexed: 01/29/2023]
Abstract
Nuclear mitochondrial DNA (mtDNA) segments (NUMTs) were discovered shortly after sequencing the first human mitochondrial genome. They have earlier been considered to represent archaic elements of ancient insertion events, but modern sequencing technologies and growing databases of mtDNA and NUMT sequences confirm that they are abundant and some of them phylogenetically young. Here, we build upon mtDNA/NUMT review articles published in the mid 2010 s and focus on the distinction of NUMTs and other artefacts that can be observed in aligned sequence reads, such as mixtures (contamination), point heteroplasmy, sequencing error and cytosine deamination. We show practical examples of the effect of the mtDNA enrichment method on the representation of NUMTs in the mapped sequence data and discuss methods to bioinformatically filter NUMTs from mtDNA reads.
Collapse
Affiliation(s)
- Charla Marshall
- Armed Forces Medical Examiner System's Armed Forces DNA Identification Laboratory (AFMES-AFDIL), Dover Air Force Base, DE 19902, USA; SNA International, Contractor Supporting the AFMES-AFDIL, Alexandria, VA 22314, USA; Forensic Science Program, The Pennsylvania State University, University Park, PA 16802, USA
| | - Walther Parson
- Forensic Science Program, The Pennsylvania State University, University Park, PA 16802, USA; Institute of Legal Medicine, Medical University of Innsbruck, 6020 Innsbruck, Austria.
| |
Collapse
|