1
|
Gettings KB, Tillmar A, Sturk-Andreaggi K, Marshall C. Review of SNP assays for disaster victim identification: Cost, time, and performance information for decision-makers. J Forensic Sci 2024. [PMID: 39021258 DOI: 10.1111/1556-4029.15585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 07/01/2024] [Accepted: 07/03/2024] [Indexed: 07/20/2024]
Abstract
In mass disaster events, forensic DNA laboratories may be called upon to quickly pivot their operations toward identifying bodies and reuniting remains with family members. Ideally, laboratories have considered this possibility in advance and have a plan in place. Compared with traditional short tandem repeat (STR) typing, single nucleotide polymorphisms (SNPs) may be better suited to these disaster victim identification (DVI) scenarios due to their small genomic target size, resulting in an improved success rate in degraded DNA samples. As the landscape of technology has shifted toward DNA sequencing, many forensic laboratories now have benchtop instruments available for massively parallel sequencing (MPS), facilitating this operational pivot from routine forensic STR casework to DVI SNP typing. Herein, we present the commercially available SNP sequencing assays amenable to DVI, we use data simulations to explore the potential for kinship prediction from SNP panels of varying sizes, and we give an example DVI scenario as context for presenting the matrix of considerations: kinship predictive potential, cost, and throughput of current SNP assay options. This information is intended to assist laboratories in choosing a SNP system for disaster preparedness.
Collapse
Affiliation(s)
| | - Andreas Tillmar
- Department of Forensic Genetics and Forensic Toxicology, National Board of Forensic Medicine, Linköping, Sweden
- Department of Biomedical and Clinical Sciences, Faculty of Medicine and Health Sciences, Linköping University, Linköping, Sweden
| | - Kimberly Sturk-Andreaggi
- Armed Forces Medical Examiner System's Armed Forces DNA Identification Laboratory (AFMES-AFDIL), 10 Defense Health Agency, Dover Air Force Base, Dover, Delaware, USA
- SNA International, LLC (Contractor Supporting the AFMES-AFDIL), Alexandria, Virginia, USA
| | - Charla Marshall
- Armed Forces Medical Examiner System's Armed Forces DNA Identification Laboratory (AFMES-AFDIL), 10 Defense Health Agency, Dover Air Force Base, Dover, Delaware, USA
- Forensic Science Program, The Pennsylvania State University, State College, Pennsylvania, USA
| |
Collapse
|
2
|
Guo LL, Yuan JH, Zhang C, Zhao J, Yao YR, Guo KL, Meng Y, Ji AQ, Kang KL, Wang L. Developmental validation of the STRSeqTyper122 kit for massively parallel sequencing of forensic STRs. Int J Legal Med 2024; 138:1255-1264. [PMID: 38416217 DOI: 10.1007/s00414-024-03195-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Accepted: 02/09/2024] [Indexed: 02/29/2024]
Abstract
Massively parallel sequencing allows for integrated genotyping of different types of forensic markers, which reduces DNA consumption, simplifies experimental processes, and provides additional sequence-based genetic information. The STRseqTyper122 kit genotypes 63 autosomal STRs, 16 X-STRs, 42 Y-STRs, and the Amelogenin locus. Amplicon sizes of 117 loci were below 300 bp. In this study, MiSeq FGx sequencing metrics for STRseqTyper122 were presented. The genotyping accuracy of this kit was examined by comparing to certified genotypes of NIST standard reference materials and results from five capillary electrophoresis-based kits. The sensitivity of STRseqTyper122 reached 125 pg, and > 80% of the loci were correctly called with 62.5 pg and 31.25 pg input genomic DNA. Repeatability, species specificity, and tolerance for DNA degradation and PCR inhibitors of this kit were also evaluated. STRseqTyper122 demonstrated reliable performance with routine case-work samples and provided a powerful tool for forensic applications.
Collapse
Affiliation(s)
- Li-Liang Guo
- Key Laboratory of Forensic Genetics of Ministry of Public Security, Institute of Forensic Science, Ministry of Public Security, Beijing, 100038, China
| | - Jia-Hui Yuan
- School of Forensic Medicine, Kunming Medical University, Kunming, 650500, China
| | - Chi Zhang
- Key Laboratory of Forensic Genetics of Ministry of Public Security, Institute of Forensic Science, Ministry of Public Security, Beijing, 100038, China
| | - Jie Zhao
- Key Laboratory of Forensic Genetics of Ministry of Public Security, Institute of Forensic Science, Ministry of Public Security, Beijing, 100038, China
| | - Yi-Ren Yao
- Key Laboratory of Forensic Genetics of Ministry of Public Security, Institute of Forensic Science, Ministry of Public Security, Beijing, 100038, China
| | - Ke-Li Guo
- Key Laboratory of Forensic Genetics of Ministry of Public Security, Institute of Forensic Science, Ministry of Public Security, Beijing, 100038, China
| | - Yang Meng
- Key Laboratory of Forensic Genetics of Ministry of Public Security, Institute of Forensic Science, Ministry of Public Security, Beijing, 100038, China
| | - An-Quan Ji
- Key Laboratory of Forensic Genetics of Ministry of Public Security, Institute of Forensic Science, Ministry of Public Security, Beijing, 100038, China.
| | - Ke-Lai Kang
- Key Laboratory of Forensic Genetics of Ministry of Public Security, Institute of Forensic Science, Ministry of Public Security, Beijing, 100038, China.
| | - Le Wang
- Key Laboratory of Forensic Genetics of Ministry of Public Security, Institute of Forensic Science, Ministry of Public Security, Beijing, 100038, China.
- School of Forensic Medicine, Kunming Medical University, Kunming, 650500, China.
| |
Collapse
|
3
|
Sidstedt M, Gynnå AH, Kiesler KM, Jansson L, Steffen CR, Håkansson J, Johansson G, Österlund T, Bogestål Y, Tillmar A, Rådström P, Ståhlberg A, Vallone PM, Hedman J. Ultrasensitive sequencing of STR markers utilizing unique molecular identifiers and the SiMSen-Seq method. Forensic Sci Int Genet 2024; 71:103047. [PMID: 38598919 DOI: 10.1016/j.fsigen.2024.103047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 03/27/2024] [Accepted: 04/01/2024] [Indexed: 04/12/2024]
Abstract
Massively parallel sequencing (MPS) is increasingly applied in forensic short tandem repeat (STR) analysis. The presence of stutter artefacts and other PCR or sequencing errors in the MPS-STR data partly limits the detection of low DNA amounts, e.g., in complex mixtures. Unique molecular identifiers (UMIs) have been applied in several scientific fields to reduce noise in sequencing. UMIs consist of a stretch of random nucleotides, a unique barcode for each starting DNA molecule, that is incorporated in the DNA template using either ligation or PCR. The barcode is used to generate consensus reads, thus removing errors. The SiMSen-Seq (Simple, multiplexed, PCR-based barcoding of DNA for sensitive mutation detection using sequencing) method relies on PCR-based introduction of UMIs and includes a sophisticated hairpin design to reduce unspecific primer binding as well as PCR protocol adjustments to further optimize the reaction. In this study, SiMSen-Seq is applied to develop a proof-of-concept seven STR multiplex for MPS library preparation and an associated bioinformatics pipeline. Additionally, machine learning (ML) models were evaluated to further improve UMI allele calling. Overall, the seven STR multiplex resulted in complete detection and concordant alleles for 47 single-source samples at 1 ng input DNA as well as for low-template samples at 62.5 pg input DNA. For twelve challenging mixtures with minor contributions of 10 pg to 150 pg and ratios of 1-15% relative to the major donor, 99.2% of the expected alleles were detected by applying the UMIs in combination with an ML filter. The main impact of UMIs was a substantially lowered number of artefacts as well as reduced stutter ratios, which were generally below 5% of the parental allele. In conclusion, UMI-based STR sequencing opens new means for improved analysis of challenging crime scene samples including complex mixtures.
Collapse
Affiliation(s)
- Maja Sidstedt
- National Forensic Centre, Swedish Police Authority, Linköping SE-581 94, Sweden
| | - Arvid H Gynnå
- National Forensic Centre, Swedish Police Authority, Linköping SE-581 94, Sweden
| | - Kevin M Kiesler
- National Institute of Standards and Technology, 100 Bureau Drive, M/S 8314, Gaithersburg, MD 20899, USA
| | - Linda Jansson
- National Forensic Centre, Swedish Police Authority, Linköping SE-581 94, Sweden; Applied Microbiology, Department of Chemistry, Lund University, Lund SE-221 00, Sweden
| | - Carolyn R Steffen
- National Institute of Standards and Technology, 100 Bureau Drive, M/S 8314, Gaithersburg, MD 20899, USA
| | - Joakim Håkansson
- RISE Unit of Biological Function, Division Materials and Production, Box 857, Borås SE-501 15, Sweden; Department of Laboratory Medicine, Institute of Biomedicine, University of Gothenburg, Gothenburg SE-405 30, Sweden; Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg SE-405 30, Sweden
| | - Gustav Johansson
- SIMSEN Diagnostics, Sahlgrenska Science Park, Gothenburg, Sweden
| | - Tobias Österlund
- Department of Laboratory Medicine, Sahlgrenska Center for Cancer Research, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Medicinaregatan 1F, Gothenburg 41390, Sweden; Wallenberg Center for Molecular and Translational Medicine, University of Gothenburg, Gothenburg 41390, Sweden; Department of Clinical Genetics and Genomics, Sahlgrenska University Hospital, Gothenburg, Region Västra Götaland 41390, Sweden
| | - Yalda Bogestål
- RISE Unit of Biological Function, Division Materials and Production, Box 857, Borås SE-501 15, Sweden
| | - Andreas Tillmar
- Department of Forensic Genetics and Forensic Toxicology, National Board of Forensic Medicine, Linköping SE-587 58, Sweden
| | - Peter Rådström
- Applied Microbiology, Department of Chemistry, Lund University, Lund SE-221 00, Sweden
| | - Anders Ståhlberg
- Department of Laboratory Medicine, Sahlgrenska Center for Cancer Research, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Medicinaregatan 1F, Gothenburg 41390, Sweden; Wallenberg Center for Molecular and Translational Medicine, University of Gothenburg, Gothenburg 41390, Sweden; Department of Clinical Genetics and Genomics, Sahlgrenska University Hospital, Gothenburg, Region Västra Götaland 41390, Sweden
| | - Peter M Vallone
- National Institute of Standards and Technology, 100 Bureau Drive, M/S 8314, Gaithersburg, MD 20899, USA
| | - Johannes Hedman
- National Forensic Centre, Swedish Police Authority, Linköping SE-581 94, Sweden; Applied Microbiology, Department of Chemistry, Lund University, Lund SE-221 00, Sweden.
| |
Collapse
|
4
|
Agudo MM, Aanes H, Albert M, Janssen K, Gill P, Bleka Ø. An overview of autosomal STRs and identity SNPs in a Norwegian population using massively parallel sequencing. Forensic Sci Int Genet 2024; 71:103057. [PMID: 38733649 DOI: 10.1016/j.fsigen.2024.103057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 02/27/2024] [Accepted: 04/28/2024] [Indexed: 05/13/2024]
Abstract
In recent years, probabilistic genotyping software has been adapted for the analysis of massively parallel sequencing (MPS) forensic data. Likelihood ratios (LR) are based on allele frequencies selected from populations of interest. This study provides an outline of sequence-based (SB) allele frequencies for autosomal short tandem repeats (aSTRs) and identity single nucleotide polymorphisms (iSNPs) in 371 individuals from Southern Norway. 27 aSTRs and 94 iSNPs were previously analysed with the ForenSeq™ DNA Signature Prep Kit (Verogen). The number of alleles with frequencies less than 0.05 for sequenced-based alleles was 4.6 times higher than for length-based alleles. Consistent with previous studies, it was observed that sequence-based data (both with and without flanks) exhibited higher allele diversity compared to length-based (LB) data; random match probabilities were lower for SB alleles confirming their advantage to discriminate between individuals. Two alleles in markers D22S1045 and Penta D were observed with SNPs in the 3´ flanking region, which have not been reported before. Also, a novel SNP with a minor allele frequency (MAF) of 0.001, was found in marker TH01. The impact of the sample size on minor allele frequency (MAF) values was studied in 88 iSNPs from Southern Norway (n = 371). The findings were then compared to a larger Norwegian population dataset (n = 15,769). The results showed that the smaller Southern Norway dataset provided similar results, and it was a representative sample. Population structure was analyzed for regions within Southern Norway; FST estimates for aSTR and iSNPs did not indicate any genetic structure. Finally, we investigated the genetic differences between Southern Norway and two other populations: Northern Norway and Denmark. Allele frequencies between these populations were compared, and we found no significant frequency differences (p-values > 0.0001). We also calculated the pairwise FST values per marker and comparisons between Southern and Northern Norway showed small differences. In contrast, the comparisons between Southern Norway and Denmark showed higher FST values for some markers, possibly driven by distinct alleles that were present in only one of the populations. In summary, we propose that allele frequencies from each population considered in this study could be used interchangeably to calculate genotype probabilities.
Collapse
Affiliation(s)
- Maria Martin Agudo
- Department of Forensic Sciences, Oslo University Hospital, Oslo, Norway; Department of Forensic Medicine, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Håvard Aanes
- Department of Forensic Sciences, Oslo University Hospital, Oslo, Norway
| | - Michel Albert
- Department of Forensic Sciences, Oslo University Hospital, Oslo, Norway
| | - Kirstin Janssen
- Centre for Forensic Genetics, UiT The Arctic University of Norway, Norway
| | - Peter Gill
- Department of Forensic Sciences, Oslo University Hospital, Oslo, Norway; Department of Forensic Medicine, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Øyvind Bleka
- Department of Forensic Sciences, Oslo University Hospital, Oslo, Norway.
| |
Collapse
|
5
|
Liu YY, Cheng K, Just R, Enke S, Bright JA. Sequencing-induced artefacts in NGS STR data. Forensic Sci Int Genet 2024; 72:103086. [PMID: 38897164 DOI: 10.1016/j.fsigen.2024.103086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 06/10/2024] [Accepted: 06/13/2024] [Indexed: 06/21/2024]
Abstract
Significant progress has been made in recent years in the development of techniques for Next Generation Sequencing (NGS), or Massively Parallel Sequencing (MPS), of forensically relevant short tandem repeat (STR) loci. However, as these technologies are investigated and adopted by forensic laboratories, new challenges unfold that require further scrutiny. In the analysis of DNA profiles generated using the MiSeq FGx sequencing system, we have observed noise sequences with relatively high readcounts that are challenging to distinguish from genuine alleles. These high read count noise sequences appear as allele sequences with one or a few substituted bases compared to a known allele sequence within the profile. An examination of ForenSeq DNA Signature Prep Kit STR noise sequences revealed that the substituted base of a parent allele can align to the same position on the sequence across noise sequences. This suggests that these substitution events occur at specific positions within the amplicon, resulting in multiple noise reads with substitutions at the same position. Mapping of the noise events onto the original raw read positions revealed a high number of events, or "noise spikes", occurring at specific positions within a given sequencing run. These noise spikes affected reads across the entire run, agnostic of locus or sample, while the position, occurrence, and amplitude of the spikes differed across runs. The majority of noise sequences with high read counts in a DNA profile were generated from base changes at these spike positions, and could be classified as "noise spike artefacts". In this paper we present evidence of the noise spike artefacts and their genesis during the sequencing process in the sequencing-by-synthesis (SBS) cycles, as well as the methods developed to detect them. The information and methods will assist laboratories with detecting noise spikes in MiSeq FGx sequencing runs, differentiating authentic allele sequences from noise spike artefacts, and developing protocols for analyst review and handling of MiSeq FGx data.
Collapse
Affiliation(s)
- Yao-Yuan Liu
- ESR Limited, Private Bag 92021, Auckland, New Zealand.
| | - Kevin Cheng
- ESR Limited, Private Bag 92021, Auckland, New Zealand
| | - Rebecca Just
- National Bioforensic Analysis Center, National Biodefense Analysis and Countermeasures Center, 8300 Research Plaza, Fort Detrick, MD, United States
| | - Sana Enke
- National Bioforensic Analysis Center, National Biodefense Analysis and Countermeasures Center, 8300 Research Plaza, Fort Detrick, MD, United States
| | | |
Collapse
|
6
|
Liu Z, Wu E, Li R, Liu J, Zang Y, Cong B, Wu R, Xie B, Sun H. Improved individual identification in DNA mixtures of unrelated or related contributors through massively parallel sequencing. Forensic Sci Int Genet 2024; 72:103078. [PMID: 38889491 DOI: 10.1016/j.fsigen.2024.103078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 06/07/2024] [Accepted: 06/11/2024] [Indexed: 06/20/2024]
Abstract
DNA mixtures are a common sample type in forensic genetics, and we typically assume that contributors to the mixture are unrelated when calculating the likelihood ratio (LR). However, scenarios involving mixtures with related contributors, such as in family murder or incest cases, can also be encountered. Compared to the mixtures with unrelated contributors, the kinship within the mixture would bring additional challenges for the inference of the number of contributors (NOC) and the construction of probabilistic genotyping models. To evaluate the influence of potential kinship on the individual identification of the person of interest (POI), we conducted simulations of two-person (2 P) and three-person (3 P) DNA mixtures containing unrelated or related contributors (parent-child, full-sibling, and uncle-nephew) at different mixing ratios (for 2 P: 1:1, 4:1, 9:1, and 19:1; for 3 P: 1:1:1, 2:1:1, 5:4:1, and 10:5:1), and performed massively parallel sequencing (MPS) using MGIEasy Signature Identification Library Prep Kit on MGI platform. In addition, in silico simulations of mixtures with unrelated and related contributors were also performed. In this study, we evaluated 1): the MPS performance; 2) the influence of multiple genetic markers on determining the presence of related contributors and inferring the NOC within the mixture; 3) the probability distribution of MAC (maximum allele count) and TAC (total allele count) based on in silico mixture profiles; 4) trends in LR values with and without considering kinship in mixtures with related and unrelated contributors; 5) trends in LR values with length- and sequence-based STR genotypes. Results indicated that multiple numbers and types of genetic markers positively influenced kinship and NOC inference in a mixture. The LR values of POI were strongly dependent on the mixing ratio. Non- and correct-kinship hypotheses essentially did not affect the individual identification of the major POI; the correct kinship hypothesis yielded more conservative LR values; the incorrect kinship hypothesis did not necessarily lead to the failure of POI individual identification. However, it is noteworthy that these considerations could lead to uncertain outcomes in the identification of minor contributors. Compared to length-based STR genotyping, using sequence-based STR genotype increases the individual identification power of the POI, concurrently improving the accuracy of mixing ratio inference using EuroForMix. In conclusion, the MGIEasy Signature Identification Library Prep kit demonstrated robust individual identification power, which is a viable MPS panel for forensic DNA mixture interpretations, whether involving unrelated or related contributors.
Collapse
Affiliation(s)
- Zhiyong Liu
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China
| | - Enlin Wu
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China
| | - Ran Li
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China; School of Medicine, Jiaying University, Meizhou 514015, China
| | - Jiajun Liu
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China
| | - Yu Zang
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China
| | - Bin Cong
- College of Forensic Medicine, Hebei Medical University, Hebei Key Laboratory of Forensic Medicine, Shijiazhuang 050017, China
| | - Riga Wu
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China
| | - Bo Xie
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China
| | - Hongyu Sun
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China.
| |
Collapse
|
7
|
Wang X, Muenzler M, King J, Liu M, Li H, Budowle B, Ge J. A complete pipeline enables haplotyping and phasing macrohaplotype in long sequencing reads for polyploidy samples and a multi-source DNA mixture. Electrophoresis 2024; 45:877-884. [PMID: 38196015 DOI: 10.1002/elps.202300143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 11/19/2023] [Accepted: 11/30/2023] [Indexed: 01/11/2024]
Abstract
Macrohaplotype combines multiple types of phased DNA variants, increasing forensic discrimination power. High-quality long-sequencing reads, for example, PacBio HiFi reads, provide data to detect macrohaplotypes in multiploidy and DNA mixtures. However, the bioinformatics tools for detecting macrohaplotypes are lacking. In this study, we developed a bioinformatics software, MacroHapCaller, in which targeted loci (i.e., short TRs [STRs], single nucleotide polymorphisms, and insertion and deletions) are genotyped and combined with novel algorithms to call macrohaplotypes from long reads. MacroHapCaller uses physical phasing (i.e., read-backed phasing) to identify macrohaplotypes, and thus it can detect multi-allelic macrohaplotypes for a given sample. MacroHapCaller was validated with data generated from our designed targeted PacBio HiFi sequencing pipeline, which sequenced ∼8-kb amplicon regions harboring 20 core forensic STR loci in human benchmark samples HG002 and HG003. MacroHapCaller also was validated in whole-genome long-read sequencing data. Robust and accurate genotyping and phased macrohaplotypes were obtained with MacroHapCaller compared with the known ground truth. MacroHapCaller achieved a higher or consistent genotyping accuracy and faster speed than existing tools HipSTR and DeepVar. MacroHapCaller enables efficient macrohaplotype analysis from high-throughput sequencing data and supports applications using discriminating macrohaplotypes.
Collapse
Affiliation(s)
- Xuewen Wang
- Health Science Center, University of North Texas, Fort Worth, Texas, USA
| | - Melissa Muenzler
- Health Science Center, University of North Texas, Fort Worth, Texas, USA
| | - Jonathan King
- Health Science Center, University of North Texas, Fort Worth, Texas, USA
| | - Muyi Liu
- Health Science Center, University of North Texas, Fort Worth, Texas, USA
| | - Hongmin Li
- College of Science, Cal State East Bay, Hayward, California, USA
| | - Bruce Budowle
- Department of Forensic Medicine, University of Helsinki, Helsinki, Finland
- Forensic Science Institute, Radford University, Radford, Virginia, USA
| | - Jianye Ge
- Health Science Center, University of North Texas, Fort Worth, Texas, USA
| |
Collapse
|
8
|
Feng Y, Zhao Y, Lu X, Li H, Zhao K, Shi M, Wen S. Forensic analysis and sequence variation of 133 STRs in the Hakka population. Front Genet 2024; 15:1347868. [PMID: 38317659 PMCID: PMC10839782 DOI: 10.3389/fgene.2024.1347868] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 01/05/2024] [Indexed: 02/07/2024] Open
Abstract
Introduction: Short Tandem Repeats (STRs) are highly valuable genetic markers in forensic science. However, the conventional PCR-CE technique has limitations, and the emergence of massively parallel sequencing (MPS) technology presents new opportunities for STR analysis. Yet, there is limited research on Chinese population diversity using MPS. Methods: In this study, we obtained genotype data for 52 A-STRs and 81 Y-STRs from the Hakka population in Meizhou, Guangdong, China, using the Forensic Analysis System Multiplecues SetB Kit on the MGISEQ-2000 platform. Results: Our findings demonstrate that these 133 STRs are highly efficient for forensic applications within the Meizhou Hakka population. Statistical analysis revealed Hobs values ranging from 0.61306 to 0.91083 and Hexp values ranging from 0.59156 to 0.91497 for A-STRs based on length polymorphism. For sequence polymorphism, Hobs values ranged from 0.61306 to 0.94586, and Hexp values fluctuated between 0.59156 and 0.94487. The CPE values were 1-5.0779620E-21 and 1-3.257436E-24 for length and sequence polymorphism, respectively, while the CPD values were 1-1.727007E-59 and 1-5.517015E-66, respectively. Among the 80 Y-STR loci, the HD values for length and sequence polymorphism were 0.99764282 and 0.99894195, respectively. The HMP values stood at 0.00418102 and 0.00288427, respectively, and the DC values were 0.75502742 and 0.83363803, respectively. For the 52 A-STR loci, we identified 554 and 989 distinct alleles based on length and sequence polymorphisms, respectively. For the 81 Y-STR loci, 464 and 652 unique alleles were detected at the length and sequence level, respectively. Population genetic analysis revealed that the Meizhou Hakka population has a close kinship relationship with the Asian populations THI and KOR based on length polymorphism data of A-STRs. Conversely, based on length polymorphism data of Y-STRs, the Meizhou Hakka population has the closest kinship relationship with the Henan Han population. Discussion: Overall, the variation information of repeat region sequences significantly enhances the forensic identification efficacy of STR genetic markers, providing an essential database for forensic individual and paternity testing in this region. Additionally, the data generated by our study will serve as a vital resource for research into the genetic structure and historical origins of the Meizhou Hakka population.
Collapse
Affiliation(s)
- Yuhang Feng
- MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| | - Yutao Zhao
- Public Security Bureau of Zhaoqing Municipality, Zhaoqing, China
| | - Xiaoyu Lu
- Deepreads Biotech Company Limited, Guangzhou, China
| | - Haiyan Li
- Criminal Technology Center of Guangdong Provincial Public Security Department, Guangzhou, China
| | - Kai Zhao
- Criminal Technology Center of Guangdong Provincial Public Security Department, Guangzhou, China
| | - Meisen Shi
- Criminal Justice College of China University of Political Science and Law, Beijing, China
| | - Shaoqing Wen
- MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
- Institute of Archaeological Science, Fudan University, Shanghai, China
- MOE Laboratory for National Development and Intelligent Governance, Fudan University, Shanghai, China
| |
Collapse
|
9
|
Wang X, Huang M, Budowle B, Ge J. TRcaller: a novel tool for precise and ultrafast tandem repeat variant genotyping in massively parallel sequencing reads. Front Genet 2023; 14:1227176. [PMID: 37533432 PMCID: PMC10390829 DOI: 10.3389/fgene.2023.1227176] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 06/13/2023] [Indexed: 08/04/2023] Open
Abstract
Calling tandem repeat (TR) variants from DNA sequences is of both theoretical and practical significance. Some bioinformatics tools have been developed for detecting or genotyping TRs. However, little study has been done to genotyping TR alleles from long-read sequencing data, and the accuracy of genotyping TR alleles from next-generation sequencing data still needs to be improved. Herein, a novel algorithm is described to retrieve TR regions from sequence alignment, and a software program TRcaller has been developed and integrated into a web portal to call TR alleles from both short- and long-read sequences, both whole genome and targeted sequences generated from multiple sequencing platforms. All TR alleles are genotyped as haplotypes and the robust alleles will be reported, even multiple alleles in a DNA mixture. TRcaller could provide substantially higher accuracy (>99% in 289 human individuals) in detecting TR alleles with magnitudes faster (e.g., ∼2 s for 300x human sequence data) than the mainstream software tools. The web portal preselected 119 TR loci from forensics, genealogy, and disease related TR loci. TRcaller is validated to be scalable in various applications, such as DNA forensics and disease diagnosis, which can be expanded into other fields like breeding programs. Availability: TRcaller is available at https://www.trcaller.com/SignIn.aspx.
Collapse
Affiliation(s)
- Xuewen Wang
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, United States
| | - Meng Huang
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, United States
| | - Bruce Budowle
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, United States
- Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, Fort Worth, TX, United States
| | - Jianye Ge
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, United States
- Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, Fort Worth, TX, United States
| |
Collapse
|
10
|
Kulthammanit N, Sathirapatya T, Sukawutthiya P, Noh H, Vongpaisarnsin K, Wichadakul D. STRategy: A support system for collecting and analyzing next-generation sequencing data of short tandem repeats for forensic science. PLoS One 2023; 18:e0282551. [PMID: 37459339 PMCID: PMC10351723 DOI: 10.1371/journal.pone.0282551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 05/30/2023] [Indexed: 07/20/2023] Open
Abstract
Short tandem repeats (STRs) are short repeated sequences commonly found in the human genome and valuable in forensic science, used for human identity and relatedness markers. Next-generation sequencing (NGS) technologies, e.g., ForenSeq Signature Prep, can sequence STRs, inferring length-based alleles and single nucleotide polymorphisms (SNPs) and providing valuable insights into population and sub-population structures. Despite the potential benefits of NGS for STRs, no open-source software platform integrates the collection, management, and analysis of STR data from NGS into one place. Users must use multiple programs to process their STR data and then collect the results into a separate database or a file system folder. Moreover, analyzing repeat structures (STR repeat motifs) may require learning multiple software tools, making the process inefficient and cumbersome. To address this gap, we introduce the STRategy, a standalone web-based application supporting essential STR data management and analysis capabilities. The STRategy allows users to collect their data into its database, automatically calculates forensic parameters, and visualizes the analyzed data in various forms. Users can search the database using different options, such as by profile, loci, and genotypes, with and without a specific test kit. Moreover, users can also find the nucleotide variants of a locus among the samples. We designed the STRategy for internal use in a laboratory or an organization. Hence, our system includes role-based access control that allows users to search for or access specific data based on their responsibilities. The administrator role can customize the system, for example, configure maps according to the samples' geographic data, and manage reference STR repeat motifs. A laboratory or an organization can download and install a copy of STRategy on their local system using Docker, as described in https://github.com/cucpbioinfo/STRategy. In summary, the STRategy is an end-to-end system that provides users with a database to collect the analyzed STR data from NGS, the dynamic analyses of forensic parameters, and the variants of STR patterns according to the newly added samples, which are then explorable via various search options and visualizations. The system is helpful for both forensic investigations and forensic genetics.
Collapse
Affiliation(s)
- Nuttachai Kulthammanit
- Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, Thailand
| | - Tikumphorn Sathirapatya
- Department of Forensic Medicine, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
- Forensic Serology and DNA, King Chulalongkorn Memorial Hospital and Thai Red Cross Society, Bangkok, Thailand
| | - Poonyapat Sukawutthiya
- Department of Forensic Medicine, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
- Forensic Serology and DNA, King Chulalongkorn Memorial Hospital and Thai Red Cross Society, Bangkok, Thailand
| | - Hasnee Noh
- Department of Forensic Medicine, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
- Forensic Serology and DNA, King Chulalongkorn Memorial Hospital and Thai Red Cross Society, Bangkok, Thailand
| | - Kornkiat Vongpaisarnsin
- Department of Forensic Medicine, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
- Forensic Serology and DNA, King Chulalongkorn Memorial Hospital and Thai Red Cross Society, Bangkok, Thailand
- Forensic Genetics Research Unit, Ratchadapiseksompotch Fund, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
| | - Duangdao Wichadakul
- Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, Thailand
- Center of Excellence in Systems Biology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
| |
Collapse
|
11
|
Kocsis B, Mátrai N, Egyed B. Forensic Implications of the Discrepancies Caused between NGS and CE Results by New Microvariant Allele at Penta E Microsatellite. Genes (Basel) 2023; 14:genes14051109. [PMID: 37239469 DOI: 10.3390/genes14051109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 05/05/2023] [Accepted: 05/17/2023] [Indexed: 05/28/2023] Open
Abstract
Examination of STR markers using the MPS technology is becoming more common in forensic genetics, but scientists still have insufficient experience in dealing with ambiguous results. However, it is always essential to resolve discordant data if we want to use the technology as an accredited method in routine forensic casework. During the internal laboratory validation of the Precision ID GlobalFiler NGS STR Panel v2 kit, we observed two discrepant genotypes at Penta E locus compared to the previous capillary electrophoresis results. Each NGS software that we applied (i.e., Converge, STRaitRazor and IGV) returned the same 12,14 and 12,16 genotypes in the two samples, respectively, instead of the 11.3,14 and 11.3,16 genotypes previously observed with CE (Capillary electrophoresis) typing. In the case of the length variant 11.3 alleles, traditional Sanger sequencing confirmed a complete twelve repeat unit structure in both samples. However, after sequencing was extended to the flanking regions of the variant alleles, sequence data revealed a two-bases GG deletion downstream of the last TCTTT repeat motif in the forward strand. The determined allele variant has not been previously reported in the scientific literature and highlights the need for a careful evaluation and thorough concordance studies before using NGS STR data in forensic cases.
Collapse
Affiliation(s)
- Balázs Kocsis
- Department of Genetics, Hungarian Institute for Forensic Sciences, 1087 Budapest, Hungary
- Department of Genetics, ELTE Eötvös Loránd University, 1117 Budapest, Hungary
| | - Norbert Mátrai
- Department of Genetics, Hungarian Institute for Forensic Sciences, 1087 Budapest, Hungary
| | - Balázs Egyed
- Department of Genetics, ELTE Eötvös Loránd University, 1117 Budapest, Hungary
| |
Collapse
|
12
|
Kiesler KM, Borsuk LA, Steffen CR, Vallone PM, Gettings KB. US Population Data for 94 Identity-Informative SNP Loci. Genes (Basel) 2023; 14:genes14051071. [PMID: 37239431 DOI: 10.3390/genes14051071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 05/05/2023] [Accepted: 05/09/2023] [Indexed: 05/28/2023] Open
Abstract
The US National Institute of Standards and Technology (NIST) analyzed a set of 1036 samples representing four major US population groups (African American, Asian American, Caucasian, and Hispanic) with 94 single nucleotide polymorphisms (SNPs) used for individual identification (iiSNPs). The compact size of iiSNP amplicons compared to short tandem repeat (STR) markers increases the likelihood of successful amplification with degraded DNA samples. Allele frequencies and relevant forensic statistics were calculated for each population group as well as the aggregate population sample. Examination of sequence data in the regions flanking the targeted SNPs identified additional variants, which can be combined with the target SNPs to form microhaplotypes (multiple phased SNPs within a short-read sequence). Comparison of iiSNP performance with and without flanking SNP variation identified four amplicons containing microhaplotypes with observed heterozygosity increases of greater than 15% over the targeted SNP alone. For this set of 1036 samples, comparison of average match probabilities from iiSNPs with the 20 CODIS core STR markers yielded an estimate of 1.7 × 10-38 for iiSNPs (assuming independence between all 94 SNPs), which was four orders of magnitude lower (more discriminating) than STRs where internal sequence variation was considered, and 10 orders of magnitude lower than STRs using established capillary electrophoresis length-based genotypes.
Collapse
Affiliation(s)
- Kevin M Kiesler
- National Institute of Standards and Technology, 100 Bureau Drive, Mailstop 8314, Gaithersburg, MD 20899, USA
| | - Lisa A Borsuk
- National Institute of Standards and Technology, 100 Bureau Drive, Mailstop 8314, Gaithersburg, MD 20899, USA
| | - Carolyn R Steffen
- National Institute of Standards and Technology, 100 Bureau Drive, Mailstop 8314, Gaithersburg, MD 20899, USA
| | - Peter M Vallone
- National Institute of Standards and Technology, 100 Bureau Drive, Mailstop 8314, Gaithersburg, MD 20899, USA
| | - Katherine B Gettings
- National Institute of Standards and Technology, 100 Bureau Drive, Mailstop 8314, Gaithersburg, MD 20899, USA
| |
Collapse
|
13
|
Riman S, Ghemrawi M, Borsuk LA, Mahfouz R, Walsh S, Vallone PM. Sequence-based allelic variations and frequencies for 22 autosomal STR loci in the Lebanese population. Forensic Sci Int Genet 2023; 65:102872. [PMID: 37068444 DOI: 10.1016/j.fsigen.2023.102872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 04/06/2023] [Accepted: 04/08/2023] [Indexed: 04/19/2023]
Abstract
This is the first study that characterizes the sequence-based allelic variations of 22 autosomal Short Tandem Repeat (aSTR) loci in a population dataset collected from Lebanon. Genomic DNA extracts from 195 unrelated Lebanese individuals were amplified with PowerSeq 46GY System Prototype. Targeted amplicons were subjected to DNA library preparation and sequenced on the Verogen MiSeq FGx Sequencing System. Raw FASTQ data files were processed by STRait Razor v3. Sequence strings were annotated according to the considerations of the DNA Commission of the International Society for Forensic Genetics (ISFG) and tabulated herein with their respective allelic frequencies and GeneBank accession and version numbers. The sequenced Lebanese dataset resulted in 429 distinct allelic sequences as compared to the 236 alleles identified by length only. The increase in the number of alleles was observed at 18 out of 22 aSTR loci and was attributed to the sequence variations residing in both the STR repeat motifs and flanking regions. The study uncovered 25 novel aSTR allelic sequences across 12 loci for which GenBank records did not previously exist in the STRSeq BioProject, PRJNA380127. For a concordance check, the length-based allelic calls derived from the full sequences were compared to those genotyped using capillary electrophoresis (CE) methods. Population genetic parameters relevant to the evaluation of forensic DNA evidence were assessed for the sequence-based data and compared to the parameters generated from the length-based information. Using the sequence-based data, Analysis of MOlecular VAriance (AMOVA), genetic distances, and population genetic structure were evaluated for 1231 individuals sampled from the Lebanese and four U.S. populations (African American, Asian, Caucasian, and Hispanic). The results were tabulated and visualized in a population tree, multidimensional scaling scatter plots, and bar plots. This newly established sequence-based database for the Lebanese population can be beneficial for extending NGS applicability to casework or paternity testing and assessing the strength of evidence for NGS-STR profiles. The described novel sequence variants at certain loci can further help in the effort to characterize the sequence diversity of STR markers from different populations around the world.
Collapse
Affiliation(s)
- Sarah Riman
- Applied Genetics Group, National Institute of Standards and Technology, Gaithersburg, MD 20899, USA.
| | - Mirna Ghemrawi
- Department of Chemistry and Biochemistry and International Forensic Research Institute, Florida International University, Miami, FL 33199, USA
| | - Lisa A Borsuk
- Applied Genetics Group, National Institute of Standards and Technology, Gaithersburg, MD 20899, USA
| | - Rami Mahfouz
- Department of Pathology and Laboratory Medicine, American University of Beirut Medical Center, Beirut, Lebanon
| | - Susan Walsh
- Department of Biology, Indiana University Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Peter M Vallone
- Applied Genetics Group, National Institute of Standards and Technology, Gaithersburg, MD 20899, USA
| |
Collapse
|
14
|
Joo SM, Kwon YL, Moon MH, Shin KJ. Genetic investigation of 124 SNPs in a Myanmar population using the Precision ID Identity Panel and the Illumina MiSeq. Leg Med (Tokyo) 2023; 63:102256. [PMID: 37058993 DOI: 10.1016/j.legalmed.2023.102256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 03/16/2023] [Accepted: 04/07/2023] [Indexed: 04/16/2023]
Abstract
Single nucleotide polymorphisms (SNPs) have become popular in forensic genetics as an alternative to short tandem repeats (STRs). The Precision ID Identity Panel (Thermo Fisher Scientific), consisting of 90 autosomal SNPs and 34 Y-chromosomal SNPs, enabled human identification studies on global populations through next-generation sequencing (NGS). However, most previous studies on the panel have used the Ion Torrent platform, and there are few reports on the Southeast Asian population. Here, a total of 96 unrelated males from Myanmar (Yangon) were analyzed with the Precision ID Identity Panel on a MiSeq (Illumina) using an in-house TruSeq compatible universal adapter and a custom variant caller, Visual SNP. The sequencing performance evaluated by locus balance and heterozygote balance was comparable to that of the Ion Torrent platform. For 90 autosomal SNPs, the combined match probability (CMP) was 6.994 × 10-34, lower than that of 22 PowerPlex Fusion autosomal STRs (3.130 × 10-26). For 34 Y-SNPs, 14 Y-haplogroups (mostly O2 and O1b) were observed. We found 51 cryptic variations (42 haplotypes) around target SNPs, of which haplotypes corresponding to 33 autosomal SNPs decreased CMP. Interpopulation analysis revealed that the Myanmar population is genetically closer to the East and Southeast Asian populations. In conclusion, the Precision ID Identity Panel can be successfully analyzed on the Illumina MiSeq and provides high discrimination power for human identification in the Myanmar population. This study broadened the accessibility of the NGS-based SNP panel by expanding the available NGS platforms and adopting a robust NGS data analysis tool.
Collapse
Affiliation(s)
- Su Min Joo
- Department of Forensic Medicine, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea; Graduate School of Medical Science, Brain Korea 21 Project, Yonsei University, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea
| | - Ye-Lim Kwon
- Department of Forensic Medicine, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea; Graduate School of Medical Science, Brain Korea 21 Project, Yonsei University, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea
| | - Mi Hyeon Moon
- Department of Forensic Medicine, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea; Graduate School of Medical Science, Brain Korea 21 Project, Yonsei University, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea
| | - Kyoung-Jin Shin
- Department of Forensic Medicine, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea; Graduate School of Medical Science, Brain Korea 21 Project, Yonsei University, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea.
| |
Collapse
|
15
|
Using unique molecular identifiers to improve allele calling in low-template mixtures. Forensic Sci Int Genet 2023; 63:102807. [PMID: 36462297 DOI: 10.1016/j.fsigen.2022.102807] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 10/20/2022] [Accepted: 11/18/2022] [Indexed: 11/27/2022]
Abstract
PCR artifacts are an ever-present challenge in sequencing applications. These artifacts can seriously limit the analysis and interpretation of low-template samples and mixtures, especially with respect to a minor contributor. In medicine, molecular barcoding techniques have been employed to decrease the impact of PCR error and to allow the examination of low-abundance somatic variation. In principle, it should be possible to apply the same techniques to the forensic analysis of mixtures. To that end, several short tandem repeat loci were selected for targeted sequencing, and a bioinformatic pipeline for analyzing the sequence data was developed. The pipeline notes the relevant unique molecular identifiers (UMIs) attached to each read and, using machine learning, filters the noise products out of the set of potential alleles. To evaluate this pipeline, DNA from pairs of individuals were mixed at different ratios (1-1, 1-9) and sequenced with different starting amounts of DNA (10, 1 and 0.1 ng). Naïvely using the information in the molecular barcodes led to increased performance, with the machine learning resulting in an additional benefit. In concrete terms, using the UMI data results in less noise for a given amount of drop out. For instance, if thresholds are selected that filter out a quarter of the true alleles, using read counts accepts 2381 noise alleles and using raw UMI counts accepts 1726 noise alleles, while the machine learning approach only accepts 307.
Collapse
|
16
|
Stephens KM, Barta R, Fleming K, Perez JC, Wu SF, Snedecor J, Holt CL, LaRue B, Budowle B. Developmental validation of the ForenSeq MainstAY kit, MiSeq FGx sequencing system and ForenSeq Universal Analysis Software. Forensic Sci Int Genet 2023; 64:102851. [PMID: 36907074 DOI: 10.1016/j.fsigen.2023.102851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 02/24/2023] [Accepted: 02/27/2023] [Indexed: 03/06/2023]
Abstract
For human identification purposes, forensic genetics has primarily relied upon a core set of autosomal (and to a lesser extent Y chromosome) short tandem repeat (STR) markers that are enriched by amplification using the polymerase chain reaction (PCR) that are subsequently separated and detected using capillary electrophoresis (CE). While STR typing conducted in this manner is well-developed and robust, advances in molecular biology that have occurred over the last 15 years, in particular massively parallel sequencing (MPS) [1-7], offer certain advantages as compared to CE-based typing. First and foremost is the high throughput capacity of MPS. Current bench top high throughput sequencers enable larger batteries of markers to be multiplexed and multiple samples to be sequenced simultaneously (e.g., millions to billions of nucleotides can be sequenced in one run). Second, compared to the length-based CE approach, sequencing STRs increases discrimination power, enhances sensitivity of detection, reduces noise due to instrumentation, and improves mixture interpretation [4,8-23]. Third, since detection of STRs is based on sequence and not fluorescence, amplicons can be designed that are shorter in length and of similar lengths among loci, where possible, which can improve amplification efficiency and analysis of degraded samples. Lastly, MPS offers a single format approach that can be applied to analysis of a wide variety of genetic markers of forensic interest (e.g., STRs, mitochondrial DNA, single nucleotide polymorphisms, insertion/deletions). These features make MPS a desirable technology for casework [14,15,24,25-48]. The developmental validation of the ForenSeq MainstAY library preparation kit with the MiSeq FGx Sequencing System and ForenSeq Universal Software is reported here to assist with validation of this MPS system for casework [49]. The results show that the system is sensitive, accurate and precise, specific, and performs well with mixtures and mock case-type samples.
Collapse
Affiliation(s)
| | - Richelle Barta
- Verogen, Inc., 11111 Flintkote Ave., San Diego, CA 92121, USA
| | - Keenan Fleming
- Verogen, Inc., 11111 Flintkote Ave., San Diego, CA 92121, USA
| | | | - Shan-Fu Wu
- Verogen, Inc., 11111 Flintkote Ave., San Diego, CA 92121, USA
| | - June Snedecor
- Verogen, Inc., 11111 Flintkote Ave., San Diego, CA 92121, USA
| | - Cydne L Holt
- Verogen, Inc., 11111 Flintkote Ave., San Diego, CA 92121, USA
| | - Bobby LaRue
- Verogen, Inc., 11111 Flintkote Ave., San Diego, CA 92121, USA
| | - Bruce Budowle
- University of Helsinki, Department of Forensic Medicine, Haartmaninkatu 8, P.O. Box 63, Helsinki 00014, Finland
| |
Collapse
|
17
|
Cheng K, Bright JA, Kelly H, Liu YY, Lin MH, Kruijver M, Taylor D, Buckleton J. Developmental validation of STRmix™ NGS, a probabilistic genotyping tool for the interpretation of autosomal STRs from forensic profiles generated using NGS. Forensic Sci Int Genet 2023; 62:102804. [PMID: 36370677 DOI: 10.1016/j.fsigen.2022.102804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 11/04/2022] [Accepted: 11/07/2022] [Indexed: 11/09/2022]
Abstract
We describe the developmental validation of the probabilistic genotyping software - STRmix™ NGS - developed for the interpretation of forensic DNA profiles containing autosomal STRs generated using next generation sequencing (NGS) also known as massively parallel sequencing (MPS) technologies. Developmental validation was carried out in accordance with the Scientific Working Group on DNA Analysis Methods (SWGDAM) Guidelines for the Validation of Probabilistic Genotyping Systems and the International Society for Forensic Genetics (ISFG) recommendations and included sensitivity and specificity testing, accuracy, precision, and the interpretation of case-types samples. The results of developmental validation demonstrate the appropriateness of the software for the interpretation of profiles developed using NGS technology.
Collapse
Affiliation(s)
- Kevin Cheng
- Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142, New Zealand.
| | - Jo-Anne Bright
- Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142, New Zealand
| | - Hannah Kelly
- Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142, New Zealand
| | - Yao-Yuan Liu
- Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142, New Zealand
| | - Meng-Han Lin
- Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142, New Zealand
| | - Maarten Kruijver
- Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142, New Zealand
| | - Duncan Taylor
- Forensic Science SA, GPO Box 2790, Adelaide, SA 5001, Australia
| | - John Buckleton
- Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142, New Zealand
| |
Collapse
|
18
|
Wang X, Budowle B, Ge J. USAT: a bioinformatic toolkit to facilitate interpretation and comparative visualization of tandem repeat sequences. BMC Bioinformatics 2022; 23:497. [PMID: 36402991 PMCID: PMC9675219 DOI: 10.1186/s12859-022-05021-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 10/29/2022] [Indexed: 11/21/2022] Open
Abstract
Background Tandem repeats (TR), highly variable genomic variants, are widely used in individual identification, disease diagnostics, and evolutionary studies. The recent advances in sequencing technologies and bioinformatic tools facilitate calling TR haplotypes genome widely. Both length-based and sequence-based TR alleles are used in different applications. However, sequence-based TR alleles could provide the highest precision in characterizing TR haplotypes. The need to identify the differences at the single nucleotide level between or among TR haplotypes with an easy-use bioinformatic tool is essential. Results In this study, we developed a Universal STR Allele Toolkit (USAT) for TR haplotype analysis, which takes TR haplotype output from existing tools to perform allele size conversion, sequence comparison of haplotypes, figure plotting, comparison for allele distribution, and interactive visualization. An exemplary application of USAT for analysis of the CODIS core STR loci for DNA forensics with benchmarking human individuals demonstrated the capabilities of USAT. USAT has user-friendly graphic interfaces and runs fast in major computing operating systems with parallel computing enabled. Conclusion USAT is a user-friendly bioinformatics software for interpretation, visualization, and comparisons of TRs. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-05021-1.
Collapse
Affiliation(s)
- Xuewen Wang
- grid.266869.50000 0001 1008 957XCenter for Human Identification, Health Science Center, University of North Texas, Fort Worth, TX USA
| | - Bruce Budowle
- grid.266869.50000 0001 1008 957XCenter for Human Identification, Health Science Center, University of North Texas, Fort Worth, TX USA ,grid.266871.c0000 0000 9765 6057Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, Fort Worth, TX USA
| | - Jianye Ge
- grid.266869.50000 0001 1008 957XCenter for Human Identification, Health Science Center, University of North Texas, Fort Worth, TX USA ,grid.266871.c0000 0000 9765 6057Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, Fort Worth, TX USA
| |
Collapse
|
19
|
Assessing Sequence Variation and Genetic Diversity of Currently Untapped Y-STR Loci. FORENSIC SCIENCE INTERNATIONAL: REPORTS 2022. [DOI: 10.1016/j.fsir.2022.100298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
|
20
|
An ambiguous sequence-based allele of SE33. FORENSIC SCIENCE INTERNATIONAL GENETICS SUPPLEMENT SERIES 2022. [DOI: 10.1016/j.fsigss.2022.10.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
21
|
Kwon YL, Lee EY, Kim BM, Joo SM, Jeong KS, Chun BW, Lee YH, Park KW, Shin KJ. Application of a custom haplotype caller to analyze sequence-based data of 56 microhaplotypes. Forensic Sci Int Genet 2022; 61:102778. [PMID: 36166997 DOI: 10.1016/j.fsigen.2022.102778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 08/23/2022] [Accepted: 09/16/2022] [Indexed: 11/18/2022]
Abstract
Microhaplotypes (microhaps) are recently introduced markers that aim to complement the limitations of conventional forensic markers such as short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs). With the potential of microhaps in forensics becoming clearer through massively parallel sequencing (MPS), MPS-based studies on microhaps are being actively reported. However, simpler workflow schemes for the generation and analysis of MPS data are still required to facilitate the practical application of MPS in forensics. In this study, we developed an in-house MPS panel that simultaneously amplifies 56 microhaps and a custom haplotype caller, Visual Microhap. The developed tool works on a web browser and provides four analysis options to extract SNP-based haplotypes from sequence-based data obtained by STRait Razor 3.0. To demonstrate the utility of the MPS panel and data analysis workflow scheme, we also analyzed 56 microhaps of 286 samples from four populations (African-American, Caucasian, Hispanic, and Korean). The average effective number of alleles (Ae) for the four groups was 3.45, ranging from 1.74 to 6.98. Forensic statistical parameters showed that this microhap panel is more powerful than conventional autosomal STRs for human identification. Meanwhile, the 56-plex panel mostly comprised microhaps with high Ae; however, the four populations were grossly distinguishable from each other by cluster analysis. Consequently, the developed in-house MPS panel for 56 microhaps and the adopted workflow using open-source tools can increase the utility of microhap MPS in forensic research and practice.
Collapse
Affiliation(s)
- Ye-Lim Kwon
- Department of Forensic Medicine, Yonsei University College of Medicine, Seoul 03722, Republic of Korea; Graduate School of Medical Science, Brain Korea 21 Project, Yonsei University, Seoul 03722, Republic of Korea.
| | - Eun Young Lee
- Department of Forensic Medicine, Yonsei University College of Medicine, Seoul 03722, Republic of Korea. -silver02-@hanmail.net
| | - Bo Min Kim
- Department of Forensic Medicine, Yonsei University College of Medicine, Seoul 03722, Republic of Korea; Graduate School of Medical Science, Brain Korea 21 Project, Yonsei University, Seoul 03722, Republic of Korea.
| | - Su Min Joo
- Department of Forensic Medicine, Yonsei University College of Medicine, Seoul 03722, Republic of Korea; Graduate School of Medical Science, Brain Korea 21 Project, Yonsei University, Seoul 03722, Republic of Korea.
| | - Kyu Sik Jeong
- Forensic DNA division, National Forensic Service, Wonju-si, Gangwon-do 26460, Republic of Korea.
| | - Byung Won Chun
- DNA analysis Division, National Forensic Service Daejeon Institute, Daejeon 34054, Republic of Korea.
| | - Yang Han Lee
- Forensic DNA division, National Forensic Service, Wonju-si, Gangwon-do 26460, Republic of Korea.
| | - Ki Won Park
- Forensic DNA division, National Forensic Service, Wonju-si, Gangwon-do 26460, Republic of Korea.
| | - Kyoung-Jin Shin
- Department of Forensic Medicine, Yonsei University College of Medicine, Seoul 03722, Republic of Korea; Graduate School of Medical Science, Brain Korea 21 Project, Yonsei University, Seoul 03722, Republic of Korea.
| |
Collapse
|
22
|
Steffen CR, Romsos EL, Kiesler KM, Borsuk LA, Gettings KB, Vallone PM. Make it "SNPPY" - Updates to SRM 2391d: PCR-Based DNA Profiling Standard. FORENSIC SCIENCE INTERNATIONAL GENETICS SUPPLEMENT SERIES 2022. [DOI: 10.1016/j.fsigss.2022.09.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
23
|
A New String Edit Distance and Applications. ALGORITHMS 2022. [DOI: 10.3390/a15070242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
String edit distances have been used for decades in applications ranging from spelling correction and web search suggestions to DNA analysis. Most string edit distances are variations of the Levenshtein distance and consider only single-character edits. In forensic applications polymorphic genetic markers such as short tandem repeats (STRs) are used. At these repetitive motifs the DNA copying errors consist of more than just single base differences. More often the phenomenon of “stutter” is observed, where the number of repeated units differs (by whole units) from the template. To adapt the Levenshtein distance to be suitable for forensic applications where DNA sequence similarity is of interest, a generalized string edit distance is defined that accommodates the addition or deletion of whole motifs in addition to single-nucleotide edits. A dynamic programming implementation is developed for computing this distance between sequences. The novelty of this algorithm is in handling the complex interactions that arise between multiple- and single-character edits. Forensic examples illustrate the purpose and use of the Restricted Forensic Levenshtein (RFL) distance measure, but applications extend to sequence alignment and string similarity in other biological areas, as well as dynamic programming algorithms more broadly.
Collapse
|
24
|
Sequence polymorphisms of forensic Y-STRs revealed by a 68-plex in-house massively parallel sequencing panel. Forensic Sci Int Genet 2022; 59:102727. [DOI: 10.1016/j.fsigen.2022.102727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Revised: 05/03/2022] [Accepted: 05/23/2022] [Indexed: 11/20/2022]
|
25
|
Agudo MM, Aanes H, Roseth A, Albert M, Gill P, Bleka Ø. A comprehensive characterization of MPS-STR stutter artefacts. Forensic Sci Int Genet 2022; 60:102728. [DOI: 10.1016/j.fsigen.2022.102728] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 05/03/2022] [Accepted: 05/24/2022] [Indexed: 11/04/2022]
|
26
|
Huszar TI, Bodmer WF, Hutnik K, Wetton JH, Jobling MA. Sequencing of autosomal, mitochondrial and Y-chromosomal forensic markers in the People of the British Isles cohort detects population structure dominated by patrilineages. Forensic Sci Int Genet 2022; 59:102725. [DOI: 10.1016/j.fsigen.2022.102725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 05/08/2022] [Accepted: 05/13/2022] [Indexed: 11/27/2022]
|
27
|
Microhaplotype and Y-SNP/STR (MY): A novel MPS-based system for genotype pattern recognition in two-person DNA mixtures. Forensic Sci Int Genet 2022; 59:102705. [DOI: 10.1016/j.fsigen.2022.102705] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2021] [Revised: 03/10/2022] [Accepted: 04/10/2022] [Indexed: 12/13/2022]
|
28
|
Moon MH, Hong SR, Shin KJ. Sequence Variations of 31 Υ-Chromosomal Short Tandem Repeats Analyzed by Massively Parallel Sequencing in Three U.S. Population Groups and Korean Population. J Korean Med Sci 2022; 37:e40. [PMID: 35166077 PMCID: PMC8845103 DOI: 10.3346/jkms.2022.37.e40] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 12/19/2021] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Rapidly mutating (RM) Y-chromosomal short tandem repeats (Y-STRs) have been demonstrated to increase the possibility of distinguishing between male relatives due to a higher mutation rate than conventional Y-STRs. Massively parallel sequencing (MPS) can be useful for forensic DNA typing as it allows the detection of sequence variants of many forensic markers. Here, we present sequence variations of 31 Y-STRs including nine RM Y-STRs (DYF387S1, DYF399S1, DYF404S1, DYS449, DYS518, DYS570, DYS576, DYS612, and DYS627), their frequencies, distribution, and the gain in the number of alleles using MPS. METHODS We constructed a multiplex MPS assay capable of simultaneously amplifying 32 Y-chromosomal markers, producing amplicons ranging from 85-274 bp. Barcoded libraries from 220 unrelated males from four populations-African Americans, Caucasians, Hispanics, and Koreans-were generated via two-step polymerase chain reaction and sequenced on a MiSeq system. Genotype concordance between the capillary electrophoresis (CE) and MPS method and sequence variation of Y-STRs were investigated. RESULTS In total, 195 alleles were increased by MPS compared to CE-based alleles (261 to 456). The DYS518 marker showed the largest increase due to repeat region variation (a 3.69-fold increase). The highest increase in the number of alleles due to single nucleotide polymorphisms in the flanking region was found in DYF399S1. RM Y-STRs had more diverse sequences than conventional Y-STRs. Furthermore, null alleles were observed in DYS576 due to primer-binding site mutation, and allele drop-outs in DYS449 resulted from low marker coverage of less than the threshold. CONCLUSION The results suggest that the expanded and discriminative MPS assay could provide more genetic information for Y-STRs, especially for RM Y-STRs, and could advance male individualization. Compiling sequence-based Y-STR data for worldwide populations would facilitate the application of MPS in the field of forensic genetics and could be applicable in solving male-related forensic cases.
Collapse
Affiliation(s)
- Mi Hyeon Moon
- Department of Forensic Medicine, Yonsei University College of Medicine, Seoul, Korea
- Graduate School of Medical Science and Brain Korea 21 Project, Yonsei University, Seoul, Korea
| | - Sae Rom Hong
- Department of Forensic Medicine, Yonsei University College of Medicine, Seoul, Korea
| | - Kyoung-Jin Shin
- Department of Forensic Medicine, Yonsei University College of Medicine, Seoul, Korea
- Graduate School of Medical Science and Brain Korea 21 Project, Yonsei University, Seoul, Korea.
| |
Collapse
|
29
|
Hall CL, Kesharwani RK, Phillips NR, Planz JV, Sedlazeck FJ, Zascavage RR. Accurate profiling of forensic autosomal STRs using the Oxford Nanopore Technologies MinION device. Forensic Sci Int Genet 2021; 56:102629. [PMID: 34837788 DOI: 10.1016/j.fsigen.2021.102629] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 09/28/2021] [Accepted: 11/01/2021] [Indexed: 01/23/2023]
Abstract
The high variability characteristic of short tandem repeat (STR) markers is harnessed for human identification in forensic genetic analyses. Despite the power and reliability of current typing techniques, sequence-level information both within and around STRs are masked in the length-based profiles generated. Forensic STR typing using next generation sequencing (NGS) has therefore gained attention as an alternative to traditional capillary electrophoresis (CE) approaches. In this proof-of-principle study, we evaluate the forensic applicability of the newest and smallest NGS platform available - the Oxford Nanopore Technologies (ONT) MinION device. Although nanopore sequencing on the handheld MinION offers numerous advantages, including low startup cost and on-site sample processing, the relatively high error rate and lack of forensic-specific analysis software has prevented accurate profiling across STR panels in previous studies. Here we present STRspy, a streamlined method capable of producing length- and sequence-based STR allele designations from noisy, error-prone third generation sequencing reads. To assess the capabilities of STRspy, seven reference samples (female: n = 2; male: n = 5) were amplified at 15 and 30 PCR cycles using the Promega PowerSeq 46GY System and sequenced on the ONT MinION device in triplicate. Basecalled reads were then processed with STRspy using a custom database containing alleles reported in the STRSeq BioProject NIST 1036 dataset. Resultant STR allele designations and flanking region single nucleotide polymorphism (SNP) calls were compared to the manufacturer-validated genotypes for each sample. STRspy generated robust and reliable genotypes across all autosomal STR loci amplified with 30 PCR cycles, achieving 100% concordance based on both length and sequence. Furthermore, we were able to identify flanking region SNPs in the 15-cycle dataset with > 90% accuracy. These results demonstrate that when analyzed with STRspy ONT reads can reveal additional variation in and around STR loci depending on read coverage. As the first and only third generation sequencing platform-specific method to successfully profile the entire panel of autosomal STRs amplified by a commercially available multiplex, STRspy significantly increases the feasibility of nanopore sequencing in forensic applications.
Collapse
Affiliation(s)
- Courtney L Hall
- Department of Microbiology, Immunology & Genetics, University of North Texas Health Science Center, 3400 Camp Bowie Blvd, Fort Worth, TX 76107, USA.
| | - Rupesh K Kesharwani
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston TX 77030, USA
| | - Nicole R Phillips
- Department of Microbiology, Immunology & Genetics, University of North Texas Health Science Center, 3400 Camp Bowie Blvd, Fort Worth, TX 76107, USA
| | - John V Planz
- Department of Microbiology, Immunology & Genetics, University of North Texas Health Science Center, 3400 Camp Bowie Blvd, Fort Worth, TX 76107, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston TX 77030, USA
| | - Roxanne R Zascavage
- Department of Microbiology, Immunology & Genetics, University of North Texas Health Science Center, 3400 Camp Bowie Blvd, Fort Worth, TX 76107, USA; Department of Criminology and Criminal Justice, University of Texas at Arlington, 701 S Nedderman Dr, Arlington, TX 76109, USA
| |
Collapse
|
30
|
Development and validation of a novel 133-plex forensic STR panel (52 STRs and 81 Y-STRs) using single-end 400 bp massive parallel sequencing. Int J Legal Med 2021; 136:447-464. [PMID: 34741666 DOI: 10.1007/s00414-021-02738-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 10/25/2021] [Indexed: 12/15/2022]
Abstract
Short tandem repeats (STRs) are the preferred genetic markers in forensic DNA analysis, routinely measured by capillary electrophoresis (CE) method based on the fragment length features. While, the massive parallel sequencing (MPS) technology could simultaneously target a large number of intriguing forensic STRs, bypassing the intrinsic limitations of amplicon size separation and accessible fluorophores in CE, which is efficient and promising for enabling the identification of forensic biological evidence. Here, we developed a novel MPS-based Forensic Analysis System Multiplecues SetB Kit of 133-plex forensic STR markers (52 STRs and 81 Y-STRs) and one Y-InDel (M175) based on multiplex PCR and single-end 400 bp sequencing strategy. This panel was subjected to developmental validation studies according to the SWGDAM Validation Guidelines. Approximately 2185 MPS-based reactions using 6 human DNA standards and 8 male donors were conducted for substrate studies (filter paper, gauze, cotton swab, four different types of FTA cards, peripheral venous blood, saliva, and exfoliated cells), sensitivity studies (from 2 ng down to 0.0625 ng), mixture studies (two-person DNA mixtures), PCR inhibitor studies (seven commonly encountered PCR inhibitors), species specificity studies (11 non-human species), and repeatability studies. Results of concordance studies (413 Han males and 6 human DNA standards) generated by STRait Razor and in-house Python scripts indicated 99.98% concordance rate in STR calling relative to CE for STRs between 41,900 genotypes at 100 STR markers. Moreover, the limitations of present studies, the nomenclature rules and forensic MPS applications were also described. In conclusion, the validation studies based on ~ 2200 MPS-based and ~ 2500 CE-based DNA profiles demonstrated that the novel MPS-based panel meets forensic DNA quality assurance guidelines with robust, reliable, and reproducible performance on samples of various quantities and qualities, and the STR nomenclature rules should be further regulated to integrate the inconformity between MPS-based and CE-based methods.
Collapse
|
31
|
Wu JZ, Wang LX, Yang XY, Pan DH, Lu XY, Liu CH, Han XL, Liu H, Shi MS, Liu C, Wen SQ. Forensic application of a novel MPS-based panel (90 STRs and 100 SNPs) in a non-exclusion parentage case with three autosomal STRs incompatibilities. Leg Med (Tokyo) 2021; 54:101987. [PMID: 34768042 DOI: 10.1016/j.legalmed.2021.101987] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 10/22/2021] [Accepted: 10/26/2021] [Indexed: 10/19/2022]
Abstract
In kinship tests, the investigating of the forensic STRs usually provides decisive information to resolve relationship cases. We describe a parentage case with 3 genetic incompatibilities (D6S1043, D18S51 and D2S1338) between the child and alleged parent. With 90 STR loci and 100 SNP loci, the massively parallel sequencing (MPS)-based genotyping results support the certainty of parentage, and the mismatched alleles were considered to be mutations. MPS can provide additional allele sequence structures that can be used to infer the origins of the mutations. SNPs as supplementary markers can provide effective information to give an unequivocal statement of the parentage.
Collapse
Affiliation(s)
- Jia-Zi Wu
- MOE Key Laboratory of Contemporary Anthropology, Department of Anthropology and Human Genetics, School of Life Sciences, Fudan University, Shanghai 200438, China; Institute of Archaeological Science, Fudan University, Shanghai 200433, China
| | - Ling-Xiang Wang
- Institute of Archaeological Science, Fudan University, Shanghai 200433, China; Deepreads Biotech, Guangzhou 510663, China
| | - Xing-Yi Yang
- Guangzhou Forensic Science Institute, Guangzhou 510030, China
| | - Dong-Hua Pan
- Forensic Science Centre of Maoming Public Security Department, Guangdong Province, Maoming 525000, China
| | - Xiao-Yu Lu
- Institute of Archaeological Science, Fudan University, Shanghai 200433, China; Deepreads Biotech, Guangzhou 510663, China
| | - Chang-Hui Liu
- Guangzhou Forensic Science Institute, Guangzhou 510030, China
| | - Xiao-Long Han
- Guangzhou Forensic Science Institute, Guangzhou 510030, China
| | - Hong Liu
- Guangzhou Forensic Science Institute, Guangzhou 510030, China
| | - Mei-Sen Shi
- Criminal Justice College of China University of Political Science and Law, Beijing 100088, China.
| | - Chao Liu
- Guangzhou Forensic Science Institute, Guangzhou 510030, China.
| | - Shao-Qing Wen
- MOE Key Laboratory of Contemporary Anthropology, Department of Anthropology and Human Genetics, School of Life Sciences, Fudan University, Shanghai 200438, China; Institute of Archaeological Science, Fudan University, Shanghai 200433, China.
| |
Collapse
|
32
|
An Introductory Overview of Open-Source and Commercial Software Options for the Analysis of Forensic Sequencing Data. Genes (Basel) 2021; 12:genes12111739. [PMID: 34828345 PMCID: PMC8618049 DOI: 10.3390/genes12111739] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 10/27/2021] [Accepted: 10/27/2021] [Indexed: 12/30/2022] Open
Abstract
The top challenges of adopting new methods to forensic DNA analysis in routine laboratories are often the capital investment and the expertise required to implement and validate such methods locally. In the case of next-generation sequencing, in the last decade, several specifically forensic commercial options became available, offering reliable and validated solutions. Despite this, the readily available expertise to analyze, interpret and understand such data is still perceived to be lagging behind. This review gives an introductory overview for the forensic scientists who are at the beginning of their journey with implementing next-generation sequencing locally and because most in the field do not have a bioinformatics background may find it difficult to navigate the new terms and analysis options available. The currently available open-source and commercial software for forensic sequencing data analysis are summarized here to provide an accessible starting point for those fairly new to the forensic application of massively parallel sequencing.
Collapse
|
33
|
Reverse complement-PCR, an innovative and effective method for multiplexing forensically relevant single nucleotide polymorphism marker systems. Biotechniques 2021; 71:484-489. [PMID: 34350776 DOI: 10.2144/btn-2021-0031] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
DNA analyses from challenging samples such as touch evidence, hairs and skeletal remains push the limits of the current forensic DNA typing technologies. Reverse complement PCR (RC-PCR) is a novel, single-step PCR target enrichment method adapted to amplify degraded DNA. The sample preparation process involves a limited number of steps, decreasing the labor required for library preparation and reducing the possibility of contamination due to less sample manipulation. These features of the RC-PCR make the technology a unique application to successfully target single nucleotide polymorphisms (SNPs) in fragmented and low copy number DNA and yield results from samples in which no or limited data are obtained with standard DNA typing methods. The developed RC-PCR short amplicon 85 SNP-plex panel is a substantial improvement over the previously reported 27-plex RC-PCR multiplex that will provide higher discrimination power for challenging DNA sample analyses.
Collapse
|
34
|
The forensic landscape and the population genetic analyses of Hainan Li based on massively parallel sequencing DNA profiling. Int J Legal Med 2021; 135:1295-1317. [PMID: 33847803 DOI: 10.1007/s00414-021-02590-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Accepted: 03/26/2021] [Indexed: 12/30/2022]
Abstract
Due to the formation of the Qiongzhou Strait by climate change and marine transition, Hainan island was isolated from the mainland southern China during the Last Glacial Maximum. Hainan island, located at the southernmost part of China and separated from the Leizhou Peninsula by the Qiongzhou Strait, laid on one of the modern human northward migration routes from Southeast Asia to East Asia. The Hlai language-speaking Li minority, the second largest population after Han Chinese in Hainan island, is the direct descendants of the initial migrants in Hainan island and has unique ethnic properties and derived characteristics; however, the forensic-associated studies on Hainan Li population are still insufficient. Hence, 136 Hainan Li individuals were genotyped in this study using the MPS-based ForenSeq™ DNA Signature Prep Kit (DNA Primer Set A, DPMA) to characterize the forensic genetic polymorphism landscape, and DNA profiles were obtained from 152 different molecular genetic markers (27 autosomal STRs, 24 Y-STRs, 7 X-STRs, and 94 iiSNPs). A total of 419 distinct length variants and 586 repeat sequence sub-variants, with 31 novel alleles (at 17 loci), were identified across the 58 STR loci from the DNA profiles of Hainan Li population. We evaluated the forensic characteristics and efficiencies of DPMA, demonstrating that the STRs and iiSNPs in DPMA were highly polymorphic in Hainan Li population and could be employed in forensic applications. In addition, we set up three datasets, which included the genetic data of (i) iiSNPs (27 populations, 2640 individuals), (ii) Y-STRs (42 populations, 8281 individuals), and (iii) Y haplogroups (123 populations, 4837 individuals) along with the population ancestries and language families, to perform population genetic analyses separately from different perspectives. In conclusion, the phylogenetic analyses indicated that Hainan Li, with a southern East Asia origin and Tai-Kadai language-speaking language, is an isolated population relatively. But the genetic pool of Hainan Li influenced by the limited gene flows from other Tai-Kadai populations and Hainan populations. Furthermore, the establishment of isolated population models will be beneficial to clarify the exquisite population structures and develop specific genetic markers for subpopulations in forensic genetic fields.
Collapse
|
35
|
Shen X, Li R, Li H, Gao Y, Chen H, Qu N, Peng D, Wu R, Sun H. Noninvasive Prenatal Paternity Testing with a Combination of Well-Established SNP and STR Markers Using Massively Parallel Sequencing. Genes (Basel) 2021; 12:genes12030454. [PMID: 33810139 PMCID: PMC8004970 DOI: 10.3390/genes12030454] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 03/05/2021] [Accepted: 03/20/2021] [Indexed: 01/04/2023] Open
Abstract
Cell-free fetal DNA (cffDNA) from maternal plasma has made it possible to develop noninvasive prenatal paternity testing (NIPPT). However, most studies have focused on customized single nucleotide polymorphism (SNP) typing systems and few have used conventional short tandem repeat (STR) markers. Based on massively parallel sequencing (MPS), this study used a widely-accepted forensic multiplex assay system to evaluate the effect of noninvasive prenatal paternity testing with a combination of well-established SNP and STR markers. Using a ForenSeq DNA Signature Prep Kit, NIPPT was performed in 17 real parentage cases with monovular unborn fetuses at 7 to 24 gestational weeks. Different analytical strategies for the identification of paternally inherited allele (PIA) were developed to deal with SNPs and STRs. Combined paternity index (CPI) for 17 real trios as well as 272 unrelated trios was calculated. With the combination of SNPs and A-STRs, 82.35% (14/17), 88.24% (15/17), 94.12% (16/17), and 94.12% (16/17) of real trios could be accurately determined when the likelihood ratio (LR) threshold for paternity inclusion was set to 10,000, 1000, 100, and 10, respectively. This reveals that simultaneous surveys of SNP and STR markers included in the ForenSeq DNA Signature Prep Kit offer a promising method for NIPPT using MPS technology.
Collapse
Affiliation(s)
- Xuefeng Shen
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China; (X.S.); (R.L.); (H.L.); (H.C.); (N.Q.); (D.P.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-Sen University, Guangzhou 510080, China
| | - Ran Li
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China; (X.S.); (R.L.); (H.L.); (H.C.); (N.Q.); (D.P.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-Sen University, Guangzhou 510080, China
| | - Haixia Li
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China; (X.S.); (R.L.); (H.L.); (H.C.); (N.Q.); (D.P.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-Sen University, Guangzhou 510080, China
| | - Yu Gao
- Department of Obstetrics, The Sixth Affiliated Hospital of Sun Yat-Sen University, Guangzhou 510630, China;
| | - Hui Chen
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China; (X.S.); (R.L.); (H.L.); (H.C.); (N.Q.); (D.P.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-Sen University, Guangzhou 510080, China
| | - Ning Qu
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China; (X.S.); (R.L.); (H.L.); (H.C.); (N.Q.); (D.P.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-Sen University, Guangzhou 510080, China
| | - Dan Peng
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China; (X.S.); (R.L.); (H.L.); (H.C.); (N.Q.); (D.P.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-Sen University, Guangzhou 510080, China
| | - Riga Wu
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-Sen University, Guangzhou 510080, China
- Correspondence: (R.W.); (H.S.)
| | - Hongyu Sun
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China; (X.S.); (R.L.); (H.L.); (H.C.); (N.Q.); (D.P.)
- Correspondence: (R.W.); (H.S.)
| |
Collapse
|
36
|
Kwon YL, Kim BM, Lee EY, Shin KJ. Massively parallel sequencing of 25 autosomal STRs including SE33 in four population groups for forensic applications. Sci Rep 2021; 11:4701. [PMID: 33633141 PMCID: PMC7907369 DOI: 10.1038/s41598-021-82814-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 01/25/2021] [Indexed: 11/09/2022] Open
Abstract
The introduction of massively parallel sequencing (MPS) in forensic investigation enables sequence-based large-scale multiplexing beyond size-based analysis using capillary electrophoresis (CE). For the practical application of MPS to forensic casework, many population studies have provided sequence data for autosomal short tandem repeats (STRs). However, SE33, a highly polymorphic STR marker, has little sequence-based data because of difficulties in analysis. In this study, 25 autosomal STRs were analyzed, including SE33, using an in-house MPS panel for 350 samples from four populations (African-American, Caucasian, Hispanic, and Korean). The barcoded MPS library was generated using a two-step PCR method and sequenced using a MiSeq System. As a result, 99.88% genotype concordance was obtained between length- and sequence-based analyses. In SE33, the most discordances (eight samples, 0.08%) were observed because of the 4 bp deletion between the CE and MPS primer binding sites. Compared with the length-based CE method, the number of alleles increased from 332 to 725 (2.18-fold) for 25 autosomal STRs in the sequence-based MPS method. Notably, additional 129 unique alleles, a 4.15-fold increase, were detected in SE33 by identifying sequence variations. This population data set provides sequence variations and sequence-based allele frequencies for 25 autosomal STRs.
Collapse
Affiliation(s)
- Ye-Lim Kwon
- Department of Forensic Medicine, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Korea.,Brain Korea 21 PLUS Project for Medical Science, Yonsei University, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Korea
| | - Bo Min Kim
- Department of Forensic Medicine, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Korea.,Brain Korea 21 PLUS Project for Medical Science, Yonsei University, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Korea
| | - Eun Young Lee
- Department of Forensic Medicine, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Korea
| | - Kyoung-Jin Shin
- Department of Forensic Medicine, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Korea. .,Brain Korea 21 PLUS Project for Medical Science, Yonsei University, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Korea.
| |
Collapse
|
37
|
Li R, Shen X, Chen H, Peng D, Wu R, Sun H. Developmental validation of the MGIEasy Signature Identification Library Prep Kit, an all-in-one multiplex system for forensic applications. Int J Legal Med 2021; 135:739-753. [PMID: 33523251 DOI: 10.1007/s00414-021-02507-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 01/08/2021] [Indexed: 01/23/2023]
Abstract
Analyzing genetic markers in nuclear and mitochondrial genomes is helpful in various forensic applications, such as individual identifications and kinship analyses. However, most commercial kits detect these markers separately, which is time-consuming, laborious, and more error-prone (mislabelling, contamination, ...). The MGIEasy Signature Identification Library Prep Kit (hereinafter "MGIEasy identification system"; MGI Tech, Shenzhen, China) has been designed to provide a simple, fast, and robust way to detect appropriate markers in one multiplex PCR reaction: 52 autosomal STRs, 27 X-chromosomal STRs, 48 Y-chromosomal STRs, 145 identity-informative SNPs, 53 ancestry-informative SNPs, 29 phenotype-informative SNPs, and the hypervariable regions of mitochondrial DNA (mtDNA). Here, we validated the performance of MGIEasy identification system following the guidelines of the Scientific Working Group on DNA Analysis Methods (SWGDAM), assessing species specificity, sensitivity, mixture identification, stability under non-optimal conditions (degraded samples, inhibitor contamination, and various substrates), repeatability, and concordance. Libraries prepared using MGIEasy identification system were sequenced on a MGISEQ-2000 instrument (MGI Tech). MGIEasy-derived STR, SNP, and mtDNA genotypes were highly concordant with CE-based STR genotypes (99.79%), MiSeq FGx-based SNP genotypes (99.78%), and Sanger-based mtDNA genotypes (100%), respectively. This system was strongly human-specific, resistant to four common PCR inhibitors, and reliably amplified both low quantities of DNA (as low as 0.125 ng) and degraded DNA (~ 150 nt). Most of the unique alleles from the minor contributor were detected in 1:10 male-female and male-male mixtures; some minor Y-STR alleles were even detected in 1:1000 male-female mixtures. MGIEasy also successfully directly amplified markers from blood stains on FTA cards, filter papers, and swabs. Thus, our results demonstrated that MGIEasy identification system was suitable for use in forensic analyses due to its robust and reliable performance on samples of varying quality and quantity.
Collapse
Affiliation(s)
- Ran Li
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, People's Republic of China.,Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, No. 74 Zhongshan Road II, Guangzhou, 510080, Guangdong, People's Republic of China
| | - Xuefeng Shen
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, People's Republic of China.,Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, No. 74 Zhongshan Road II, Guangzhou, 510080, Guangdong, People's Republic of China
| | - Hui Chen
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, People's Republic of China.,Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, No. 74 Zhongshan Road II, Guangzhou, 510080, Guangdong, People's Republic of China
| | - Dan Peng
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, People's Republic of China.,Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, No. 74 Zhongshan Road II, Guangzhou, 510080, Guangdong, People's Republic of China
| | - Riga Wu
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, People's Republic of China.,Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, No. 74 Zhongshan Road II, Guangzhou, 510080, Guangdong, People's Republic of China
| | - Hongyu Sun
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, People's Republic of China. .,Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, No. 74 Zhongshan Road II, Guangzhou, 510080, Guangdong, People's Republic of China.
| |
Collapse
|
38
|
King JL, Woerner AE, Mandape SN, Kapema KB, Moura-Neto RS, Silva R, Budowle B. STRait Razor Online: An enhanced user interface to facilitate interpretation of MPS data. Forensic Sci Int Genet 2021; 52:102463. [PMID: 33493821 DOI: 10.1016/j.fsigen.2021.102463] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 11/06/2020] [Accepted: 12/29/2020] [Indexed: 12/17/2022]
Abstract
Since 2013, STRait Razor has enabled analysis of massively parallel sequencing (MPS) data from various marker systems such as short tandem repeats, single nucleotide polymorphisms, insertion/deletions, and mitochondrial DNA. In this paper, STRait Razor Online (SRO), available at https://www.unthsc.edu/straitrazor, is introduced as an interactive, Shiny-based user interface for primary analysis of MPS data and secondary analysis of STRait Razor haplotype pileups. This software can be accessed from any common browser via desktop, tablet, or smartphone device. SRO is available also as a standalone application and open-source R script available at https://github.com/ExpectationsManaged/STRaitRazorOnline. The local application is capable of batch processing of both fastq files and primary analysis output. Processed batches generate individual report folders and summary reports at the locus- and haplotype-level in a matter of minutes. For example, the processing of data from ∼700 samples generated with the ForenSeq Signature Preparation Kit from allsequences.txt to a final table can be performed in ∼40 min whereas the Excel-based workbooks can take 35-60 h to compile a subset of the tables generated by SRO. To facilitate analysis of single-source, reference samples, a preliminary triaging system was implemented that calls potential alleles and flags loci suspected of severe heterozygote imbalance. When compared to published, manually curated data sets, 98.72 % of software-assigned allele calls without manual interpretation were consistent with curated data sets, 0.99 % loci were presented to the user for interpretation due to heterozygote imbalance, and the remaining 0.29 % of loci were inconsistent due to the analytical thresholds used across the studies.
Collapse
Affiliation(s)
- Jonathan L King
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA.
| | - August E Woerner
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA; Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA
| | - Sammed N Mandape
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA
| | - Kapema Bupe Kapema
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA
| | | | - Rosane Silva
- Instituto de Biofisica Carlos Chagas Filho, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
| | - Bruce Budowle
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA; Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA
| |
Collapse
|
39
|
Woerner AE, Mandape S, King JL, Muenzler M, Crysup B, Budowle B. Reducing noise and stutter in short tandem repeat loci with unique molecular identifiers. Forensic Sci Int Genet 2020; 51:102459. [PMID: 33429137 DOI: 10.1016/j.fsigen.2020.102459] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 10/28/2020] [Accepted: 12/21/2020] [Indexed: 12/24/2022]
Abstract
Unique molecular identifiers (UMIs) are a promising approach to contend with errors generated during PCR and massively parallel sequencing (MPS). With UMI technology, random molecular barcodes are ligated to template DNA molecules prior to PCR, allowing PCR and sequencing error to be tracked and corrected bioinformatically. UMIs have the potential to be particularly informative for the interpretation of short tandem repeats (STRs). Traditional MPS approaches may simply lead to the observation of alleles that are consistent with the hypotheses of stutter, while with UMIs stutter products bioinformatically may be re-associated with their parental alleles and subsequently removed. Herein, a bioinformatics pipeline named strumi is described that is designed for the analysis of STRs that are tagged with UMIs. Unlike other tools, strumi is an alignment-free machine learning driven algorithm that clusters individual MPS reads into UMI families, infers consensus super-reads that represent each family and provides an estimate the resulting haplotype's accuracy. Super-reads, in turn, approximate independent measurements not of the PCR products, but of the original template molecules, both in terms of quantity and sequence identity. Provisional assessments show that naïve threshold-based approaches generate super-reads that are accurate (∼97 % haplotype accuracy, compared to ∼78 % when UMIs are not used), and the application of a more nuanced machine learning approach increases the accuracy to ∼99.5 % depending on the level of certainty desired. With these features, UMIs may greatly simplify probabilistic genotyping systems and reduce uncertainty. However, the ability to interpret alleles at trace levels also permits the interpretation, characterization and quantification of contamination as well as somatic variation (including somatic stutter), which may present newfound challenges.
Collapse
Affiliation(s)
- August E Woerner
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA; Department of Microbiology, Immunology and Genetics, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA.
| | - Sammed Mandape
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA
| | - Jonathan L King
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA
| | - Melissa Muenzler
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA
| | - Benjamin Crysup
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA
| | - Bruce Budowle
- Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA; Department of Microbiology, Immunology and Genetics, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA
| |
Collapse
|
40
|
Borsuk LA, Steffen CR, Kiesler KM, Vallone PM, Gettings KB. Sequence-based U.S. population data for 7 X-STR loci. FORENSIC SCIENCE INTERNATIONAL: REPORTS 2020. [DOI: 10.1016/j.fsir.2020.100160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
|
41
|
Khubrani YM, Jobling MA, Wetton JH. Massively parallel sequencing of sex-chromosomal STRs in Saudi Arabia reveals patrilineage-associated sequence variants. Forensic Sci Int Genet 2020; 49:102402. [PMID: 33035796 DOI: 10.1016/j.fsigen.2020.102402] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Revised: 09/18/2020] [Accepted: 09/27/2020] [Indexed: 11/27/2022]
Abstract
Massively parallel sequencing (MPS) of forensic STRs has the potential to reveal additional allele diversity compared to conventional capillary electrophoresis (CE) typing strategies, but population studies are currently relatively few in number. The Verogen ForenSeq™ DNA Signature Prep Kit includes both Y-STRs and X-STRs among its targeted loci, and here we report the sequences of these loci, analysed using Verogen's ForenSeq™ Universal Analysis Software (UAS) v1.3 and STRait Razor v3.0, in a representative sample of 89 Saudi Arabian males. We identified 56 length variants (equivalent to CE alleles) and 75 repeat sequence sub-variants across the six X-STRs analysed; equivalent figures for the set of 24 Y-STRs were 147 and 192 respectively. We also observed two flanking sequence variants for the X-, and six for the Y-STRs. Recovery of sequence data and concordance with CE data (where available) across the tested loci was good, though rare flanking variation affected interpretation and allele calling at DYF387S1 and DXS7132. Examination of flanking sequences of the Y-STRs revealed five SNPs (L255, M4790, BY7692, Z16708 and S17543) previously shown to define specific haplogroups by Y-chromosome sequencing. These define Y-haplogroups in 62 % of our sample, a proportion that increases to 91 % when haplogroup-associated repeat-sequence motifs are also considered. A population-level comparison of the Saudi Arabian X-STRs with a global sample showed our dataset to be part of a large cluster of populations of West Eurasian and Middle Eastern origin.
Collapse
Affiliation(s)
- Yahya M Khubrani
- Department of Genetics & Genome Biology, University of Leicester, University Road, Leicester, UK; Forensic Genetics Laboratory, General Administration of Criminal Evidence, Public Security, Ministry of Interior, Saudi Arabia
| | - Mark A Jobling
- Department of Genetics & Genome Biology, University of Leicester, University Road, Leicester, UK.
| | - Jon H Wetton
- Department of Genetics & Genome Biology, University of Leicester, University Road, Leicester, UK.
| |
Collapse
|
42
|
Barrio PA, García Ó, Phillips C, Prieto L, Gusmão L, Fernández C, Casals F, Freitas JM, González-Albo MDC, Martín P, Mosquera A, Navarro-Vera I, Paredes M, Pérez JA, Pinzón A, Rasal R, Ruiz-Ramírez J, Trindade BR, Alonso A. The first GHEP-ISFG collaborative exercise on forensic applications of massively parallel sequencing. Forensic Sci Int Genet 2020; 49:102391. [PMID: 32957016 DOI: 10.1016/j.fsigen.2020.102391] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Revised: 08/20/2020] [Accepted: 08/28/2020] [Indexed: 01/17/2023]
Abstract
One of the main goals of the Spanish and Portuguese-Speaking Working Group of the International Society for Forensic Genetics (GHEP-ISFG) is to promote and contribute to the development and dissemination of scientific knowledge in the field of forensic genetics. The GHEP-ISFG supports several Working Commissions which develop different scientific activities. One of them, the Working Commission on "Massively Parallel Sequencing (MPS): Forensic Applications", organized its first collaborative exercise on forensic applications of MPS technology in 2019. The aim of this exercise was to assess the concordance between the MPS results and those obtained with conventional technologies (capillary electrophoresis and Sanger sequencing), as well as to compare the results obtained within the different MPS platforms and/or the different kits/panels and analysis software packages (commercial and open-access) available on the market. The seven participating laboratories analyzed some samples of the annual GHEP-ISFG proficiency test (EIADN No. 27 (2019)), using Ion Torrent™ or MiSeq FGx® platforms. Six of them sent autosomal STR sequence data, five laboratories performed MPS analysis of individual identification SNPs, four laboratories reported MPS data of Y-chromosomal STRs, and X-chromosomal STRs, three laboratories performed MPS analysis of ancestry informative SNPs and phenotype informative SNPs, two labs performed MPS analysis of the mitochondrial DNA control region, and only one lab produced MPS data of lineage informative SNPs. Autosomal STR sequencing results were highly concordant to the consensus obtained by capillary electrophoresis in the EIADN No. 27 (2019) exercise. Furthermore, in general, a high level of concordance was observed between the results of the participating laboratories, regardless of the platform used. The main discordances were due to errors during the analysis process or from sequence data obtained with low depth of coverage. In this paper we highlight some issues that still arise, such as standardization of the nomenclature for STRs analyzed by sequencing with MPS, the universal uptake of a nomenclature framework by the analysis software, and well established validation and accreditation of the new MPS platforms for use in routine forensic case-work.
Collapse
Affiliation(s)
- Pedro A Barrio
- Working Commission on "Massively Parallel Sequencing (MPS): Forensic Applications" of the GHEP-ISFG (The Spanish and Portuguese Speaking Working Group of the International Society for Forensic Genetics), Spain; Biology Service, National Institute of Toxicology and Forensic Sciences, Department of Madrid, Spain.
| | - Óscar García
- Working Commission on "Massively Parallel Sequencing (MPS): Forensic Applications" of the GHEP-ISFG (The Spanish and Portuguese Speaking Working Group of the International Society for Forensic Genetics), Spain; Forensic Science Unit, Forensic Genetics Section, Basque Country Police, Erandio, Bizkaia, Spain
| | - Christopher Phillips
- Working Commission on "Massively Parallel Sequencing (MPS): Forensic Applications" of the GHEP-ISFG (The Spanish and Portuguese Speaking Working Group of the International Society for Forensic Genetics), Spain; Forensic Genetics Unit, University of Santiago de Compostela, Spain
| | - Lourdes Prieto
- Working Commission on "Massively Parallel Sequencing (MPS): Forensic Applications" of the GHEP-ISFG (The Spanish and Portuguese Speaking Working Group of the International Society for Forensic Genetics), Spain; Forensic Genetics Unit, University of Santiago de Compostela, Spain; Comisaría General de Policía Científica, Madrid, Spain
| | - Leonor Gusmão
- Working Commission on "Massively Parallel Sequencing (MPS): Forensic Applications" of the GHEP-ISFG (The Spanish and Portuguese Speaking Working Group of the International Society for Forensic Genetics), Spain; DNA Diagnostics Laboratory (LDD), State University of Rio de Janeiro (UERJ), Rio de Janeiro, Brazil
| | - Coro Fernández
- Quality Service, National Institute of Toxicology and Forensic Sciences, Department of Madrid, Spain
| | - Ferran Casals
- Servei de Genòmica, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Spain
| | - Jorge M Freitas
- Instituto Nacional de Criminalística, Polícia Federal, Brazil
| | | | - Pablo Martín
- Biology Service, National Institute of Toxicology and Forensic Sciences, Department of Madrid, Spain
| | - Ana Mosquera
- Forensic Genetics Unit, University of Santiago de Compostela, Spain
| | | | - Manuel Paredes
- Subdirección de Investigación Científica, Instituto Nacional de Medicina Legal y Ciencias Forenses, Colombia
| | - Juan Antonio Pérez
- Forensic Science Unit, Forensic Genetics Section, Basque Country Police, Erandio, Bizkaia, Spain
| | - Andrea Pinzón
- Grupo Nacional de Ciencias Forenses, Instituto Nacional de Medicina Legal y Ciencias Forenses, Colombia
| | - Raquel Rasal
- Servei de Genòmica, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Spain
| | | | | | - Antonio Alonso
- Working Commission on "Massively Parallel Sequencing (MPS): Forensic Applications" of the GHEP-ISFG (The Spanish and Portuguese Speaking Working Group of the International Society for Forensic Genetics), Spain; Biology Service, National Institute of Toxicology and Forensic Sciences, Department of Madrid, Spain
| |
Collapse
|
43
|
Wu R, Peng D, Ren H, Li R, Li H, Wang N, Shen X, Huang E, Zhang Y, Sun H. Characterization of genetic polymorphisms in Nigerians residing in Guangzhou using massively parallel sequencing. Forensic Sci Int Genet 2020; 48:102323. [PMID: 32574994 DOI: 10.1016/j.fsigen.2020.102323] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Revised: 05/21/2020] [Accepted: 06/03/2020] [Indexed: 01/13/2023]
Abstract
African populations exhibit extensive linguistic and cultural diversity but are less studied from a population genetic standpoint. Although much genetic data on admixed African individuals, such as African Americans, have been published, genetic polymorphism data, especially that based on sequence-based typing, are still insufficient in indigenous Africans. In this study, we examined the genetic diversity of 85 Nigerians residing in Guangzhou, China. Forensically relevant genetic markers, including autosomal short tandem repeats (A-STRs), X-chromosomal STRs (X-STRs), Y-chromosomal STRs (Y-STRs), and identity-informative single nucleotide polymorphisms (iiSNPs) were genotyped to uncover the genetic polymorphisms of this population. Sequence-based allelic variations were observed in 22 A-STRs, ten Y-STRs, and four X-STRs. Using massively parallel sequencing (MPS), the allele number increased from 475 length-based alleles to 683 sequence-based alleles. Compared to other populations, the overall observed heterozygosity of the 27 A-STRs was the highest in Nigerians, which reflected the higher genetic diversity of this population. The combined match probability of the 27 A-STRs was low at 9.06 × 10-38. When both A-STRs and iiSNPs were considered, the cumulative discrimination power, and combined power of exclusion for duo and trio paternity testing was 1-2.97 × 10-57, 1-2.20 × 10-10 and 1-4.61 × 10-17, respectively, which demonstrated that the STRs and SNPs analyzed here can be applied to forensic investigations. In summary, this study uncovers the genetic features of the Nigerian population and provides valuable frequency data for forensic applications.
Collapse
Affiliation(s)
- Riga Wu
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, PR China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, PR China
| | - Dan Peng
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, PR China
| | - Han Ren
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, PR China
| | - Ran Li
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, PR China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, PR China
| | - Haixia Li
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, PR China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, PR China
| | - Nana Wang
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, PR China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, PR China
| | - Xuefeng Shen
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, PR China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, PR China
| | - Erwen Huang
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, PR China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, PR China.
| | - Yinming Zhang
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, PR China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, PR China.
| | - Hongyu Sun
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, PR China; Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, PR China.
| |
Collapse
|
44
|
Cho S, Shin KJ, Bae SJ, Kwon YL, Lee SD. Improved STR analysis of degraded DNA from human skeletal remains through in-house MPS-STR panel. Electrophoresis 2020; 41:1600-1605. [PMID: 32725901 DOI: 10.1002/elps.202000070] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2020] [Revised: 07/06/2020] [Accepted: 07/24/2020] [Indexed: 11/09/2022]
Abstract
DNA analysis of degraded samples and low-copy number DNA derived from skeletal remains, one of the most challenging forensic tasks, is common in disaster victim identification and genetic analysis of historical materials. Massively parallel sequencing (MPS) is a useful technique for STR analysis that enables the sequencing of smaller amplicons compared with conventional capillary electrophoresis (CE), which is valuable for the analysis of degraded DNA. In this study, 92 samples of human skeletal remains (70+ years postmortem) were tested using an in-house MPS-STR system designed for the analysis of degraded DNA. Multiple intrinsic factors of DNA from skeletal remains that affect STR typing were assessed. The recovery of STR alleles was influenced more by DNA input amount for amplification rather than DNA degradation, which may be attributed from the high quantity and quality of libraries prepared for MPS run. In addition, the higher success rate of STR typing was achieved using the MPS-STR system compared with a commercial CE-STR system by providing smaller sized fragments for amplification. The results can provide constructive information for the analysis of degraded sample, and this MPS-STR system will contribute in forensic application with regard to skeletal remain sample investigation.
Collapse
Affiliation(s)
- Sohee Cho
- Institute of Forensic and Anthropological Science, Seoul National University College of Medicine, Seoul, Korea
| | - Kyoung-Jin Shin
- Department of Forensic Medicine, Yonsei University College of Medicine, Seoul, Korea
| | - Su-Jin Bae
- Department of Forensic Medicine, Yonsei University College of Medicine, Seoul, Korea
| | - Ye-Lim Kwon
- Department of Forensic Medicine, Yonsei University College of Medicine, Seoul, Korea
| | - Soong Deok Lee
- Institute of Forensic and Anthropological Science, Seoul National University College of Medicine, Seoul, Korea.,Department of Forensic Medicine, Seoul National University College of Medicine, Seoul, Korea
| |
Collapse
|
45
|
Identification of sequence polymorphisms at 58 STRs and 94 iiSNPs in a Tibetan population using massively parallel sequencing. Sci Rep 2020; 10:12225. [PMID: 32699278 PMCID: PMC7376188 DOI: 10.1038/s41598-020-69137-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2020] [Accepted: 06/16/2020] [Indexed: 01/26/2023] Open
Abstract
Massively parallel sequencing (MPS) has rapidly become a promising method for forensic DNA typing, due to its ability to detect a large number of markers and samples simultaneously in a single reaction, and sequence information can be obtained directly. In the present study, two kinds of forensic genetic markers, short tandem repeat (STR) and identity-informative single nucleotide polymorphism (iiSNP) were analyzed simultaneously using ForenSeq DNA Signature Prep Kit, a commercially available kit on MPS platform. A total of 152 DNA markers, including 27 autosomal STR (A-STR) loci, 24 Y chromosomal STR (Y-STR) loci, 7 X chromosomal STR (X-STR) loci and 94 iiSNP loci were genotyped for 107 Tibetan individuals (53 males and 54 females). Compared with length-based STR typing methods, 112 more A-STR alleles, 41 more Y-STR alleles, and 24 more X-STR alleles were observed at 17 A-STRs, 9 Y-STRs, and 5 X-STRs using sequence-based approaches. Thirty-nine novel sequence variations were observed at 20 STR loci. When the flanking regions were also analyzed in addition to target SNPs at the 94 iiSNPs, 38 more alleles were identified. Our study provided an adequate genotype and frequencies data of the two types of genetic markers for forensic practice. Moreover, we also proved that this panel is highly polymorphic and informative in Tibetan population, and should be efficient in forensic kinship testing and personal identification cases.
Collapse
|
46
|
A case of 46,XX/46,XX chimerism in a phenotypically normal woman. Int J Legal Med 2020; 134:2045-2051. [PMID: 32361859 DOI: 10.1007/s00414-020-02296-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Accepted: 04/03/2020] [Indexed: 10/24/2022]
Abstract
Chimerism is the presence of two genetically different cell lines within a single organism, which is rarely observed in humans. Usually, chimerism in the human body is revealed by the finding of an abnormal phenotype during a medical examination or is unexpectedly detected in routine genetic analysis. However, the incidence or underlying mechanism of chimerism remains unclear due to the lack of information on this infrequent biological event. A phenotypically normal woman with a 46,XX karyotype and atypical short tandem repeat (STR) allelic patterns observed in DNA analysis was investigated with various genetic testing methods, including STR typing based on capillary electrophoresis and massively parallel sequencing, genome-wide SNP array, and a differentially methylated parental allele assay (DMPA). The proband's parents were not available for testing to discriminate the parental allelic contribution, but the parents' alleles were recovered from testing the proband's siblings. Based on the results consistently found in multiple analyses using STR and single nucleotide polymorphism (SNP) polymorphism markers, dispermic fertilization was suggested as the underlying mechanism. The application of various molecular genetic testing methods was used to elucidate the chimerism observed in the proband in this study. In the future, the development of novel genetic markers or techniques, such as DMPA, may have potential use in the investigation of chimerism.
Collapse
|
47
|
Wang D, Tao R, Li Z, Pan D, Wang Z, Li C, Shi Y. STRsearch: a new pipeline for targeted profiling of short tandem repeats in massively parallel sequencing data. Hereditas 2020; 157:8. [PMID: 32172688 PMCID: PMC7075041 DOI: 10.1186/s41065-020-00120-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Accepted: 02/18/2020] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Short tandem repeats (STRs) are important polymorphism makers for human identification and kinship analyses in forensic science. With the continuous development of massively parallel sequencing (MPS), more laboratories have utilized this technology for forensic applications. Existing STR genotyping tools, mostly developed for whole-genome sequencing data, are not effective for MPS data. More importantly, their backward compatibility with the conventional capillary electrophoresis (CE) technology has not been evaluated and guaranteed. RESULTS In this study, we developed a new end-to-end pipeline called STRsearch for STR-MPS data analysis. The STRsearch can not only determine the allele by counting repeat patterns and INDELs that are actually in the STR region, but it also translates MPS results into standard STR nomenclature (numbers and letters). We evaluated the performance of STRsearch in two forensic sequencing datasets, and the concordance with CE genotypes was 75.73 and 75.75%, increasing 12.32 and 9.05% than the existing tool named STRScan, respectively. Additionally, we trained a base classifier using sequence properties and used it to predict the probability of correct genotyping at a given locus, resulting in the highest accuracy of 96.13%. CONCLUSIONS All these results demonstrated that STRsearch was a better tool to protect the backward compatibility with CE for the targeted STR profiling in MPS data. STRsearch is available as open-source software at https://github.com/AnJingwd/STRsearch.
Collapse
Affiliation(s)
- Dong Wang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai, China
| | - Ruiyang Tao
- Shanghai Key Laboratory of Forensic Medicine, Shanghai Forensic Service Platform, Academy of Forensic Science, Ministry of Justice, Shanghai, 200063, People's Republic of China
| | - Zhiqiang Li
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai, China
| | - Dun Pan
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai, China
| | - Zhuo Wang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai, China.
| | - Chengtao Li
- Shanghai Key Laboratory of Forensic Medicine, Shanghai Forensic Service Platform, Academy of Forensic Science, Ministry of Justice, Shanghai, 200063, People's Republic of China.
| | - Yongyong Shi
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai, China.
| |
Collapse
|
48
|
Ambers A, Bus MM, King JL, Jones B, Durst J, Bruseth JE, Gill-King H, Budowle B. Forensic genetic investigation of human skeletal remains recovered from the La Belle shipwreck. Forensic Sci Int 2020; 306:110050. [DOI: 10.1016/j.forsciint.2019.110050] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 10/10/2019] [Accepted: 11/08/2019] [Indexed: 10/25/2022]
|
49
|
Reverse Complement PCR: A novel one-step PCR system for typing highly degraded DNA for human identification. Forensic Sci Int Genet 2020; 44:102201. [DOI: 10.1016/j.fsigen.2019.102201] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 10/22/2019] [Accepted: 11/03/2019] [Indexed: 12/12/2022]
|
50
|
Sturk-Andreaggi K, Parson W, Allen M, Marshall C. Impact of the sequencing method on the detection and interpretation of mitochondrial DNA length heteroplasmy. Forensic Sci Int Genet 2020; 44:102205. [DOI: 10.1016/j.fsigen.2019.102205] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 11/09/2019] [Accepted: 11/09/2019] [Indexed: 02/04/2023]
|