1
|
Song M, Zhou Y, Zhao C, Song F, Hou Y. YHP: Y-chromosome Haplogroup Predictor for predicting male lineages based on Y-STRs. Forensic Sci Int 2024; 361:112113. [PMID: 38936202 DOI: 10.1016/j.forsciint.2024.112113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 05/24/2024] [Accepted: 06/16/2024] [Indexed: 06/29/2024]
Abstract
Human Y chromosome reflects the evolutionary process of males. Male lineage tracing by Y chromosome is of great use in evolutionary, forensic, and anthropological studies. Identifying the male lineage based on the specific distribution of Y haplogroups narrows down the investigation scope, which has been used in forensic scenarios. However, existing software aids in familial searching using Y-STRs (Y-chromosome short tandem repeats) to predict Y-SNP (Y-chromosome single nucleotide polymorphism) haplogroups, they often lack resolution. In this study, we developed YHP (Y Haplogroup Predictor), a novel software offering high-resolution haplogroup inference without requiring extensive Y-SNP sequencing. Leveraging existing datasets (219 haplogroups, 4064 samples in total), YHP predicts haplogroups with 0.923 accuracy under the highest haplogroup resolution, employing a random forest algorithm. YHP, available on Github (https://github.com/cissy123/YHP-Y-Haplogroup-Predictor-), facilitates high-resolution haplogroup prediction, haplotype mismatch analysis, and haplotype similarity comparison. Notably, it demonstrates efficacy in East Asian populations, benefiting from training data from eight distinct East Asian ethnic populations. Moreover, it enables seamless integration of additional training sets, extending its utility to diverse populations.
Collapse
Affiliation(s)
- Mengyuan Song
- Department of Forensic Genetics, West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu 610041, China; Department of Laboratory Medicine, West China Hospital, Sichuan University, Chengdu, China
| | - Yuxiang Zhou
- Department of Forensic Genetics, West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu 610041, China
| | - Chenxi Zhao
- College of Computer Science, Sichuan University, Chengdu, China
| | - Feng Song
- Department of Forensic Genetics, West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu 610041, China.
| | - Yiping Hou
- Department of Forensic Genetics, West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu 610041, China.
| |
Collapse
|
2
|
Link V, Zavaleta YJA, Reyes RJ, Ding L, Wang J, Rohlfs RV, Edge MD. Microsatellites used in forensics are in regions enriched for trait-associated variants. iScience 2023; 26:107992. [PMID: 37841589 PMCID: PMC10570123 DOI: 10.1016/j.isci.2023.107992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 08/10/2023] [Accepted: 09/18/2023] [Indexed: 10/17/2023] Open
Abstract
The 20 short tandem repeat (STR) loci of the combined DNA index system (CODIS) are the basis of the vast majority of forensic genetics in the United States. One argument for permissive rules about the collection of CODIS genotypes is that the CODIS loci are thought to contain little information about ancestry or traits. However, in the past 20 years, a growing field has identified hundreds of thousands of genotype-trait associations. Here, we conduct a survey of the landscape of such associations surrounding the CODIS loci as compared with non-CODIS STRs. Although this study cannot establish or quantify associations between CODIS genotypes and phenotypes, we find that the regions around the CODIS loci are enriched for both known pathogenic variants (> 90th percentile) and for trait-associated SNPs identified in genome-wide association studies (GWAS) (≥ 95th percentile in 10kb and 100kb flanking regions), compared with other random sets of autosomal tetranucleotide-repeat STRs.
Collapse
Affiliation(s)
- Vivian Link
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | | | - Rochelle-Jan Reyes
- Department of Biology, San Francisco State University, San Francisco, CA, USA
| | - Linda Ding
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Judy Wang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Rori V Rohlfs
- Department of Biology, San Francisco State University, San Francisco, CA, USA
- Department of Data Science and Institute of Ecology and Evolution, University of Oregon, Eugene, OR, USA
| | - Michael D Edge
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
3
|
Link V, Zavaleta YJA, Reyes RJ, Ding L, Wang J, Rohlfs RV, Edge MD. Microsatellites used in forensics are located in regions unusually rich in trait-associated variants. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.07.531629. [PMID: 36945578 PMCID: PMC10028909 DOI: 10.1101/2023.03.07.531629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2023]
Abstract
The 20 short tandem repeat (STR) markers of the combined DNA index system (CODIS) are the basis of the vast majority of forensic genetics in the United States. One argument for permissive rules about the collection of CODIS genotypes is that the CODIS markers are thought to contain information relevant to identification only (such as a human fingerprint would), with little information about ancestry or traits. However, in the past 20 years, a quickly growing field has identified hundreds of thousands of genotype-trait associations. Here we conduct a survey of the landscape of such associations surrounding the CODIS loci as compared with non-CODIS STRs. We find that the regions around the CODIS markers are enriched for both known pathogenic variants (>90th percentile) and for SNPs identified as trait-associated in genome-wide association studies (GWAS) (≥95th percentile in 10kb and 100kb flanking regions), compared with other random sets of autosomal tetranucleotide-repeat STRs. Although it is not obvious how much phenotypic information CODIS would need to convey to strain the "DNA fingerprint" analogy, the CODIS markers, considered as a set, are in regions unusually dense with variants with known phenotypic associations.
Collapse
Affiliation(s)
- Vivian Link
- Department of Quantitative and Computational Biology, University of Southern California
| | | | | | - Linda Ding
- Department of Quantitative and Computational Biology, University of Southern California
| | - Judy Wang
- Department of Quantitative and Computational Biology, University of Southern California
| | - Rori V. Rohlfs
- Department of Biology, San Francisco State University
- Department of Computer Science and Institute of Ecology and Evolution, University of Oregon
| | - Michael D. Edge
- Department of Quantitative and Computational Biology, University of Southern California
| |
Collapse
|
4
|
Ogbunugafor CB, Edge MD. Gattaca as a lens on contemporary genetics: marking 25 years into the film's "not-too-distant" future. Genetics 2022; 222:iyac142. [PMID: 36218390 PMCID: PMC9713434 DOI: 10.1093/genetics/iyac142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 09/05/2022] [Indexed: 11/13/2022] Open
Abstract
The 1997 film Gattaca has emerged as a canonical pop culture reference used to discuss modern controversies in genetics and bioethics. It appeared in theaters a few years prior to the announcement of the "completion" of the human genome (2000), as the science of human genetics was developing a renewed sense of its social implications. The story is set in a near-future world in which parents can, with technological assistance, influence the genetic composition of their offspring on the basis of predicted life outcomes. The current moment-25 years after the film's release-offers an opportunity to reflect on where society currently stands with respect to the ideas explored in Gattaca. Here, we review and discuss several active areas of genetic research-genetic prediction, embryo selection, forensic genetics, and others-that interface directly with scenes and concepts in the film. On its silver anniversary, we argue that Gattaca remains an important reflection of society's expectations and fears with respect to the ways that genetic science has manifested in the real world. In accompanying supplemental material, we offer some thought questions to guide group discussions inside and outside of the classroom.
Collapse
Affiliation(s)
- C Brandon Ogbunugafor
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA
- Santa Fe Institute, Santa Fe, NM 87501, USA
- Vermont Complex Systems Center, Burlington, VT 05401, USA
| | - Michael D Edge
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
5
|
Bañuelos MM, Zavaleta YJA, Roldan A, Reyes RJ, Guardado M, Chavez Rojas B, Nyein T, Rodriguez Vega A, Santos M, Huerta-Sanchez E, Rohlfs RV. Associations between forensic loci and expression levels of neighboring genes may compromise medical privacy. Proc Natl Acad Sci U S A 2022; 119:e2121024119. [PMID: 36166477 PMCID: PMC9546536 DOI: 10.1073/pnas.2121024119] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 08/29/2022] [Indexed: 11/18/2022] Open
Abstract
A set of 20 short tandem repeats (STRs) is used by the US criminal justice system to identify suspects and to maintain a database of genetic profiles for individuals who have been previously convicted or arrested. Some of these STRs were identified in the 1990s, with a preference for markers in putative gene deserts to avoid forensic profiles revealing protected medical information. We revisit that assumption, investigating whether forensic genetic profiles reveal information about gene-expression variation or potential medical information. We find six significant correlations (false discovery rate = 0.23) between the forensic STRs and the expression levels of neighboring genes in lymphoblastoid cell lines. We explore possible mechanisms for these associations, showing evidence compatible with forensic STRs causing expression variation or being in linkage disequilibrium with a causal locus in three cases and weaker or potentially spurious associations in the other three cases. Together, these results suggest that forensic genetic loci may reveal expression levels and, perhaps, medical information.
Collapse
Affiliation(s)
- Mayra M. Bañuelos
- Department of Mathematics, San Francisco State University, San Francisco, CA 94132
- Ecology, Evolution and Organismal Biology, Brown University, Providence, RI 02912
- Center for Computational and Molecular Biology, Brown University, Providence, RI 02912
| | | | - Alennie Roldan
- Department of Biology, San Francisco State University, San Francisco, CA 94132
| | - Rochelle-Jan Reyes
- Department of Biology, San Francisco State University, San Francisco, CA 94132
| | - Miguel Guardado
- Department of Mathematics, San Francisco State University, San Francisco, CA 94132
| | | | - Thet Nyein
- Department of Mathematics, San Francisco State University, San Francisco, CA 94132
| | - Ana Rodriguez Vega
- Department of Biology, San Francisco State University, San Francisco, CA 94132
| | - Maribel Santos
- Department of Biology, San Francisco State University, San Francisco, CA 94132
| | - Emilia Huerta-Sanchez
- Ecology, Evolution and Organismal Biology, Brown University, Providence, RI 02912
- Center for Computational and Molecular Biology, Brown University, Providence, RI 02912
| | - Rori V. Rohlfs
- Department of Biology, San Francisco State University, San Francisco, CA 94132
| |
Collapse
|
6
|
Morrison ML, Alcala N, Rosenberg NA. FSTruct: An F ST -based tool for measuring ancestry variation in inference of population structure. Mol Ecol Resour 2022; 22:2614-2626. [PMID: 35596736 PMCID: PMC9544611 DOI: 10.1111/1755-0998.13647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 03/09/2022] [Accepted: 05/13/2022] [Indexed: 11/30/2022]
Abstract
In model-based inference of population structure from individual-level genetic data, individuals are assigned membership coefficients in a series of statistical clusters generated by clustering algorithms. Distinct patterns of variability in membership coefficients can be produced for different groups of individuals, for example, representing different predefined populations, sampling sites or time periods. Such variability can be difficult to capture in a single numerical value; membership coefficient vectors are multivariate and potentially incommensurable across predefined groups, as the number of clusters over which individuals are distributed can vary among groups of interest. Further, two groups might share few clusters in common, so that membership coefficient vectors are concentrated on different clusters. We introduce a method for measuring the variability of membership coefficients of individuals in a predefined group, making use of an analogy between variability across individuals in membership coefficient vectors and variation across populations in allele frequency vectors. We show that in a model in which membership coefficient vectors in a population follow a Dirichlet distribution, the measure increases linearly with a parameter describing the variance of a specified component of the membership vector and does not depend on its mean. We apply the approach, which makes use of a normalized FST statistic, to data on inferred population structure in three example scenarios. We also introduce a bootstrap test for equivalence of two or more predefined groups in their level of membership coefficient variability. Our methods are implemented in the r package FSTruct.
Collapse
Affiliation(s)
| | - Nicolas Alcala
- Rare Cancers Genomics Team (RCG)Genomic Epidemiology Branch (GEM)International Agency for Research on Cancer/World Health Organisation (IARC/WHO)LyonFrance
| | | |
Collapse
|
7
|
Kidd KK, Pakstis AJ, Gandotra N, Scharfe C, Podini D. A multipurpose panel of microhaplotypes for use with STR markers in casework. Forensic Sci Int Genet 2022; 60:102729. [PMID: 35696960 PMCID: PMC11071123 DOI: 10.1016/j.fsigen.2022.102729] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Revised: 05/30/2022] [Accepted: 05/31/2022] [Indexed: 11/19/2022]
Abstract
A small panel of highly informative loci that can be genotyped on the same equipment as the standard CODIS short tandem repeat (STR) markers has strong potential for application in forensic casework. Single nucleotide polymorphisms (SNPs) can be typed by a couple of methods on capillary electrophoresis (CE) machines and on sequencers, but the amount of information relative to the laboratory effort has hindered use of SNPs in actual casework. Insertion-deletion markers (InDels) suffer from similar problems. Microhaplotypes (MHs) are much more informative per locus but have similar technical difficulties unless they are typed by massively parallel sequencing (MPS). As forensic labs are acquiring sequencing machines, MHs become more likely to be used in casework, especially if multiplexed with STRs. Here we present the details of a multipurpose panel of 24 MHs with the highest effective number of alleles (Ae) from previous work. An augmented STR panel of 24 loci (20 CODIS markers plus four commonly typed STRs) is also considered. The Ae and ancestry informativeness (In) distributions of these two datasets are compared. The MH panel is shown to have better individualization and population distinction than the augmented CODIS STRs. We note that the 24 MHs should be better for mixture analyses than the STRs. Finally, we suggest that a commercial kit including both the standard CODIS markers and this set of 24 MH would greatly improve the discrimination power over that of current commercial assays.
Collapse
Affiliation(s)
- Kenneth K Kidd
- Yale University School of Medicine, Department of Genetics, 333 Cedar Street, New Haven, CT 06520, United States.
| | - Andrew J Pakstis
- Yale University School of Medicine, Department of Genetics, 333 Cedar Street, New Haven, CT 06520, United States
| | - Neeru Gandotra
- Yale University School of Medicine, Department of Genetics, 333 Cedar Street, New Haven, CT 06520, United States
| | - Curt Scharfe
- Yale University School of Medicine, Department of Genetics, 333 Cedar Street, New Haven, CT 06520, United States
| | - Daniele Podini
- The George Washington University, Department of Forensic Science, 2100 Foxhall Road, NW, Washington, DC 20007, United States
| |
Collapse
|
8
|
Alcala N, Rosenberg NA. Mathematical constraints on FST: multiallelic markers in arbitrarily many populations. Philos Trans R Soc Lond B Biol Sci 2022; 377:20200414. [PMID: 35430885 PMCID: PMC9014193 DOI: 10.1098/rstb.2020.0414] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 10/23/2021] [Indexed: 11/12/2022] Open
Abstract
Interpretations of values of the FST measure of genetic differentiation rely on an understanding of its mathematical constraints. Previously, it has been shown that FST values computed from a biallelic locus in a set of multiple populations and FST values computed from a multiallelic locus in a pair of populations are mathematically constrained as a function of the frequency of the allele that is most frequent across populations. We generalize from these cases to report here the mathematical constraint on FST given the frequency M of the most frequent allele at a multiallelic locus in a set of multiple populations. Using coalescent simulations of an island model of migration with an infinitely-many-alleles mutation model, we argue that the joint distribution of FST and M helps in disentangling the separate influences of mutation and migration on FST. Finally, we show that our results explain a puzzling pattern of microsatellite differentiation: the lower FST in an interspecific comparison between humans and chimpanzees than in the comparison of chimpanzee populations. We discuss the implications of our results for the use of FST. This article is part of the theme issue 'Celebrating 50 years since Lewontin's apportionment of human diversity'.
Collapse
Affiliation(s)
- Nicolas Alcala
- Rare Cancers Genomics Team (RCG), Genetic Epidemiology Branch (GEM), International Agency for Research on Cancer/World Health Organization, Lyon 69008, France
| | - Noah A. Rosenberg
- Department of Biology, Stanford University, Stanford, CA 94305-5020, USA
| |
Collapse
|
9
|
Huszar TI, Bodmer WF, Hutnik K, Wetton JH, Jobling MA. Sequencing of autosomal, mitochondrial and Y-chromosomal forensic markers in the People of the British Isles cohort detects population structure dominated by patrilineages. Forensic Sci Int Genet 2022; 59:102725. [DOI: 10.1016/j.fsigen.2022.102725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 05/08/2022] [Accepted: 05/13/2022] [Indexed: 11/27/2022]
|
10
|
Maróstica AS, Nunes K, Castelli EC, Silva NSB, Weir BS, Goudet J, Meyer D. How HLA diversity is apportioned: influence of selection and relevance to transplantation. Philos Trans R Soc Lond B Biol Sci 2022; 377:20200420. [PMID: 35430892 PMCID: PMC9014195 DOI: 10.1098/rstb.2020.0420] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
In his 1972 paper ‘The apportionment of human diversity’, Lewontin showed that, when averaged over loci, genetic diversity is predominantly attributable to differences among individuals within populations. However, selection can alter the apportionment of diversity of specific genes or genomic regions. We examine genetic diversity at the human leucocyte antigen (HLA) loci, located within the major histocompatibility complex (MHC) region. HLA genes code for proteins that are critical to adaptive immunity and are well-documented targets of balancing selection. The single-nucleotide polymorphisms (SNPs) within HLA genes show strong signatures of balancing selection on large timescales and are broadly shared among populations, displaying low FST values. However, when we analyse haplotypes defined by these SNPs (which define ‘HLA alleles’), we find marked differences in frequencies between geographic regions. These differences are not reflected in the FST values because of the extreme polymorphism at HLA loci, illustrating challenges in interpreting FST. Differences in the frequency of HLA alleles among geographic regions are relevant to bone-marrow transplantation, which requires genetic identity at HLA loci between patient and donor. We discuss the case of Brazil's bone marrow registry, where a deficit of enrolled volunteers with African ancestry reduces the chance of finding donors for individuals with an MHC region of African ancestry. This article is part of the theme issue ‘Celebrating 50 years since Lewontin's apportionment of human diversity’.
Collapse
Affiliation(s)
- André Silva Maróstica
- Departamento de Genética e Biologia Evolutiva, Universidade de São Paulo, São Paulo, SP, Brazil
| | - Kelly Nunes
- Departamento de Genética e Biologia Evolutiva, Universidade de São Paulo, São Paulo, SP, Brazil
| | - Erick C. Castelli
- Departamento de Patologia, Universidade Estadual Paulista - Unesp, Faculdade de Medicina de Botucatu, Botucatu, SP, Brazil
- Molecular Genetics and Bioinformatics Laboratory, Experimental Research Unit, School of Medicine, São Paulo State University - Unesp, Botucatu, SP, Brazil
| | - Nayane S. B. Silva
- Molecular Genetics and Bioinformatics Laboratory, Experimental Research Unit, School of Medicine, São Paulo State University - Unesp, Botucatu, SP, Brazil
| | - Bruce S. Weir
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
| | - Jérôme Goudet
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
- Swiss Institute of Bioinformatics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Diogo Meyer
- Departamento de Genética e Biologia Evolutiva, Universidade de São Paulo, São Paulo, SP, Brazil
| |
Collapse
|
11
|
Jobling MA. Forensic genetics through the lens of Lewontin: population structure, ancestry and race. Philos Trans R Soc Lond B Biol Sci 2022; 377:20200422. [PMID: 35430883 PMCID: PMC9014189 DOI: 10.1098/rstb.2020.0422] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
In his famous 1972 paper, Richard Lewontin used ‘classical’ protein-based markers to show that greater than 85% of human genetic diversity was contained within, rather than between, populations. At that time, these same markers also formed the basis of forensic technology aiming to identify individuals. This review describes the evolution of forensic genetic methods into DNA profiling, and how the field has accounted for the apportionment of genetic diversity in considering the weight of forensic evidence. When investigative databases fail to provide a match to a crime-scene profile, specific markers can be used to seek intelligence about a suspect: these include inferences on population of origin (biogeographic ancestry) and externally visible characteristics, chiefly pigmentation of skin, hair and eyes. In this endeavour, ancestry and phenotypic variation are closely entangled. The markers used show patterns of inter- and intrapopulation diversity that are very atypical compared to the genome as a whole, and reinforce an apparent link between ancestry and racial divergence that is not systematically present otherwise. Despite the legacy of Lewontin's result, therefore, in a major area in which genetics coincides with issues of public interest, methods tend to exaggerate human differences and could thereby contribute to the reification of biological race. This article is part of the theme issue ‘Celebrating 50 years since Lewontin's apportionment of human diversity’.
Collapse
Affiliation(s)
- Mark A. Jobling
- Department of Genetics and Genome Biology, University of Leicester, University Road, Leicester LE1 7RH, UK
| |
Collapse
|
12
|
Laurent FX, Fischer A, Oldt RF, Kanthaswamy S, Buckleton JS, Hitchin S. Streamlining the decision-making process for international DNA kinship matching using Worldwide allele frequencies and tailored cutoff log 10LR thresholds. Forensic Sci Int Genet 2021; 57:102634. [PMID: 34871915 DOI: 10.1016/j.fsigen.2021.102634] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 10/13/2021] [Accepted: 11/15/2021] [Indexed: 11/30/2022]
Abstract
The identification of human remains belonging to missing persons is one of the main challenges for forensic genetics. Although other means of identification can be applied to missing person investigations, DNA is often extremely valuable to further support or refute potential associations. When reference DNA samples cannot be collected from personal items belonging to a missing person, a direct DNA identification cannot be carried out. However, identifications can be made indirectly using DNA from the missing person's relatives. The ranking of likelihood ratio (LR) values, which measure the fit of a missing person for any given pedigree, is often the first step in selecting candidates in a DNA database. Although implementing DNA kinship matching in a national environment is feasible, many challenges need to be resolved before applying this method to an international configuration. In this study, we present an innovative and intuitive method to perform international DNA kinship matching and facilitate the comparison of DNA profiles when the ancestry is unknown or unsure and/or when different marker sets are used. This straightforward method, which is based on calculations performed with the DNA matching software BONAPARTE, Worldwide allele frequencies and tailored cutoff log10LR thresholds, allows for the classification of potential candidates according to the strength of the DNA evidence and the predicted proportion of adventitious matches. This is a powerful method for streamlining the decision-making process in missing person investigations and DVI processes, especially when there are low numbers of overlapping typed STRs. Intuitive interpretation tables and a decision tree will help strengthen international data comparison for the identification of reported missing individuals discovered outside their national borders.
Collapse
Affiliation(s)
- François-Xavier Laurent
- International Criminal Police Organization - INTERPOL, DNA Unit, 200 quai Charles de Gaulle, 69006 Lyon, France.
| | - Andrea Fischer
- International Criminal Police Organization - INTERPOL, DNA Unit, 200 quai Charles de Gaulle, 69006 Lyon, France; Landeskriminalamt Baden-Württemberg, Taubenheimstr. 85, 70372 Stuttgart, Germany
| | - Robert F Oldt
- School of Mathematical and Natural Sciences, Arizona State University, Phoenix, AZ 85004, USA
| | - Sree Kanthaswamy
- School of Mathematical and Natural Sciences, Arizona State University, Phoenix, AZ 85004, USA
| | - John S Buckleton
- University of Auckland, Department of Statistics, Private Bag, 92019 Auckland, New Zealand
| | - Susan Hitchin
- International Criminal Police Organization - INTERPOL, DNA Unit, 200 quai Charles de Gaulle, 69006 Lyon, France.
| |
Collapse
|
13
|
Tran LH, Chu PTM, Nguyen TH, La HV, Nguyen HTH, Tran HT, Nguyen HM, Hoang H, Chu HH. Genetic structure and population connection of two Bouyei populations in northern Vietnam based on short tandem repeat analysis. Am J Hum Biol 2021; 34:e23702. [PMID: 34784439 DOI: 10.1002/ajhb.23702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 10/05/2021] [Accepted: 11/01/2021] [Indexed: 11/07/2022] Open
Abstract
OBJECTIVES Genetic characteristics were investigated based on short tandem repeat (STR) data to assess the relationship between two Vietnamese Bouyei populations in Vietnam. METHODS We collected hair and buccal swab samples from two separate Bouyei populations in the mountainous region of Northern Vietnam, which are the Bo Y in Ha Giang Province and the Tu Di in Lao Cai Province. The study included data of 23 autosomal and 27 Y-chromosome STRs loci of 96 unrelated participants from a total Vietnamese Bouyei population of under 3300 individuals. RESULTS The results showed that these STR markers are valuable for differentiation of individuals and human genetic studies in Vietnamese Bouyei populations. Genetic analysis indicated that Tu Di and Bo Y people were from the same Bouyei population in China. CONCLUSIONS The results supported the official historical records of the region and the classification of the Vietnamese government. Furthermore, the genetic data provided in this study will be helpful in investigating the genetic genealogy evolution and settlement or migration patterns of the Bouyei populations in Vietnam.
Collapse
Affiliation(s)
- Linh Huyen Tran
- National Key Laboratory of Gene Technology, Institute of Biotechnology (IBT), Vietnam Academy of Science and Technology (VAST), Hanoi, Vietnam
| | | | - Trang Hong Nguyen
- National Key Laboratory of Gene Technology, Institute of Biotechnology (IBT), Vietnam Academy of Science and Technology (VAST), Hanoi, Vietnam
| | - Hong Viet La
- Hanoi Pedagogical University 2, Vinh Phuc, Vietnam
| | - Hanh Thi Hong Nguyen
- National Key Laboratory of Gene Technology, Institute of Biotechnology (IBT), Vietnam Academy of Science and Technology (VAST), Hanoi, Vietnam
| | - Hoai Thu Tran
- National Key Laboratory of Gene Technology, Institute of Biotechnology (IBT), Vietnam Academy of Science and Technology (VAST), Hanoi, Vietnam
| | | | - Ha Hoang
- National Key Laboratory of Gene Technology, Institute of Biotechnology (IBT), Vietnam Academy of Science and Technology (VAST), Hanoi, Vietnam.,Centre of DNA Identification, IBT, VAST, Hanoi, Vietnam
| | - Hoang Ha Chu
- National Key Laboratory of Gene Technology, Institute of Biotechnology (IBT), Vietnam Academy of Science and Technology (VAST), Hanoi, Vietnam.,Graduate University of Science and Technology, VAST, Hanoi, Vietnam
| |
Collapse
|
14
|
Zhao X, Fan Y, Zeye MMJ, He W, Wen D, Wang C, Li J, Hua Z. A novel set of short microhaplotypes based on non-binary SNPs for forensic challenging samples. Int J Legal Med 2021; 136:43-53. [PMID: 34654943 DOI: 10.1007/s00414-021-02719-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Accepted: 09/28/2021] [Indexed: 01/23/2023]
Abstract
Short tandem repeats (STRs) are the most widely used genetic markers in forensic application, but they are not ideal genetic markers for the analysis of forensic challenging samples such as highly degraded or unbalanced mixed samples because of their relatively large amplicons and stutter peaks. In this study, we developed a set of short microhaplotypes based on non-binary SNPs with molecular extent sizes no longer than 60 bases and genotyped 100 unrelated individuals from northern Han groups. Our results showed this panel has similar discrimination power to STR kits, as the combined random match probability (CMP) reached 1.396 × 10-22 and mean effective number of alleles (Ae) was 3.59. The cumulative probability of exclusion for duos (CPE-duos) was 0.999919 and the cumulative probability of exclusion for trios (CPE-trios) was 0.9999999987, suggesting this panel could be applied for forensic personal identification and parentage testing independently. Population differentiation in 26 populations from the 1000 Genomes Project indicated this panel could distinguish populations from Africa, East Asia, South Asia, America, and Europe. These microhaplotypes based on non-binary SNPs have short amplicons, good discrimination power, no stutter artifacts, and have great potential in detection of highly degraded and unbalanced mixtures for personal identification, paternity testing, and ancestry inference.
Collapse
Affiliation(s)
- Xingchun Zhao
- School of Biopharmacy, China Pharmaceutical University, Nanjing, 211198, China.,National Engineering Laboratory for Forensic Science, Beijing, 100038, China
| | - Yang Fan
- National Engineering Laboratory for Forensic Science, Beijing, 100038, China
| | - Moutanou Modeste Judes Zeye
- Department of Forensic Science, School of Basic Medical Sciences, Central South University, No172. Tongzipo Road, Changsha, Hunan Province, 410013, People's Republic of China
| | - Wei He
- Department of Forensic Science, School of Basic Medical Sciences, Central South University, No172. Tongzipo Road, Changsha, Hunan Province, 410013, People's Republic of China
| | - Dan Wen
- Department of Forensic Science, School of Basic Medical Sciences, Central South University, No172. Tongzipo Road, Changsha, Hunan Province, 410013, People's Republic of China
| | - Chudong Wang
- Department of Forensic Science, School of Basic Medical Sciences, Central South University, No172. Tongzipo Road, Changsha, Hunan Province, 410013, People's Republic of China
| | - Jienan Li
- Department of Forensic Science, School of Basic Medical Sciences, Central South University, No172. Tongzipo Road, Changsha, Hunan Province, 410013, People's Republic of China.
| | - Zichun Hua
- School of Biopharmacy, China Pharmaceutical University, Nanjing, 211198, China.
| |
Collapse
|
15
|
Bae JH, Zhang DY. Predicting stability of DNA bulge at mononucleotide microsatellite. Nucleic Acids Res 2021; 49:7901-7908. [PMID: 34308470 PMCID: PMC8373066 DOI: 10.1093/nar/gkab616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 06/28/2021] [Accepted: 07/07/2021] [Indexed: 11/14/2022] Open
Abstract
Mononucleotide microsatellites are clinically and forensically crucial DNA sequences due to their high mutability and abundance in the human genome. As a mutagenic intermediate of an indel in a microsatellite and a consequence of probe hybridization after such mutagenesis, a bulge with structural degeneracy sliding within a microsatellite is formed. Stability of such dynamic bulges, however, is still poorly understood despite their critical role in cancer genomics and neurological disease studies. In this paper, we have built a model that predicts the thermodynamics of a sliding bulge at a microsatellite. We first identified 40 common bulge states that can be assembled into any sliding bulges, and then characterized them with toehold exchange energy measurement and the partition function. Our model, which is the first to predict the free energy of sliding bulges with more than three repeats, can infer the stability penalty of a sliding bulge of any sequence and length with a median prediction error of 0.22 kcal/mol. Patterns from the prediction clearly explain landscapes of microsatellites observed in the literature, such as higher mutation rates of longer microsatellites and C/G microsatellites.
Collapse
Affiliation(s)
- Jin H Bae
- Department of Bioengineering, Rice University, Houston, TX 77005, USA
| | - David Yu Zhang
- Department of Bioengineering, Rice University, Houston, TX 77005, USA.,Systems, Synthetic, and Physical Biology, Rice University, Houston, TX 77005, USA
| |
Collapse
|
16
|
Forensic proteomics. Forensic Sci Int Genet 2021; 54:102529. [PMID: 34139528 DOI: 10.1016/j.fsigen.2021.102529] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Revised: 05/06/2021] [Accepted: 05/07/2021] [Indexed: 12/19/2022]
Abstract
Protein is a major component of all biological evidence, often the matrix that embeds other biomolecules such as polynucleotides, lipids, carbohydrates, and small molecules. The proteins in a sample reflect the transcriptional and translational program of the originating cell types. Because of this, proteins can be used to identify body fluids and tissues, as well as convey genetic information in the form of single amino acid polymorphisms, the result of non-synonymous SNPs. This review explores the application and potential of forensic proteomics. The historical role that protein analysis played in the development of forensic science is examined. This review details how innovations in proteomic mass spectrometry have addressed many of the historical limitations of forensic protein science, and how the application of forensic proteomics differs from proteomics in the life sciences. Two more developed applications of forensic proteomics are examined in detail: body fluid and tissue identification, and proteomic genotyping. The review then highlights developing areas of proteomics that have the potential to impact forensic science in the near future: fingermark analysis, species identification, peptide toxicology, proteomic sex estimation, and estimation of post-mortem intervals. Finally, the review highlights some of the newer innovations in proteomics that may drive further development of the field. In addition to potential impact, this review also attempts to evaluate the stage of each application in the development, validation and implementation process. This review is targeted at investigators who are interested in learning about proteomics in a forensic context and expanding the amount of information they can extract from biological evidence.
Collapse
|
17
|
Lloyd JP, Soellner MB, Merajver SD, Li JZ. Impact of between-tissue differences on pan-cancer predictions of drug sensitivity. PLoS Comput Biol 2021; 17:e1008720. [PMID: 33630864 PMCID: PMC7906305 DOI: 10.1371/journal.pcbi.1008720] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Accepted: 01/18/2021] [Indexed: 11/24/2022] Open
Abstract
Increased availability of drug response and genomics data for many tumor cell lines has accelerated the development of pan-cancer prediction models of drug response. However, it is unclear how much between-tissue differences in drug response and molecular characteristics may contribute to pan-cancer predictions. Also unknown is whether the performance of pan-cancer models could vary by cancer type. Here, we built a series of pan-cancer models using two datasets containing 346 and 504 cell lines, each with MEK inhibitor (MEKi) response and mRNA expression, point mutation, and copy number variation data, and found that, while the tissue-level drug responses are accurately predicted (between-tissue ρ = 0.88–0.98), only 5 of 10 cancer types showed successful within-tissue prediction performance (within-tissue ρ = 0.11–0.64). Between-tissue differences make substantial contributions to the performance of pan-cancer MEKi response predictions, as exclusion of between-tissue signals leads to a decrease in Spearman’s ρ from a range of 0.43–0.62 to 0.30–0.51. In practice, joint analysis of multiple cancer types usually has a larger sample size, hence greater power, than for one cancer type; and we observe that higher accuracy of pan-cancer prediction of MEKi response is almost entirely due to the sample size advantage. Success of pan-cancer prediction reveals how drug response in different cancers may invoke shared regulatory mechanisms despite tissue-specific routes of oncogenesis, yet predictions in different cancer types require flexible incorporation of between-cancer and within-cancer signals. As most datasets in genome sciences contain multiple levels of heterogeneity, careful parsing of group characteristics and within-group, individual variation is essential when making robust inference. One of the central goals for precision oncology is to tailor treatment of individual tumors by their molecular characteristics. While drug response predictions have traditionally been sought within each cancer type, it has long been hoped to develop more robust predictions by jointly considering diverse cancer types. While such pan-cancer approaches have improved in recent years, it remains unclear whether between-tissue differences are contributing to the reported pan-cancer prediction performance. This concern stems from the observation that, when cancer types differ in both molecular features and drug response, strong predictive information can come mainly from differences among tissue types. Our study finds that both between- and within-cancer type signals provide substantial contributions to pan-cancer drug response prediction models, and about half of the cancer types examined are poorly predicted despite strong overall performance across all cancer types. We also find that pan-cancer prediction models perform similarly or better than cancer type-specific models, and in many cases the advantage of pan-cancer models is due to the larger number of samples available for pan-cancer analysis. Our results highlight tissue-of-origin as a key consideration for pan-cancer drug response prediction models, and recommend cancer type-specific considerations when translating pan-cancer prediction models for clinical use.
Collapse
Affiliation(s)
- John P Lloyd
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, United States of America.,Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, United States of America.,Rogel Cancer Center, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Matthew B Soellner
- Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, United States of America.,Rogel Cancer Center, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Sofia D Merajver
- Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, United States of America.,Rogel Cancer Center, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Jun Z Li
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, United States of America.,Rogel Cancer Center, University of Michigan, Ann Arbor, Michigan, United States of America
| |
Collapse
|
18
|
Katsanis SH. Pedigrees and Perpetrators: Uses of DNA and Genealogy in Forensic Investigations. Annu Rev Genomics Hum Genet 2020; 21:535-564. [DOI: 10.1146/annurev-genom-111819-084213] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In the past few years, cases with DNA evidence that could not be solved with direct matches in DNA databases have benefited from comparing single-nucleotide polymorphism data with private and public genomic databases. Using a combination of genome comparisons and traditional genealogical research, investigators can triangulate distant relatives to the contributor of DNA data from a crime scene, ultimately identifying perpetrators of violent crimes. This approach has also been successful in identifying unknown deceased persons and perpetrators of lesser crimes. Such advances are bringing into focus ethical questions on how much access to DNA databases should be granted to law enforcement and how best to empower public genome contributors with control over their data. The necessary policies will take time to develop but can be informed by reflection on the familial searching policies developed for searches of the federal DNA database and considerations of the anonymity and privacy interests of civilians.
Collapse
Affiliation(s)
- Sara H. Katsanis
- Mary Ann & J. Milburn Smith Child Health Research, Outreach, and Advocacy Center, Ann & Robert H. Lurie Children's Hospital of Chicago, Chicago, Illinois 60611, USA
- Department of Pediatrics, Northwestern University, Chicago, Illinois 60611, USA
| |
Collapse
|
19
|
Wyner N, Barash M, McNevin D. Forensic Autosomal Short Tandem Repeats and Their Potential Association With Phenotype. Front Genet 2020; 11:884. [PMID: 32849844 PMCID: PMC7425049 DOI: 10.3389/fgene.2020.00884] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Accepted: 07/17/2020] [Indexed: 12/11/2022] Open
Abstract
Forensic DNA profiling utilizes autosomal short tandem repeat (STR) markers to establish identity of missing persons, confirm familial relations, and link persons of interest to crime scenes. It is a widely accepted notion that genetic markers used in forensic applications are not predictive of phenotype. At present, there has been no demonstration of forensic STR variants directly causing or predicting disease. Such a demonstration would have many legal and ethical implications. For example, is there a duty to inform a DNA donor if a medical condition is discovered during routine analysis of their sample? In this review, we evaluate the possibility that forensic STRs could provide information beyond mere identity. An extensive search of the literature returned 107 articles associating a forensic STR with a trait. A total of 57 of these studies met our inclusion criteria: a reported link between a STR-inclusive gene and a phenotype and a statistical analysis reporting a p-value less than 0.05. A total of 50 unique traits were associated with the 24 markers included in the 57 studies. TH01 had the greatest number of associations with 27 traits reportedly linked to 40 different genotypes. Five of the articles associated TH01 with schizophrenia. None of the associations found were independently causative or predictive of disease. Regardless, the likelihood of identifying significant associations is increasing as the function of non-coding STRs in gene expression is steadily revealed. It is recommended that regular reviews take place in order to remain aware of future studies that identify a functional role for any forensic STRs.
Collapse
Affiliation(s)
- Nicole Wyner
- Centre for Forensic Science, School of Mathematical and Physical Sciences, Faculty of Science, University of Technology Sydney, Sydney, NSW, Australia
| | - Mark Barash
- Centre for Forensic Science, School of Mathematical and Physical Sciences, Faculty of Science, University of Technology Sydney, Sydney, NSW, Australia.,Department of Justice Studies, San José State University, San Jose, CA, United States
| | - Dennis McNevin
- Centre for Forensic Science, School of Mathematical and Physical Sciences, Faculty of Science, University of Technology Sydney, Sydney, NSW, Australia
| |
Collapse
|
20
|
Fortier AL, Kim J, Rosenberg NA. Human-Genetic Ancestry Inference and False Positives in Forensic Familial Searching. G3 (BETHESDA, MD.) 2020; 10:2893-2902. [PMID: 32586848 PMCID: PMC7407470 DOI: 10.1534/g3.120.401473] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Accepted: 06/20/2020] [Indexed: 11/18/2022]
Abstract
In forensic familial search methods, a query DNA profile is tested against a database to determine if the query profile represents a close relative of a database entrant. One challenge for familial search is that the calculations may require specification of allele frequencies for the unknown population from which the query profile has originated. The choice of allele frequencies affects the rate at which non-relatives are erroneously classified as relatives, and allele-frequency misspecification can substantially inflate false positive rates compared to use of allele frequencies drawn from the same population as the query profile. Here, we use ancestry inference on the query profile to circumvent the high false positive rates that result from highly misspecified allele frequencies. In particular, we perform ancestry inference on the query profile and make use of allele frequencies based on its inferred genetic ancestry. In a test for sibling matches on profiles that represent unrelated individuals, we demonstrate that false positive rates for familial search with use of ancestry inference to specify the allele frequencies are similar to those seen when allele frequencies align with the population of origin of a profile. Because ancestry inference is possible to perform on query profiles, the extreme allele-frequency misspecifications that produce the highest false positive rates can be avoided. We discuss the implications of the results in the context of concerns about the forensic use of familial searching.
Collapse
Affiliation(s)
| | - Jaehee Kim
- Department of Biology, Stanford University, CA 94305
| | | |
Collapse
|
21
|
Kureshi A, Li J, Wen D, Sun S, Yang Z, Zha L. Construction and forensic application of 20 highly polymorphic microhaplotypes. ROYAL SOCIETY OPEN SCIENCE 2020; 7:191937. [PMID: 32537197 PMCID: PMC7277291 DOI: 10.1098/rsos.191937] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Accepted: 04/07/2020] [Indexed: 06/11/2023]
Abstract
Microhaplotype markers have become an important research focus in forensic genetics. However, many reported microhaplotype markers have limited polymorphisms. In this study, we developed a set of highly polymorphic microhaplotype markers based on tri-allelic single-nucleotide polymorphisms. Eleven newly discovered microhaplotypes along with nine previously identified in our laboratory were studied. The microhaplotype genotypes of unrelated individuals and familial samples were generated on the MiSeq PE300 platform. These 20 loci have an average greater than 3.5 effective number of alleles. Over the whole set, the cumulative power of discrimination was 1-3.3 × 10-18, the cumulative power of exclusion was 1-1.928 × 10-7 and the theoretical probability of detecting a mixture was 1-1.427 × 10-6. Differentiation comparisons of 26 populations from the 1000 Genomes Project distinguished among East Asian, South Asian, African and European populations. Overall, these markers enrich the current microhaplotype marker databases and can be applied for individual identification, paternity testing and biogeographic ancestry distinction.
Collapse
Affiliation(s)
- Aliye Kureshi
- School of Basic Medical Sciences, Xinjiang Medical University, Urumqi 830011, Xinjiang, People's Republic of China
| | - Jienan Li
- Department of Forensic Medicine, School of Basic Medical Sciences, Central South University, No. 172, Tongzipo Road, Changsha 410013, Hunan, People's Republic of China
| | - Dan Wen
- Department of Forensic Medicine, School of Basic Medical Sciences, Central South University, No. 172, Tongzipo Road, Changsha 410013, Hunan, People's Republic of China
| | - Shule Sun
- Department of Forensic Medicine, School of Basic Medical Sciences, Central South University, No. 172, Tongzipo Road, Changsha 410013, Hunan, People's Republic of China
| | - Zedeng Yang
- Department of Forensic Medicine, School of Basic Medical Sciences, Central South University, No. 172, Tongzipo Road, Changsha 410013, Hunan, People's Republic of China
| | - Lagabaiyila Zha
- School of Basic Medical Sciences, Xinjiang Medical University, Urumqi 830011, Xinjiang, People's Republic of China
- Shanghai Key Laboratory of Forensic Medicine, Academy of Forensic Science, Shanghai 200063, People's Republic of China
- Department of Forensic Medicine, School of Basic Medical Sciences, Central South University, No. 172, Tongzipo Road, Changsha 410013, Hunan, People's Republic of China
| |
Collapse
|
22
|
West FL, Algee-Hewitt BF. Cadaveric blood cards: Assessing DNA quality and quantity and the utility of STRs for the individual estimation of trihybrid ancestry and admixture proportions. Forensic Sci Int Synerg 2020; 2:114-122. [PMID: 32412010 PMCID: PMC7219121 DOI: 10.1016/j.fsisyn.2020.03.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Revised: 03/04/2020] [Accepted: 03/06/2020] [Indexed: 02/04/2023]
Abstract
As part a body donation program, blood samples were collected and stored on untreated (non-FTA) blood cards. The blood cards were evaluated in terms of DNA preservation and STR typing success with resulting profiles assessed with special consideration given to profile matching for positive identification and biogeographic ancestry estimation. While STR profiles were successfully generated for all samples, results indicate that the time interval between date of death and sample collection have an impact on DNA quantity and quality. There is a statistically significant decrease in relative fluorescent unit (RFU) values with increasing time interval between date of death and sample collection, indicating degradation in the blood card samples related to the post-mortem interval prior to sample collection. The STR profiles were used to estimate ancestry and admixture using the program STRUCTURE, demonstrating utility of these markers beyond individual identification purposes, with caveats for application based on population history.
Collapse
Affiliation(s)
- Frankie L. West
- Forensic Science Program, Western Carolina University, USA
- Corresponding author.
| | | |
Collapse
|
23
|
Kinney N, Kang L, Eckstrand L, Pulenthiran A, Samuel P, Anandakrishnan R, Varghese RT, Michalak P, Garner HR. Abundance of ethnically biased microsatellites in human gene regions. PLoS One 2019; 14:e0225216. [PMID: 31830051 PMCID: PMC6907796 DOI: 10.1371/journal.pone.0225216] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Accepted: 10/29/2019] [Indexed: 12/16/2022] Open
Abstract
Microsatellites-a type of short tandem repeat (STR)-have been used for decades as putatively neutral markers to study the genetic structure of diverse human populations. However, recent studies have demonstrated that some microsatellites contribute to gene expression, cis heritability, and phenotype. As a corollary, some microsatellites may contribute to differential gene expression and RNA/protein structure stability in distinct human populations. To test this hypothesis, we investigate genotype frequencies, functional relevance, and adaptive potential of microsatellites in five super-populations (ethnicities) drawn from the 1000 Genomes Project. We discover 3,984 ethnically-biased microsatellite loci (EBML); for each EBML at least one ethnicity has genotype frequencies statistically different from the remaining four. South Asian, East Asian, European, and American EBML show significant overlap; on the contrary, the set of African EBML is mostly unique. We cross-reference the 3,984 EBML with 2,060 previously identified expression STRs (eSTRs); repeats known to affect gene expression (64 total) are over-represented. The most significant pathway enrichments are those associated with the matrisome: a broad collection of genes encoding the extracellular matrix and its associated proteins. At least 14 of the EBML have established links to human disease. Analysis of the 3,984 EBML with respect to known selective sweep regions in the genome shows that allelic variation in some of them is likely associated with adaptive evolution.
Collapse
Affiliation(s)
- Nick Kinney
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
- Gibbs Cancer Center & Research Institute, Spartanburg, SC, United States of America
| | - Lin Kang
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
- Gibbs Cancer Center & Research Institute, Spartanburg, SC, United States of America
| | - Laurel Eckstrand
- Virginia-Maryland College of Veterinary Medicine, Blacksburg, VA, United States of America
| | - Arichanah Pulenthiran
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
| | - Peter Samuel
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
| | - Ramu Anandakrishnan
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
| | - Robin T. Varghese
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
| | - P. Michalak
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
- Virginia-Maryland College of Veterinary Medicine, Blacksburg, VA, United States of America
- Institute of Evolution, University of Haifa, Haifa, Israel
| | - Harold R. Garner
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
- Gibbs Cancer Center & Research Institute, Spartanburg, SC, United States of America
| |
Collapse
|
24
|
Forgacs D, Wallen RL, Boedeker AL, Derr JN. Evaluation of fecal samples as a valid source of DNA by comparing paired blood and fecal samples from American bison (Bison bison). BMC Genet 2019; 20:22. [PMID: 30808294 PMCID: PMC6390568 DOI: 10.1186/s12863-019-0722-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2018] [Accepted: 02/08/2019] [Indexed: 11/25/2022] Open
Abstract
Background The collection and analysis of fecal DNA is a common practice, especially when dealing with wildlife species that are difficult to track or capture. While fecal DNA is known to be lower quality than traditional sources of DNA, such as blood or other tissues, few investigations have verified fecal samples as a valid source of DNA by directly comparing the results to high quality DNA samples from the same individuals. Our goal was to compare DNA from fecal and blood samples from the same 50 American plains bison (Bison bison) from Yellowstone National Park, analyze 35 short tandem repeat (STR) loci for genotyping efficiency, and compare heterozygosity estimates. Results We discovered that some of the fecal-derived genotypes obtained were significantly different from the blood-derived genotypes from the same bison. We also found that fecal-derived DNA samples often underestimated heterozygosity values, in some cases by over 20%. Conclusions These findings highlight a potential shortcoming inherent in previous wildlife studies that relied solely on a multi-tube approach, using exclusively low quality fecal DNA samples with no quality control to account for false alleles and allelic dropout. Herein, we present a rigorous marker selection protocol that is applicable for a wide range of species and report a set of 15 STR markers for use in future bison studies that yielded consistent results from both fecal and blood-derived DNA. Electronic supplementary material The online version of this article (10.1186/s12863-019-0722-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- David Forgacs
- Interdisciplinary Graduate Program of Genetics, Texas A&M University, College Station, TX, 77843, USA.,Department of Veterinary Pathobiology, Texas A&M University, College Station, TX, 77843, USA
| | - Rick L Wallen
- National Park Service, Yellowstone National Park, Hot Springs, Mammoth, WY, 82190, USA
| | - Amy L Boedeker
- Department of Veterinary Pathobiology, Texas A&M University, College Station, TX, 77843, USA
| | - James N Derr
- Interdisciplinary Graduate Program of Genetics, Texas A&M University, College Station, TX, 77843, USA. .,Department of Veterinary Pathobiology, Texas A&M University, College Station, TX, 77843, USA.
| |
Collapse
|
25
|
Moriot A, Santos C, Freire-Aradas A, Phillips C, Hall D. Inferring biogeographic ancestry with compound markers of slow and fast evolving polymorphisms. Eur J Hum Genet 2018; 26:1697-1707. [PMID: 29995845 PMCID: PMC6189140 DOI: 10.1038/s41431-018-0215-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2017] [Revised: 04/23/2018] [Accepted: 06/12/2018] [Indexed: 11/09/2022] Open
Abstract
Bio-geographic ancestry is an area of considerable interest in the medical genetics, anthropology and forensics. Although genome-wide panels are ideal as they provide dense genotyping data, small sets of ancestry informative marker provide a cost-effective way to investigate genetic ancestry and population structure. Here, we investigate the performance of a reduced marker set that combine different types of autosomal markers through haplotype analysis. In particular, recently described DIP-STR markers should offer the advantage of comprising both, low mutation rate Indels (DIPs), to study human history over longer time scale; and high mutation rate STRs, to trace relatively recent demographic events. In this study, we assessed the ability of an initial set of 23 DIP-STRs to distinguish major population groups using the HGDP-CEPH reference samples. The results obtained applying the STRUCTURE algorithm show that the discrimination capacity of the DIP-STRs is comparable to currently used small-scale ancestry informative markers by approaching seven major demographic groups. Yet, the DIP-STRs show an improved success rate in assigning individuals to populations of Europe and Middle East. These data show a remarkable ability of a preliminary set of 23 DIP-STR markers to infer major biogeographic origins. A novel set of DIP-STRs preselected to contain ancestry information should lead to further improvements.
Collapse
Affiliation(s)
- Amandine Moriot
- Unité de Génétique Forensique, Centre Universitaire Romand de Médecine Légale, Centre Hospitalier Universitaire Vaudois et Université de Lausanne, Lausanne, Switzerland
| | - Carla Santos
- Forensic Genetics Unit, Institute of Forensic Science, University of Santiago de Compostela, Santiago de Compostela, Spain
| | - Ana Freire-Aradas
- Forensic Genetics Unit, Institute of Forensic Science, University of Santiago de Compostela, Santiago de Compostela, Spain
| | - Christopher Phillips
- Forensic Genetics Unit, Institute of Forensic Science, University of Santiago de Compostela, Santiago de Compostela, Spain
| | - Diana Hall
- Unité de Génétique Forensique, Centre Universitaire Romand de Médecine Légale, Centre Hospitalier Universitaire Vaudois et Université de Lausanne, Lausanne, Switzerland.
| |
Collapse
|
26
|
Oldoni F, Kidd KK, Podini D. Microhaplotypes in forensic genetics. Forensic Sci Int Genet 2018; 38:54-69. [PMID: 30347322 DOI: 10.1016/j.fsigen.2018.09.009] [Citation(s) in RCA: 93] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2018] [Revised: 09/21/2018] [Accepted: 09/25/2018] [Indexed: 01/28/2023]
Abstract
Microhaplotype loci (microhaps, MHs) are a novel type of molecular marker of less than 300 nucleotides, defined by two or more closely linked SNPs associated in multiple allelic combinations. The value of these markers is enhanced by massively parallel sequencing (MPS), which allows the sequencing of both parental haplotypes at each of the many multiplexed loci. This review describes the features of these multi-SNP markers and documents their value in forensic genetics, focusing on individualization, biogeographic ancestry inference, and mixture deconvolution. Foreseeable applications also include missing person identification, relationship testing, and medical diagnostic applications. The technique is not restricted to humans.
Collapse
Affiliation(s)
- Fabio Oldoni
- Department of Forensic Sciences, The George Washington University, 2100 Foxhall Road NW, Washington, DC, 20007, United States
| | - Kenneth K Kidd
- Yale University School of Medicine, Department of Genetics, 333 Cedar Street, New Haven, CT, 06520, United States
| | - Daniele Podini
- Department of Forensic Sciences, The George Washington University, 2100 Foxhall Road NW, Washington, DC, 20007, United States.
| |
Collapse
|
27
|
Benvisto A, Messina F, Finocchio A, Popa L, Stefan M, Stefanescu G, Mironeanu C, Novelletto A, Rapone C, Berti A. A genetic portrait of the South-Eastern Carpathians based on autosomal short tandem repeats loci used in forensics. Am J Hum Biol 2018; 30:e23139. [PMID: 30099799 DOI: 10.1002/ajhb.23139] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2018] [Revised: 04/02/2018] [Accepted: 05/17/2018] [Indexed: 11/09/2022] Open
Abstract
OBJECTIVES This work aimed to describe the genetic landscape of the Balkan Peninsula, as revealed by STR markers commonly used in forensics and spatial methods specifically developed for genetic data. METHODS We generated and analyzed 16 short tandem repeats (STRs) autosomal genotypes in 287 subjects from ten administrative/geographical regions of Eastern Europe (Romania and the Republic of Moldova). We report estimates of the allele frequencies in these sub-populations, their fixation indexes, and use these results to complement previous spatial analyses of Southern Europe. RESULTS In seven out of ten analyzed regional samples the heterozygosity, averaged across loci, was lower than expected. The average Fis was 0.011. Among the 16 loci, five returned a significant fixation index Fst. The composite Fst across the 16 loci, among the 10 regional samples, was 0.00417, a figure twice as large as that obtained with the same markers across the entire Northern Mediterranean. The first spatial principal component (sPC1) returned the picture of a Central-European pattern of frequencies for the Carpathians, which extended to the Southern boundary of the Balkan Peninsula. However, the 8 alleles extracted by sPC1 returned a picture of a strong reduction of the migration rate in the Carpathian region, mostly between the inner locations. CONCLUSIONS Our results revealed an unexpected heterogeneity in the area. We believe that populations from some regions will require treatment as distinct entities when considered in forensic applications.
Collapse
Affiliation(s)
- Alessandro Benvisto
- Reparto Carabinieri Investigazioni Scientifiche - Sezione di Biologia, Rome, 00191, Italy
| | - Francesco Messina
- Department of Biology, University of Rome Tor Vergata, Rome, 00133, Italy
| | - Andrea Finocchio
- Department of Biology, University of Rome Tor Vergata, Rome, 00133, Italy
| | - Luis Popa
- "Grigore Antipa" National Museum of Natural History, Bucharest, 011341, Romania
| | - Mihaela Stefan
- Department of Genetics, University of Bucharest, Bucharest, 76258, Romania
| | | | | | - Andrea Novelletto
- Department of Biology, University of Rome Tor Vergata, Rome, 00133, Italy
| | - Cesare Rapone
- Reparto Carabinieri Investigazioni Scientifiche - Sezione di Biologia, Rome, 00191, Italy
| | - Andrea Berti
- Reparto Carabinieri Investigazioni Scientifiche - Sezione di Biologia, Rome, 00191, Italy
| |
Collapse
|
28
|
Messina F, Finocchio A, Akar N, Loutradis A, Michalodimitrakis EI, Brdicka R, Jodice C, Novelletto A. Enlarging the gene-geography of Europe and the Mediterranean area to STR loci of common forensic use: longitudinal and latitudinal frequency gradients. Ann Hum Biol 2018; 45:77-85. [PMID: 29382282 DOI: 10.1080/03014460.2017.1409365] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
BACKGROUND Tetranucleotide Short Tandem Repeats (STRs) for human identification and common use in forensic cases have recently been used to address the population genetics of the North-Eastern Mediterranean area. However, to gain confidence in the inferences made using STRs, this kind of analysis should be challenged with changes in three main aspects of the data, i.e. the sizes of the samples, their distance across space and the genetic background from which they are drawn. AIM To test the resilience of the gradients previously detected in the North-Eastern Mediterranean to the enlargement of the surveyed area and population set, using revised data. SUBJECTS AND METHODS STR genotype profiles were obtained from a publicly available database (PopAffilietor databank) and a dataset was assembled including >7000 subjects from the Arabian Peninsula to Scandinavia, genotyped at eight loci. Spatial principal component analysis (sPCA) was applied and the frequency maps of the nine alleles which contributed most strongly to sPC1 were examined in detail. RESULTS By far the greatest part of diversity was summarised by a single spatial principal component (sPC1), oriented along a SouthEast-to-NorthWest axis. The alleles with the top 5% squared loadings were TH01(9.3), D19S433(14), TH01(6), D19S433(15.2), FGA(20), FGA(24), D3S1358(14), FGA(21) and D2S1338(19). These results confirm a clinal pattern over the whole range for at least four loci (TH01, D19S433, FGA, D3S1358). CONCLUSIONS Four of the eight STR loci (or even alleles) considered here can reproducibly capture continental arrangements of diversity. This would, in principle, allow for the exploitation of forensic data to clarify important aspects in the formation of local gene pools.
Collapse
Affiliation(s)
- Francesco Messina
- a Department of Biology , University of Rome Tor Vergata , Rome , Italy
| | - Andrea Finocchio
- a Department of Biology , University of Rome Tor Vergata , Rome , Italy
| | - Nejat Akar
- b Pediatrics Department , TOBB-Economy and Technology University Hospital , Ankara , Turkey
| | | | | | - Radim Brdicka
- e Institute of Hematology and Blood Transfusion , Praha , Czech Republic
| | - Carla Jodice
- a Department of Biology , University of Rome Tor Vergata , Rome , Italy
| | - Andrea Novelletto
- a Department of Biology , University of Rome Tor Vergata , Rome , Italy
| |
Collapse
|
29
|
Signs of continental ancestry in urban populations of Peru through autosomal STR loci and mitochondrial DNA typing. PLoS One 2018; 13:e0200796. [PMID: 30020992 PMCID: PMC6051651 DOI: 10.1371/journal.pone.0200796] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2018] [Accepted: 07/03/2018] [Indexed: 11/30/2022] Open
Abstract
The human genetic diversity around the world was studied through several high variable genetic markers. In South America the demic consequences of admixture events between Native people, European colonists and African slaves have been displayed by uniparental markers variability. The mitochondrial DNA (mtDNA) has been the most widely used genetic marker for studying American mixed populations, although nuclear markers, such as microsatellite loci (STRs) commonly used in forensic science, showed to be genetically and geographically structured. In this work, we analyzed DNA from buccal swab samples of 296 individuals across Peru: 156 Native Amazons (Ashaninka, Cashibo and Shipibo from Ucayali, Huambiza from Loreto and Moche from Lambayeque) and 140 urban Peruvians from Lima and other 33 urban areas. The aim was to evaluate, through STRs and mtDNA variability, recent migrations in urban Peruvian populations and to gain more information about their continental ancestry. STR data highlighted that most individuals (67%) of the urban Peruvian sample have a strong similarity to the Amazon Native population, whereas 22% have similarity to African populations and only ~1% to European populations. Also the maternally-transmitted mtDNA confirmed the strong Native contribution (~90% of Native American haplogroups) and the lower frequencies of African (~6%) and European (~3%) haplogroups. This study provides a detailed description of the urban Peruvian genetic structure and proposes forensic STRs as a useful tool for studying recent migrations, especially when coupled with mtDNA.
Collapse
|
30
|
Kidd KK, Pakstis AJ, Speed WC, Lagace R, Wootton S, Chang J. Selecting microhaplotypes optimized for different purposes. Electrophoresis 2018; 39:2815-2823. [PMID: 29931757 DOI: 10.1002/elps.201800092] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2018] [Revised: 06/13/2018] [Accepted: 06/14/2018] [Indexed: 12/22/2022]
Abstract
Massively parallel sequencing is transforming forensic work by allowing various useful forensic markers, such as STRPs and SNPs, to be multiplexed providing information on ancestry, individual and familial identification, phenotypes for eye/hair/skin pigmentation, and the deconvolution of mixtures. Microhaplotypes also become feasible with massively parallel sequencing, these are DNA segments (smaller than 300 nucleotides) that are selected to contain multiple SNPs unambiguously defining three or more haplotype alleles occurring at common frequencies. The physical extent of a microhaplotype can thus be covered by a single sequence read making these loci phase-known codominant genetic systems. Such microhaplotypes supply significantly more information than a single SNP can. Our efforts to develop useful sets of microhaplotypes have already identified 182 such loci that we have studied on a large number of human populations from around the world. We present various analyses on 83 populations in our ongoing study for a subset of the best microhaplotypes currently available illustrating their characteristics and potential utility for ancestry, identification, and mixture deconvolution.
Collapse
Affiliation(s)
- Kenneth K Kidd
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
| | - Andrew J Pakstis
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
| | - William C Speed
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
| | - Robert Lagace
- Human Identification Group, ThermoFisher Scientific, South San Francisco, CA, USA
| | - Sharon Wootton
- Human Identification Group, ThermoFisher Scientific, South San Francisco, CA, USA
| | - Joseph Chang
- Human Identification Group, ThermoFisher Scientific, South San Francisco, CA, USA
| |
Collapse
|
31
|
Kidd KK, Soundararajan U, Rajeevan H, Pakstis AJ, Moore KN, Ropero-Miller JD. The redesigned Forensic Research/Reference on Genetics-knowledge base, FROG-kb. Forensic Sci Int Genet 2018; 33:33-37. [DOI: 10.1016/j.fsigen.2017.11.009] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2017] [Revised: 10/13/2017] [Accepted: 11/13/2017] [Indexed: 01/22/2023]
|
32
|
Tau T, Wally A, Fanie TP, Ngono GL, Mpoloka SW, Davison S, D'Amato ME. Genetic variation and population structure of Botswana populations as identified with AmpFLSTR Identifiler short tandem repeat (STR) loci. Sci Rep 2017; 7:6768. [PMID: 28754995 PMCID: PMC5533702 DOI: 10.1038/s41598-017-06365-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2016] [Accepted: 06/14/2017] [Indexed: 11/09/2022] Open
Abstract
Population structure was investigated in 990 Botswana individuals according to ethno-linguistics, Bantu and Khoisan, and geography (the nine administrative districts) using the Identifiler autosomal microsatellite markers. Genetic diversity and forensic parameters were calculated for the overall population, and according to ethno-linguistics and geography. The overall combined power of exclusion (CPE) was 0.9999965412 and the combined match probability 6,28 × 10-19. CPE was highest for the Khoisan Tuu ethnolinguistic group and the Northeast District at 0.9999582029 and 0.9999922652 respectively. CMP ranged from 6.28 × 10-19 (Khoisan Tuu) to 1,02 × 10-18 (Northwest district). Using pairwise genetic distances (FST), analysis of molecular variance (AMOVA), factorial correspondence analysis (FCA), and the unsupervised Bayesian clustering method found in STRUCTURE and TESS, ethno-linguistics were found to have a greater influence on population structure than geography. FCA showed clustering between Bantu and Khoisan, and within the Bantu. This Bantu sub-structuring was not seen with STRUCTURE and TESS, which detected clustering only between Bantu and Khoisan. The patterns of population structure revealed highlight the need for regional reference databases that include ethno-linguistic and geographic location information. These markers have important potential for bio-anthropological studies as well as for forensic applications.
Collapse
Affiliation(s)
- Tiroyamodimo Tau
- University of the Western Cape, Department of Biotechnology, Forensic DNA Laboratory, Private Bag X17, 7535, Bellville, Cape Town, South Africa
| | - Anthony Wally
- Botswana Police Service, Forensic Science Laboratory, Private Bag 0400, Gaborone, Botswana
| | | | - Goitseone Lorato Ngono
- Botswana Police Service, Forensic Science Laboratory, Private Bag 0400, Gaborone, Botswana
| | - Sununguko Wata Mpoloka
- University of Botswana, Biological Sciences Department, Private Bag 00704, Gaborone, Botswana
| | - Sean Davison
- University of the Western Cape, Department of Biotechnology, Forensic DNA Laboratory, Private Bag X17, 7535, Bellville, Cape Town, South Africa
| | - María Eugenia D'Amato
- University of the Western Cape, Department of Biotechnology, Forensic DNA Laboratory, Private Bag X17, 7535, Bellville, Cape Town, South Africa.
| |
Collapse
|
33
|
Algee-Hewitt BFB. Geographic substructure in craniometric estimates of admixture for contemporary American populations. AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 2017. [DOI: 10.1002/ajpa.23267] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
34
|
Linkage disequilibrium matches forensic genetic records to disjoint genomic marker sets. Proc Natl Acad Sci U S A 2017; 114:5671-5676. [PMID: 28507140 PMCID: PMC5465933 DOI: 10.1073/pnas.1619944114] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Combining genotypes across datasets is central in facilitating advances in genetics. Data aggregation efforts often face the challenge of record matching-the identification of dataset entries that represent the same individual. We show that records can be matched across genotype datasets that have no shared markers based on linkage disequilibrium between loci appearing in different datasets. Using two datasets for the same 872 people-one with 642,563 genome-wide SNPs and the other with 13 short tandem repeats (STRs) used in forensic applications-we find that 90-98% of forensic STR records can be connected to corresponding SNP records and vice versa. Accuracy increases to 99-100% when ∼30 STRs are used. Our method expands the potential of data aggregation, but it also suggests privacy risks intrinsic in maintenance of databases containing even small numbers of markers-including databases of forensic significance.
Collapse
|
35
|
Mathematical Constraints on FST: Biallelic Markers in Arbitrarily Many Populations. Genetics 2017; 206:1581-1600. [PMID: 28476869 DOI: 10.1534/genetics.116.199141] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Accepted: 05/03/2017] [Indexed: 02/01/2023] Open
Abstract
[Formula: see text] is one of the most widely used statistics in population genetics. Recent mathematical studies have identified constraints that challenge interpretations of [Formula: see text] as a measure with potential to range from 0 for genetically similar populations to 1 for divergent populations. We generalize results obtained for population pairs to arbitrarily many populations, characterizing the mathematical relationship between [Formula: see text] the frequency M of the more frequent allele at a polymorphic biallelic marker, and the number of subpopulations K We show that for fixed K, [Formula: see text] has a peculiar constraint as a function of M, with a maximum of 1 only if [Formula: see text] for integers i with [Formula: see text] For fixed M, as K grows large, the range of [Formula: see text] becomes the closed or half-open unit interval. For fixed K, however, some [Formula: see text] always exists at which the upper bound on [Formula: see text] lies below [Formula: see text] We use coalescent simulations to show that under weak migration, [Formula: see text] depends strongly on M when K is small, but not when K is large. Finally, examining data on human genetic variation, we use our results to explain the generally smaller [Formula: see text] values between pairs of continents relative to global [Formula: see text] values. We discuss implications for the interpretation and use of [Formula: see text].
Collapse
|
36
|
Evaluating 130 microhaplotypes across a global set of 83 populations. Forensic Sci Int Genet 2017; 29:29-37. [PMID: 28359046 DOI: 10.1016/j.fsigen.2017.03.014] [Citation(s) in RCA: 82] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2016] [Revised: 02/17/2017] [Accepted: 03/12/2017] [Indexed: 01/08/2023]
Abstract
Today the primary DNA markers used in forensics are short tandem repeat (STR) polymorphisms (STRPs), initially selected because they are highly polymorphic. However, the increasingly common need to deal with samples with a mixture of DNA from two or more individuals sometimes is complicated by the inherent stutter involved with PCR amplification, especially in strongly unbalanced mixtures when the minor component coincides with the stutter range of the major component. Also, the STRPs in use provide little evidence of ancestry of a single source sample beyond broad "continental" resolution. Methodologies for analyzing DNA have become much more powerful in recent years. Massively parallel sequencing (MPS) is a new method being considered for routine use in forensics. Primarily to aid in mixture deconvolution and avoid the issue of stutter, we have begun to investigate a new type of forensic marker, microhaplotype loci, that will provide useful information on mixtures of DNA and on ancestry when typed using massively parallel sequencing (MPS). We have identified 130 loci and estimated their haplotype (allele) frequencies in 83 different population samples. Many of these loci are shown to be highly informative for individual identification and for mixture identification and deconvolution.
Collapse
|
37
|
Messina F, Finocchio A, Akar N, Loutradis A, Michalodimitrakis EI, Brdicka R, Jodice C, Novelletto A. Spatially Explicit Models to Investigate Geographic Patterns in the Distribution of Forensic STRs: Application to the North-Eastern Mediterranean. PLoS One 2016; 11:e0167065. [PMID: 27898725 PMCID: PMC5127579 DOI: 10.1371/journal.pone.0167065] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2016] [Accepted: 11/08/2016] [Indexed: 11/18/2022] Open
Abstract
Human forensic STRs used for individual identification have been reported to have little power for inter-population analyses. Several methods have been developed which incorporate information on the spatial distribution of individuals to arrive at a description of the arrangement of diversity. We genotyped at 16 forensic STRs a large population sample obtained from many locations in Italy, Greece and Turkey, i.e. three countries crucial to the understanding of discontinuities at the European/Asian junction and the genetic legacy of ancient migrations, but seldom represented together in previous studies. Using spatial PCA on the full dataset, we detected patterns of population affinities in the area. Additionally, we devised objective criteria to reduce the overall complexity into reduced datasets. Independent spatially explicit methods applied to these latter datasets converged in showing that the extraction of information on long- to medium-range geographical trends and structuring from the overall diversity is possible. All analyses returned the picture of a background clinal variation, with regional discontinuities captured by each of the reduced datasets. Several aspects of our results are confirmed on external STR datasets and replicate those of genome-wide SNP typings. High levels of gene flow were inferred within the main continental areas by coalescent simulations. These results are promising from a microevolutionary perspective, in view of the fast pace at which forensic data are being accumulated for many locales. It is foreseeable that this will allow the exploitation of an invaluable genotypic resource, assembled for other (forensic) purposes, to clarify important aspects in the formation of local gene pools.
Collapse
Affiliation(s)
| | | | - Nejat Akar
- Pediatrics Department, TOBB-Economy and Technology University Hospital, Ankara, Turkey
| | | | | | - Radim Brdicka
- Institute of Haematology and Blood Transfusion, Praha, Czech Republic
| | - Carla Jodice
- Department of Biology, University "Tor Vergata", Rome, Italy
| | - Andrea Novelletto
- Department of Biology, University "Tor Vergata", Rome, Italy
- * E-mail:
| |
Collapse
|
38
|
Algee-Hewitt BFB, Goldberg A. Better together: Thinking anthropologically about genetics. AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 2016; 160:557-60. [DOI: 10.1002/ajpa.23022] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2016] [Accepted: 05/18/2016] [Indexed: 01/22/2023]
Affiliation(s)
| | - Amy Goldberg
- Department of Biology; Stanford University; Stanford California 94305
| |
Collapse
|