1
|
Kulhankova L, Bindels E, Kayser M, Mulugeta E. Deconvoluting multi-person biological mixtures and accurate characterization and identification of separated contributors using non-targeted single-cell DNA sequencing. Forensic Sci Int Genet 2024; 71:103030. [PMID: 38513339 DOI: 10.1016/j.fsigen.2024.103030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 02/16/2024] [Accepted: 03/04/2024] [Indexed: 03/23/2024]
Abstract
The genetic characterization and identification of individuals who contributed to biological mixtures are complex and mostly unresolved tasks. These tasks are relevant in various fields, particularly in forensic investigations, which frequently encounters crime scene stains generated by more than one person. Currently, forensic mixture deconvolution is mostly performed subsequent to forensic DNA profiling at the level of the mixed DNA profiles, which comes with several limitations. Some previous studies attempted at separating single cells prior to forensic DNA profiling. However, these approaches are biased at selection of the cells and, due to their targeted DNA analysis on low template DNA, provide incomplete and unreliable forensic DNA profiles. We recently demonstrated the feasibility of performing mixture deconvolution prior to forensic DNA profiling through the utilization of a non-targeted single-cell transcriptome sequencing (scRNA-seq). In addition to individual-specific mixture deconvolution, this approach also allowed accurate characterisation of biological sex, biogeographic ancestry and individual identification of the separated mixture contributors. However, RNA has the forensic disadvantage of being prone to degradation, and sequencing RNA - focussing on coding regions - limits the number of single nucleotide polymorphisms (SNPs) utilized for genetic mixture deconvolution, characterization, and identification. These limitations can be overcome by performing single-cell sequencing on the level of DNA instead of RNA. Here, for the first time, we applied non-targeted single-cell DNA sequencing (scDNA-seq) by applying the scATAC-seq (Assay for Transposase-Accessible Chromatin with sequencing) technique to address the challenges of mixture deconvolution in the forensic context. We demonstrated that scATAC-seq, together with our recently developed De-goulash data analysis pipeline, is capable of deconvoluting complex blood mixtures of five individuals from both sexes with varying biogeographic ancestries. We further showed that our approach achieved correct genetic characterization of the biological sex and the biogeographic ancestry of each of the separated mixture contributors and established their identity. Furthermore, by analysing in-silico generated scATAC-seq data mixtures, we demonstrated successful individual-specific mixture deconvolution of i) highly complex mixtures of 11 individuals, ii) balanced mixtures containing as few as 20 cells (10 per each individual), and iii) imbalanced mixtures with a ratio as low as 1:80. Overall, our proof-of-principle study demonstrates the general feasibility of scDNA-seq in general, and scATAC-seq in particular, for mixture deconvolution, genetic characterization and individual identification of the separated mixture contributors. Furthermore, it shows that compared to scRNA-seq, scDNA-seq detects more SNPs from fewer cells, providing higher sensitivity, that is valuable in forensic genetics.
Collapse
Affiliation(s)
- Lucie Kulhankova
- Department of Genetic Identification, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Eric Bindels
- Department of Haematology, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Manfred Kayser
- Department of Genetic Identification, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands.
| | - Eskeatnaf Mulugeta
- Department of Cell Biology, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands.
| |
Collapse
|
2
|
Grgicak CM, Bhembe Q, Slooten K, Sheth NC, Duffy KR, Lun DS. Single-cell investigative genetics: Single-cell data produces genotype distributions concentrated at the true genotype across all mixture complexities. Forensic Sci Int Genet 2024; 69:103000. [PMID: 38199167 DOI: 10.1016/j.fsigen.2023.103000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 11/07/2023] [Accepted: 12/12/2023] [Indexed: 01/12/2024]
Abstract
In the absence of a suspect the forensic aim is investigative, and the focus is one of discerning what genotypes best explain the evidence. In traditional systems, the list of candidate genotypes may become vast if the sample contains DNA from many donors or the information from a minor contributor is swamped by that of major contributors, leading to lower evidential value for a true donor's contribution and, as a result, possibly overlooked or inefficient investigative leads. Recent developments in single-cell analysis offer a way forward, by producing data capable of discriminating genotypes. This is accomplished by first clustering single-cell data by similarity without reference to a known genotype. With good clustering it is reasonable to assume that the scEPGs in a cluster are of a single contributor. With that assumption we determine the probability of a cluster's content given each possible genotype at each locus, which is then used to determine the posterior probability mass distribution for all genotypes by application of Bayes' rule. A decision criterion is then applied such that the sum of the ranked probabilities of all genotypes falling in the set is at least 1-α. This is the credible genotype set and is used to inform database search criteria. Within this work we demonstrate the salience of single-cell analysis by performance testing a set of 630 previously constructed admixtures containing up to 5 donors of balanced and unbalanced contributions. We use scEPGs that were generated by isolating single cells, employing a direct-to-PCR extraction treatment, amplifying STRs that are compliant with existing national databases and applying post-PCR treatments that elicit a detection limit of one DNA copy. We determined that, for these test data, 99.3% of the true genotypes are included in the 99.8% credible set, regardless of the number of donors that comprised the mixture. We also determined that the most probable genotype was the true genotype for 97% of the loci when the number of cells in a cluster was at least two. Since efficient investigative leads will be borne by posterior mass distributions that are narrow and concentrated at the true genotype, we report that, for this test set, 47,900 (86%) loci returned only one credible genotype and of these 47,551 (99%) were the true genotype. When determining the LR for true contributors, 91% of the clusters rendered LR>1018, showing the potential of single-cell data to positively affect investigative reporting.
Collapse
Affiliation(s)
- Catherine M Grgicak
- Department of Chemistry, Rutgers University, Camden, NJ 08102, USA; Center for Computational and Integrative Biology, Rutgers University, Camden, NJ 08102, USA; Program in Biomedical Forensic Sciences, Boston University, Boston, MA 02118, USA.
| | - Qhawe Bhembe
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ 08102, USA
| | - Klaas Slooten
- Netherlands Forensic Institute, P.O. Box 24044, 2490 AA The Hague, the Netherlands; VU University Amsterdam, De Boelelaan 1081, 1081 HV Amsterdam, the Netherlands
| | - Nidhi C Sheth
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ 08102, USA
| | - Ken R Duffy
- Department of Mathematics, Northeastern University, Boston, MA 02115, USA; Department of Electrical and Computer Engineering, Northeastern University, Boston, MA 02115, USA; Hamilton Institute, Maynooth University, Ireland
| | - Desmond S Lun
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ 08102, USA; Department of Computer Science, Rutgers University, Camden, NJ 08102, USA
| |
Collapse
|
3
|
Shome M, MacKenzie TMG, Subbareddy SR, Snyder MP. The Importance, Challenges, and Possible Solutions for Sharing Proteomics Data While Safeguarding Individuals' Privacy. Mol Cell Proteomics 2024; 23:100731. [PMID: 38331191 PMCID: PMC10915627 DOI: 10.1016/j.mcpro.2024.100731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 01/28/2024] [Accepted: 02/05/2024] [Indexed: 02/10/2024] Open
Abstract
Proteomics data sharing has profound benefits at the individual level as well as at the community level. While data sharing has increased over the years, mostly due to journal and funding agency requirements, the reluctance of researchers with regard to data sharing is evident as many shares only the bare minimum dataset required to publish an article. In many cases, proper metadata is missing, essentially making the dataset useless. This behavior can be explained by a lack of incentives, insufficient awareness, or a lack of clarity surrounding ethical issues. Through adequate training at research institutes, researchers can realize the benefits associated with data sharing and can accelerate the norm of data sharing for the field of proteomics, as has been the standard in genomics for decades. In this article, we have put together various repository options available for proteomics data. We have also added pros and cons of those repositories to facilitate researchers in selecting the repository most suitable for their data submission. It is also important to note that a few types of proteomics data have the potential to re-identify an individual in certain scenarios. In such cases, extra caution should be taken to remove any personal identifiers before sharing on public repositories. Data sets that will be useless without personal identifiers need to be shared in a controlled access repository so that only authorized researchers can access the data and personal identifiers are kept safe.
Collapse
Affiliation(s)
- Mahasish Shome
- Department of Genetics, Stanford University, Palo Alto, California, USA
| | - Tim M G MacKenzie
- Department of Genetics, Stanford University, Palo Alto, California, USA
| | | | - Michael P Snyder
- Department of Genetics, Stanford University, Palo Alto, California, USA.
| |
Collapse
|
4
|
Huffman K, Ballantyne J. Single cell genomics applications in forensic science: Current state and future directions. iScience 2023; 26:107961. [PMID: 37876804 PMCID: PMC10590970 DOI: 10.1016/j.isci.2023.107961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2023] Open
Abstract
Standard methods of mixture analysis involve subjecting a dried crime scene sample to a "bulk" DNA extraction method such that the resulting isolate compromises a homogenized DNA mixture from the individual donors. If, however, instead of bulk DNA extraction, a sufficient number of individual cells from the mixed stain are subsampled prior to genetic analysis then it should be possible to recover highly probative single source, non-mixed scDNA profiles from each of the donors. This approach can detect low DNA level minor donors to a mixture that otherwise would not be identified using standard methods and can also resolve rare mixtures comprising first degree relatives and thereby also prevent the false inclusion of non-donor relatives. This literature landscape review and associated commentary reports on the history and increasing interest in current and potential future applications of scDNA in forensic genomics, and critically evaluates opportunities and impediments to further progress.
Collapse
Affiliation(s)
- Kaitlin Huffman
- Graduate Program in Chemistry, Department of Chemistry, University of Central Florida, PO Box 162366, Orlando, FL 32816-2366, USA
| | - Jack Ballantyne
- National Center for Forensic Science, PO Box 162367, Orlando, FL 32816-2367, USA
- Department of Chemistry, University of Central Florida, PO Box 162366, Orlando, FL 32816-2366, USA
| |
Collapse
|
5
|
Diepenbroek M, Bayer B, Anslinger K. Phenotype predictions of two-person mixture using single cell analysis. Forensic Sci Int Genet 2023; 67:102938. [PMID: 37832204 DOI: 10.1016/j.fsigen.2023.102938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 09/19/2023] [Accepted: 09/27/2023] [Indexed: 10/15/2023]
Abstract
Over a decade after the publication of the first forensic DNA phenotyping (FDP) studies, DNA-based appearance predictions are now becoming a reality in routine crime scene investigations. The significant number of publications dedicated to the subject of FDP clearly demonstrates a sustained interest and a strong need for further method development. However, the implementation of FDP in routine work still encounters obstacles, and one of these challenges is making phenotype predictions from DNA mixtures. In this study, we examined single-cell sequencing as a potential tool to enable reliable phenotyping of contributors within mixtures. Two mock mixtures, each containing two contributors with similar and different physical appearances, were analyzed using two different workflows. In the first workflow, the mixtures were sequenced using the Ion AmpliSeq™ PhenoTrivium Panel, which includes 41 HIrisPlex-S (HPS) markers. Subsequently, the genotypes were analyzed using the HPS Deconvolution Tool to predict the phenotypes of both contributors. The second workflow involved the introduction of single-cell separation and collection using the DEPArray™ PLUS System. Two different PhenoTrivium amplification protocols were tested, and the phenotype predictions from single cells were compared with the results obtained using the HPS Tool. Our results suggest that the approach presented here allows for the obtainment of nearly complete HIrisPlex-S profiles with accurate genotypes and reliable phenotype predictions from single cells. This method proves successful in deconvoluting mixtures submitted to forensic DNA phenotyping.
Collapse
Affiliation(s)
- Marta Diepenbroek
- Institute of Legal Medicine LMU Munich, Nussbaumstrasse 26, 80336 Munich, Germany.
| | - Birgit Bayer
- Institute of Legal Medicine LMU Munich, Nussbaumstrasse 26, 80336 Munich, Germany
| | - Katja Anslinger
- Institute of Legal Medicine LMU Munich, Nussbaumstrasse 26, 80336 Munich, Germany
| |
Collapse
|
6
|
Plummer JT, George SHL. Challenges and Opportunities in Building a Global Representative Single-Cell and Spatial Atlas in Cancer. Cancer Discov 2023; 13:1969-1972. [PMID: 37671469 DOI: 10.1158/2159-8290.cd-23-0810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2023]
Abstract
SUMMARY Cancer health disparities are complex and a mixture of factors that need to be accounted for in both our planning, implementation, and execution across all researchers, especially in single-cell and spatial technologies, which have a higher burden for adoption in low- and middle-income countries. This commentary tackles the hurdles these technologies face in creating a diverse, representative atlas of cancer and is a call to arms for a strategic plan toward inclusivity across all global populations.
Collapse
Affiliation(s)
- Jasmine T Plummer
- Center for Spatial Omics, St. Jude Children's Research Hospital, Memphis, Tennessee
- Comprehensive Cancer Center, St. Jude Children's Research Hospital, Memphis, Tennessee
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, Tennessee
- Department of Cellular and Molecular Biology, St. Jude Children's Research Hospital, Memphis, Tennessee
| | - Sophia H L George
- Department of Obstetrics, Gynecology and Reproductive Sciences, University of Miami Miller School of Medicine, Miami, Florida
- Sylvester Comprehensive Cancer Center, UHealth Medical Systems, University of Miami Miller School of Medicine, Miami, Florida
| |
Collapse
|
7
|
Huffman K, Kruijver M, Ballantyne J, Taylor D. Carrying out common DNA donor analysis using DBLR™ on two or five-cell mini-mixture subsamples for improved discrimination power in complex DNA mixtures. Forensic Sci Int Genet 2023; 66:102908. [PMID: 37402330 DOI: 10.1016/j.fsigen.2023.102908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 06/13/2023] [Accepted: 06/15/2023] [Indexed: 07/06/2023]
Abstract
Probabilistic genotyping systems are able to analyse complex mixed DNA profiles and show good power to discriminate contributors from non-contributors. However, the abilities of the statistical analyses are still unavoidably bound by the quality of information being analysed. If a profile has a high number of contributors, or a contributor that is present in trace amounts, then the amount of information about those individuals in the DNA profile is limited. Recent work has shown the ability to gain better resolution of the genotypes of contributors to complex profiles using cell subsampling. This is the process of taking many sets of a limited number of cells and individually profiling each set. These 'mini-mixtures' can provide greater information about the genotypes of underlying contributors. In our work we take the resulting profiles from multiple subsamplings of complex DNA profiles in equal amounts and show how testing for, and then assuming, a common DNA donor can further improve the ability to resolve the genotypes of contributors. Using direct cell sub-sampling and statistical analysis software DBLR™, we were able to recover single source profiles of uploadable quality from five out of the six contributors of an equally proportioned mixture. Through the analysis of mixtures in this work we provide a template for carrying out common donor analysis for maximum effect.
Collapse
Affiliation(s)
- Kaitlin Huffman
- Graduate Program in Chemistry, Department of Chemistry, University of Central Florida, P.O. Box 162366, Orlando, FL 32816-2366, USA
| | - Maarten Kruijver
- Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142, New Zealand
| | - Jack Ballantyne
- Graduate Program in Chemistry, Department of Chemistry, University of Central Florida, P.O. Box 162366, Orlando, FL 32816-2366, USA; National Center for Forensic Science, P.O. Box 162367, Orlando, FL 32816-2367, USA
| | - Duncan Taylor
- Forensic Science SA, GPO Box 2790, Adelaide, SA 5001, Australia; School of Biological Sciences, Flinders University, GPO Box 2100, Adelaide, SA 5001, Australia.
| |
Collapse
|