Song M, Zhou Y, Zhao C, Song F, Hou Y. YHP: Y-chromosome Haplogroup Predictor for predicting male lineages based on Y-STRs.
Forensic Sci Int 2024;
361:112113. [PMID:
38936202 DOI:
10.1016/j.forsciint.2024.112113]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 05/24/2024] [Accepted: 06/16/2024] [Indexed: 06/29/2024]
Abstract
Human Y chromosome reflects the evolutionary process of males. Male lineage tracing by Y chromosome is of great use in evolutionary, forensic, and anthropological studies. Identifying the male lineage based on the specific distribution of Y haplogroups narrows down the investigation scope, which has been used in forensic scenarios. However, existing software aids in familial searching using Y-STRs (Y-chromosome short tandem repeats) to predict Y-SNP (Y-chromosome single nucleotide polymorphism) haplogroups, they often lack resolution. In this study, we developed YHP (Y Haplogroup Predictor), a novel software offering high-resolution haplogroup inference without requiring extensive Y-SNP sequencing. Leveraging existing datasets (219 haplogroups, 4064 samples in total), YHP predicts haplogroups with 0.923 accuracy under the highest haplogroup resolution, employing a random forest algorithm. YHP, available on Github (https://github.com/cissy123/YHP-Y-Haplogroup-Predictor-), facilitates high-resolution haplogroup prediction, haplotype mismatch analysis, and haplotype similarity comparison. Notably, it demonstrates efficacy in East Asian populations, benefiting from training data from eight distinct East Asian ethnic populations. Moreover, it enables seamless integration of additional training sets, extending its utility to diverse populations.
Collapse