1
|
Chen H, Xu S. Population genomics advances in frontier ethnic minorities in China. SCIENCE CHINA. LIFE SCIENCES 2024:10.1007/s11427-024-2659-2. [PMID: 39643831 DOI: 10.1007/s11427-024-2659-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 06/18/2024] [Indexed: 12/09/2024]
Abstract
China, with its large geographic span, possesses rich genetic diversity across vast frontier regions in addition to the Han Chinese majority. Importantly, demographic events and various natural and cultural environments in Chinese frontier regions have shaped the genomic diversity of ethnic minorities via local adaptations. Thus, insights into the genetic diversity and adaptive evolution of these under-represented ethnic groups are crucial for understanding evolutionary scenarios and biomedical implications in East Asian populations. Here, we focus on ethnic minorities in Chinese frontier regions and review research advances regarding genomic diversity, genetic structure, population history, genetic admixture, and local adaptation. We first provide an overview of the extensive genetic diversity across populations in different Chinese frontier regions. Next, we summarize research progress regarding genetic ancestry, demographic history, the adaptive process, and the archaic identification of multiple ethnic minorities in different Chinese frontier regions. Finally, we discuss the gaps and opportunities in genomic studies of Chinese populations and the need for a more comprehensive understanding of genomic diversity and the evolution of populations of East Asian ancestry in the post-genomic era.
Collapse
Affiliation(s)
- Hao Chen
- Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Shuhua Xu
- Center for Evolutionary Biology, School of Life Sciences, Fudan University, Shanghai, 200438, China.
| |
Collapse
|
3
|
Elhaik E, Ahsanuddin S, Robinson JM, Foster EM, Mason CE. The impact of cross-kingdom molecular forensics on genetic privacy. MICROBIOME 2021; 9:114. [PMID: 34016161 PMCID: PMC8138925 DOI: 10.1186/s40168-021-01076-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 04/07/2021] [Indexed: 05/21/2023]
Abstract
Recent advances in metagenomic technology and computational prediction may inadvertently weaken an individual's reasonable expectation of privacy. Through cross-kingdom genetic and metagenomic forensics, we can already predict at least a dozen human phenotypes with varying degrees of accuracy. There is also growing potential to detect a "molecular echo" of an individual's microbiome from cells deposited on public surfaces. At present, host genetic data from somatic or germ cells provide more reliable information than microbiome samples. However, the emerging ability to infer personal details from different microscopic biological materials left behind on surfaces requires in-depth ethical and legal scrutiny. There is potential to identify and track individuals, along with new, surreptitious means of genetic discrimination. This commentary underscores the need to update legal and policy frameworks for genetic privacy with additional considerations for the information that could be acquired from microbiome-derived data. The article also aims to stimulate ubiquitous discourse to ensure the protection of genetic rights and liberties in the post-genomic era. Video abstract.
Collapse
Affiliation(s)
- Eran Elhaik
- Department of Biology, Lund University, 22362, Lund, Sweden.
| | - Sofia Ahsanuddin
- Department of Medical Education, Icahn School of Medicine at Mount Sinai, New York, USA
| | - Jake M Robinson
- The Department of Landscape Architecture, University of Sheffield, Sheffield, S10 2TN, UK
- The Healthy Urban Microbiome Initiative (HUMI), Adelaide, 5005, South Australia
| | | | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10021, USA.
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, New York, NY, 10021, USA.
- The Feil Family Brain and Mind Research Institute (BMRI), New York, NY, 10021, USA.
- The Information Society Project, Yale Law School, New Haven, CT, 06511, USA.
| |
Collapse
|
4
|
Elhaik E, Ryan DM. Pair Matcher (PaM): fast model-based optimization of treatment/case-control matches. Bioinformatics 2020; 35:2243-2250. [PMID: 30445488 PMCID: PMC6596890 DOI: 10.1093/bioinformatics/bty946] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Revised: 11/03/2018] [Accepted: 11/15/2018] [Indexed: 11/22/2022] Open
Abstract
Motivation In clinical trials, individuals are matched using demographic criteria, paired and then randomly assigned to treatment and control groups to determine a drug’s efficacy. A chief cause for the irreproducibility of results across pilot to Phase-III trials is population stratification bias caused by the uneven distribution of ancestries in the treatment and control groups. Results Pair Matcher (PaM) addresses stratification bias by optimizing pairing assignments a priori and/or a posteriori to the trial using both genetic and demographic criteria. Using simulated and real datasets, we show that PaM identifies ideal and near-ideal pairs that are more genetically homogeneous than those identified based on competing methods, including the commonly used principal component analysis (PCA). Homogenizing the treatment (or case) and control groups can be expected to improve the accuracy and reproducibility of the trial or genetic study. PaM’s ancestral inferences also allow characterizing responders and developing a precision medicine approach to treatment. Availability and implementation PaM is freely available via Rhttps://github.com/eelhaik/PAM and a web-interface at http://elhaik-matcher.sheffield.ac.uk/ElhaikLab/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Eran Elhaik
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield UK, UK.,INSIGNEO Institute for In Silico Medicine, University of Sheffield, Sheffield UK, UK
| | - Desmond M Ryan
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield UK, UK
| |
Collapse
|
6
|
Esposito U, Das R, Syed S, Pirooznia M, Elhaik E. Ancient Ancestry Informative Markers for Identifying Fine-Scale Ancient Population Structure in Eurasians. Genes (Basel) 2018; 9:E625. [PMID: 30545160 PMCID: PMC6316245 DOI: 10.3390/genes9120625] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Revised: 12/05/2018] [Accepted: 12/10/2018] [Indexed: 12/23/2022] Open
Abstract
The rapid accumulation of ancient human genomes from various areas and time periods potentially enables the expansion of studies of biodiversity, biogeography, forensics, population history, and epidemiology into past populations. However, most ancient DNA (aDNA) data were generated through microarrays designed for modern-day populations, which are known to misrepresent the population structure. Past studies addressed these problems by using ancestry informative markers (AIMs). It is, thereby, unclear whether AIMs derived from contemporary human genomes can capture ancient population structures, and whether AIM-finding methods are applicable to aDNA, provided that the high missingness rates in ancient-and oftentimes haploid-DNA can also distort the population structure. Here, we define ancient AIMs (aAIMs) and develop a framework to evaluate established and novel AIM-finding methods in identifying the most informative markers. We show that aAIMs identified by a novel principal component analysis (PCA)-based method outperform all of the competing methods in classifying ancient individuals into populations and identifying admixed individuals. In some cases, predictions made using the aAIMs were more accurate than those made with a complete marker set. We discuss the features of the ancient Eurasian population structure and strategies to identify aAIMs. This work informs the design of single nucleotide polymorphism (SNP) microarrays and the interpretation of aDNA results, which enables a population-wide testing of primordialist theories.
Collapse
Affiliation(s)
- Umberto Esposito
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield S10 2TN, UK.
| | - Ranajit Das
- Manipal University, Manipal Centre for Natural Sciences (MCNS), Manipal, Karnataka, 576104, India.
| | - Syakir Syed
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield S10 2TN, UK.
| | - Mehdi Pirooznia
- Bioinformatics and Computational Biology, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, MD 20892, USA .
| | - Eran Elhaik
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield S10 2TN, UK.
| |
Collapse
|
7
|
Baughn LB, Pearce K, Larson D, Polley MY, Elhaik E, Baird M, Colby C, Benson J, Li Z, Asmann Y, Therneau T, Cerhan JR, Vachon CM, Stewart AK, Bergsagel PL, Dispenzieri A, Kumar S, Rajkumar SV. Differences in genomic abnormalities among African individuals with monoclonal gammopathies using calculated ancestry. Blood Cancer J 2018; 8:96. [PMID: 30305608 PMCID: PMC6180134 DOI: 10.1038/s41408-018-0132-1] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Accepted: 08/31/2018] [Indexed: 12/11/2022] Open
Abstract
Multiple myeloma (MM) is two- to three-fold more common in African Americans (AAs) compared to European Americans (EAs). This striking disparity, one of the highest of any cancer, may be due to underlying genetic predisposition between these groups. There are multiple unique cytogenetic subtypes of MM, and it is likely that the disparity is associated with only certain subtypes. Previous efforts to understand this disparity have relied on self-reported race rather than genetic ancestry, which may result in bias. To mitigate these difficulties, we studied 881 patients with monoclonal gammopathies who had undergone uniform testing to identify primary cytogenetic abnormalities. DNA from bone marrow samples was genotyped on the Precision Medicine Research Array and biogeographical ancestry was quantitatively assessed using the Geographic Population Structure Origins tool. The probability of having one of three specific subtypes, namely t(11;14), t(14;16), or t(14;20) was significantly higher in the 120 individuals with highest African ancestry (≥80%) compared with the 235 individuals with lowest African ancestry (<0.1%) (51% vs. 33%, respectively, p value = 0.008). Using quantitatively measured African ancestry, we demonstrate a major proportion of the racial disparity in MM is driven by disparity in the occurrence of the t(11;14), t(14;16), and t(14;20) types of MM.
Collapse
Affiliation(s)
- Linda B Baughn
- Division of Laboratory Genetics, Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA
| | - Kathryn Pearce
- Division of Laboratory Genetics, Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA
| | - Dirk Larson
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Mei-Yin Polley
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Eran Elhaik
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, UK
| | | | - Colin Colby
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Joanne Benson
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Zhuo Li
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Yan Asmann
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Terry Therneau
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - James R Cerhan
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Celine M Vachon
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - A Keith Stewart
- Division of Hematology, Department of Internal Medicine, Mayo Clinic, Scottsdale, AZ, USA
- Division of Hematology, Department of Internal Medicine, Mayo Clinic, Rochester, MN, USA
| | - P Leif Bergsagel
- Division of Hematology, Department of Internal Medicine, Mayo Clinic, Scottsdale, AZ, USA
| | - Angela Dispenzieri
- Division of Hematology, Department of Internal Medicine, Mayo Clinic, Rochester, MN, USA
| | - Shaji Kumar
- Division of Hematology, Department of Internal Medicine, Mayo Clinic, Rochester, MN, USA
| | - S Vincent Rajkumar
- Division of Hematology, Department of Internal Medicine, Mayo Clinic, Rochester, MN, USA.
| |
Collapse
|
8
|
Node-Based Resilience Measure Clustering with Applications to Noisy and Overlapping Communities in Complex Networks. APPLIED SCIENCES-BASEL 2018. [DOI: 10.3390/app8081307] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This paper examines a schema for graph-theoretic clustering using node-based resilience measures. Node-based resilience measures optimize an objective based on a critical set of nodes whose removal causes some severity of disconnection in the network. Beyond presenting a general framework for the usage of node based resilience measures for variations of clustering problems, we experimentally validate the usefulness of such methods in accomplishing the following: (i) clustering a graph in one step without knowing the number of clusters a priori; (ii) removing noise from noisy data; and (iii) detecting overlapping communities. We demonstrate that this clustering schema can be applied successfully using a wide range of data, including both real and synthetic networks, both natively in graph form and also expressed as point sets.
Collapse
|