1
|
George SHL, Medina-Rivera A, Idaghdour Y, Lappalainen T, Gallego Romero I. Increasing diversity of functional genetics studies to advance biological discovery and human health. Am J Hum Genet 2023; 110:1996-2002. [PMID: 37995684 PMCID: PMC10716434 DOI: 10.1016/j.ajhg.2023.10.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 10/25/2023] [Accepted: 10/25/2023] [Indexed: 11/25/2023] Open
Abstract
In this perspective we discuss the current lack of genetic and environmental diversity in functional genomics datasets. There is a well-described Eurocentric bias in genetic and functional genomic research that has a clear impact on the benefit this research can bring to underrepresented populations. Current research focused on genetic variant-to-function experiments aims to identify molecular QTLs, but the lack of data from genetically diverse individuals has limited analyses to mostly populations of European ancestry. Although some efforts have been established to increase diversity in functional genomic studies, much remains to be done to consistently generate data for underrepresented populations from now on. We discuss the major barriers for this continuity and suggest actionable insights, aiming to empower research and researchers from underserved populations.
Collapse
Affiliation(s)
- Sophia H L George
- Department of Obstetrics, Gynecology and Reproductive Sciences, Miller School of Medicine, University of Miami, Miami, FL, USA; Sylvester Comprehensive Cancer Center, Miami, FL, USA.
| | - Alejandra Medina-Rivera
- Laboratorio Internacional de Investigación Sobre El Genoma Humano, Universidad Nacional Autónoma de México, Juriquilla, Querétaro, México
| | - Youssef Idaghdour
- Program in Biology, Division of Science and Mathematics, New York University Abu Dhabi, Abu Dhabi, UAE; Public Health Research Center, New York University Abu Dhabi, Abu Dhabi, UAE; Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, UAE
| | - Tuuli Lappalainen
- Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden; New York Genome Center, New York, NY, USA.
| | - Irene Gallego Romero
- Melbourne Integrative Genomics and School of BioSciences, University of Melbourne, Parkville, VIC, Australia; Center for Genomics, Evolution and Medicine, Institute of Genomics, University of Tartu, Tartu, Estonia
| |
Collapse
|
2
|
Ang KC, Canfield VA, Foster TC, Harbaugh TD, Early KA, Harter RL, Reid KP, Leong SL, Kawasawa Y, Liu D, Hawley JW, Cheng KC. Native American genetic ancestry and pigmentation allele contributions to skin color in a Caribbean population. eLife 2023; 12:e77514. [PMID: 37294081 PMCID: PMC10371226 DOI: 10.7554/elife.77514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 06/08/2023] [Indexed: 06/10/2023] Open
Abstract
Our interest in the genetic basis of skin color variation between populations led us to seek a Native American population with genetically African admixture but low frequency of European light skin alleles. Analysis of 458 genomes from individuals residing in the Kalinago Territory of the Commonwealth of Dominica showed approximately 55% Native American, 32% African, and 12% European genetic ancestry, the highest Native American genetic ancestry among Caribbean populations to date. Skin pigmentation ranged from 20 to 80 melanin units, averaging 46. Three albino individuals were determined to be homozygous for a causative multi-nucleotide polymorphism OCA2NW273KV contained within a haplotype of African origin; its allele frequency was 0.03 and single allele effect size was -8 melanin units. Derived allele frequencies of SLC24A5A111T and SLC45A2L374F were 0.14 and 0.06, with single allele effect sizes of -6 and -4, respectively. Native American genetic ancestry by itself reduced pigmentation by more than 20 melanin units (range 24-29). The responsible hypopigmenting genetic variants remain to be identified, since none of the published polymorphisms predicted in prior literature to affect skin color in Native Americans caused detectable hypopigmentation in the Kalinago.
Collapse
Affiliation(s)
- Khai C Ang
- Department of Pathology, Penn State College of MedicineHersheyUnited States
- Jake Gittlen Laboratories for Cancer Research, Penn State College of MedicineHersheyUnited States
| | - Victor A Canfield
- Department of Pathology, Penn State College of MedicineHersheyUnited States
- Jake Gittlen Laboratories for Cancer Research, Penn State College of MedicineHersheyUnited States
| | - Tiffany C Foster
- Department of Pathology, Penn State College of MedicineHersheyUnited States
- Jake Gittlen Laboratories for Cancer Research, Penn State College of MedicineHersheyUnited States
| | - Thaddeus D Harbaugh
- Department of Pathology, Penn State College of MedicineHersheyUnited States
- Jake Gittlen Laboratories for Cancer Research, Penn State College of MedicineHersheyUnited States
| | - Kathryn A Early
- Department of Pathology, Penn State College of MedicineHersheyUnited States
- Jake Gittlen Laboratories for Cancer Research, Penn State College of MedicineHersheyUnited States
| | - Rachel L Harter
- Department of Pathology, Penn State College of MedicineHersheyUnited States
| | - Katherine P Reid
- Department of Pathology, Penn State College of MedicineHersheyUnited States
- Jake Gittlen Laboratories for Cancer Research, Penn State College of MedicineHersheyUnited States
| | - Shou Ling Leong
- Department of Family & Community Medicine, Penn State College of MedicineHersheyUnited States
| | - Yuka Kawasawa
- Department of Biochemistry and Molecular Biology, Penn State College of MedicineHersheyUnited States
- Department of Pharmacology, Penn State College of MedicineHersheyUnited States
- Institute of Personalized Medicine, Penn State College of MedicineHersheyUnited States
| | - Dajiang Liu
- Department of Biochemistry and Molecular Biology, Penn State College of MedicineHersheyUnited States
- Department of Public Health Sciences, Penn State College of MedicineHersheyUnited States
| | | | - Keith C Cheng
- Department of Pathology, Penn State College of MedicineHersheyUnited States
- Jake Gittlen Laboratories for Cancer Research, Penn State College of MedicineHersheyUnited States
- Department of Biochemistry and Molecular Biology, Penn State College of MedicineHersheyUnited States
- Department of Pharmacology, Penn State College of MedicineHersheyUnited States
| |
Collapse
|
3
|
Marrodan M, Piedrabuena MA, Gaitan MI, Fiol MP, Ysrraelit MC, Carnero Conttenti E, Lopez PA, Peuchot V, Correale J. Performance of McDonald 2017 multiple sclerosis diagnostic criteria and evaluation of genetic ancestry in patients with a first demyelinating event in Argentina. Mult Scler 2023; 29:559-567. [PMID: 36942953 DOI: 10.1177/13524585231157276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/23/2023]
Abstract
BACKGROUND Information on performance of multiple sclerosis (MS) diagnostic criteria is scarce for populations from Latin America, Asia, or the Caribbean. OBJECTIVE To assess performance of revised 2017 McDonald criteria as well as evaluate genetic ancestry in a group of MS patients from Argentina experiencing a debut demyelinating event. METHODS Demographic and clinical characteristics, cerebrospinal fluid (CSF), and magnetic resonance imaging (MRI) findings and new T2 lesions were recorded at baseline and during relapses. Diagnostic accuracy in predicting conversion to clinically defined MS (CDMS) based on initial imaging applying revised 2017 criteria was evaluated and genetic ancestry-informative markers analyzed. RESULTS Of 201 patients experiencing their first demyelinating event (median follow-up 60 months), CDMS was confirmed in 67. We found 2017 diagnostic criteria were more sensitive (84% vs 67%) and less specific (14% vs 33%) than 2010 criteria, especially in a group of patients revised separately, presenting positive oligoclonal bands (88% vs 8%). Genetic testing performed in 128 cases showed 72% of patients were of European ancestry and 27% presented genetic admixture. CONCLUSION 2017 McDonald criteria showed higher sensitivity and lower specificity compared with 2010 criteria, shortening both time-to-diagnosis and time-to-treatment implementation.
Collapse
Affiliation(s)
| | | | | | - Marcela P Fiol
- Departamento de Neurología, Fleni, Buenos Aires, Argentina
| | | | - Edgar Carnero Conttenti
- Unidad de Neuroinmunología, Departamento de Neurociencias, Hospital Alemán, Buenos Aires, Argentina
| | - Pablo Adrian Lopez
- Unidad de Neuroinmunología, Departamento de Neurociencias, Hospital Alemán, Buenos Aires, Argentina
| | | | - Jorge Correale
- Departamento de Neurología, Fleni, Buenos Aires, Argentina/Instituto de Química y Fisicoquímica Biológicas (IQUIFIB), CONICET/Universidad de Buenos Aires, Buenos Aires, Argentina
| |
Collapse
|
4
|
Liu X, Ahsan Z, Martheswaran TK, Rosenberg NA. When is the allele-sharing dissimilarity between two populations exceeded by the allele-sharing dissimilarity of a population with itself? Stat Appl Genet Mol Biol 2023; 22:sagmb-2023-0004. [PMID: 38073574 PMCID: PMC10711674 DOI: 10.1515/sagmb-2023-0004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 11/10/2023] [Indexed: 12/18/2023]
Abstract
Allele-sharing statistics for a genetic locus measure the dissimilarity between two populations as a mean of the dissimilarity between random pairs of individuals, one from each population. Owing to within-population variation in genotype, allele-sharing dissimilarities can have the property that they have a nonzero value when computed between a population and itself. We consider the mathematical properties of allele-sharing dissimilarities in a pair of populations, treating the allele frequencies in the two populations parametrically. Examining two formulations of allele-sharing dissimilarity, we obtain the distributions of within-population and between-population dissimilarities for pairs of individuals. We then mathematically explore the scenarios in which, for certain allele-frequency distributions, the within-population dissimilarity - the mean dissimilarity between randomly chosen members of a population - can exceed the dissimilarity between two populations. Such scenarios assist in explaining observations in population-genetic data that members of a population can be empirically more genetically dissimilar from each other on average than they are from members of another population. For a population pair, however, the mathematical analysis finds that at least one of the two populations always possesses smaller within-population dissimilarity than the value of the between-population dissimilarity. We illustrate the mathematical results with an application to human population-genetic data.
Collapse
Affiliation(s)
- Xiran Liu
- Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA94305, USA
| | - Zarif Ahsan
- Department of Biology, Stanford University, Stanford, CA94305, USA
| | | | | |
Collapse
|
5
|
Paixão D, Torrezan GT, Santiago KM, Formiga MN, Ahuno ST, Dias-Neto E, Tojal da Silva I, Foulkes WD, Polak P, Carraro DM. Characterization of genetic predisposition to molecular subtypes of breast cancer in Brazilian patients. Front Oncol 2022; 12:976959. [PMID: 36119527 PMCID: PMC9472814 DOI: 10.3389/fonc.2022.976959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Accepted: 08/10/2022] [Indexed: 11/13/2022] Open
Abstract
Introduction BRCA1 and BRCA2 germline pathogenic variants (GPVs) account for most of the 5-10% of breast cancer (BC) that is attributable to inherited genetic variants. BRCA1 GPVs are associated with the triple negative subtype, whereas BRCA2 GPVs are likely to result in higher grade, estrogen-receptor positive BCs. The contribution of other genes of high and moderate risk for BC has not been well defined and risk estimates to specific BC subtypes is lacking, especially for an admixed population like Brazilian. Objective The aim of this study is to evaluate the value of a multigene panel in detecting germline mutations in cancer-predisposing genes for Brazilian BC patients and its relation with molecular subtypes and the predominant molecular ancestry. Patients and methods A total of 321 unrelated BC patients who fulfilled NCCN criteria for BRCA1/2 testing between 2016-2018 were investigated with a 94-genes panel. Molecular subtypes were retrieved from medical records and ancestry-specific variants were obtained from off-target reads obtained from the sequencing data. Results We detected 83 GPVs in 81 patients (positivity rate of 25.2%). Among GPVs, 47% (39/83) were identified in high-risk BC genes (BRCA1/2, PALB2 and TP53) and 18% (15/83) in moderate-penetrance genes (ATM, CHEK2 and RAD51C). The remainder of the GPVs (35% - 29/83), were identified in lower-risk genes. As for the molecular subtypes, triple negative BC had a mutation frequency of 31.6% (25/79), with predominance in BRCA1 (12.6%; 10/79). Among the luminal subtypes, except Luminal B HER2-positive, 18.7% (29/155) had GPV with BRCA1/2 genes contributing 7.1% (11/155) and non-BRCA1/2 genes, 12.9% (20/155). For Luminal B HER2-positive subtype, 40% (16/40) had GPVs, with a predominance of ATM gene (15% - 6/40) and BRCA2 with only 2.5% (1/40). Finally, HER2-enriched subtype presented a mutation rate of 30.8% (4/13) with contribution of BRCA2 of 7.5% (1/13) and non-BRCA1/2 of 23% (3/13). Variants of uncertain significance (VUS) were identified in 77.6% (249/321) of the patients and the number of VUS was increased in patients with Asian and Native American ancestry. Conclusion The multigene panel contributed to identify GPVs in genes other than BRCA1/2, increasing the positivity of the genetic test from 9.6% (BRCA1/2) to 25.2% and, considering only the most clinically relevant BC predisposing genes, to 16.2%. These results indicate that women with clinical criteria for hereditary BC may benefit from a multigene panel testing, as it allows identifying GPVs in genes that directly impact the clinical management of these patients and family members.
Collapse
Affiliation(s)
- Daniele Paixão
- Oncogenetics Department, A.C.Camargo Cancer Center, São Paulo, SP, Brazil
| | - Giovana Tardin Torrezan
- Clinical and Functional Genomics Group, International Research Center/CIPE, A.C.Camargo Cancer Center, São Paulo, SP, Brazil
- National Institute of Science and Technology in Oncogenomics and Therapeutic Innovation (INCITO), São Paulo, SP, Brazil
| | - Karina Miranda Santiago
- Clinical and Functional Genomics Group, International Research Center/CIPE, A.C.Camargo Cancer Center, São Paulo, SP, Brazil
| | | | - Samuel Terkper Ahuno
- Tri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY, United States
| | - Emmanuel Dias-Neto
- National Institute of Science and Technology in Oncogenomics and Therapeutic Innovation (INCITO), São Paulo, SP, Brazil
- Genomic Medicine Group, - International Research Center/CIPE, A.C.Camargo Cancer Center, São Paulo, SP, Brazil
| | - Israel Tojal da Silva
- National Institute of Science and Technology in Oncogenomics and Therapeutic Innovation (INCITO), São Paulo, SP, Brazil
- Bioinformatics and Computational Biology Group, - International Research Center/CIPE, A.C.Camargo Cancer Center, São Paulo, SP, Brazil
| | - William D. Foulkes
- Program in Cancer Genetics, Department of Oncology and Human Genetics, McGill University, Montreal, QC, Canada
| | - Paz Polak
- Computational Biology, C2i Genomics, New York, NY, United States
| | - Dirce Maria Carraro
- Clinical and Functional Genomics Group, International Research Center/CIPE, A.C.Camargo Cancer Center, São Paulo, SP, Brazil
- National Institute of Science and Technology in Oncogenomics and Therapeutic Innovation (INCITO), São Paulo, SP, Brazil
- *Correspondence: Dirce Maria Carraro,
| |
Collapse
|
6
|
Fedorova L, Khrunin A, Khvorykh G, Lim J, Thornton N, Mulyar OA, Limborska S, Fedorov A. Analysis of Common SNPs across Continents Reveals Major Genomic Differences between Human Populations. Genes (Basel) 2022; 13:genes13081472. [PMID: 36011383 PMCID: PMC9408407 DOI: 10.3390/genes13081472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 08/12/2022] [Accepted: 08/17/2022] [Indexed: 12/03/2022] Open
Abstract
Common alleles tend to be more ancient than rare alleles. These common SNPs appeared thousands of years ago and reflect intricate human evolution including various adaptations, admixtures, and migration events. Eighty-four thousand abundant region-specific alleles (ARSAs) that are common in one continent but absent in the rest of the world have been characterized by processing 3100 genomes from 230 populations. Also computed were 17,446 polymorphic sites with regional absence of common alleles (RACAs), which are widespread globally but absent in one region. A majority of these region-specific SNPs were found in Africa. America has the second greatest number of ARSAs (3348) and is even ahead of Europe (1911). Surprisingly, East Asia has the highest number of RACAs (10,524) and the lowest number of ARSAs (362). ARSAs and RACAs have distinct compositions of ancestral versus derived alleles in different geographical regions, reflecting their unique evolution. Genes associated with ARSA and RACA SNPs were identified and their functions were analyzed. The core 100 genes shared by multiple populations and associated with region-specific natural selection were examined. The largest part of them (42%) are related to the nervous system. ARSA and RACA SNPs are important for both association and human evolution studies.
Collapse
Affiliation(s)
| | - Andrey Khrunin
- Institute of Molecular Genetics of National Research Centre, “Kurchatov Institute”, 123182 Moscow, Russia
| | - Gennady Khvorykh
- Institute of Molecular Genetics of National Research Centre, “Kurchatov Institute”, 123182 Moscow, Russia
| | - Jan Lim
- CRI Genetics LLC, Santa Monica, CA 90404, USA
| | | | | | - Svetlana Limborska
- Institute of Molecular Genetics of National Research Centre, “Kurchatov Institute”, 123182 Moscow, Russia
| | - Alexei Fedorov
- CRI Genetics LLC, Santa Monica, CA 90404, USA
- Department of Medicine, University of Toledo, Toledo, OH 43606, USA
- Correspondence: ; Tel.: +1-419-383-5270
| |
Collapse
|
7
|
Griesemer J, Barragán CA. Re-situations of scientific knowledge: a case study of a skirmish over clusters vs clines in human population genomics. HISTORY AND PHILOSOPHY OF THE LIFE SCIENCES 2022; 44:16. [PMID: 35445860 PMCID: PMC9023434 DOI: 10.1007/s40656-022-00497-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 03/12/2022] [Indexed: 06/14/2023]
Abstract
We track and analyze the re-situation of scientific knowledge in the field of human population genomics ancestry studies. We understand re-situation as a process of accommodating the direct or indirect transfer of objects of knowledge from one site/situation to (one or many) other sites/situations. Our take on the concept borrows from Mary S. Morgan's work on facts traveling while expanding it to include other objects of knowledge such as models, data, software, findings, and visualizations. We structure a specific case study by tracking the re-situation of these objects between three research projects studying human population diversity reported in three articles in Science, Genome Research and PLoS Genetics between 2002 and 2005. We characterize these three engagements as a unit of analysis, a "skirmish," in order to compare: (a) the divergence of interests in how life-scientists answer similar research questions and (b) to track the challenging transformation of workflows in research laboratories as these scientific objects are re-situated individually or in bundles. Our analysis of the case study shows that an accurate understanding of re-situation requires tracking the whole bundle of objects in a project because they interact in particular key ways. The absence or dismissal of these interactions opens the door to unforeseen trade-offs, misunderstandings and misrepresentations about research design(s) and workflow(s) and what these say about the questions asked and the findings produced.
Collapse
Affiliation(s)
- James Griesemer
- Department of Philosophy, University of California, Davis, One Shields Avenue, Davis, CA 95616 USA
- Department of Science and Technology Studies, University of California, Davis, One Shields Avenue, Davis, CA 95616 USA
| | - Carlos Andrés Barragán
- Department of Philosophy, University of California, Davis, One Shields Avenue, Davis, CA 95616 USA
- Department of Science and Technology Studies, University of California, Davis, One Shields Avenue, Davis, CA 95616 USA
| |
Collapse
|
8
|
Chiu AM, Molloy EK, Tan Z, Talwalkar A, Sankararaman S. Inferring population structure in biobank-scale genomic data. Am J Hum Genet 2022; 109:727-737. [PMID: 35298920 PMCID: PMC9069078 DOI: 10.1016/j.ajhg.2022.02.015] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 02/21/2022] [Indexed: 01/07/2023] Open
Abstract
Inferring the structure of human populations from genetic variation data is a key task in population and medical genomic studies. Although a number of methods for population structure inference have been proposed, current methods are impractical to run on biobank-scale genomic datasets containing millions of individuals and genetic variants. We introduce SCOPE, a method for population structure inference that is orders of magnitude faster than existing methods while achieving comparable accuracy. SCOPE infers population structure in about a day on a dataset containing one million individuals and variants as well as on the UK Biobank dataset containing 488,363 individuals and 569,346 variants. Furthermore, SCOPE can leverage allele frequencies from previous studies to improve the interpretability of population structure estimates.
Collapse
Affiliation(s)
- Alec M Chiu
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Erin K Molloy
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA 90095, USA; Institute for Advanced Computer Studies, University of Maryland, College Park, College Park, MD 20742, USA
| | - Zilong Tan
- Facebook, Inc., Menlo Park, CA 94025, USA
| | - Ameet Talwalkar
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Sriram Sankararaman
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90095, USA; Department of Computer Science, University of California, Los Angeles, Los Angeles, CA 90095, USA; Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA 90095, USA; Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA.
| |
Collapse
|
9
|
Further insight into the global variability of the OCA2-HERC2 locus for human pigmentation from multiallelic markers. Sci Rep 2021; 11:22530. [PMID: 34795370 PMCID: PMC8602267 DOI: 10.1038/s41598-021-01940-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 11/02/2021] [Indexed: 11/20/2022] Open
Abstract
The OCA2-HERC2 locus is responsible for the greatest proportion of eye color variation in humans. Numerous studies extensively described both functional SNPs and associated patterns of variation over this region. The goal of our study is to examine how these haplotype structures and allelic associations vary when highly variable markers such as microsatellites are used. Eleven microsatellites spanning 357 Kb of OCA2-HERC2 genes are analyzed in 3029 individuals from worldwide populations. We found that several markers display large differences in allele frequency (10% to 35% difference) among Europeans, East Asians and Africans. In Europe, the alleles showing increased frequency can also discriminate individuals with (IrisPlex) predicted blue and brown eyes. Distinct haplotypes are identified around the variants C and T of the functional SNP rs12913832 (associated to blue eyes), with linkage disequilibrium r2 values significant up to 237 Kb. The haplotype carrying the allele rs12913832 C has high frequency (76%) in blue eye predicted individuals (30% in brown eye predicted individuals), while the haplotype associated to the allele rs12913832 T is restricted to brown eye predicted individuals. Finally, homozygosity values reach levels of 91% near rs12913832. Odds ratios show values of 4.2, 7.4 and 10.4 for four markers around rs12913832 and 7.1 for their core haplotype. Hence, this study provides an example on the informativeness of multiallelic markers that, despite their current limited potential contribution to forensic eye color prediction, supports the use of microsatellites for identifying causing variants showing similar genetic features and history.
Collapse
|
10
|
Mughal MR, DeGiorgio M. Properties and unbiased estimation of F- and D-statistics in samples containing related and inbred individuals. Genetics 2021; 220:6321956. [PMID: 34849832 PMCID: PMC8733448 DOI: 10.1093/genetics/iyab090] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Accepted: 05/26/2021] [Indexed: 11/14/2022] Open
Abstract
The Patterson F- and D-statistics are commonly used
measures for quantifying population relationships and for testing hypotheses about
demographic history. These statistics make use of allele frequency information across
populations to infer different aspects of population history, such as population structure
and introgression events. Inclusion of related or inbred individuals can bias such
statistics, which may often lead to the filtering of such individuals. Here, we derive
statistical properties of the F- and D-statistics,
including their biases due to the inclusion of related or inbred individuals, their
variances, and their corresponding mean squared errors. Moreover, for those statistics
that are biased, we develop unbiased estimators and evaluate the variances of these new
quantities. Comparisons of the new unbiased statistics to the originals demonstrates that
our newly derived statistics often have lower error across a wide population parameter
space. Furthermore, we apply these unbiased estimators using several global human
populations with the inclusion of related individuals to highlight their application on an
empirical dataset. Finally, we implement these unbiased estimators in open-source software
package funbiased for easy application by the scientific community.
Collapse
Affiliation(s)
- Mehreen R Mughal
- Bioinformatics and Genomics at the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
11
|
The analysis of ancestry with small-scale forensic panels of genetic markers. Emerg Top Life Sci 2021; 5:443-453. [PMID: 33949669 DOI: 10.1042/etls20200327] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 04/07/2021] [Accepted: 04/19/2021] [Indexed: 11/17/2022]
Abstract
In the last 10 years, forensic genetic analysis has been extended beyond identification tests that link a suspect to crime scene evidence using standard DNA profiling, to new supplementary tests that can provide information to investigators about a suspect in the absence of a database hit or eyewitness testimony. These tests now encompass the prediction of physical appearance, ancestry and age. In this review, we give a comprehensive overview of the full range of DNA-based ancestry inference tests designed to work with forensic contact traces, when the level of DNA is often very low or highly degraded. We outline recent developments in the design of ancestry-informative marker sets, forensic assays that use capillary electrophoresis or massively parallel sequencing, and the statistical analysis frameworks that examine the test profile and compares it to reference population variation. Three casework ancestry analysis examples are described which were successfully accomplished in the authors' laboratory, where the ancestry information obtained was critical to the outcome of the DNA analyses made.
Collapse
|
12
|
Galván-Femenía I, Barceló-Vidal C, Sumoy L, Moreno V, de Cid R, Graffelman J. A likelihood ratio approach for identifying three-quarter siblings in genetic databases. Heredity (Edinb) 2021; 126:537-547. [PMID: 33452467 PMCID: PMC8027836 DOI: 10.1038/s41437-020-00392-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 11/04/2020] [Accepted: 11/16/2020] [Indexed: 11/09/2022] Open
Abstract
The detection of family relationships in genetic databases is of interest in various scientific disciplines such as genetic epidemiology, population and conservation genetics, forensic science, and genealogical research. Nowadays, screening genetic databases for related individuals forms an important aspect of standard quality control procedures. Relatedness research is usually based on an allele sharing analysis of identity by state (IBS) or identity by descent (IBD) alleles. Existing IBS/IBD methods mainly aim to identify first-degree relationships (parent-offspring or full siblings) and second degree (half-siblings, avuncular, or grandparent-grandchild) pairs. Little attention has been paid to the detection of in-between first and second-degree relationships such as three-quarter siblings (3/4S) who share fewer alleles than first-degree relationships but more alleles than second-degree relationships. With the progressively increasing sample sizes used in genetic research, it becomes more likely that such relationships are present in the database under study. In this paper, we extend existing likelihood ratio (LR) methodology to accurately infer the existence of 3/4S, distinguishing them from full siblings and second-degree relatives. We use bootstrap confidence intervals to express uncertainty in the LRs. Our proposal accounts for linkage disequilibrium (LD) by using marker pruning, and we validate our methodology with a pedigree-based simulation study accounting for both LD and recombination. An empirical genome-wide array data set from the GCAT Genomes for Life cohort project is used to illustrate the method.
Collapse
Affiliation(s)
- Iván Galván-Femenía
- Department of Computer Science, Applied Mathematics and Statistics, Universitat de Girona, Girona, Spain.,Genomes For Life - GCAT lab, Institute for Health Science Research Germans Trias i Pujol (IGTP), Can Ruti Campus, Badalona, Barcelona, Spain
| | - Carles Barceló-Vidal
- Department of Computer Science, Applied Mathematics and Statistics, Universitat de Girona, Girona, Spain
| | - Lauro Sumoy
- High Content Genomics and Bioinformatics Unit, Institute for Health Science Research Germans Trias i Pujol (IGTP), Can Ruti Campus, Badalona, Barcelona, Spain
| | - Victor Moreno
- Oncology Data Analytics Program, Catalan Institute of Oncology (ICO), Badalona, Spain.,ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain.,Consortium for Biomedical Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain.,Department of Clinical Sciences, University of Barcelona, Barcelona, Spain
| | - Rafael de Cid
- Genomes For Life - GCAT lab, Institute for Health Science Research Germans Trias i Pujol (IGTP), Can Ruti Campus, Badalona, Barcelona, Spain.
| | - Jan Graffelman
- Department of Statistics and Operations Research, Universitat Politècnica de Catalunya, Barcelona, Spain. .,Department of Biostatistics, University of Washington, Seattle, WA, USA.
| |
Collapse
|
13
|
Andersen JD, Meyer OS, Simão F, Jannuzzi J, Carvalho E, Andersen MM, Pereira V, Børsting C, Morling N, Gusmão L. Skin pigmentation and genetic variants in an admixed Brazilian population of primarily European ancestry. Int J Legal Med 2020; 134:1569-1579. [PMID: 32385594 DOI: 10.1007/s00414-020-02307-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Accepted: 04/22/2020] [Indexed: 01/16/2023]
Abstract
Although many genes have been shown to be associated with human pigmentary traits and forensic prediction assays exist (e.g. HIrisPlex-S), the genetic knowledge about skin colour remains incomplete. The highly admixed Brazilian population is an interesting study population for investigation of the complex genotype-phenotype architecture of human skin colour because of its large variation. Here, we compared variants in 22 pigmentary genes with quantitative skin pigmentation levels on the buttock, arm, and forehead areas of 266 genetically admixed Brazilian individuals. The genetic ancestry of each individual was estimated by typing 46 AIM-InDels. The mean proportion of genetic ancestry was 68.8% European, 20.8% Sub-Saharan African, and 10.4% Native American. A high correlation (adjusted R2 = 0.65, p < 0.05) was observed between nine SNPs and quantitative skin pigmentation using multiple linear regression analysis. The correlations were notably smaller between skin pigmentation and biogeographic ancestry (adjusted R2 = 0.45, p < 0.05), or markers in the leading forensic skin colour prediction system, the HIrisPlex-S (adjusted R2 = 0.54, p < 0.05). Four of the nine SNPs, OCA2 rs1448484 (rank 2), APBA2 rs4424881 (rank 4), MFSD12 rs10424065 (rank 8), and TYRP1 1408799 (rank 9) were not investigated as part of the HIrisPlex-S selection process, and therefore not included in the HIrisPlex-S model. Our results indicate that these SNPs account for a substantial part of the skin colour variation in individuals of admixed ancestry. Hence, we suggest that these SNPs are considered when developing future skin colour prediction models.
Collapse
Affiliation(s)
- Jeppe D Andersen
- Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, DK-2100, Copenhagen, Denmark.
| | - Olivia S Meyer
- Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, DK-2100, Copenhagen, Denmark
| | - Filipa Simão
- DNA Diagnostic Laboratory (LDD), Institute of Biology, State University of Rio de Janeiro (UERJ), Rio de Janeiro, Brazil
| | - Juliana Jannuzzi
- DNA Diagnostic Laboratory (LDD), Institute of Biology, State University of Rio de Janeiro (UERJ), Rio de Janeiro, Brazil
| | - Elizeu Carvalho
- DNA Diagnostic Laboratory (LDD), Institute of Biology, State University of Rio de Janeiro (UERJ), Rio de Janeiro, Brazil
| | - Mikkel M Andersen
- Department of Mathematical Sciences, Aalborg University, DK-9000, Aalborg, Denmark
| | - Vania Pereira
- Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, DK-2100, Copenhagen, Denmark
| | - Claus Børsting
- Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, DK-2100, Copenhagen, Denmark
| | - Niels Morling
- Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, DK-2100, Copenhagen, Denmark
| | - Leonor Gusmão
- DNA Diagnostic Laboratory (LDD), Institute of Biology, State University of Rio de Janeiro (UERJ), Rio de Janeiro, Brazil
| |
Collapse
|
14
|
West FL, Algee-Hewitt BF. Cadaveric blood cards: Assessing DNA quality and quantity and the utility of STRs for the individual estimation of trihybrid ancestry and admixture proportions. Forensic Sci Int Synerg 2020; 2:114-122. [PMID: 32412010 PMCID: PMC7219121 DOI: 10.1016/j.fsisyn.2020.03.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Revised: 03/04/2020] [Accepted: 03/06/2020] [Indexed: 02/04/2023]
Abstract
As part a body donation program, blood samples were collected and stored on untreated (non-FTA) blood cards. The blood cards were evaluated in terms of DNA preservation and STR typing success with resulting profiles assessed with special consideration given to profile matching for positive identification and biogeographic ancestry estimation. While STR profiles were successfully generated for all samples, results indicate that the time interval between date of death and sample collection have an impact on DNA quantity and quality. There is a statistically significant decrease in relative fluorescent unit (RFU) values with increasing time interval between date of death and sample collection, indicating degradation in the blood card samples related to the post-mortem interval prior to sample collection. The STR profiles were used to estimate ancestry and admixture using the program STRUCTURE, demonstrating utility of these markers beyond individual identification purposes, with caveats for application based on population history.
Collapse
Affiliation(s)
- Frankie L. West
- Forensic Science Program, Western Carolina University, USA
- Corresponding author.
| | | |
Collapse
|
15
|
Greenbaum G, Rubin A, Templeton AR, Rosenberg NA. Network-based hierarchical population structure analysis for large genomic data sets. Genome Res 2019; 29:2020-2033. [PMID: 31694865 PMCID: PMC6886512 DOI: 10.1101/gr.250092.119] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Accepted: 11/01/2019] [Indexed: 01/24/2023]
Abstract
Analysis of population structure in natural populations using genetic data is a common practice in ecological and evolutionary studies. With large genomic data sets of populations now appearing more frequently across the taxonomic spectrum, it is becoming increasingly possible to reveal many hierarchical levels of structure, including fine-scale genetic clusters. To analyze these data sets, methods need to be appropriately suited to the challenges of extracting multilevel structure from whole-genome data. Here, we present a network-based approach for constructing population structure representations from genetic data. The use of community-detection algorithms from network theory generates a natural hierarchical perspective on the representation that the method produces. The method is computationally efficient, and it requires relatively few assumptions regarding the biological processes that underlie the data. We show the approach by analyzing population structure in the model plant species Arabidopsis thaliana and in human populations. These examples illustrate how network-based approaches for population structure analysis are well-suited to extracting valuable ecological and evolutionary information in the era of large genomic data sets.
Collapse
Affiliation(s)
- Gili Greenbaum
- Department of Biology, Stanford University, Stanford, California 94305, USA
| | - Amir Rubin
- Department of Computer Science, Ben-Gurion University of the Negev, Be'er-Sheva, 8410501, Israel
| | - Alan R Templeton
- Department of Biology, Washington University, St. Louis, Missouri 63130, USA
- Department of Evolutionary and Environmental Ecology, University of Haifa, Haifa, 31905, Israel
| | - Noah A Rosenberg
- Department of Biology, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
16
|
Waples RK, Albrechtsen A, Moltke I. Allele frequency-free inference of close familial relationships from genotypes or low-depth sequencing data. Mol Ecol 2019; 28:35-48. [PMID: 30462358 PMCID: PMC6850436 DOI: 10.1111/mec.14954] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Accepted: 10/12/2018] [Indexed: 01/03/2023]
Abstract
Knowledge of how individuals are related is important in many areas of research, and numerous methods for inferring pairwise relatedness from genetic data have been developed. However, the majority of these methods were not developed for situations where data are limited. Specifically, most methods rely on the availability of population allele frequencies, the relative genomic position of variants and accurate genotype data. But in studies of non‐model organisms or ancient samples, such data are not always available. Motivated by this, we present a new method for pairwise relatedness inference, which requires neither allele frequency information nor information on genomic position. Furthermore, it can be applied not only to accurate genotype data but also to low‐depth sequencing data from which genotypes cannot be accurately called. We evaluate it using data from a range of human populations and show that it can be used to infer close familial relationships with a similar accuracy as a widely used method that relies on population allele frequencies. Additionally, we show that our method is robust to SNP ascertainment and applicable to low‐depth sequencing data generated using different strategies, including resequencing and RADseq, which is important for application to a diverse range of populations and species.
Collapse
Affiliation(s)
- Ryan K Waples
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen N, Denmark
| | - Anders Albrechtsen
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen N, Denmark
| | - Ida Moltke
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen N, Denmark
| |
Collapse
|
17
|
Hao W, Storey JD. Extending Tests of Hardy-Weinberg Equilibrium to Structured Populations. Genetics 2019; 213:759-770. [PMID: 31537622 PMCID: PMC6827367 DOI: 10.1534/genetics.119.302370] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Accepted: 08/21/2019] [Indexed: 12/22/2022] Open
Abstract
Testing for Hardy-Weinberg equilibrium (HWE) is an important component in almost all analyses of population genetic data. Genetic markers that violate HWE are often treated as special cases; for example, they may be flagged as possible genotyping errors, or they may be investigated more closely for evolutionary signatures of interest. The presence of population structure is one reason why genetic markers may fail a test of HWE. This is problematic because almost all natural populations studied in the modern setting show some degree of structure. Therefore, it is important to be able to detect deviations from HWE for reasons other than structure. To this end, we extend statistical tests of HWE to allow for population structure, which we call a test of "structural HWE." Additionally, our new test allows one to automatically choose tuning parameters and identify accurate models of structure. We demonstrate our approach on several important studies, provide theoretical justification for the test, and present empirical evidence for its utility. We anticipate the proposed test will be useful in a broad range of analyses of genome-wide population genetic data.
Collapse
Affiliation(s)
- Wei Hao
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, New Jersey 08544
| | - John D Storey
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, New Jersey 08544
| |
Collapse
|
18
|
Genetic Similarity Assessment of Twin-Family Populations by Custom-Designed Genotyping Array. Twin Res Hum Genet 2019; 22:210-219. [PMID: 31379313 DOI: 10.1017/thg.2019.41] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Twin registries often take part in large collaborative projects and are major contributors to genome-wide association (GWA) meta-analysis studies. In this article, we describe genotyping of twin-family populations from Australia, the Midwestern USA (Avera Twin Register), the Netherlands (Netherlands Twin Register), as well as a sample of mothers of twins from Nigeria to assess the extent, if any, of genetic differences between them. Genotyping in all cohorts was done using a custom-designed Illumina Global Screening Array (GSA), optimized to improve imputation quality for population-specific GWA studies. We investigated the degree of genetic similarity between the populations using several measures of population variation with genotype data generated from the GSA. Visualization of principal component analysis (PCA) revealed that the Australian, Dutch and Midwestern American populations exhibit negligible interpopulation stratification when compared to each other, to a reference European population and to globally distant populations. Estimations of fixation indices (FST values) between the Australian, Midwestern American and Netherlands populations suggest minimal genetic differentiation compared to the estimates between each population and a genetically distinct cohort (i.e., samples from Nigeria genotyped on GSA). Thus, results from this study demonstrate that genotype data from the Australian, Dutch and Midwestern American twin-family populations can be reasonably combined for joint-genetic analysis.
Collapse
|
19
|
The three-hybrid genetic composition of an Ecuadorian population using AIMs-InDels compared with autosomes, mitochondrial DNA and Y chromosome data. Sci Rep 2019; 9:9247. [PMID: 31239502 PMCID: PMC6592923 DOI: 10.1038/s41598-019-45723-w] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2019] [Accepted: 06/04/2019] [Indexed: 11/08/2022] Open
Abstract
The history of Ecuador was marked by the arrival of Europeans with Africans, resulting in the mixture of Native Americans with Africans and Europeans. The present study contributes to the knowledge of the Ecuadorian mestizo population by offering information about ancestry and ethnic heterogeneity. Forty-six AIM-InDels (Ancestry Informative Insertion/Deletion Markers) were used to obtain information on 240 Ecuadorian individuals from three regions (Amazonia, the Highlands, and the Coast). As a result, the population involved a significant contribution from Native Americans (values up to 51%), followed by Europeans (values up to 33%) and Africans (values up to 13%). Furthermore, we compared the data obtained with nine previously reported scientific articles on autosomal, mitochondrial DNA and Y chromosomes. The admixture results correspond to Ecuador's historical background and vary slightly between regions.
Collapse
|
20
|
Nikoghosyan M, Hakobyan S, Hovhannisyan A, Loeffler-Wirth H, Binder H, Arakelyan A. Population Levels Assessment of the Distribution of Disease-Associated Variants With Emphasis on Armenians - A Machine Learning Approach. Front Genet 2019; 10:394. [PMID: 31105750 PMCID: PMC6498285 DOI: 10.3389/fgene.2019.00394] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Accepted: 04/11/2019] [Indexed: 12/25/2022] Open
Abstract
Background: During the last decades a number of genome-wide association studies (GWASs) has identified numerous single nucleotide polymorphisms (SNPs) associated with different complex diseases. However, associations reported in one population are often conflicting and did not replicate when studied in other populations. One of the reasons could be that most GWAS employ a case-control design in one or a limited number of populations, but little attention was paid to the global distribution of disease-associated alleles across different populations. Moreover, the majority of GWAS have been performed on selected European, African, and Chinese populations and the considerable number of populations remains understudied. Aim: We have investigated the global distribution of so far discovered disease-associated SNPs across worldwide populations of different ancestry and geographical regions with a special focus on the understudied population of Armenians. Data and Methods: We have used genotyping data from the Human Genome Diversity Project and of Armenian population and combined them with disease-associated SNP data taken from public repositories leading to a final dataset of 44,234 markers. Their frequency distribution across 1039 individuals from 53 populations was analyzed using self-organizing maps (SOM) machine learning. Our SOM portrayal approach reduces data dimensionality, clusters SNPs with similar frequency profiles and provides two-dimensional data images which enable visual evaluation of disease-associated SNPs landscapes among human populations. Results: We find that populations from Africa, Oceania, and America show specific patterns of minor allele frequencies of disease-associated SNPs, while populations from Europe, Middle East, Central South Asia, and Armenia mostly share similar patterns. Importantly, different sets of SNPs associated with common polygenic diseases, such as cancer, diabetes, neurodegeneration in populations from different geographic regions. Armenians are characterized by a set of SNPs that are distinct from other populations from the neighboring geographical regions. Conclusion: Genetic associations of diseases considerably vary across populations which necessitates health-related genotyping efforts especially for so far understudied populations. SOM portrayal represents novel promising methods in population genetic research with special strength in visualization-based comparison of SNP data.
Collapse
Affiliation(s)
- Maria Nikoghosyan
- Institute of Biomedicine and Pharmacy, Russian-Armenian University, Yerevan, Armenia
- Research Group of Bioinformatics, Institute of Molecular Biology NAS RA, Yerevan, Armenia
| | - Siras Hakobyan
- Research Group of Bioinformatics, Institute of Molecular Biology NAS RA, Yerevan, Armenia
| | - Anahit Hovhannisyan
- Laboratory of Ethnogenomics, Institute of Molecular Biology NAS RA, Yerevan, Armenia
| | - Henry Loeffler-Wirth
- Interdisciplinary Centre for Bioinformatics, University of Leipzig, Leipzig, Germany
| | - Hans Binder
- Interdisciplinary Centre for Bioinformatics, University of Leipzig, Leipzig, Germany
| | - Arsen Arakelyan
- Institute of Biomedicine and Pharmacy, Russian-Armenian University, Yerevan, Armenia
- Research Group of Bioinformatics, Institute of Molecular Biology NAS RA, Yerevan, Armenia
| |
Collapse
|
21
|
Graffelman J, Galván Femenía I, de Cid R, Barceló Vidal C. A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State. Front Genet 2019; 10:341. [PMID: 31068965 PMCID: PMC6491861 DOI: 10.3389/fgene.2019.00341] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2018] [Accepted: 03/29/2019] [Indexed: 12/31/2022] Open
Abstract
The detection of cryptic relatedness in large population-based cohorts is of great importance in genome research. The usual approach for detecting closely related individuals is to plot allele sharing statistics, based on identity-by-state or identity-by-descent, in a two-dimensional scatterplot. This approach ignores that allele sharing data across individuals has in reality a higher dimensionality, and neither regards the compositional nature of the underlying counts of shared genotypes. In this paper we develop biplot methodology based on log-ratio principal component analysis that overcomes these restrictions. This leads to entirely new graphics that are essentially useful for exploring relatedness in genetic databases from homogeneous populations. The proposed method can be applied in an iterative manner, acting as a looking glass for more remote relationships that are harder to classify. Datasets from the 1,000 Genomes Project and the Genomes For Life-GCAT Project are used to illustrate the proposed method. The discriminatory power of the log-ratio biplot approach is compared with the classical plots in a simulation study. In a non-inbred homogeneous population the classification rate of the log-ratio principal component approach outperforms the classical graphics across the whole allele frequency spectrum, using only identity by state. In these circumstances, simulations show that with 35,000 independent bi-allelic variants, log-ratio principal component analysis, combined with discriminant analysis, can correctly classify relationships up to and including the fourth degree.
Collapse
Affiliation(s)
- Jan Graffelman
- Department of Statistics and Operations Research, Technical University of Catalonia, Barcelona, Spain.,Department of Biostatistics, University of Washington, Seattle, WA, United States
| | - Iván Galván Femenía
- Department of Computer Science, Applied Mathematics and Statistics, University of Girona, Girona, Spain.,Genomes For Life - GCAT Lab, Institute for Health Science Research Germans Trias i Pujol (IGTP), Badalona, Spain
| | - Rafael de Cid
- Genomes For Life - GCAT Lab, Institute for Health Science Research Germans Trias i Pujol (IGTP), Badalona, Spain
| | - Carles Barceló Vidal
- Department of Computer Science, Applied Mathematics and Statistics, University of Girona, Girona, Spain
| |
Collapse
|
22
|
Moriot A, Santos C, Freire-Aradas A, Phillips C, Hall D. Inferring biogeographic ancestry with compound markers of slow and fast evolving polymorphisms. Eur J Hum Genet 2018; 26:1697-1707. [PMID: 29995845 PMCID: PMC6189140 DOI: 10.1038/s41431-018-0215-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2017] [Revised: 04/23/2018] [Accepted: 06/12/2018] [Indexed: 11/09/2022] Open
Abstract
Bio-geographic ancestry is an area of considerable interest in the medical genetics, anthropology and forensics. Although genome-wide panels are ideal as they provide dense genotyping data, small sets of ancestry informative marker provide a cost-effective way to investigate genetic ancestry and population structure. Here, we investigate the performance of a reduced marker set that combine different types of autosomal markers through haplotype analysis. In particular, recently described DIP-STR markers should offer the advantage of comprising both, low mutation rate Indels (DIPs), to study human history over longer time scale; and high mutation rate STRs, to trace relatively recent demographic events. In this study, we assessed the ability of an initial set of 23 DIP-STRs to distinguish major population groups using the HGDP-CEPH reference samples. The results obtained applying the STRUCTURE algorithm show that the discrimination capacity of the DIP-STRs is comparable to currently used small-scale ancestry informative markers by approaching seven major demographic groups. Yet, the DIP-STRs show an improved success rate in assigning individuals to populations of Europe and Middle East. These data show a remarkable ability of a preliminary set of 23 DIP-STR markers to infer major biogeographic origins. A novel set of DIP-STRs preselected to contain ancestry information should lead to further improvements.
Collapse
Affiliation(s)
- Amandine Moriot
- Unité de Génétique Forensique, Centre Universitaire Romand de Médecine Légale, Centre Hospitalier Universitaire Vaudois et Université de Lausanne, Lausanne, Switzerland
| | - Carla Santos
- Forensic Genetics Unit, Institute of Forensic Science, University of Santiago de Compostela, Santiago de Compostela, Spain
| | - Ana Freire-Aradas
- Forensic Genetics Unit, Institute of Forensic Science, University of Santiago de Compostela, Santiago de Compostela, Spain
| | - Christopher Phillips
- Forensic Genetics Unit, Institute of Forensic Science, University of Santiago de Compostela, Santiago de Compostela, Spain
| | - Diana Hall
- Unité de Génétique Forensique, Centre Universitaire Romand de Médecine Légale, Centre Hospitalier Universitaire Vaudois et Université de Lausanne, Lausanne, Switzerland.
| |
Collapse
|
23
|
Yuan J, Hu Z, Mahal BA, Zhao SD, Kensler KH, Pi J, Hu X, Zhang Y, Wang Y, Jiang J, Li C, Zhong X, Montone KT, Guan G, Tanyi JL, Fan Y, Xu X, Morgan MA, Long M, Zhang Y, Zhang R, Sood AK, Rebbeck TR, Dang CV, Zhang L. Integrated Analysis of Genetic Ancestry and Genomic Alterations across Cancers. Cancer Cell 2018; 34:549-560.e9. [PMID: 30300578 PMCID: PMC6348897 DOI: 10.1016/j.ccell.2018.08.019] [Citation(s) in RCA: 144] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/21/2017] [Revised: 06/08/2018] [Accepted: 08/29/2018] [Indexed: 12/22/2022]
Abstract
Disparities in cancer care have been a long-standing challenge. We estimated the genetic ancestry of The Cancer Genome Atlas patients, and performed a pan-cancer analysis on the influence of genetic ancestry on genomic alterations. Compared with European Americans, African Americans (AA) with breast, head and neck, and endometrial cancers exhibit a higher level of chromosomal instability, while a lower level of chromosomal instability was observed in AAs with kidney cancers. The frequencies of TP53 mutations and amplification of CCNE1 were increased in AAs in the cancer types showing higher levels of chromosomal instability. We observed lower frequencies of genomic alterations affecting genes in the PI3K pathway in AA patients across cancers. Our result provides insight into genomic contribution to cancer disparities.
Collapse
Affiliation(s)
- Jiao Yuan
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Obstetrics and Gynecology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Zhongyi Hu
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Obstetrics and Gynecology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Brandon A Mahal
- Department of Radiation Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Sihai D Zhao
- Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, IL 61801, USA
| | - Kevin H Kensler
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02115, USA; Department of Epidemiology, Harvard TH Chan School of Public Health, Boston, MA 02115, USA
| | - Jingjiang Pi
- Research Center for Translational Medicine, Shanghai East Hospital, Tongji University School of Medicine, Shanghai 200120, China
| | - Xiaowen Hu
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Obstetrics and Gynecology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Youyou Zhang
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Obstetrics and Gynecology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Yueying Wang
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Obstetrics and Gynecology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Junjie Jiang
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Obstetrics and Gynecology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Chunsheng Li
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Obstetrics and Gynecology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Xiaomin Zhong
- Center for Stem Cell Biology and Tissue Engineering, Department of Biology, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China
| | - Kathleen T Montone
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Guoqiang Guan
- Department of Orthodontics, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Janos L Tanyi
- Department of Obstetrics and Gynecology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Yi Fan
- Department of Radiation Oncology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Xiaowei Xu
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Mark A Morgan
- Department of Obstetrics and Gynecology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Meixiao Long
- Department of Internal Medicine, Division of Hematology, Ohio State University, Columbus, OH 43210, USA
| | - Yuzhen Zhang
- Research Center for Translational Medicine, Shanghai East Hospital, Tongji University School of Medicine, Shanghai 200120, China
| | | | - Anil K Sood
- Center for RNA Interference and Non-coding RNA, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; Department of Gynecologic Oncology and Reproductive Medicine, University of Texas MD Anderson Cancer Center, Houston, TX 77584, USA
| | - Timothy R Rebbeck
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02115, USA; Department of Epidemiology, Harvard TH Chan School of Public Health, Boston, MA 02115, USA
| | - Chi V Dang
- Wistar Institute, Philadelphia, PA 19104, USA; Ludwig Institute for Cancer Research, New York City, NY 10017, USA
| | - Lin Zhang
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Obstetrics and Gynecology, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
24
|
Sethuraman A. Estimating Genetic Relatedness in Admixed Populations. G3 (BETHESDA, MD.) 2018; 8:3203-3220. [PMID: 30104261 PMCID: PMC6169378 DOI: 10.1534/g3.118.200485] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/06/2018] [Accepted: 07/30/2018] [Indexed: 01/12/2023]
Abstract
Estimating genetic relatedness, and inbreeding coefficients is important to the fields of quantitative genetics, conservation, genome-wide association studies (GWAS), and population genetics. Traditional estimators of genetic relatedness assume an underlying model of population structure. Each individual is assigned to a population, depending on a priori assumptions about geographical location of sampling, proximity, or genetic similarity. But often, this population assignment is unknown and assumptions about assignment can lead to erroneous estimates of genetic relatedness. I develop a generalized method of estimating relatedness in admixed populations, to account for (1) multi-allelic genomic data, (2) including all nine Identity By Descent (IBD) states, and implement a maximum likelihood based estimator of pairwise genetic relatedness in structured populations, part of the software, InRelate. Replicated estimations of genetic relatedness between admixed full sib (FS), half sib (HS), first cousin (FC), parent-offspring (PO) and unrelated (UR) dyads in simulated and empirical data from the HGDP-CEPH panel show considerably low bias and error while using InRelate, compared to several previously developed methods. I also propose a bootstrap scheme, and a series of Wald Tests to assign relatedness categories to pairs of individuals.
Collapse
Affiliation(s)
- Arun Sethuraman
- Department of Biological Sciences, California State University San Marcos, CA 92096
| |
Collapse
|
25
|
Phillips C, Devesse L, Ballard D, van Weert L, de la Puente M, Melis S, Álvarez Iglesias V, Freire-Aradas A, Oldroyd N, Holt C, Syndercombe Court D, Carracedo Á, Lareu MV. Global patterns of STR sequence variation: Sequencing the CEPH human genome diversity panel for 58 forensic STRs using the Illumina ForenSeq DNA Signature Prep Kit. Electrophoresis 2018; 39:2708-2724. [DOI: 10.1002/elps.201800117] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2018] [Revised: 07/02/2018] [Accepted: 07/26/2018] [Indexed: 11/06/2022]
Affiliation(s)
- Christopher Phillips
- Forensic Genetics Unit; Institute of Forensic Sciences; University of Santiago de Compostela; Galicia Spain
| | - Laurence Devesse
- King's Forensics; King's College London; London UK
- Verogen Inc.; San Diego USA
| | | | - Leanne van Weert
- Forensic Genetics Unit; Institute of Forensic Sciences; University of Santiago de Compostela; Galicia Spain
| | - Maria de la Puente
- Forensic Genetics Unit; Institute of Forensic Sciences; University of Santiago de Compostela; Galicia Spain
| | - Stefania Melis
- Forensic Genetics Unit; Institute of Forensic Sciences; University of Santiago de Compostela; Galicia Spain
| | - Vanessa Álvarez Iglesias
- Forensic Genetics Unit; Institute of Forensic Sciences; University of Santiago de Compostela; Galicia Spain
| | - Ana Freire-Aradas
- Forensic Genetics Unit; Institute of Forensic Sciences; University of Santiago de Compostela; Galicia Spain
| | | | | | | | - Ángel Carracedo
- Forensic Genetics Unit; Institute of Forensic Sciences; University of Santiago de Compostela; Galicia Spain
- Genomic Medicine Group; University of Santiago de Compostela; Galicia Spain
| | - Maria Victoria Lareu
- Forensic Genetics Unit; Institute of Forensic Sciences; University of Santiago de Compostela; Galicia Spain
| |
Collapse
|
26
|
Nafikov RA, Nato AQ, Sohi H, Wang B, Brown L, Horimoto AR, Vardarajan BN, Barral SM, Tosto G, Mayeux RP, Thornton TA, Blue E, Wijsman EM. Analysis of pedigree data in populations with multiple ancestries: Strategies for dealing with admixture in Caribbean Hispanic families from the ADSP. Genet Epidemiol 2018; 42:500-515. [PMID: 29862559 PMCID: PMC6160322 DOI: 10.1002/gepi.22133] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2017] [Revised: 05/04/2018] [Accepted: 05/14/2018] [Indexed: 11/12/2022]
Abstract
Multipoint linkage analysis is an important approach for localizing disease-associated loci in pedigrees. Linkage analysis, however, is sensitive to misspecification of marker allele frequencies. Pedigrees from recently admixed populations are particularly susceptible to this problem because of the challenge of accurately accounting for population structure. Therefore, increasing emphasis on use of multiethnic samples in genetic studies requires reevaluation of best practices, given data currently available. Typical strategies have been to compute allele frequencies from the sample, or to use marker allele frequencies determined by admixture proportions averaged over the entire sample. However, admixture proportions vary among pedigrees and throughout the genome in a family-specific manner. Here, we evaluate several approaches to model admixture in linkage analysis, providing different levels of detail about ancestral origin. To perform our evaluations, for specification of marker allele frequencies, we used data on 67 Caribbean Hispanic admixed families from the Alzheimer's Disease Sequencing Project. Our results show that choice of admixture model has an effect on the linkage analysis results. Variant-specific admixture proportions, computed for individual families, provide the most detailed regional admixture estimates, and, as such, are the most appropriate allele frequencies for linkage analysis. This likely decreases the number of false-positive results, and is straightforward to implement.
Collapse
Affiliation(s)
- Rafael A Nafikov
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington
| | - Alejandro Q Nato
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington
| | - Harkirat Sohi
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington
| | - Bowen Wang
- Department of Statistics, University of Washington, Seattle, Washington
| | - Lisa Brown
- Department of Biostatistics, University of Washington, Seattle, Washington
| | - Andrea R Horimoto
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington
| | | | - Sandra M Barral
- Department of Neurology, Columbia University, New York, Washington
| | - Giuseppe Tosto
- Department of Neurology, Columbia University, New York, Washington
| | - Richard P Mayeux
- Department of Neurology, Columbia University, New York, Washington
| | - Timothy A Thornton
- Department of Biostatistics, University of Washington, Seattle, Washington
| | - Elizabeth Blue
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington
| | - Ellen M Wijsman
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington.,Department of Biostatistics, University of Washington, Seattle, Washington
| |
Collapse
|
27
|
Chaitanya L, Breslin K, Zuñiga S, Wirken L, Pośpiech E, Kukla-Bartoszek M, Sijen T, Knijff PD, Liu F, Branicki W, Kayser M, Walsh S. The HIrisPlex-S system for eye, hair and skin colour prediction from DNA: Introduction and forensic developmental validation. Forensic Sci Int Genet 2018; 35:123-135. [DOI: 10.1016/j.fsigen.2018.04.004] [Citation(s) in RCA: 138] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Revised: 03/05/2018] [Accepted: 04/06/2018] [Indexed: 11/29/2022]
|
28
|
Abstract
The cavity system of the inner ear—the so-called bony labyrinth—houses the senses of balance and hearing. This structure is embedded in dense petrous bone, fully formed by birth and generally well preserved in human skeletal remains, thus providing a rich source of morphological information about past populations. Here we show that labyrinthine morphology tracks genetic distances and geography in an isolation-by-distance model with dispersal from Africa. Because petrous bones have become prime targets of ancient DNA recovery, we propose that all destructive studies first acquire high-resolution 3D computed-tomography data prior to any invasive sampling. Such data will constitute an important archive of morphological variation in past and present populations, and will permit individual-based genotype–phenotype comparisons. The dispersal of modern humans from Africa is now well documented with genetic data that track population history, as well as gene flow between populations. Phenetic skeletal data, such as cranial and pelvic morphologies, also exhibit a dispersal-from-Africa signal, which, however, tends to be blurred by the effects of local adaptation and in vivo phenotypic plasticity, and that is often deteriorated by postmortem damage to skeletal remains. These complexities raise the question of which skeletal structures most effectively track neutral population history. The cavity system of the inner ear (the so-called bony labyrinth) is a good candidate structure for such analyses. It is already fully formed by birth, which minimizes postnatal phenotypic plasticity, and it is generally well preserved in archaeological samples. Here we use morphometric data of the bony labyrinth to show that it is a surprisingly good marker of the global dispersal of modern humans from Africa. Labyrinthine morphology tracks genetic distances and geography in accordance with an isolation-by-distance model with dispersal from Africa. Our data further indicate that the neutral-like pattern of variation is compatible with stabilizing selection on labyrinth morphology. Given the increasingly important role of the petrous bone for ancient DNA recovery from archaeological specimens, we encourage researchers to acquire 3D morphological data of the inner ear structures before any invasive sampling. Such data will constitute an important archive of phenotypic variation in present and past populations, and will permit individual-based genotype–phenotype comparisons.
Collapse
|
29
|
Yew CW, Hoque MZ, Pugh-Kitingan J, Minsong A, Voo CLY, Ransangan J, Lau STY, Wang X, Saw WY, Ong RTH, Teo YY, Xu S, Hoh BP, Phipps ME, Kumar SV. Genetic relatedness of indigenous ethnic groups in northern Borneo to neighboring populations from Southeast Asia, as inferred from genome-wide SNP data. Ann Hum Genet 2018. [PMID: 29521412 DOI: 10.1111/ahg.12246] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
The region of northern Borneo is home to the current state of Sabah, Malaysia. It is located closest to the southern Philippine islands and may have served as a viaduct for ancient human migration onto or off of Borneo Island. In this study, five indigenous ethnic groups from Sabah were subjected to genome-wide SNP genotyping. These individuals represent the "North Borneo"-speaking group of the great Austronesian family. They have traditionally resided in the inland region of Sabah. The dataset was merged with public datasets, and the genetic relatedness of these groups to neighboring populations from the islands of Southeast Asia, mainland Southeast Asia and southern China was inferred. Genetic structure analysis revealed that these groups formed a genetic cluster that was independent of the clusters of neighboring populations. Additionally, these groups exhibited near-absolute proportions of a genetic component that is also common among Austronesians from Taiwan and the Philippines. They showed no genetic admixture with Austro-Melanesian populations. Furthermore, phylogenetic analysis showed that they are closely related to non-Austro-Melansian Filipinos as well as to Taiwan natives but are distantly related to populations from mainland Southeast Asia. Relatively lower heterozygosity and higher pairwise genetic differentiation index (FST ) values than those of nearby populations indicate that these groups might have experienced genetic drift in the past, resulting in their differentiation from other Austronesians. Subsequent formal testing suggested that these populations have received no gene flow from neighboring populations. Taken together, these results imply that the indigenous ethnic groups of northern Borneo shared a common ancestor with Taiwan natives and non-Austro-Melanesian Filipinos and then isolated themselves on the inland of Sabah. This isolation presumably led to no admixture with other populations, and these individuals therefore underwent strong genetic differentiation. This report contributes to addressing the paucity of genetic data on representatives from this strategic region of ancient human migration event(s).
Collapse
Affiliation(s)
- Chee Wei Yew
- Biotechnology Research Institute, Universiti Malaysia Sabah, Jalan UMS, Sabah, Malaysia
| | - Mohd Zahirul Hoque
- Faculty of Medicine and Health Sciences, Universiti Malaysia Sabah, Jalan UMS, Sabah, Malaysia
| | | | - Alexander Minsong
- Faculty of Humanities, Arts & Heritage, Universiti Malaysia Sabah, Jalan UMS, Sabah, Malaysia
| | | | - Julian Ransangan
- Borneo Marine Research Institute, Universiti Malaysia Sabah, Jalan UMS, Sabah, Malaysia
| | - Sophia Tiek Ying Lau
- Biotechnology Research Institute, Universiti Malaysia Sabah, Jalan UMS, Sabah, Malaysia
| | - Xu Wang
- Department of Statistics and Applied Probability, Faculty of Science, National University of Singapore, Singapore
| | - Woei Yuh Saw
- Department of Statistics and Applied Probability, Faculty of Science, National University of Singapore, Singapore
| | - Rick Twee-Hee Ong
- Department of Statistics and Applied Probability, Faculty of Science, National University of Singapore, Singapore.,Saw Swee Hock School of Public Health, National University of Singapore, Singapore
| | - Yik-Ying Teo
- Department of Statistics and Applied Probability, Faculty of Science, National University of Singapore, Singapore.,Saw Swee Hock School of Public Health, National University of Singapore, Singapore.,NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore.,Life Sciences Institute, National University of Singapore, Singapore.,Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore
| | - Shuhua Xu
- Max Planck Independent Research Group on Population Genomics, Chinese Academy of Sciences and Max Planck Society Partner Institute for Computational Biology (PICB), Shanghai Institute for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.,School of Life Science and Technology, ShanghaiTec University, Shanghai, China.,Collaborative Innovation Centre of Genetics and Development, Shanghai, China
| | - Boon-Peng Hoh
- Institute for Molecular Medical Biotechnology, Universiti Teknologi MARA, Selangor, Malaysia.,Faculty of Medicine and Health Sciences, UCSI University, Kuala Lumpur, Malaysia
| | - Maude E Phipps
- Jeffrey Cheah School of Medicine and Health Sciences, Monash University Malaysia, Selangor, Malaysia
| | - S Vijay Kumar
- Biotechnology Research Institute, Universiti Malaysia Sabah, Jalan UMS, Sabah, Malaysia
| |
Collapse
|
30
|
Bigham AW, Magnaye K, Dunn DM, Weiss RB, Bamshad M. Complex signatures of natural selection at GYPA. Hum Genet 2018; 137:151-160. [PMID: 29362874 DOI: 10.1007/s00439-018-1866-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Accepted: 01/06/2018] [Indexed: 01/07/2023]
Abstract
The human MN blood group antigens are isoforms of glycophorin A (GPA) encoded by the gene, GYPA, and are the most abundant erythrocyte sialoglycoproteins. The distribution of MN antigens has been widely studied in human populations yet the evolutionary and/or demographic factors affecting population variation remain elusive. While the primary function of GPA is yet to be discovered, it serves as the major binding site for the 175-kD erythrocyte-binding antigen (EB-175) of the malarial parasite, Plasmodium falciparum, a major selective pressure in recent human history. More specifically, exon two of GYPA encodes the receptor-binding ligand to which P. falciparum binds. Accordingly, there has been keen interest in understanding what impact, if any, natural selection has had on the distribution of variation in GYPA and exon two in particular. To this end, we resequenced GYPA in individuals sampled from both P. falciparum endemic (sub-Saharan Africa and South India) and non-endemic (Europe and East Asia) regions of the world. Observed patterns of variation suggest that GYPA has been subject to balancing selection in populations living in malaria endemic areas and in Europeans, but no such evidence was found in samples from East Asia, Oceania, and the Americas. These results are consistent with malaria acting as a selective pressure on GYPA, but also suggest that another selective force has resulted in a similar pattern of variation in Europeans. Accordingly, GYPA has perhaps a more complex evolutionary history, wherein on a global scale, spatially varying selective pressures have governed its natural history.
Collapse
Affiliation(s)
- Abigail W Bigham
- Department of Anthropology, The University of Michigan, 222C West Hall, 1085 S. University, Ann Arbor, MI, 48109-1107, USA.
| | - Kevin Magnaye
- Department of Human Genetics, The University of Chicago, Chicago, IL, USA
| | - Diane M Dunn
- Department of Human Genetics, The University of Utah, Salt Lake City, UT, USA
| | - Robert B Weiss
- Department of Human Genetics, The University of Utah, Salt Lake City, UT, USA
| | - Michael Bamshad
- Departments of Pediatrics and Genome Sciences, The University of Washington, Seattle, WA, USA
| |
Collapse
|
31
|
Aguilar-Velázquez JA, Martínez-Sevilla VM, Sosa-Macías M, González-Martin A, Muñoz-Valle JF, Rangel-Villalobos H. Evaluation of the contribution of D9S1120 to anthropological studies in Native American populations. HOMO-JOURNAL OF COMPARATIVE HUMAN BIOLOGY 2017; 68:440-451. [PMID: 29175060 DOI: 10.1016/j.jchb.2017.10.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/12/2016] [Accepted: 10/20/2017] [Indexed: 10/18/2022]
Abstract
The D9S1120 locus exhibits a population-specific allele of 9 repeats (9RA) in all Native American and two Siberian populations currently studied, but it is absent in other worldwide populations. Although this feature has been used in anthropological genetic studies, its impact on the evaluation of the structure and genetic relations among Native American populations has been scarcely assessed. Consequently, the aim of this study was to evaluate the anthropological impact of D9S1120 when it was added to STR population datasets in Mexican Native American groups. We analyzed D9S1120 by PCR and capillary electrophoresis (CE) in 1117 unrelated individuals from 13 native groups from the north and west of Mexico. Additional worldwide populations previously studied with D9S1120 and/or 15 autosomal STRs (Identifier kit) were included for interpopulation analyses. We report statistical results of forensic importance for D9S1120. On average, the modal alleles were the Native American-specific allele 9RA (0.3254) and 16 (0.3362). Genetic distances between Native American and worldwide populations were estimated. When D9S1120 was included in the 15 STR population dataset, we observed improvements for admixture estimation in Mestizo populations and for representing congruent genetic relationships in dendrograms. Analysis of molecular variance (AMOVA) based on D9S1120 confirms that most of the genetic variability in the Mexican population is attributable to their Native American backgrounds, and allows the detection of significant intercontinental differentiation attributed to the exclusive presence of 9RA in America. Our findings demonstrate the contribution of D9S1120 to a better understanding of the genetic relationships and structure among Mexican Native groups.
Collapse
Affiliation(s)
- J A Aguilar-Velázquez
- Instituto de Investigación en Genética Molecular, Centro Universitario de la Ciénega, Universidad de Guadalajara (CUCI-UdeG), Av. Universidad #1115, Ocotlán, Jalisco, México, CP 47810
| | - V Manuel Martínez-Sevilla
- Instituto de Investigación en Genética Molecular, Centro Universitario de la Ciénega, Universidad de Guadalajara (CUCI-UdeG), Av. Universidad #1115, Ocotlán, Jalisco, México, CP 47810
| | - M Sosa-Macías
- Centro Interdisciplinario de Investigación para el Desarrollo Integral Regional del Instituto Politécnico Nacional, Unidad Durango (CIIDIR-IPN), Durango, México
| | - A González-Martin
- Departamento de Zoología y Antropología Física, Universidad Complutense de Madrid (UCM), 28040 Madrid, Spain
| | - J F Muñoz-Valle
- Instituto de Investigación en Ciencias Biomédicas, Centro Universitario en Ciencias de la Salud (CUCS-UdeG), Guadalajara, Jalisco, México
| | - H Rangel-Villalobos
- Instituto de Investigación en Genética Molecular, Centro Universitario de la Ciénega, Universidad de Guadalajara (CUCI-UdeG), Av. Universidad #1115, Ocotlán, Jalisco, México, CP 47810.
| |
Collapse
|
32
|
Parolo S, Lacroix S, Kaput J, Scott-Boyer MP. Ancestors' dietary patterns and environments could drive positive selection in genes involved in micronutrient metabolism-the case of cofactor transporters. GENES AND NUTRITION 2017; 12:28. [PMID: 29043008 PMCID: PMC5628472 DOI: 10.1186/s12263-017-0579-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2017] [Accepted: 09/19/2017] [Indexed: 02/06/2023]
Abstract
Background During evolution, humans colonized different ecological niches and adopted a variety of subsistence strategies that gave rise to diverse selective pressures acting across the genome. Environmentally induced selection of vitamin, mineral, or other cofactor transporters could influence micronutrient-requiring molecular reactions and contribute to inter-individual variability in response to foods and nutritional interventions. Methods A comprehensive list of genes coding for transporters of cofactors or their precursors was built using data mining procedures from the HGDP dataset and then explored to detect evidence of positive genetic selection. This dataset was chosen since it comprises several genetically diverse worldwide populations whom ancestries have evolved in different environments and thus lived following various nutritional habits and lifestyles. Results We identified 312 cofactor transporter (CT) genes involved in between-cell or sub-cellular compartment distribution of 28 cofactors derived from dietary intake. Twenty-four SNPs distributed across 14 CT genes separated populations into continental and intra-continental groups such as African hunter-gatherers and farmers, and between Native American sub-populations. Notably, four SNPs were located in SLC24A3 with one being a known eQTL of the NCKX3 protein. Conclusions These findings could support the importance of considering individual’s genetic makeup along with their metabolic profile when tailoring personalized dietary interventions for optimizing health. Electronic supplementary material The online version of this article (10.1186/s12263-017-0579-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Silvia Parolo
- The Microsoft Research, University of Trento Centre for Computational Systems Biology (COSBI), piazza Manifattura 1, 38068 Rovereto, TN Italy
| | - Sébastien Lacroix
- The Microsoft Research, University of Trento Centre for Computational Systems Biology (COSBI), piazza Manifattura 1, 38068 Rovereto, TN Italy
| | | | - Marie-Pier Scott-Boyer
- The Microsoft Research, University of Trento Centre for Computational Systems Biology (COSBI), piazza Manifattura 1, 38068 Rovereto, TN Italy
| |
Collapse
|
33
|
Ullah I, Olofsson JK, Margaryan A, Ilardo M, Ahmad H, Sikora M, Hansen AJ, Shahid Nadeem M, Fazal N, Ali M, Buchard A, Hemphill BE, Willerslev E, Allentoft ME. High Y-chromosomal Differentiation Among Ethnic Groups of Dir and Swat Districts, Pakistan. Ann Hum Genet 2017; 81:234-248. [PMID: 28771684 DOI: 10.1111/ahg.12204] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2017] [Revised: 04/26/2017] [Accepted: 06/02/2017] [Indexed: 10/19/2022]
Abstract
The ethnic groups that inhabit the mountainous Dir and Swat districts of northern Pakistan are marked by high levels of cultural and phenotypic diversity. To obtain knowledge of the extent of genetic diversity in this region, we investigated Y-chromosomal diversity in five population samples representing the three main ethnic groups residing within these districts, including Gujars, Pashtuns and Kohistanis. A total of 27 Y-chromosomal short tandem repeats (Y-STRs) and 331 Y-chromosomal single nucleotide polymorphisms (Y-SNPs) were investigated. In the Y-STRs, we observed very high and significant levels of genetic differentiation in nine of the 10 pairwise between-group comparisons (RST 0.179-0.746), and the differences were mirrored in the Y-SNP haplogroup frequency distribution. No genetic differences were found between the two Pashtun subethnic groups Tarklanis and Yusafzais (RST = 0.000). Utmankhels, also considered Pashtuns culturally, were not closely related to any of the other population samples (RST 0.451-0.746). Thus, our findings provide examples of both associations and dissociations between cultural and genetic legacies. When analyzed within a larger continental-scale context, these five ethnic groups fall mostly outside the previously characterized Y-chromosomal gene pools of the Indo-Pakistani subcontinent. Male founder effects, coupled with culturally and topographically based constraints upon marriage and movement, are likely responsible for the high degree of genetic structure in this region.
Collapse
Affiliation(s)
- Inam Ullah
- Department of Genetics, Hazara University, Garden Campus, Mansehra, Pakistan.,Centre for GeoGenetics, Natural History Museum, University of Copenhagen, Copenhagen, Denmark
| | - Jill K Olofsson
- Department of Animal and Plant Sciences, University of Sheffield, Western Bank, Sheffield, United Kingdom
| | - Ashot Margaryan
- Centre for GeoGenetics, Natural History Museum, University of Copenhagen, Copenhagen, Denmark
| | - Melissa Ilardo
- Centre for GeoGenetics, Natural History Museum, University of Copenhagen, Copenhagen, Denmark
| | - Habib Ahmad
- Department of Genetics, Hazara University, Garden Campus, Mansehra, Pakistan.,Islamia University, Peshawar, Pakistan
| | - Martin Sikora
- Centre for GeoGenetics, Natural History Museum, University of Copenhagen, Copenhagen, Denmark
| | - Anders J Hansen
- Centre for GeoGenetics, Natural History Museum, University of Copenhagen, Copenhagen, Denmark
| | - Muhammad Shahid Nadeem
- Department of Biochemistry, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Numan Fazal
- Department of Genetics, Hazara University, Garden Campus, Mansehra, Pakistan
| | - Murad Ali
- Department of Genetics, Hazara University, Garden Campus, Mansehra, Pakistan
| | - Anders Buchard
- Department of Forensic Medicine, University of Copenhagen, Copenhagen, Denmark
| | - Brian E Hemphill
- Department of Anthropology, University of Alaska, Fairbanks, AK, USA
| | - Eske Willerslev
- Centre for GeoGenetics, Natural History Museum, University of Copenhagen, Copenhagen, Denmark
| | - Morten E Allentoft
- Centre for GeoGenetics, Natural History Museum, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
34
|
Population structure and coalescence in pedigrees: Comparisons to the structured coalescent and a framework for inference. Theor Popul Biol 2017; 115:1-12. [DOI: 10.1016/j.tpb.2017.01.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Revised: 01/02/2017] [Accepted: 01/18/2017] [Indexed: 01/08/2023]
|
35
|
Walsh S, Chaitanya L, Breslin K, Muralidharan C, Bronikowska A, Pospiech E, Koller J, Kovatsi L, Wollstein A, Branicki W, Liu F, Kayser M. Global skin colour prediction from DNA. Hum Genet 2017; 136:847-863. [PMID: 28500464 PMCID: PMC5487854 DOI: 10.1007/s00439-017-1808-5] [Citation(s) in RCA: 73] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 05/03/2017] [Indexed: 12/14/2022]
Abstract
Human skin colour is highly heritable and externally visible with relevance in medical, forensic, and anthropological genetics. Although eye and hair colour can already be predicted with high accuracies from small sets of carefully selected DNA markers, knowledge about the genetic predictability of skin colour is limited. Here, we investigate the skin colour predictive value of 77 single-nucleotide polymorphisms (SNPs) from 37 genetic loci previously associated with human pigmentation using 2025 individuals from 31 global populations. We identified a minimal set of 36 highly informative skin colour predictive SNPs and developed a statistical prediction model capable of skin colour prediction on a global scale. Average cross-validated prediction accuracies expressed as area under the receiver-operating characteristic curve (AUC) ± standard deviation were 0.97 ± 0.02 for Light, 0.83 ± 0.11 for Dark, and 0.96 ± 0.03 for Dark-Black. When using a 5-category, this resulted in 0.74 ± 0.05 for Very Pale, 0.72 ± 0.03 for Pale, 0.73 ± 0.03 for Intermediate, 0.87±0.1 for Dark, and 0.97 ± 0.03 for Dark-Black. A comparative analysis in 194 independent samples from 17 populations demonstrated that our model outperformed a previously proposed 10-SNP-classifier approach with AUCs rising from 0.79 to 0.82 for White, comparable at the intermediate level of 0.63 and 0.62, respectively, and a large increase from 0.64 to 0.92 for Black. Overall, this study demonstrates that the chosen DNA markers and prediction model, particularly the 5-category level; allow skin colour predictions within and between continental regions for the first time, which will serve as a valuable resource for future applications in forensic and anthropologic genetics.
Collapse
Affiliation(s)
- Susan Walsh
- Department of Biology, Indiana University Purdue University Indianapolis (IUPUI), Indianapolis, IN, USA.
| | - Lakshmi Chaitanya
- Department of Genetic Identification, Erasmus MC University Medical Centre Rotterdam, Rotterdam, The Netherlands
| | - Krystal Breslin
- Department of Biology, Indiana University Purdue University Indianapolis (IUPUI), Indianapolis, IN, USA
| | - Charanya Muralidharan
- Department of Biology, Indiana University Purdue University Indianapolis (IUPUI), Indianapolis, IN, USA
| | - Agnieszka Bronikowska
- Department of Dermatology, Collegium Medicum of the Jagiellonian University, Kraków, Poland
| | - Ewelina Pospiech
- Faculty of Biology and Earth Sciences, Institute of Zoology, Jagiellonian University, Kraków, Poland
- Malopolska Centre of Biotechnology, Jagiellonian University, Kraków, Poland
| | - Julia Koller
- Department of Genetic Identification, Erasmus MC University Medical Centre Rotterdam, Rotterdam, The Netherlands
| | - Leda Kovatsi
- Laboratory of Forensic Medicine and Toxicology, School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Andreas Wollstein
- Section of Evolutionary Biology, Department of Biology II, University of Munich LMU, Planegg-Martinsried, Germany
| | - Wojciech Branicki
- Malopolska Centre of Biotechnology, Jagiellonian University, Kraków, Poland
- Central Forensic Laboratory of the Police, Warsaw, Poland
| | - Fan Liu
- Department of Genetic Identification, Erasmus MC University Medical Centre Rotterdam, Rotterdam, The Netherlands
- Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Manfred Kayser
- Department of Genetic Identification, Erasmus MC University Medical Centre Rotterdam, Rotterdam, The Netherlands.
| |
Collapse
|
36
|
Galván-Femenía I, Graffelman J, Barceló-I-Vidal C. Graphics for relatedness research. Mol Ecol Resour 2017; 17:1271-1282. [PMID: 28374569 PMCID: PMC5624821 DOI: 10.1111/1755-0998.12674] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Revised: 03/15/2017] [Accepted: 03/21/2017] [Indexed: 11/27/2022]
Abstract
Studies of relatedness have been crucial in molecular ecology over the last decades. Good evidence of this is the fact that studies of population structure, evolution of social behaviours, genetic diversity and quantitative genetics all involve relatedness research. The main aim of this article was to review the most common graphical methods used in allele sharing studies for detecting and identifying family relationships. Both IBS- and IBD-based allele sharing studies are considered. Furthermore, we propose two additional graphical methods from the field of compositional data analysis: the ternary diagram and scatterplots of isometric log-ratios of IBS and IBD probabilities. We illustrate all graphical tools with genetic data from the HGDP-CEPH diversity panel, using mainly 377 microsatellites genotyped for 25 individuals from the Maya population of this panel. We enhance all graphics with convex hulls obtained by simulation and use these to confirm the documented relationships. The proposed compositional graphics are shown to be useful in relatedness research, as they also single out the most prominent related pairs. The ternary diagram is advocated for its ability to display all three allele sharing probabilities simultaneously. The log-ratio plots are advocated as an attempt to overcome the problems with the Euclidean distance interpretation in the classical graphics.
Collapse
Affiliation(s)
- Iván Galván-Femenía
- Department of Computer Science, Applied Mathematics and Statistics, Universitat de Girona, Girona, Spain.,Disease Genomics-GCAT Group, Germans Trias Health Research Institute (IGTP)-Program of Predictive and Personalized Medicine of Cancer (PMPPC), Can Ruti Campus, Badalona, Barcelona, Spain
| | - Jan Graffelman
- Department of Statistics and Operations Research, Universitat Politècnica de Catalunya, Barcelona, Spain.,Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Carles Barceló-I-Vidal
- Department of Computer Science, Applied Mathematics and Statistics, Universitat de Girona, Girona, Spain
| |
Collapse
|
37
|
Li L, Wang Y, Yang S, Xia M, Yang Y, Wang J, Lu D, Pan X, Ma T, Jiang P, Yu G, Zhao Z, Ping Y, Zhou H, Zhao X, Sun H, Liu B, Jia D, Li C, Hu R, Lu H, Liu X, Chen W, Mi Q, Xue F, Su Y, Jin L, Li S. Genome-wide screening for highly discriminative SNPs for personal identification and their assessment in world populations. Forensic Sci Int Genet 2017; 28:118-127. [PMID: 28249201 DOI: 10.1016/j.fsigen.2017.02.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Revised: 02/07/2017] [Accepted: 02/10/2017] [Indexed: 10/20/2022]
Abstract
The applications of DNA profiling aim to identify perpetrators, missing family members and disaster victims in forensic investigations. Single nucleotide polymorphisms (SNPs) based forensic applications are emerging rapidly with a potential to replace short tandem repeats (STRs) based panels which are now being used widely, and there is a need for a well-designed SNP panel to meet such challenge for this transition. Here we present a panel of 175 SNP markers (referred to as Fudan ID Panel or FID), selected from ∼3.6 million SNPs, for the application of personal identification. We optimized and validated FID panel using 729 Chinese individuals using a next generation sequencing (NGS) technology. We showed that the SNPs in the panel possess very high heterozygosity as well as low within- and among-continent differentiations, enabling FID panel exhibit discrimination power in both regional and worldwide populations, with the average match probabilities ranging from 4.77×10-71 to 1.06×10-64 across 54 world populations. With the advent of biomedical research, the SNPs connecting physical anthropological, physiological, behavioral and phenotypic traits will be eventually added to the forensic panels that will revolutionize criminal investigation.
Collapse
Affiliation(s)
- Liming Li
- MOE Key Laboratory of Contemporary Anthropology and State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Developmental Biology and School of Life Sciences, Fudan University, Shanghai, 200438, China
| | - Yi Wang
- MOE Key Laboratory of Contemporary Anthropology and State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Developmental Biology and School of Life Sciences, Fudan University, Shanghai, 200438, China
| | - Shuping Yang
- MOE Key Laboratory of Contemporary Anthropology and State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Developmental Biology and School of Life Sciences, Fudan University, Shanghai, 200438, China
| | - Mingying Xia
- MOE Key Laboratory of Contemporary Anthropology and State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Developmental Biology and School of Life Sciences, Fudan University, Shanghai, 200438, China
| | - Yajun Yang
- MOE Key Laboratory of Contemporary Anthropology and State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Developmental Biology and School of Life Sciences, Fudan University, Shanghai, 200438, China; Fudan-Taizhou Institute of Health Sciences, Jiangsu, 225300, China
| | - Jiucun Wang
- MOE Key Laboratory of Contemporary Anthropology and State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Developmental Biology and School of Life Sciences, Fudan University, Shanghai, 200438, China; Fudan-Taizhou Institute of Health Sciences, Jiangsu, 225300, China
| | - Daru Lu
- MOE Key Laboratory of Contemporary Anthropology and State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Developmental Biology and School of Life Sciences, Fudan University, Shanghai, 200438, China; Fudan-Taizhou Institute of Health Sciences, Jiangsu, 225300, China
| | - Xingwei Pan
- MOE Key Laboratory of Contemporary Anthropology and State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Developmental Biology and School of Life Sciences, Fudan University, Shanghai, 200438, China
| | - Teng Ma
- MOE Key Laboratory of Contemporary Anthropology and State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Developmental Biology and School of Life Sciences, Fudan University, Shanghai, 200438, China
| | - Pei Jiang
- MOE Key Laboratory of Contemporary Anthropology and State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Developmental Biology and School of Life Sciences, Fudan University, Shanghai, 200438, China
| | - Ge Yu
- MOE Key Laboratory of Contemporary Anthropology and State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Developmental Biology and School of Life Sciences, Fudan University, Shanghai, 200438, China
| | - Ziqin Zhao
- Department of Forensic Medicine, Shanghai Medicine College, Fudan University, Shanghai, 200000, China
| | - Yuan Ping
- Shanghai Public Security Bureau-Fudan University Joint Laboratory of Human Biology and Forensic Techniques for Crime Scenes, Shanghai Research Institute of Criminal Science and Technology, Shanghai Key Laboratory of Crime Scene Evidence, Key Laboratory of Forensic Evidence and Science Technology, Ministry of Public Security, Shanghai, 200000, China
| | - Huaigu Zhou
- Shanghai Public Security Bureau-Fudan University Joint Laboratory of Human Biology and Forensic Techniques for Crime Scenes, Shanghai Research Institute of Criminal Science and Technology, Shanghai Key Laboratory of Crime Scene Evidence, Key Laboratory of Forensic Evidence and Science Technology, Ministry of Public Security, Shanghai, 200000, China
| | - Xueying Zhao
- Shanghai Public Security Bureau-Fudan University Joint Laboratory of Human Biology and Forensic Techniques for Crime Scenes, Shanghai Research Institute of Criminal Science and Technology, Shanghai Key Laboratory of Crime Scene Evidence, Key Laboratory of Forensic Evidence and Science Technology, Ministry of Public Security, Shanghai, 200000, China
| | - Hui Sun
- MOE Key Laboratory of Contemporary Anthropology and State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Developmental Biology and School of Life Sciences, Fudan University, Shanghai, 200438, China; Beijing Engineering Research Center of Crime Scene Evidence Examination, Institute of Forensic Science, Ministry of Public Security, Beijing, 100038, China
| | - Bing Liu
- Beijing Engineering Research Center of Crime Scene Evidence Examination, Institute of Forensic Science, Ministry of Public Security, Beijing, 100038, China
| | - Dongtao Jia
- Nantong Bureau of Public Safety, Jiangsu, 226000, China
| | - Chengtao Li
- National Institute of Forensics, Ministry of Justice, Shanghai, 200000, China
| | - Rile Hu
- MOE Key Laboratory of Contemporary Anthropology and State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Developmental Biology and School of Life Sciences, Fudan University, Shanghai, 200438, China; Medical College of Inner Mongolia, Ulaanbaatar, Autonomous Region of Inner Mongolia, 010000, China
| | - Hongzhou Lu
- Department of Infectious Diseases, Shanghai Public Health Clinical Center, Fudan University, Shanghai, 200000, China
| | - Xiaoyang Liu
- China-Japan Friendship Hospital, Jilin University, Changchun, Jilin, 130000, China
| | - Wenqing Chen
- Cancer Hospital, Changchun, Jilin, 130000, China
| | - Qin Mi
- Department of Biology and Geography, Qinghai Normal University, Xining, Qinghai, 810000, China
| | - Fuzhong Xue
- School of Public Health, Shandong University, Jinan, Shandong, 250000, China
| | - Yongdong Su
- Bureau of Public Safety, Lhasa, Autonomous Region of Tibet, 850000, China
| | - Li Jin
- MOE Key Laboratory of Contemporary Anthropology and State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Developmental Biology and School of Life Sciences, Fudan University, Shanghai, 200438, China; Fudan-Taizhou Institute of Health Sciences, Jiangsu, 225300, China.
| | - Shilin Li
- MOE Key Laboratory of Contemporary Anthropology and State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Developmental Biology and School of Life Sciences, Fudan University, Shanghai, 200438, China; Fudan-Taizhou Institute of Health Sciences, Jiangsu, 225300, China.
| |
Collapse
|
38
|
Prediction of biogeographical ancestry from genotype: a comparison of classifiers. Int J Legal Med 2016; 131:901-912. [DOI: 10.1007/s00414-016-1504-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Accepted: 11/21/2016] [Indexed: 12/19/2022]
|
39
|
Messina F, Finocchio A, Akar N, Loutradis A, Michalodimitrakis EI, Brdicka R, Jodice C, Novelletto A. Spatially Explicit Models to Investigate Geographic Patterns in the Distribution of Forensic STRs: Application to the North-Eastern Mediterranean. PLoS One 2016; 11:e0167065. [PMID: 27898725 PMCID: PMC5127579 DOI: 10.1371/journal.pone.0167065] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2016] [Accepted: 11/08/2016] [Indexed: 11/18/2022] Open
Abstract
Human forensic STRs used for individual identification have been reported to have little power for inter-population analyses. Several methods have been developed which incorporate information on the spatial distribution of individuals to arrive at a description of the arrangement of diversity. We genotyped at 16 forensic STRs a large population sample obtained from many locations in Italy, Greece and Turkey, i.e. three countries crucial to the understanding of discontinuities at the European/Asian junction and the genetic legacy of ancient migrations, but seldom represented together in previous studies. Using spatial PCA on the full dataset, we detected patterns of population affinities in the area. Additionally, we devised objective criteria to reduce the overall complexity into reduced datasets. Independent spatially explicit methods applied to these latter datasets converged in showing that the extraction of information on long- to medium-range geographical trends and structuring from the overall diversity is possible. All analyses returned the picture of a background clinal variation, with regional discontinuities captured by each of the reduced datasets. Several aspects of our results are confirmed on external STR datasets and replicate those of genome-wide SNP typings. High levels of gene flow were inferred within the main continental areas by coalescent simulations. These results are promising from a microevolutionary perspective, in view of the fast pace at which forensic data are being accumulated for many locales. It is foreseeable that this will allow the exploitation of an invaluable genotypic resource, assembled for other (forensic) purposes, to clarify important aspects in the formation of local gene pools.
Collapse
Affiliation(s)
| | | | - Nejat Akar
- Pediatrics Department, TOBB-Economy and Technology University Hospital, Ankara, Turkey
| | | | | | - Radim Brdicka
- Institute of Haematology and Blood Transfusion, Praha, Czech Republic
| | - Carla Jodice
- Department of Biology, University "Tor Vergata", Rome, Italy
| | - Andrea Novelletto
- Department of Biology, University "Tor Vergata", Rome, Italy
- * E-mail:
| |
Collapse
|
40
|
Scaling probabilistic models of genetic variation to millions of humans. Nat Genet 2016; 48:1587-1590. [PMID: 27819665 PMCID: PMC5127768 DOI: 10.1038/ng.3710] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Accepted: 10/04/2016] [Indexed: 12/20/2022]
Abstract
A major goal of population genetics is to quantitatively understand variation of genetic polymorphisms among individuals. The aggregated number of genotyped humans is currently on the order of millions of individuals, and existing methods do not scale to data of this size. To solve this problem, we developed TeraStructure, an algorithm to fit Bayesian models of genetic variation in structured human populations on tera-sample-sized data sets (1012 observed genotypes; for example, 1 million individuals at 1 million SNPs). TeraStructure is a scalable approach to Bayesian inference in which subsamples of markers are used to update an estimate of the latent population structure among individuals. We demonstrate that TeraStructure performs as well as existing methods on current globally sampled data, and we show using simulations that TeraStructure continues to be accurate and is the only method that can scale to tera-sample sizes.
Collapse
|
41
|
Wang Y, Li J, Kolon TF, Olivant Fisher A, Figueroa TE, BaniHani AH, Hagerty JA, Gonzalez R, Noh PH, Chiavacci RM, Harden KR, Abrams DJ, Stabley D, Kim CE, Sol-Church K, Hakonarson H, Devoto M, Barthold JS. Genomic copy number variation association study in Caucasian patients with nonsyndromic cryptorchidism. BMC Urol 2016; 16:62. [PMID: 27769252 PMCID: PMC5073740 DOI: 10.1186/s12894-016-0180-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2015] [Accepted: 10/14/2016] [Indexed: 11/10/2022] Open
Abstract
Background Copy number variation (CNV) is a potential contributing factor to many genetic diseases. Here we investigated the potential association of CNV with nonsyndromic cryptorchidism, the most common male congenital genitourinary defect, in a Caucasian population. Methods Genome wide genotyping were performed in 559 cases and 1772 controls (Group 1) using Illumina HumanHap550 v1, HumanHap550 v3 or Human610-Quad platforms and in 353 cases and 1149 controls (Group 2) using the Illumina Human OmniExpress 12v1 or Human OmniExpress 12v1-1. Signal intensity data including log R ratio (LRR) and B allele frequency (BAF) for each single nucleotide polymorphism (SNP) were used for CNV detection using PennCNV software. After sample quality control, gene- and CNV-based association tests were performed using cleaned data from Group 1 (493 cases and 1586 controls) and Group 2 (307 cases and 1102 controls) using ParseCNV software. Meta-analysis was performed using gene-based test results as input to identify significant genes, and CNVs in or around significant genes were identified in CNV-based association test results. Called CNVs passing quality control and signal intensity visualization examination were considered for validation using TaqMan CNV assays and QuantStudio® 3D Digital PCR System. Results The meta-analysis identified 373 genome wide significant (p < 5X10−4) genes/loci including 49 genes/loci with deletions and 324 with duplications. Among them, 17 genes with deletion and 1 gene with duplication were identified in CNV-based association results in both Group 1 and Group 2. Only 2 genes (NUCB2 and UPF2) containing deletions passed CNV quality control in both groups and signal intensity visualization examination, but laboratory validation failed to verify these deletions. Conclusions Our data do not support that structural variation is a major cause of nonsyndromic cryptorchidism. Electronic supplementary material The online version of this article (doi:10.1186/s12894-016-0180-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yanping Wang
- Nemours Biomedical Research, Nemours /Alfred I. duPont Hospital for Children, Wilmington, DE, 19803, USA
| | - Jin Li
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Thomas F Kolon
- Division of Urology, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Alicia Olivant Fisher
- Nemours Biomedical Research, Nemours /Alfred I. duPont Hospital for Children, Wilmington, DE, 19803, USA
| | - T Ernesto Figueroa
- Division of Urology, Nemours/Alfred I. duPont Hospital for Children, Wilmington, DE, 19803, USA
| | - Ahmad H BaniHani
- Division of Urology, Nemours/Alfred I. duPont Hospital for Children, Wilmington, DE, 19803, USA
| | - Jennifer A Hagerty
- Division of Urology, Nemours/Alfred I. duPont Hospital for Children, Wilmington, DE, 19803, USA
| | - Ricardo Gonzalez
- Division of Urology, Nemours/Alfred I. duPont Hospital for Children, Wilmington, DE, 19803, USA.,Present address: Auf der Bult Kinder- und Jugendkrankenhaus, Hannover, Germany
| | - Paul H Noh
- Division of Urology, Nemours/Alfred I. duPont Hospital for Children, Wilmington, DE, 19803, USA.,Present address: Division of Pediatric Urology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Rosetta M Chiavacci
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Kisha R Harden
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Debra J Abrams
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Deborah Stabley
- Nemours Biomedical Research, Nemours /Alfred I. duPont Hospital for Children, Wilmington, DE, 19803, USA
| | - Cecilia E Kim
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Katia Sol-Church
- Nemours Biomedical Research, Nemours /Alfred I. duPont Hospital for Children, Wilmington, DE, 19803, USA
| | - Hakon Hakonarson
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.,Division of Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.,Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Marcella Devoto
- Division of Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.,Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.,Department of Molecular Medicine, Sapienza University, Rome, Italy
| | - Julia Spencer Barthold
- Nemours Biomedical Research, Nemours /Alfred I. duPont Hospital for Children, Wilmington, DE, 19803, USA. .,Division of Urology, Nemours/Alfred I. duPont Hospital for Children, Wilmington, DE, 19803, USA.
| |
Collapse
|
42
|
Hunley K, Gwin K, Liberman B. A Reassessment of the Impact of European Contact on the Structure of Native American Genetic Diversity. PLoS One 2016; 11:e0161018. [PMID: 27579784 PMCID: PMC5007009 DOI: 10.1371/journal.pone.0161018] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2016] [Accepted: 07/28/2016] [Indexed: 11/19/2022] Open
Abstract
Our current understanding of pre-Columbian history in the Americas rests in part on several trends identified in recent genetic studies. The goal of this study is to reexamine these trends in light of the impact of post-Columbian admixture and the methods used to study admixture. The previously-published data consist of 645 autosomal microsatellite genotypes from 1046 individuals in 63 populations. We used STRUCTURE to estimate ancestry proportions and tested the sensitivity of these estimates to the choice of the number of clusters, K. We used partial correlation analyses to examine the relationship between gene diversity and geographic distance from Beringia, controlling for non-Native American ancestry (from Africa, Europe and East Asia), and taking into account alternative paths of migration. Principal component analysis and multidimensional scaling were used to investigate the relationships between Andean and non-Andean populations and to explore gene-language correspondence. We found that 1) European and East Asian ancestry estimates decline as K increases, especially in Native Canadian populations, 2) a north-south decline in gene diversity is driven by low diversity in Amazonian and Paraguayan populations, not serial founder effects from Beringia, 3) controlling for non-Native American ancestry, populations in the Andes and Mesoamerica have higher gene diversity than populations in other regions, and 4) patterns of genetic and linguistic diversity are poorly correlated. We conclude that patterns of diversity previously attributed to pre-Columbian processes may in part reflect post-Columbian admixture and the choice of K in STRUCTURE analyses. Accounting for admixture, the pattern of diversity is inconsistent with a north-south founder effect process, though the genetic similarities between Mesoamerican and Andean populations are consistent with rapid dispersal along the western coast of the Americas. Further, even setting aside the disruptive effects of European contact, gene-language congruence is unlikely to have ever existed at the geographic scale analyzed here.
Collapse
Affiliation(s)
- Keith Hunley
- Department of Anthropology, University of New Mexico, Albuquerque, NM, 87131, United States of America
- * E-mail:
| | - Kiela Gwin
- Department of Anthropology, University of New Mexico, Albuquerque, NM, 87131, United States of America
| | - Brendan Liberman
- Department of Anthropology, University of New Mexico, Albuquerque, NM, 87131, United States of America
| |
Collapse
|
43
|
Granot Y, Tal O, Rosset S, Skorecki K. On the Apportionment of Population Structure. PLoS One 2016; 11:e0160413. [PMID: 27505172 PMCID: PMC4978449 DOI: 10.1371/journal.pone.0160413] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2015] [Accepted: 07/19/2016] [Indexed: 11/30/2022] Open
Abstract
Measures of population differentiation, such as FST, are traditionally derived from the partition of diversity within and between populations. However, the emergence of population clusters from multilocus analysis is a function of genetic structure (departures from panmixia) rather than of diversity. If the populations are close to panmixia, slight differences between the mean pairwise distance within and between populations (low FST) can manifest as strong separation between the populations, thus population clusters are often evident even when the vast majority of diversity is partitioned within populations rather than between them. For any given FST value, clusters can be tighter (more panmictic) or looser (more stratified), and in this respect higher FST does not always imply stronger differentiation. In this study we propose a measure for the partition of structure, denoted EST, which is more consistent with results from clustering schemes. Crucially, our measure is based on a statistic of the data that is a good measure of internal structure, mimicking the information extracted by unsupervised clustering or dimensionality reduction schemes. To assess the utility of our metric, we ranked various human (HGDP) population pairs based on FST and EST and found substantial differences in ranking order. EST ranking seems more consistent with population clustering and classification and possibly with geographic distance between populations. Thus, EST may at times outperform FST in identifying evolutionary significant differentiation.
Collapse
Affiliation(s)
- Yaron Granot
- Rappaport Faculty of Medicine and Research Institute, Technion–Israel Institute of Technology, and Rambam Medical Center, Haifa, Israel
- * E-mail:
| | - Omri Tal
- Max Planck Institute for Mathematics in the Sciences, Inselstr. 22-26, 04103, Leipzig, Germany
| | - Saharon Rosset
- School of Mathematical Sciences Tel Aviv University, Tel Aviv, Israel
| | - Karl Skorecki
- Rappaport Faculty of Medicine and Research Institute, Technion–Israel Institute of Technology, and Rambam Medical Center, Haifa, Israel
| |
Collapse
|
44
|
Roudnitzky N, Risso D, Drayna D, Behrens M, Meyerhof W, Wooding SP. Copy Number Variation in TAS2R Bitter Taste Receptor Genes: Structure, Origin, and Population Genetics. Chem Senses 2016; 41:649-59. [PMID: 27340135 DOI: 10.1093/chemse/bjw067] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Bitter taste receptor genes (TAS2Rs) harbor extensive diversity, which is broadly distributed across human populations and strongly associated with taste response phenotypes. The majority of TAS2R variation is composed of single-nucleotide polymorphisms. However, 2 closely positioned loci at 12p13, TAS2R43 and -45, harbor high-frequency deletion (Δ) alleles in which genomic segments are absent, resulting in copy number variation (CNV). To resolve their chromosomal structure and organization, we generated maps using long-range contig alignments and local sequencing across the TAS2R43-45 region. These revealed that the deletion alleles (43Δ and 45Δ) are 37.8 and 32.2kb in length, respectively and span the complete coding region of each gene (~1kb) along with extensive up- and downstream flanking sequence, producing separate CNVs at the 2 loci. Comparisons with a chimpanzee genome, which contained intact homologs of TAS2R43, -45, and nearby TAS2Rs, indicated that the deletions evolved recently, through unequal recombination in a cluster of closely related loci. Population genetic analyses in 946 subjects from 52 worldwide populations revealed that copy number ranged from 0 to 2 at both TAS2R43 and TAS2R45, with 43Δ and 45Δ occurring at high global frequencies (0.33 and 0.18). Estimated recombination rates between the loci were low (ρ = 2.7×10(-4); r = 6.6×10(-9)) and linkage disequilibrium was high (D' = 1.0), consistent with their adjacent genomic positioning and recent origin. Geographic variation pointed to an African origin for the deletions. However, no signatures of natural selection were found in population structure or integrated haplotype scores spanning the region, suggesting that patterns of diversity at TAS2R43 and -45 are primarily due to genetic drift.
Collapse
Affiliation(s)
- Natacha Roudnitzky
- Department of Molecular Genetics, German Institute of Human Nutrition Potsdam-Rehbruecke, Arthur-Scheunert-Allee 114-116, 14558 Nuthetal, Germany
| | - Davide Risso
- National Institute on Deafness and Other Communication Disorders, National Institutes of Health, Bethesda, MD 20892, USA and
| | - Dennis Drayna
- National Institute on Deafness and Other Communication Disorders, National Institutes of Health, Bethesda, MD 20892, USA and
| | - Maik Behrens
- Department of Molecular Genetics, German Institute of Human Nutrition Potsdam-Rehbruecke, Arthur-Scheunert-Allee 114-116, 14558 Nuthetal, Germany
| | - Wolfgang Meyerhof
- Department of Molecular Genetics, German Institute of Human Nutrition Potsdam-Rehbruecke, Arthur-Scheunert-Allee 114-116, 14558 Nuthetal, Germany
| | - Stephen P Wooding
- Health Sciences Research Institute, University of California, Merced, 5200 North Lake Road, Merced, CA 95343, USA
| |
Collapse
|
45
|
Santos C, Phillips C, Gomez-Tato A, Alvarez-Dios J, Carracedo Á, Lareu MV. Inference of Ancestry in Forensic Analysis II: Analysis of Genetic Data. Methods Mol Biol 2016; 1420:255-285. [PMID: 27259745 DOI: 10.1007/978-1-4939-3597-0_19] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Three approaches applicable to the analysis of forensic ancestry-informative marker data-STRUCTURE, principal component analysis, and the Snipper Bayesian classification system-are reviewed. Detailed step-by-step guidance is provided for adjusting parameter settings in STRUCTURE with particular regard to their effect when differentiating populations. Several enhancements to the Snipper online forensic classification portal are described, highlighting the added functionality they bring to particular aspects of ancestry-informative SNP analysis in a forensic context.
Collapse
Affiliation(s)
- Carla Santos
- Forensic Genetics Unit, Luis Concheiro Institute of Forensic Sciences, Genomic Medicine Group, University of Santiago de Compostela, Galicia, 15782, Spain.
| | - Chris Phillips
- Forensic Genetics Unit, Luis Concheiro Institute of Forensic Sciences, Genomic Medicine Group, University of Santiago de Compostela, Galicia, 15782, Spain
| | - A Gomez-Tato
- Faculty of Mathematics, University of Santiago de Compostela, Galicia, Spain
| | - J Alvarez-Dios
- Faculty of Mathematics, University of Santiago de Compostela, Galicia, Spain
| | - Ángel Carracedo
- Forensic Genetics Unit, Luis Concheiro Institute of Forensic Sciences, Genomic Medicine Group, University of Santiago de Compostela, Galicia, 15782, Spain
- Center of Excellence in Genomic Medicine Research, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Maria Victoria Lareu
- Forensic Genetics Unit, Luis Concheiro Institute of Forensic Sciences, Genomic Medicine Group, University of Santiago de Compostela, Galicia, 15782, Spain
| |
Collapse
|
46
|
High level of inbreeding in final phase of 1000 Genomes Project. Sci Rep 2015; 5:17453. [PMID: 26625947 PMCID: PMC4667178 DOI: 10.1038/srep17453] [Citation(s) in RCA: 56] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2015] [Accepted: 10/19/2015] [Indexed: 12/18/2022] Open
Abstract
The 1000 Genomes Project provides a unique source of whole genome sequencing data for studies of human population genetics and human diseases. The last release of this project includes more than 2,500 sequenced individuals from 26 populations. Although relationships among individuals have been investigated in some of the populations, inbreeding has never been studied. In this article, we estimated the genomic inbreeding coefficient of each individual and found an unexpected high level of inbreeding in 1000 Genomes data: nearly a quarter of the individuals were inbred and around 4% of them had inbreeding coefficients similar or greater than the ones expected for first-cousin offspring. Inbred individuals were found in each of the 26 populations, with some populations showing proportions of inbred individuals above 50%. We also detected 227 previously unreported pairs of close relatives (up to and including first-cousins). Thus, we propose subsets of unrelated and outbred individuals, for use by the scientific community. In addition, because admixed populations are present in the 1000 Genomes Project, we performed simulations to study the robustness of inbreeding coefficient estimates in the presence of admixture. We found that our multi-point approach (FSuite) was quite robust to admixture, unlike single-point methods (PLINK).
Collapse
|
47
|
Forni D, Pozzoli U, Cagliani R, Tresoldi C, Menozzi G, Riva S, Guerini FR, Comi GP, Bolognesi E, Bresolin N, Clerici M, Sironi M. Genetic adaptation of the human circadian clock to day-length latitudinal variations and relevance for affective disorders. Genome Biol 2015; 15:499. [PMID: 25358694 PMCID: PMC4237747 DOI: 10.1186/s13059-014-0499-7] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2014] [Indexed: 01/18/2023] Open
Abstract
Background The temporal coordination of biological processes into daily cycles is a common feature of most living organisms. In humans, disruption of circadian rhythms is commonly observed in psychiatric diseases, including schizophrenia, bipolar disorder, depression and autism. Light therapy is the most effective treatment for seasonal affective disorder and circadian-related treatments sustain antidepressant response in bipolar disorder patients. Day/night cycles represent a major circadian synchronizing signal and vary widely with latitude. Results We apply a geographically explicit model to show that out-of-Africa migration, which led humans to occupy a wide latitudinal area, affected the evolutionary history of circadian regulatory genes. The SNPs we identify using this model display consistent signals of natural selection using tests based on population genetic differentiation and haplotype homozygosity. Signals of natural selection driven by annual photoperiod variation are detected for schizophrenia, bipolar disorder, and restless leg syndrome risk variants, in line with the circadian component of these conditions. Conclusions Our results suggest that human populations adapted to life at different latitudes by tuning their circadian clock systems. This process also involves risk variants for neuropsychiatric conditions, suggesting possible genetic modulators for chronotherapies and candidates for interaction analysis with photoperiod-related environmental variables, such as season of birth, country of residence, shift-work or lifestyle habits. Electronic supplementary material The online version of this article (doi:10.1186/s13059-014-0499-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Diego Forni
- Scientific Institute IRCCS E. Medea, 23842 Bosisio Parini, LC, Italy
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
48
|
Zheng X, Weir BS. Eigenanalysis of SNP data with an identity by descent interpretation. Theor Popul Biol 2015; 107:65-76. [PMID: 26482676 DOI: 10.1016/j.tpb.2015.09.004] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2015] [Revised: 09/17/2015] [Accepted: 09/23/2015] [Indexed: 01/11/2023]
Abstract
Principal component analysis (PCA) is widely used in genome-wide association studies (GWAS), and the principal component axes often represent perpendicular gradients in geographic space. The explanation of PCA results is of major interest for geneticists to understand fundamental demographic parameters. Here, we provide an interpretation of PCA based on relatedness measures, which are described by the probability that sets of genes are identical-by-descent (IBD). An approximately linear transformation between ancestral proportions (AP) of individuals with multiple ancestries and their projections onto the principal components is found. In addition, a new method of eigenanalysis "EIGMIX" is proposed to estimate individual ancestries. EIGMIX is a method of moments with computational efficiency suitable for millions of SNP data, and it is not subject to the assumption of linkage equilibrium. With the assumptions of multiple ancestries and their surrogate ancestral samples, EIGMIX is able to infer ancestral proportions (APs) of individuals. The methods were applied to the SNP data from the HapMap Phase 3 project and the Human Genome Diversity Panel. The APs of individuals inferred by EIGMIX are consistent with the findings of the program ADMIXTURE. In conclusion, EIGMIX can be used to detect population structure and estimate genome-wide ancestral proportions with a relatively high accuracy.
Collapse
Affiliation(s)
- Xiuwen Zheng
- Department of Biostatistics, University of Washington, Box 359461, Seattle, WA 98195-9461, USA.
| | - Bruce S Weir
- Department of Biostatistics, University of Washington, Box 359461, Seattle, WA 98195-9461, USA.
| |
Collapse
|
49
|
Medina-Gómez C, Chesi A, Heppe DHM, Zemel BS, Yin JL, Kalkwarf HJ, Hofman A, Lappe JM, Kelly A, Kayser M, Oberfield SE, Gilsanz V, Uitterlinden AG, Shepherd JA, Jaddoe VWV, Grant SFA, Lao O, Rivadeneira F. BMD Loci Contribute to Ethnic and Developmental Differences in Skeletal Fragility across Populations: Assessment of Evolutionary Selection Pressures. Mol Biol Evol 2015; 32:2961-72. [PMID: 26226985 PMCID: PMC4651235 DOI: 10.1093/molbev/msv170] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Bone mineral density (BMD) is a highly heritable trait used both for the diagnosis of osteoporosis in adults and to assess bone health in children. Ethnic differences in BMD have been documented, with markedly higher levels in individuals of African descent, which partially explain disparity in osteoporosis risk across populations. To date, 63 independent genetic variants have been associated with BMD in adults of Northern-European ancestry. Here, we demonstrate that at least 61 of these variants are predictive of BMD early in life by studying their compound effect within two multiethnic pediatric cohorts. Furthermore, we show that within these cohorts and across populations worldwide the frequency of those alleles associated with increased BMD is systematically elevated in individuals of Sub-Saharan African ancestry. The amount of differentiation in the BMD genetic scores among Sub-Saharan and non-Sub-Saharan populations together with neutrality tests, suggest that these allelic differences are compatible with the hypothesis of selective pressures acting on the genetic determinants of BMD. These findings constitute an explorative contribution to the role of selection on ethnic BMD differences and likely a new example of polygenic adaptation acting on a human trait.
Collapse
Affiliation(s)
- Carolina Medina-Gómez
- Department of Internal Medicine, Erasmus University Medical Center, Rotterdam, The Netherlands The Generation R Study Group, Erasmus University Medical Center, Rotterdam, The Netherlands Department of Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Alessandra Chesi
- Division of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, PA
| | - Denise H M Heppe
- The Generation R Study Group, Erasmus University Medical Center, Rotterdam, The Netherlands Department of Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands Department of Pediatrics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Babette S Zemel
- Division of GI, Hepatology, and Nutrition, Children's Hospital of Philadelphia, Philadelphia, PA Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania
| | - Jia-Lian Yin
- Department of Internal Medicine, Erasmus University Medical Center, Rotterdam, The Netherlands The Generation R Study Group, Erasmus University Medical Center, Rotterdam, The Netherlands Department of Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Heidi J Kalkwarf
- Division of General and Community Pediatrics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH
| | - Albert Hofman
- The Generation R Study Group, Erasmus University Medical Center, Rotterdam, The Netherlands Department of Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Joan M Lappe
- Division of Endocrinology, Creighton University, Omaha, NE
| | - Andrea Kelly
- Division of GI, Hepatology, and Nutrition, Children's Hospital of Philadelphia, Philadelphia, PA
| | - Manfred Kayser
- Department of Genetic Identification, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Sharon E Oberfield
- Division of Pediatric Endocrinology, Diabetes, and Metabolism, Department of Pediatrics, Columbia University Medical Center, New York, NY
| | - Vicente Gilsanz
- Department of Radiology, Children's Hospital Los Angeles, Los Angeles, CA
| | - André G Uitterlinden
- Department of Internal Medicine, Erasmus University Medical Center, Rotterdam, The Netherlands The Generation R Study Group, Erasmus University Medical Center, Rotterdam, The Netherlands Department of Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - John A Shepherd
- Department of Radiology and Biomedical Imaging, University of California, San Francisco
| | - Vincent W V Jaddoe
- The Generation R Study Group, Erasmus University Medical Center, Rotterdam, The Netherlands Department of Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands Department of Pediatrics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Struan F A Grant
- Division of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, PA
| | - Oscar Lao
- Department of Genetic Identification, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Fernando Rivadeneira
- Department of Internal Medicine, Erasmus University Medical Center, Rotterdam, The Netherlands The Generation R Study Group, Erasmus University Medical Center, Rotterdam, The Netherlands Department of Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands
| |
Collapse
|
50
|
Barthold JS, Wang Y, Kolon TF, Kollin C, Nordenskjöld A, Olivant Fisher A, Figueroa TE, BaniHani AH, Hagerty JA, Gonzaléz R, Noh PH, Chiavacci RM, Harden KR, Abrams DJ, Kim CE, Li J, Hakonarson H, Devoto M. Pathway analysis supports association of nonsyndromic cryptorchidism with genetic loci linked to cytoskeleton-dependent functions. Hum Reprod 2015. [PMID: 26209787 DOI: 10.1093/humrep/dev180] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
STUDY QUESTION What are the genetic loci that increase susceptibility to nonsyndromic cryptorchidism, or undescended testis? SUMMARY ANSWER A genome-wide association study (GWAS) suggests that susceptibility to cryptorchidism is heterogeneous, with a subset of suggestive signals linked to cytoskeleton-dependent functions and syndromic forms of the disease. WHAT IS KNOWN ALREADY Population studies suggest moderate genetic risk of cryptorchidism and possible maternal and environmental contributions to risk. Previous candidate gene analyses have failed to identify a major associated locus, although variants in insulin-like 3 (INSL3), relaxin/insulin-like family peptide receptor 2 (RXFP2) and other hormonal pathway genes may increase risk in a small percentage of patients. STUDY DESIGN, SIZE, DURATION This is a case-control GWAS of 844 boys with nonsyndromic cryptorchidism and 2718 control subjects without syndromes or genital anomalies, all of European ancestry. PARTICIPANTS/MATERIALS, SETTING, METHODS All boys with cryptorchidism were diagnosed and treated by a pediatric specialist. In the discovery phase, DNA was extracted from tissue or blood samples and genotyping performed using the Illumina HumanHap550 and Human610-Quad (Group 1) or OmniExpress (Group 2) platform. We imputed genotypes genome-wide, and combined single marker association results in meta-analyses for all cases and for secondary subphenotype analyses based on testis position, laterality and age, and defined genome-wide significance as P = 7 × 10(-9) to correct for multiple testing. Selected markers were genotyped in an independent replication group of European cases (n = 298) and controls (n = 324). We used several bioinformatics tools to analyze top (P < 10(-5)) and suggestive (P < 10(-3)) signals for significant enrichment of signaling pathways, cellular functions and custom gene lists after multiple testing correction. MAIN RESULTS AND THE ROLE OF CHANCE In the full analysis, we identified 20 top loci, none reaching genome-wide significance, but one passing this threshold in a subphenotype analysis of proximal testis position (rs55867206, near SH3PXD2B, odds ratio = 2.2 (95% confidence interval 1.7, 2.9), P = 2 × 10(-9)). An additional 127 top loci emerged in at least one secondary analysis, particularly of more severe phenotypes. Cytoskeleton-dependent molecular and cellular functions were prevalent in pathway analysis of suggestive signals, and may implicate loci encoding cytoskeletal proteins that participate in androgen receptor signaling. Genes linked to human syndromic cryptorchidism, including hypogonadotropic hypogonadism, and to hormone-responsive and/or differentially expressed genes in normal and cryptorchid rat gubernaculum, were also significantly overrepresented. No tested marker showed significant replication in an independent population. The results suggest heterogeneous, multilocus and potentially multifactorial susceptibility to nonsyndromic cryptorchidism. LIMITATIONS, REASONS FOR CAUTION The present study failed to identify genome-wide significant markers associated with cryptorchidism that could be replicated in an independent population, so further studies are required to define true positive signals among suggestive loci. WIDER IMPLICATIONS OF THE FINDINGS As the only GWAS to date of nonsyndromic cryptorchidism, these data will provide a basis for future efforts to understand genetic susceptibility to this common reproductive anomaly and the potential for additive risk from environmental exposures. STUDY FUNDING/COMPETING INTERESTS This work was supported by R01HD060769 (the Eunice Kennedy Shriver National Institute for Child Health and Human Development (NICHD)), P20RR20173 (the National Center for Research Resources (NCRR), currently P20GM103464 from the National Institute of General Medical Sciences (NIGMS)), an Institute Development Fund to the Center for Applied Genomics at The Children's Hospital of Philadelphia, and Nemours Biomedical Research. The authors have no competing interests to declare.
Collapse
Affiliation(s)
- Julia Spencer Barthold
- Nemours Biomedical Research, Alfred I. duPont Hospital for Children, Wilmington, DE 19803, USA Division of Urology, Alfred I. duPont Hospital for Children, Wilmington, DE 19803, USA
| | - Yanping Wang
- Nemours Biomedical Research, Alfred I. duPont Hospital for Children, Wilmington, DE 19803, USA
| | - Thomas F Kolon
- Division of Urology, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Claude Kollin
- Department of Women's and Children's Health, Karolinska Institutet, SE-171 76 Stockholm, Sweden
| | - Agneta Nordenskjöld
- Department of Women's and Children's Health, Karolinska Institutet, SE-171 76 Stockholm, Sweden
| | - Alicia Olivant Fisher
- Nemours Biomedical Research, Alfred I. duPont Hospital for Children, Wilmington, DE 19803, USA
| | - T Ernesto Figueroa
- Division of Urology, Alfred I. duPont Hospital for Children, Wilmington, DE 19803, USA
| | - Ahmad H BaniHani
- Division of Urology, Alfred I. duPont Hospital for Children, Wilmington, DE 19803, USA
| | - Jennifer A Hagerty
- Division of Urology, Alfred I. duPont Hospital for Children, Wilmington, DE 19803, USA
| | - Ricardo Gonzaléz
- Division of Urology, Alfred I. duPont Hospital for Children, Wilmington, DE 19803, USA Present address: Auf der Bult Kinder- und Jugendkrankenhaus, Hannover, Germany
| | - Paul H Noh
- Division of Urology, Alfred I. duPont Hospital for Children, Wilmington, DE 19803, USA Present address: Division of Pediatric Urology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Rosetta M Chiavacci
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Kisha R Harden
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Debra J Abrams
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Cecilia E Kim
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Jin Li
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Hakon Hakonarson
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA Division of Genetics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Marcella Devoto
- Division of Genetics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA Department of Molecular Medicine, Sapienza University, Rome, Italy
| |
Collapse
|