1
|
Reščenko R, Brīvība M, Atava I, Rovīte V, Pečulis R, Silamiķelis I, Ansone L, Megnis K, Birzniece L, Leja M, Xu L, Shi X, Zhou Y, Slaitas A, Hou Y, Kloviņš J. Whole-Genome Sequencing of 502 Individuals from Latvia: The First Step towards a Population-Specific Reference of Genetic Variation. Int J Mol Sci 2023; 24:15345. [PMID: 37895026 PMCID: PMC10607061 DOI: 10.3390/ijms242015345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 10/10/2023] [Accepted: 10/11/2023] [Indexed: 10/29/2023] Open
Abstract
Despite rapid improvements in the accessibility of whole-genome sequencing (WGS), understanding the extent of human genetic variation is limited by the scarce availability of genome sequences from underrepresented populations. Developing the population-scale reference database of Latvian genetic variation may fill the gap in European genomes and improve human genomics research. In this study, we analysed a high-coverage WGS dataset comprising 502 individuals selected from the Genome Database of the Latvian Population. An assessment of variant type, location in the genome, function, medical relevance, and novelty was performed, and a population-specific imputation reference panel (IRP) was developed. We identified more than 18.2 million variants in total, of which 3.3% so far are not represented in gnomAD and dbSNP databases. Moreover, we observed a notable though distinct clustering of the Latvian cohort within the European subpopulations. Finally, our findings demonstrate the improved performance of imputation of variants using the Latvian population-specific reference panel in the Latvian population compared to established IRPs. In summary, our study provides the first WGS data for a regional reference genome that will serve as a resource for the development of precision medicine and complement the global genome dataset, improving the understanding of human genetic variation.
Collapse
Affiliation(s)
- Raimonds Reščenko
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Monta Brīvība
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Ivanna Atava
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Vita Rovīte
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Raitis Pečulis
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Ivars Silamiķelis
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Laura Ansone
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Kaspars Megnis
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Līga Birzniece
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| | - Mārcis Leja
- Faculty of Medicine, University of Latvia, LV-1004 Riga, Latvia;
- Institute of Clinical and Preventive Medicine, University of Latvia, LV-1079 Riga, Latvia
| | - Liqin Xu
- Latvia MGI Tech, LV-2167 Mārupe, Latvia; (L.X.); (X.S.); (Y.Z.); (A.S.); (Y.H.)
| | - Xulian Shi
- Latvia MGI Tech, LV-2167 Mārupe, Latvia; (L.X.); (X.S.); (Y.Z.); (A.S.); (Y.H.)
| | - Yan Zhou
- Latvia MGI Tech, LV-2167 Mārupe, Latvia; (L.X.); (X.S.); (Y.Z.); (A.S.); (Y.H.)
| | - Andis Slaitas
- Latvia MGI Tech, LV-2167 Mārupe, Latvia; (L.X.); (X.S.); (Y.Z.); (A.S.); (Y.H.)
| | - Yong Hou
- Latvia MGI Tech, LV-2167 Mārupe, Latvia; (L.X.); (X.S.); (Y.Z.); (A.S.); (Y.H.)
| | - Jānis Kloviņš
- Latvian Biomedical Research and Study Centre, LV-1067 Riga, Latvia; (M.B.); (I.A.); (V.R.); (R.P.); (I.S.); (L.A.); (K.M.); (L.B.); (J.K.)
| |
Collapse
|
2
|
Oliveira LC, Dornelles AC, Nisihara RM, Bruginski ERD, Santos PID, Cipolla GA, Boschmann SE, Messias-Reason IJD, Campos FR, Petzl-Erler ML, Boldt ABW. The Second Highest Prevalence of Celiac Disease Worldwide: Genetic and Metabolic Insights in Southern Brazilian Mennonites. Genes (Basel) 2023; 14:genes14051026. [PMID: 37239386 DOI: 10.3390/genes14051026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 04/26/2023] [Accepted: 04/27/2023] [Indexed: 05/28/2023] Open
Abstract
Celiac disease (CD), despite its high morbidity, is an often-underdiagnosed autoimmune enteropathy. Using a modified version of the Brazilian questionnaire of the 2013 National Health Survey, we interviewed 604 Mennonites of Frisian/Flemish origin that have been isolated for 25 generations. A subgroup of 576 participants were screened for IgA autoantibodies in serum, and 391 participants were screened for HLA-DQ2.5/DQ8 subtypes. CD seroprevalence was 1:29 (3.48%, 95% CI = 2.16-5.27%) and biopsy-confirmed CD was 1:75 (1.32%, 95% CI = 0.57-2.59%), which is superior to the highest reported global prevalence (1:100). Half (10/21) of the patients did not suspect the disease. HLA-DQ2.5/DQ8 increased CD susceptibility (OR = 12.13 [95% CI = 1.56-94.20], p = 0.003). The HLA-DQ2.5 carrier frequency was higher in Mennonites than in Brazilians (p = 7 × 10-6). HLA-DQ8 but not HLA-DQ2.5 carrier frequency differed among settlements (p = 0.007) and was higher than in Belgians, a Mennonite ancestral population (p = 1.8 × 10-6), and higher than in Euro-Brazilians (p = 6.5 × 10-6). The glutathione pathway, which prevents reactive oxygen species-causing bowel damage, was altered within the metabolic profiles of untreated CD patients. Those with lower serological positivity clustered with controls presenting close relatives with CD or rheumatoid arthritis. In conclusion, Mennonites have a high CD prevalence with a strong genetic component and altered glutathione metabolism that calls for urgent action to alleviate the burden of comorbidities due to late diagnosis.
Collapse
Affiliation(s)
- Luana Caroline Oliveira
- Laboratory of Human Molecular Genetics, Department of Genetics, Federal University of Paraná (UFPR), Centro Politécnico, Jardim das Américas, Curitiba 81531-990, Paraná, Brazil
- Postgraduate Program in Genetics, Department of Genetics, Federal University of Paraná (UFPR), Centro Politécnico, Jardim das Américas, Curitiba 81531-990, Paraná, Brazil
| | - Amanda Coelho Dornelles
- Laboratory of Human Molecular Genetics, Department of Genetics, Federal University of Paraná (UFPR), Centro Politécnico, Jardim das Américas, Curitiba 81531-990, Paraná, Brazil
| | - Renato Mitsunori Nisihara
- Laboratory of Molecular Immunopathology, Department of Clinical Pathology, Clinical Hospital, Federal University of Paraná (UFPR), Rua General Carneiro, 181 Prédio Central, 11° Andar, Alto da Glória, Curitiba 80060-240, Paraná, Brazil
| | - Estevan Rafael Dutra Bruginski
- Postgraduate Program in Pharmaceutical Sciences, Laboratory of Bioscience and Mass Spectrometry, Department of Pharmacy, Federal University of Paraná (UFPR), Av. Pref. Lothário Meissner, 632, Jardim Botânico, Curitiba 80210-170, Paraná, Brazil
| | - Priscila Ianzen Dos Santos
- Laboratory of Human Molecular Genetics, Department of Genetics, Federal University of Paraná (UFPR), Centro Politécnico, Jardim das Américas, Curitiba 81531-990, Paraná, Brazil
- Postgraduate Program in Internal Medicine, Federal University of Paraná (UFPR), Rua General Carneiro, 181 Prédio Central, 11° Andar, Alto da Glória, Curitiba 80060-240, Paraná, Brazil
| | - Gabriel Adelman Cipolla
- Laboratory of Human Molecular Genetics, Department of Genetics, Federal University of Paraná (UFPR), Centro Politécnico, Jardim das Américas, Curitiba 81531-990, Paraná, Brazil
- Postgraduate Program in Genetics, Department of Genetics, Federal University of Paraná (UFPR), Centro Politécnico, Jardim das Américas, Curitiba 81531-990, Paraná, Brazil
| | - Stefanie Epp Boschmann
- Laboratory of Molecular Immunopathology, Department of Clinical Pathology, Clinical Hospital, Federal University of Paraná (UFPR), Rua General Carneiro, 181 Prédio Central, 11° Andar, Alto da Glória, Curitiba 80060-240, Paraná, Brazil
- Postgraduate Program in Internal Medicine, Federal University of Paraná (UFPR), Rua General Carneiro, 181 Prédio Central, 11° Andar, Alto da Glória, Curitiba 80060-240, Paraná, Brazil
| | - Iara José de Messias-Reason
- Laboratory of Molecular Immunopathology, Department of Clinical Pathology, Clinical Hospital, Federal University of Paraná (UFPR), Rua General Carneiro, 181 Prédio Central, 11° Andar, Alto da Glória, Curitiba 80060-240, Paraná, Brazil
- Postgraduate Program in Internal Medicine, Federal University of Paraná (UFPR), Rua General Carneiro, 181 Prédio Central, 11° Andar, Alto da Glória, Curitiba 80060-240, Paraná, Brazil
| | - Francinete Ramos Campos
- Postgraduate Program in Pharmaceutical Sciences, Laboratory of Bioscience and Mass Spectrometry, Department of Pharmacy, Federal University of Paraná (UFPR), Av. Pref. Lothário Meissner, 632, Jardim Botânico, Curitiba 80210-170, Paraná, Brazil
| | - Maria Luiza Petzl-Erler
- Laboratory of Human Molecular Genetics, Department of Genetics, Federal University of Paraná (UFPR), Centro Politécnico, Jardim das Américas, Curitiba 81531-990, Paraná, Brazil
- Postgraduate Program in Genetics, Department of Genetics, Federal University of Paraná (UFPR), Centro Politécnico, Jardim das Américas, Curitiba 81531-990, Paraná, Brazil
| | - Angelica Beate Winter Boldt
- Laboratory of Human Molecular Genetics, Department of Genetics, Federal University of Paraná (UFPR), Centro Politécnico, Jardim das Américas, Curitiba 81531-990, Paraná, Brazil
- Postgraduate Program in Genetics, Department of Genetics, Federal University of Paraná (UFPR), Centro Politécnico, Jardim das Américas, Curitiba 81531-990, Paraná, Brazil
| |
Collapse
|
3
|
Xu ZM, Rüeger S, Zwyer M, Brites D, Hiza H, Reinhard M, Rutaihwa L, Borrell S, Isihaka F, Temba H, Maroa T, Naftari R, Hella J, Sasamalo M, Reither K, Portevin D, Gagneux S, Fellay J. Using population-specific add-on polymorphisms to improve genotype imputation in underrepresented populations. PLoS Comput Biol 2022; 18:e1009628. [PMID: 35025869 PMCID: PMC8791479 DOI: 10.1371/journal.pcbi.1009628] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2021] [Revised: 01/26/2022] [Accepted: 11/10/2021] [Indexed: 12/13/2022] Open
Abstract
Genome-wide association studies rely on the statistical inference of untyped variants, called imputation, to increase the coverage of genotyping arrays. However, the results are often suboptimal in populations underrepresented in existing reference panels and array designs, since the selected single nucleotide polymorphisms (SNPs) may fail to capture population-specific haplotype structures, hence the full extent of common genetic variation. Here, we propose to sequence the full genomes of a small subset of an underrepresented study cohort to inform the selection of population-specific add-on tag SNPs and to generate an internal population-specific imputation reference panel, such that the remaining array-genotyped cohort could be more accurately imputed. Using a Tanzania-based cohort as a proof-of-concept, we demonstrate the validity of our approach by showing improvements in imputation accuracy after the addition of our designed add-on tags to the base H3Africa array. Genome-wide association studies, which study the association between genetic variants and various phenotypes, typically rely on genotyping arrays. Only a small proportion of genetic variants within the genome are typed on genotyping arrays. Untyped variants are statistically inferred through a process known as genotype imputation, where correlations between variants (haplotypes) observed in external reference panels are leveraged to infer untyped variants in the study population. However, for study populations that are underrepresented in existing reference panels, the quality of imputation is often sub-optimal. This is because typed variants incorporated on existing genotyping arrays can be unsuitable for the study population, and haplotype structures can be different between the reference and the study population. Here, we illustrate an approach to select a custom set of population-specific typed variants to improve genotype imputation in such underrepresented populations.
Collapse
Affiliation(s)
- Zhi Ming Xu
- School of Life Sciences, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Sina Rüeger
- School of Life Sciences, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Michaela Zwyer
- Swiss Tropical and Public Health Institute, Basel, Switzerland
- University of Basel, Basel, Switzerland
| | - Daniela Brites
- Swiss Tropical and Public Health Institute, Basel, Switzerland
- University of Basel, Basel, Switzerland
| | - Hellen Hiza
- Swiss Tropical and Public Health Institute, Basel, Switzerland
- University of Basel, Basel, Switzerland
- Ifakara Health Institute, Dar es Salaam, Tanzania
| | - Miriam Reinhard
- Swiss Tropical and Public Health Institute, Basel, Switzerland
- University of Basel, Basel, Switzerland
| | - Liliana Rutaihwa
- Swiss Tropical and Public Health Institute, Basel, Switzerland
- University of Basel, Basel, Switzerland
| | - Sonia Borrell
- Swiss Tropical and Public Health Institute, Basel, Switzerland
- University of Basel, Basel, Switzerland
| | | | | | - Thomas Maroa
- Ifakara Health Institute, Dar es Salaam, Tanzania
| | | | - Jerry Hella
- Ifakara Health Institute, Dar es Salaam, Tanzania
| | | | - Klaus Reither
- Swiss Tropical and Public Health Institute, Basel, Switzerland
- University of Basel, Basel, Switzerland
| | - Damien Portevin
- Swiss Tropical and Public Health Institute, Basel, Switzerland
- University of Basel, Basel, Switzerland
| | - Sebastien Gagneux
- Swiss Tropical and Public Health Institute, Basel, Switzerland
- University of Basel, Basel, Switzerland
| | - Jacques Fellay
- School of Life Sciences, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Precision Medicine Unit, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland
- * E-mail:
| |
Collapse
|
4
|
Jain A, Bhoyar RC, Pandhare K, Mishra A, Sharma D, Imran M, Senthivel V, Divakar MK, Rophina M, Jolly B, Batra A, Sharma S, Siwach S, Jadhao AG, Palande N, Jha GN, Ashrafi N, Mishra PK, A. K. V, Jain S, Dash D, Kumar NS, Vanlallawma A, Sarma R, Chhakchhuak L, Kalyanaraman S, Mahadevan R, Kandasamy S, B. M. P, Rajagopal RE, J. ER, P. ND, Bajaj A, Gupta V, Mathew S, Goswami S, Mangla M, Prakash S, Joshi K, S. S, Gajjar D, Soraisham R, Yadav R, Devi YS, Gupta A, Mukerji M, Ramalingam S, B. K. B, Scaria V, Sivasubbu S. IndiGenomes: a comprehensive resource of genetic variants from over 1000 Indian genomes. Nucleic Acids Res 2021; 49:D1225-D1232. [PMID: 33095885 PMCID: PMC7778947 DOI: 10.1093/nar/gkaa923] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Revised: 10/01/2020] [Accepted: 10/22/2020] [Indexed: 12/15/2022] Open
Abstract
With the advent of next-generation sequencing, large-scale initiatives for mining whole genomes and exomes have been employed to better understand global or population-level genetic architecture. India encompasses more than 17% of the world population with extensive genetic diversity, but is under-represented in the global sequencing datasets. This gave us the impetus to perform and analyze the whole genome sequencing of 1029 healthy Indian individuals under the pilot phase of the 'IndiGen' program. We generated a compendium of 55,898,122 single allelic genetic variants from geographically distinct Indian genomes and calculated the allele frequency, allele count, allele number, along with the number of heterozygous or homozygous individuals. In the present study, these variants were systematically annotated using publicly available population databases and can be accessed through a browsable online database named as 'IndiGenomes' http://clingen.igib.res.in/indigen/. The IndiGenomes database will help clinicians and researchers in exploring the genetic component underlying medical conditions. Till date, this is the most comprehensive genetic variant resource for the Indian population and is made freely available for academic utility. The resource has also been accessed extensively by the worldwide community since it's launch.
Collapse
Affiliation(s)
- Abhinav Jain
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Rahul C Bhoyar
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
| | - Kavita Pandhare
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
| | - Anushree Mishra
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
| | - Disha Sharma
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
| | - Mohamed Imran
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Vigneshwar Senthivel
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Mohit Kumar Divakar
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Mercy Rophina
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Bani Jolly
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Arushi Batra
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Sumit Sharma
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
| | - Sanjay Siwach
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
| | - Arun G Jadhao
- Department of Zoology, RTM Nagpur University, Nagpur, Maharashtra 440033, India
| | - Nikhil V Palande
- Department of Zoology, Shri Mathuradas Mohota College of Science, Nagpur, Maharashtra 440009, India
| | - Ganga Nath Jha
- Department of Anthropology, Vinoba Bhave University, Hazaribag, Jharkhand 825301, India
| | - Nishat Ashrafi
- Department of Anthropology, Vinoba Bhave University, Hazaribag, Jharkhand 825301, India
| | - Prashant Kumar Mishra
- Department of Biotechnology, Vinoba Bhave University, Hazaribag, Jharkhand 825301, India
| | - Vidhya A. K.
- Department of Biochemistry, Dr. Kongu Science and Art College, Erode, Tamil Nadu 638107, India
| | - Suman Jain
- Thalassemia and Sickle cell Society, Hyderabad, Telangana 500052, India
| | - Debasis Dash
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | | | - Andrew Vanlallawma
- Department of Biotechnology, Mizoram University, Aizawl, Mizoram 796004, India
| | - Ranjan Jyoti Sarma
- Department of Biotechnology, Mizoram University, Aizawl, Mizoram 796004, India
| | | | | | - Radha Mahadevan
- TVMC, Tirunelveli Medical College, Tirunelveli, Tamil Nadu 627011, India
| | - Sunitha Kandasamy
- TVMC, Tirunelveli Medical College, Tirunelveli, Tamil Nadu 627011, India
| | - Pabitha B. M.
- TVMC, Tirunelveli Medical College, Tirunelveli, Tamil Nadu 627011, India
| | | | - Ezhil Ramya J.
- TVMC, Tirunelveli Medical College, Tirunelveli, Tamil Nadu 627011, India
| | - Nirmala Devi P.
- TVMC, Tirunelveli Medical College, Tirunelveli, Tamil Nadu 627011, India
| | - Anjali Bajaj
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Vishu Gupta
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Samatha Mathew
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Sangam Goswami
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Mohit Mangla
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Savinitha Prakash
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
| | - Kandarp Joshi
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
| | - Sreedevi S.
- Department of Microbiology, St.Pious X Degree & PG College for Women, Hyderabad, Telangana 500076, India
| | - Devarshi Gajjar
- Department of Microbiology, The Maharaja Sayajirao University of Baroda, Vadodara, Gujarat 390002, India
| | - Ronibala Soraisham
- Department of Dermatology, Venereology and Leprology, Regional Institute of Medical Sciences, Imphal, Manipur 795004, India
| | - Rohit Yadav
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Yumnam Silla Devi
- CSIR- North East Institute of Science and Technology, Jorhat, Assam 785006, India
| | - Aayush Gupta
- Department of Dermatology, Dr. D.Y. Patil Medical College, Pune, Maharashtra 411018, India
| | - Mitali Mukerji
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Sivaprakash Ramalingam
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Binukumar B. K.
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Vinod Scaria
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Sridhar Sivasubbu
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| |
Collapse
|
5
|
Kim HI, Ye B, Gosalia N, Köroğlu Ç, Hanson RL, Hsueh WC, Knowler WC, Baier LJ, Bogardus C, Shuldiner AR, Van Hout CV, Van Hout CV. Characterization of Exome Variants and Their Metabolic Impact in 6,716 American Indians from the Southwest US. Am J Hum Genet 2020; 107:251-264. [PMID: 32640185 DOI: 10.1016/j.ajhg.2020.06.009] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Accepted: 06/10/2020] [Indexed: 12/21/2022] Open
Abstract
Applying exome sequencing to populations with unique genetic architecture has the potential to reveal novel genes and variants associated with traits and diseases. We sequenced and analyzed the exomes of 6,716 individuals from a Southwestern American Indian (SWAI) population with well-characterized metabolic traits. We found that the SWAI population has distinct allelic architecture compared to populations of European and East Asian ancestry, and there were many predicted loss-of-function (pLOF) and nonsynonymous variants that were highly enriched or private in the SWAI population. We used pLOF and nonsynonymous variants in the SWAI population to evaluate gene-burden associations of candidate genes from European genome-wide association studies (GWASs) for type 2 diabetes, body mass index, and four major plasma lipids. We found 19 significant gene-burden associations for 11 genes, providing additional evidence for prioritizing candidate effector genes of GWAS signals. Interestingly, these associations were mainly driven by pLOF and nonsynonymous variants that are unique or highly enriched in the SWAI population. Particularly, we found four pLOF or nonsynonymous variants in APOB, APOE, PCSK9, and TM6SF2 that are private or enriched in the SWAI population and associated with low-density lipoprotein (LDL) cholesterol levels. Their large estimated effects on LDL cholesterol levels suggest strong impacts on protein function and potential clinical implications of these variants in cardiovascular health. In summary, our study illustrates the utility and potential of exome sequencing in genetically unique populations, such as the SWAI population, to prioritize candidate effector genes within GWAS loci and to find additional variants in known disease genes with potential clinical impact.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | - Cristopher V Van Hout
- Regeneron Genetics Center, Regeneron Pharmaceuticals Inc., Tarrytown, NY 10591, USA.
| |
Collapse
|
6
|
Gordovez FJA, McMahon FJ. The genetics of bipolar disorder. Mol Psychiatry 2020; 25:544-559. [PMID: 31907381 DOI: 10.1038/s41380-019-0634-7] [Citation(s) in RCA: 124] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Revised: 11/22/2019] [Accepted: 12/11/2019] [Indexed: 12/11/2022]
Abstract
Bipolar disorder (BD) is one of the most heritable mental illnesses, but the elucidation of its genetic basis has proven to be a very challenging endeavor. Genome-Wide Association Studies (GWAS) have transformed our understanding of BD, providing the first reproducible evidence of specific genetic markers and a highly polygenic architecture that overlaps with that of schizophrenia, major depression, and other disorders. Individual GWAS markers appear to confer little risk, but common variants together account for about 25% of the heritability of BD. A few higher-risk associations have also been identified, such as a rare copy number variant on chromosome 16p11.2. Large scale next-generation sequencing studies are actively searching for other alleles that confer substantial risk. As our understanding of the genetics of BD improves, there is growing optimism that some clear biological pathways will emerge, providing a basis for future studies aimed at molecular diagnosis and novel therapeutics.
Collapse
Affiliation(s)
- Francis James A Gordovez
- Human Genetics Branch, National Institute of Mental Health Intramural Research Program, Department of Health and Human Services, National Institutes of Health, Bethesda, MD, USA.,College of Medicine, University of the Philippines Manila, 1000, Ermita, Manila, Philippines
| | - Francis J McMahon
- Human Genetics Branch, National Institute of Mental Health Intramural Research Program, Department of Health and Human Services, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
7
|
Abney M, ElSherbiny A. Kinpute: using identity by descent to improve genotype imputation. Bioinformatics 2019; 35:4321-4326. [PMID: 30918937 PMCID: PMC6821425 DOI: 10.1093/bioinformatics/btz221] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2018] [Revised: 02/21/2019] [Accepted: 03/26/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Genotype imputation, though generally accurate, often results in many genotypes being poorly imputed, particularly in studies where the individuals are not well represented by standard reference panels. When individuals in the study share regions of the genome identical by descent (IBD), it is possible to use this information in combination with a study-specific reference panel (SSRP) to improve the imputation results. Kinpute uses IBD information-due to recent, familial relatedness or distant, unknown ancestors-in conjunction with the output from linkage disequilibrium (LD) based imputation methods to compute more accurate genotype probabilities. Kinpute uses a novel method for IBD imputation, which works even in the absence of a pedigree, and results in substantially improved imputation quality. RESULTS Given initial estimates of average IBD between subjects in the study sample, Kinpute uses a novel algorithm to select an optimal set of individuals to sequence and use as an SSRP. Kinpute is designed to use as input both this SSRP and the genotype probabilities output from other LD-based imputation software, and uses a new method to combine the LD imputed genotype probabilities with IBD configurations to substantially improve imputation. We tested Kinpute on a human population isolate where 98 individuals have been sequenced. In half of this sample, whose sequence data was masked, we used Impute2 to perform LD-based imputation and Kinpute was used to obtain higher accuracy genotype probabilities. Measures of imputation accuracy improved significantly, particularly for those genotypes that Impute2 imputed with low certainty. AVAILABILITY AND IMPLEMENTATION Kinpute is an open-source and freely available C++ software package that can be downloaded from. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mark Abney
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Aisha ElSherbiny
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| |
Collapse
|
8
|
Yoo SK, Kim CU, Kim HL, Kim S, Shin JY, Kim N, Yang JSW, Lo KW, Cho B, Matsuda F, Schuster SC, Kim C, Kim JI, Seo JS. NARD: whole-genome reference panel of 1779 Northeast Asians improves imputation accuracy of rare and low-frequency variants. Genome Med 2019; 11:64. [PMID: 31640730 PMCID: PMC6805399 DOI: 10.1186/s13073-019-0677-z] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Accepted: 10/11/2019] [Indexed: 12/30/2022] Open
Abstract
Here, we present the Northeast Asian Reference Database (NARD), including whole-genome sequencing data of 1779 individuals from Korea, Mongolia, Japan, China, and Hong Kong. NARD provides the genetic diversity of Korean (n = 850) and Mongolian (n = 384) ancestries that were not present in the 1000 Genomes Project Phase 3 (1KGP3). We combined and re-phased the genotypes from NARD and 1KGP3 to construct a union set of haplotypes. This approach established a robust imputation reference panel for Northeast Asians, which yields the greatest imputation accuracy of rare and low-frequency variants compared with the existing panels. NARD imputation panel is available at https://nard.macrogen.com/ .
Collapse
Affiliation(s)
- Seong-Keun Yoo
- Precision Medicine Center, Seoul National University Bundang Hospital, 172 Dolma-ro, Seongnam, Bundang-gu, Gyeonggi-do, 13605, Republic of Korea
- Precision Medicine Institute, Macrogen Inc., Seongnam, Republic of Korea
| | - Chang-Uk Kim
- Precision Medicine Institute, Macrogen Inc., Seongnam, Republic of Korea
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, Republic of Korea
| | - Hie Lim Kim
- The Asian School of the Environment, Nanyang Technological University, Singapore, Singapore
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore
| | - Sungjae Kim
- Precision Medicine Institute, Macrogen Inc., Seongnam, Republic of Korea
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, Republic of Korea
| | - Jong-Yeon Shin
- Precision Medicine Institute, Macrogen Inc., Seongnam, Republic of Korea
| | - Namcheol Kim
- Precision Medicine Institute, Macrogen Inc., Seongnam, Republic of Korea
| | | | - Kwok-Wai Lo
- Department of Anatomical & Cellular Pathology and State Key Laboratory of Translational Oncology, The Chinese University of Hong Kong, Hong Kong, China
| | - Belong Cho
- Department of Family Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Fumihiko Matsuda
- Center for Genomic Medicine, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Stephan C Schuster
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore
- School of Biological Science, Nanyang Technological University, Singapore, Singapore
| | - Changhoon Kim
- Precision Medicine Institute, Macrogen Inc., Seongnam, Republic of Korea
| | - Jong-Il Kim
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, Republic of Korea
- Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Republic of Korea
| | - Jeong-Sun Seo
- Precision Medicine Center, Seoul National University Bundang Hospital, 172 Dolma-ro, Seongnam, Bundang-gu, Gyeonggi-do, 13605, Republic of Korea.
- Precision Medicine Institute, Macrogen Inc., Seongnam, Republic of Korea.
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, Republic of Korea.
- Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Republic of Korea.
- Gong-Wu Genomic Medicine Institute, Seoul National University Bundang Hospital, Seongnam, Republic of Korea.
| |
Collapse
|
9
|
Genetic pleiotropy between mood disorders, metabolic, and endocrine traits in a multigenerational pedigree. Transl Psychiatry 2018; 8:218. [PMID: 30315151 PMCID: PMC6185949 DOI: 10.1038/s41398-018-0226-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Revised: 05/10/2018] [Accepted: 07/14/2018] [Indexed: 12/15/2022] Open
Abstract
Bipolar disorder (BD) is a mental disorder characterized by alternating periods of depression and mania. Individuals with BD have higher levels of early mortality than the general population, and a substantial proportion of this is due to increased risk for comorbid diseases. To identify the molecular events that underlie BD and related medical comorbidities, we generated imputed whole-genome sequence data using a population-specific reference panel for an extended multigenerational Old Order Amish pedigree (n = 394), segregating BD and related disorders. First, we investigated all putative disease-causing variants at known Mendelian disease loci present in this pedigree. Second, we performed genomic profiling using polygenic risk scores (PRS) to establish each individual's risk for several complex diseases. We identified a set of Mendelian variants that co-occur in individuals with BD more frequently than their unaffected family members, including the R3527Q mutation in APOB associated with hypercholesterolemia. Using PRS, we demonstrated that BD individuals from this pedigree were enriched for the same common risk alleles for BD as the general population (β = 0.416, p = 6 × 10-4). Furthermore, we find evidence for a common genetic etiology between BD risk and polygenic risk for clinical autoimmune thyroid disease (p = 1 × 10-4), diabetes (p = 1 × 10-3), and lipid traits such as triglyceride levels (p = 3 × 10-4) in the pedigree. We identify genomic regions that contribute to the differences between BD individuals and unaffected family members by calculating local genetic risk for independent LD blocks. Our findings provide evidence for the extensive genetic pleiotropy that can drive epidemiological findings of comorbidities between diseases and other complex traits.
Collapse
|
10
|
High-depth whole genome sequencing of an Ashkenazi Jewish reference panel: enhancing sensitivity, accuracy, and imputation. Hum Genet 2018; 137:343-355. [PMID: 29705978 DOI: 10.1007/s00439-018-1886-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2018] [Accepted: 04/21/2018] [Indexed: 12/31/2022]
Abstract
While increasingly large reference panels for genome-wide imputation have been recently made available, the degree to which imputation accuracy can be enhanced by population-specific reference panels remains an open question. Here, we sequenced at full-depth (≥ 30×), across two platforms (Illumina X Ten and Complete Genomics, Inc.), a moderately large (n = 738) cohort of samples drawn from the Ashkenazi Jewish population. We developed a series of quality control steps to optimize sensitivity, specificity, and comprehensiveness of variant calls in the reference panel, and then tested the accuracy of imputation against target cohorts drawn from the same population. Quality control (QC) thresholds for the Illumina X Ten platform were identified that permitted highly accurate calling of single nucleotide variants across 94% of the genome. QC procedures also identified numerous regions that are poorly mapped using current reference or alternate assemblies. After stringent QC, the population-specific reference panel produced more accurate and comprehensive imputation results relative to publicly available, large cosmopolitan reference panels, especially in the range of rare variants that may be most critical to further progress in mapping of complex phenotypes. The population-specific reference panel also permitted enhanced filtering of clinically irrelevant variants from personal genomes.
Collapse
|