1
|
Gudmundsson S, Singer‐Berk M, Watts NA, Phu W, Goodrich JK, Solomonson M, Rehm HL, MacArthur DG, O'Donnell‐Luria A. Variant interpretation using population databases: Lessons from gnomAD. Hum Mutat 2022; 43:1012-1030. [PMID: 34859531 PMCID: PMC9160216 DOI: 10.1002/humu.24309] [Citation(s) in RCA: 285] [Impact Index Per Article: 95.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Revised: 11/02/2021] [Accepted: 11/28/2021] [Indexed: 01/22/2023]
Abstract
Reference population databases are an essential tool in variant and gene interpretation. Their use guides the identification of pathogenic variants amidst the sea of benign variation present in every human genome, and supports the discovery of new disease-gene relationships. The Genome Aggregation Database (gnomAD) is currently the largest and most widely used publicly available collection of population variation from harmonized sequencing data. The data is available through the online gnomAD browser (https://gnomad.broadinstitute.org/) that enables rapid and intuitive variant analysis. This review provides guidance on the content of the gnomAD browser, and its usage for variant and gene interpretation. We introduce key features including allele frequency, per-base expression levels, constraint scores, and variant co-occurrence, alongside guidance on how to use these in analysis, with a focus on the interpretation of candidate variants and novel genes in rare disease.
Collapse
|
Review |
3 |
285 |
2
|
Li H, Borinskaya S, Yoshimura K, Kal’ina N, Marusin A, Stepanov VA, Qin Z, Khaliq S, Lee MY, Yang Y, Mohyuddin A, Gurwitz D, Mehdi SQ, Rogaev E, Jin L, Yankovsky NK, Kidd JR, Kidd KK. Refined geographic distribution of the oriental ALDH2*504Lys (nee 487Lys) variant. Ann Hum Genet 2009; 73:335-45. [PMID: 19456322 PMCID: PMC2846302 DOI: 10.1111/j.1469-1809.2009.00517.x] [Citation(s) in RCA: 215] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Mitochondrial aldehyde dehydrogenase (ALDH2) is one of the most important enzymes in human alcohol metabolism. The oriental ALDH2*504Lys variant functions as a dominant negative, greatly reducing activity in heterozygotes and abolishing activity in homozygotes. This allele is associated with serious disorders such as alcohol liver disease, late onset Alzheimer disease, colorectal cancer, and esophageal cancer, and is best known for protection against alcoholism. Many hundreds of papers in various languages have been published on this variant, providing allele frequency data for many different populations. To develop a highly refined global geographic distribution of ALDH2*504Lys, we have collected new data on 4,091 individuals from 86 population samples and assembled published data on a total of 80,691 individuals from 366 population samples. The allele is essentially absent in all parts of the world except East Asia. The ALDH2*504Lys allele has its highest frequency in Southeast China, and occurs in most areas of China, Japan, Korea, Mongolia, and Indochina with frequencies gradually declining radially from Southeast China. As the indigenous populations in South China have much lower frequencies than the southern Han migrants from Central China, we conclude that ALDH2*504Lys was carried by Han Chinese as they spread throughout East Asia. Esophageal cancer, with its highest incidence in East Asia, may be associated with ALDH2*504Lys because of a toxic effect of increased acetaldehyde in the tissue where ingested ethanol has its highest concentration. While the distributions of esophageal cancer and ALDH2*504Lys do not precisely correlate, that does not disprove the hypothesis. In general the study of fine scale geographic distributions of ALDH2*504Lys and diseases may help in understanding the multiple relationships among genes, diseases, environments, and cultures.
Collapse
|
Research Support, N.I.H., Extramural |
16 |
215 |
3
|
Wassif CA, Cross JL, Iben J, Sanchez-Pulido L, Cougnoux A, Platt FM, Ory DS, Ponting CP, Bailey-Wilson JE, Biesecker LG, Porter FD. High incidence of unrecognized visceral/neurological late-onset Niemann-Pick disease, type C1, predicted by analysis of massively parallel sequencing data sets. Genet Med 2016; 18:41-8. [PMID: 25764212 PMCID: PMC4486368 DOI: 10.1038/gim.2015.25] [Citation(s) in RCA: 150] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2014] [Accepted: 01/21/2015] [Indexed: 11/20/2022] Open
Abstract
PURPOSE Niemann-Pick disease type C (NPC) is a recessive, neurodegenerative, lysosomal storage disease caused by mutations in either NPC1 or NPC2. The diagnosis is difficult and frequently delayed. Ascertainment is likely incomplete because of both these factors and because the full phenotypic spectrum may not have been fully delineated. Given the recent development of a blood-based diagnostic test and the development of potential therapies, understanding the incidence of NPC and defining at-risk patient populations are important. METHOD We evaluated data from four large, massively parallel exome sequencing data sets. Variant sequences were identified and classified as pathogenic or nonpathogenic based on a combination of literature review and bioinformatic analysis. This methodology provided an unbiased approach to determining the allele frequency. RESULTS Our data suggest an incidence rate for NPC1 and NPC2 of 1/92,104 and 1/2,858,998, respectively. Evaluation of common NPC1 variants, however, suggests that there may be a late-onset NPC1 phenotype with a markedly higher incidence, on the order of 1/19,000-1/36,000. CONCLUSION We determined a combined incidence of classical NPC of 1/89,229, or 1.12 affected patients per 100,000 conceptions, but predict incomplete ascertainment of a late-onset phenotype of NPC1. This finding strongly supports the need for increased screening of potential patients.
Collapse
|
Research Support, N.I.H., Intramural |
9 |
150 |
4
|
Wang W, Zhang W, Zhang J, He J, Zhu F. Distribution of HLA allele frequencies in 82 Chinese individuals with coronavirus disease-2019 (COVID-19). HLA 2020; 96:194-196. [PMID: 32424945 PMCID: PMC7276866 DOI: 10.1111/tan.13941] [Citation(s) in RCA: 144] [Impact Index Per Article: 28.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Revised: 05/12/2020] [Accepted: 05/15/2020] [Indexed: 01/04/2023]
Abstract
COVID‐19 is a respiratory disease caused by a novel coronavirus and is currently a global pandemic. HLA variation is associated with COVID‐19 because HLA plays a pivotal role in the immune response to pathogens. Here, 82 individuals with COVID‐19 were genotyped for HLA‐A, ‐B, ‐C, ‐DRB1, ‐DRB3/4/5, ‐DQA1, ‐DQB1, ‐DPA1, and ‐DPB1 loci using next‐generation sequencing (NGS). Frequencies of the HLA‐C*07:29, C*08:01G, B*15:27, B*40:06, DRB1*04:06, and DPB1*36:01 alleles were higher, while the frequencies of the DRB1*12:02 and DPB1*04:01 alleles were lower in COVID‐19 patients than in the control population, with uncorrected statistical significance. Only HLA‐C*07:29 and B*15:27 were significant when the corrected P‐value was considered. These data suggested that some HLA alleles may be associated with the occurrence of COVID‐19.
Collapse
|
Research Support, Non-U.S. Gov't |
5 |
144 |
5
|
Ikeda N, Kojima H, Nishikawa M, Hayashi K, Futagami T, Tsujino T, Kusunoki Y, Fujii N, Suegami S, Miyazaki Y, Middleton D, Tanaka H, Saji H. Determination of HLA-A, -C, -B, -DRB1 allele and haplotype frequency in Japanese population based on family study. ACTA ACUST UNITED AC 2015; 85:252-9. [PMID: 25789826 PMCID: PMC5054903 DOI: 10.1111/tan.12536] [Citation(s) in RCA: 124] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2014] [Revised: 01/14/2015] [Accepted: 02/03/2015] [Indexed: 12/15/2022]
Abstract
The present study investigates the human leucocyte antigen (HLA) allele and haplotype frequencies in Japanese population. We carried out the frequency analysis in 5824 families living across Japanese archipelago. The studied population has mainly been typed for the purpose of transplant, especially the hematopoietic stem cell transplantation (HSCT). We determined HLA class I (A, B, and C) and HLA class II (DRB1) using Luminex technology. The haplotypes were directly counted by segregation. A total of 44 HLA‐A, 29 HLA‐C, 75 HLA‐B, and 42 HLA‐DRB1 alleles were identified. In the HLA haplotypes of A‐C‐B‐DRB1 and C‐B, the pattern of linkage disequilibrium peculiar to Japanese population has been confirmed. Moreover, the haplotype frequencies based on family study was compared with the frequencies estimated by maximum likelihood estimation (MLE), and the equivalent results were obtained. The allele and haplotype frequencies obtained in this study could be useful for anthropology, transplantation therapy, and disease association studies.
Collapse
|
Journal Article |
10 |
124 |
6
|
Chen B, Solis-Villa C, Hakenberg J, Qiao W, Srinivasan RR, Yasuda M, Balwani M, Doheny D, Peter I, Chen R, Desnick RJ. Acute Intermittent Porphyria: Predicted Pathogenicity of HMBS Variants Indicates Extremely Low Penetrance of the Autosomal Dominant Disease. Hum Mutat 2016; 37:1215-1222. [PMID: 27539938 DOI: 10.1002/humu.23067] [Citation(s) in RCA: 117] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2016] [Accepted: 08/12/2016] [Indexed: 12/17/2022]
Abstract
Acute intermittent porphyria results from hydroxymethylbilane synthase (HMBS) mutations that markedly decrease HMBS enzymatic activity. This dominant disease is diagnosed when heterozygotes have life-threatening acute attacks, while most heterozygotes remain asymptomatic and undiagnosed. Although >400 HMBS mutations have been reported, the prevalence of pathogenic HMBS mutations in genomic/exomic databases, and the actual disease penetrance are unknown. Thus, we interrogated genomic/exomic databases, identified non-synonymous variants (NSVs) and consensus splice-site variants (CSSVs) in various demographic/racial groups, and determined the NSV's pathogenicity by prediction algorithms and in vitro expression assays. Caucasians had the most: 58 NSVs and two CSSVs among ∼92,000 alleles, a 0.00575 combined allele frequency. In silico algorithms predicted 14 out of 58 NSVs as "likely-pathogenic." In vitro expression identified 10 out of 58 NSVs as likely-pathogenic (seven predicted in silico), which together with two CSSVs had a combined allele frequency of 0.00056. Notably, six presumably pathogenic mutations/NSVs in the Human Gene Mutation Database were benign. Compared with the recent prevalence estimate of symptomatic European heterozygotes (∼0.000005), the prevalence of likely-pathogenic HMBS mutations among Caucasians was >100 times more frequent. Thus, the estimated penetrance of acute attacks was ∼1% of heterozygotes with likely-pathogenic mutations, highlighting the importance of predisposing/protective genes and environmental modifiers that precipitate/prevent the attacks.
Collapse
|
Research Support, N.I.H., Extramural |
9 |
117 |
7
|
Ghosh R, Harrison SM, Rehm HL, Plon SE, Biesecker LG. Updated recommendation for the benign stand-alone ACMG/AMP criterion. Hum Mutat 2018; 39:1525-1530. [PMID: 30311383 PMCID: PMC6188666 DOI: 10.1002/humu.23642] [Citation(s) in RCA: 106] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Revised: 08/03/2018] [Accepted: 08/28/2018] [Indexed: 11/11/2022]
Abstract
The Clinical Genome Resource (ClinGen) Sequence Variant Interpretation Working Group set out to refine the American College of Medical Genetics and Genomics and the Association of Molecular Pathologists (ACMG/AMP) variant pathogenicity recommendations for stand-alone rule BA1 (a variant with minor allele frequency [MAF] > 0.05 is benign), by clarifying how it should be used and specifying a set of variants that should be exempted from this rule. We cross-referenced ClinVar and Exome Aggregation Consortium data to identify variants for which there was a plausible argument for pathogenicity and the variant exists in one or more population data sets at MAF > 0.05. We identified nine such variants that were present in these data sets that may not be benign. The ACMG/AMP criteria were applied to these variants that resulted in four pathogenic and five variants of uncertain significance. We have refined benign rule BA1 by clarifying terms used to describe its use, which databases we recommend using, and assumptions made about this rule. We also recognized an initial list of nine variants for which there was some evidence of pathogenicity even though the MAF was high for these variants. We specify processes whereby individuals can petition ClinGen for amendments to our variant-specific assertions and the criteria experts should use when setting a numerically lower threshold for BA1 for specific genes.
Collapse
|
Research Support, N.I.H., Extramural |
7 |
106 |
8
|
Scott SA, Liu B, Nazarenko I, Martis S, Kozlitina J, Yang Y, Ramirez C, Kasai Y, Hyatt T, Peter I, Desnick RJ. Frequency of the cholesteryl ester storage disease common LIPA E8SJM mutation (c.894G>A) in various racial and ethnic groups. Hepatology 2013; 58:958-65. [PMID: 23424026 PMCID: PMC3690149 DOI: 10.1002/hep.26327] [Citation(s) in RCA: 81] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/02/2012] [Accepted: 02/06/2013] [Indexed: 12/11/2022]
Abstract
UNLABELLED Cholesteryl ester storage disease (CESD) and Wolman disease are autosomal recessive later-onset and severe infantile disorders, respectively, which result from the deficient activity of lysosomal acid lipase (LAL). LAL is encoded by LIPA (10q23.31) and the most common mutation associated with CESD is an exon 8 splice junction mutation (c.894G>A; E8SJM), which expresses only ∼3%-5% of normally spliced LAL. However, the frequency of c.894G>A is unknown in most populations. To estimate the prevalence of CESD in different populations, the frequencies of the c.894G>A mutation were determined in 10,000 LIPA alleles from healthy African-American, Asian, Caucasian, Hispanic, and Ashkenazi Jewish individuals from the greater New York metropolitan area and 6,578 LIPA alleles from African-American, Caucasian, and Hispanic subjects enrolled in the Dallas Heart Study. The combined c.894G>A allele frequencies from the two cohorts ranged from 0.0005 (Asian) to 0.0017 (Caucasian and Hispanic), which translated to carrier frequencies of 1 in 1,000 to ∼1 in 300, respectively. No African-American heterozygotes were detected. Additionally, by surveying the available literature, c.894G>A was estimated to account for 60% (95% confidence interval [CI]: 51%-69%) of reported mutations among multiethnic CESD patients. Using this estimate, the predicted prevalence of CESD in the Caucasian and Hispanic populations is ∼0.8 per 100,000 (∼1 in 130,000; 95% CI: ∼1 in 90,000 to 1 in 170,000). CONCLUSION These data indicate that CESD may be underdiagnosed in the general Caucasian and Hispanic populations, which is important since clinical trials of enzyme replacement therapy for LAL deficiency are currently being developed. Moreover, future studies on CESD prevalence in African and Asian populations may require full-gene LIPA sequencing to determine heterozygote frequencies, since c.894G>A is not common in these racial groups.
Collapse
|
research-article |
12 |
81 |
9
|
Ochola LI, Tetteh KKA, Stewart LB, Riitho V, Marsh K, Conway DJ. Allele frequency-based and polymorphism-versus-divergence indices of balancing selection in a new filtered set of polymorphic genes in Plasmodium falciparum. Mol Biol Evol 2010; 27:2344-51. [PMID: 20457586 PMCID: PMC2944029 DOI: 10.1093/molbev/msq119] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Signatures of balancing selection operating on specific gene loci in endemic pathogens can identify candidate targets of naturally acquired immunity. In malaria parasites, several leading vaccine candidates convincingly show such signatures when subjected to several tests of neutrality, but the discovery of new targets affected by selection to a similar extent has been slow. A small minority of all genes are under such selection, as indicated by a recent study of 26 Plasmodium falciparum merozoite-stage genes that were not previously prioritized as vaccine candidates, of which only one (locus PF10_0348) showed a strong signature. Therefore, to focus discovery efforts on genes that are polymorphic, we scanned all available shotgun genome sequence data from laboratory lines of P. falciparum and chose six loci with more than five single nucleotide polymorphisms per kilobase (including PF10_0348) for in-depth frequency-based analyses in a Kenyan population (allele sample sizes >50 for each locus) and comparison of Hudson-Kreitman-Aguade (HKA) ratios of population diversity (π) to interspecific divergence (K) from the chimpanzee parasite Plasmodium reichenowi. Three of these (the msp3/6-like genes PF10_0348 and PF10_0355 and the surf(4.1) gene PFD1160w) showed exceptionally high positive values of Tajima's D and Fu and Li's F indices and have the highest HKA ratios, indicating that they are under balancing selection and should be prioritized for studies of their protein products as candidate targets of immunity. Combined with earlier results, there is now strong evidence that high HKA ratio (as well as the frequency-independent ratio of Watterson's /K) is predictive of high values of Tajima's D. Thus, the former offers value for use in genome-wide screening when numbers of genome sequences within a species are low or in combination with Tajima's D as a 2D test on large population genomic samples.
Collapse
|
research-article |
15 |
57 |
10
|
Wu YH, Li JY, Wang C, Zhang LM, Qiao H. The ACE2 G8790A Polymorphism: Involvement in Type 2 Diabetes Mellitus Combined with Cerebral Stroke. J Clin Lab Anal 2016; 31. [PMID: 27500554 DOI: 10.1002/jcla.22033] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Accepted: 07/01/2016] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND We aimed to investigate the correlations between ACE2 polymorphisms and type 2 diabetes mellitus (T2DM) combined with cerebral stroke (CS). METHODS A total of 346 patients treated or hospitalized in our hospital were enrolled, including 181 cases without cerebrovascular complications (T2DM group) and 165 cases combined with CS (T2DM + CS group); 284 healthy individuals were selected as the control group. PCR-RFLP and ELISA were used to analyze ACE2 G8790A polymorphisms and serum ACE2 levels, respectively. RESULTS Significant differences were observed in the genotype/allele frequency of ACE2 G8790A between the T2DM + CS and control groups, and the T2DM and T2DM + CS groups, and in the genotype frequency of ACE2 G8790A between the T2DM and the control groups. The A allele may increase the risk of T2DM combined with CS. The AA genotype may also increase the risk of T2DM combined with CS (OR = 3.733, 95%CI = 2.069-6.738; OR = 3.597, 95%CI = 1.884-6.867). Serum ACE2 levels showed statistically significant differences among the groups. Systolic pressure and diastolic pressure were protective factors of T2DM combined with CS. CONCLUSION The ACE2 G8790A polymorphism in T2DM patients was correlated with CS, and the A allele might be a risk factor of T2DM combined with CS.
Collapse
|
Journal Article |
9 |
53 |
11
|
Thomas SC. The estimation of genetic relationships using molecular markers and their efficiency in estimating heritability in natural populations. Philos Trans R Soc Lond B Biol Sci 2005; 360:1457-67. [PMID: 16048788 PMCID: PMC1569511 DOI: 10.1098/rstb.2005.1675] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Molecular marker data collected from natural populations allows information on genetic relationships to be established without referencing an exact pedigree. Numerous methods have been developed to exploit the marker data. These fall into two main categories: method of moment estimators and likelihood estimators. Method of moment estimators are essentially unbiased, but utilise weighting schemes that are only optimal if the analysed pair is unrelated. Thus, they differ in their efficiency at estimating parameters for different relationship categories. Likelihood estimators show smaller mean squared errors but are much more biased. Both types of estimator have been used in variance component analysis to estimate heritability. All marker-based heritability estimators require that adequate levels of the true relationship be present in the population of interest and that adequate amounts of informative marker data are available. I review the different approaches to relationship estimation, with particular attention to optimizing the use of this relationship information in subsequent variance component estimation.
Collapse
|
Review |
20 |
50 |
12
|
McGaughran A, Laver R, Fraser C. Evolutionary Responses to Warming. Trends Ecol Evol 2021; 36:591-600. [PMID: 33726946 DOI: 10.1016/j.tree.2021.02.014] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Revised: 02/23/2021] [Accepted: 02/26/2021] [Indexed: 12/24/2022]
Abstract
Climate change is predicted to dramatically alter biological diversity and distributions, driving extirpations, extinctions, and extensive range shifts across the globe. Warming can also, however, lead to phenotypic or behavioural plasticity, as species adapt to new conditions. Recent genomic research indicates that some species are capable of rapid evolution as selection favours adaptive responses to environmental change and altered or novel niche spaces. New advances are providing mechanistic insights into how temperature might accelerate evolution in the Anthropocene. These discoveries highlight intriguing new research directions - such as using geothermal and polar systems combined with powerful genomic tools - that will help us to understand the processes underpinning adaptive evolution and better project how ecosystems will change in a warming world.
Collapse
|
Review |
4 |
33 |
13
|
Makgahlela ML, Strandén I, Nielsen US, Sillanpää MJ, Mäntysaari EA. The estimation of genomic relationships using breedwise allele frequencies among animals in multibreed populations. J Dairy Sci 2013; 96:5364-75. [PMID: 23769355 DOI: 10.3168/jds.2012-6523] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2012] [Accepted: 04/24/2013] [Indexed: 01/07/2023]
Abstract
Different approaches of calculating genomic measures of relationship were explored and compared with pedigree relationships (A) within and across base breeds in a crossbreed population, using genotypes for 38,194 loci of 4,106 Nordic Red dairy cattle. Four genomic relationship matrices (G) were calculated using either observed allele frequencies (AF) across breeds or within-breed AF. The G matrices were compared separately when the AF were estimated in the observed and in the base population. Breedwise AF in the current and base population were estimated using linear regression models of individual genotypes on breed composition. Different G matrices were further used to predict direct estimated genomic values using a genomic BLUP model. Higher variability existed in the diagonal elements of G across breeds (standard deviation=0.06, on average) compared with A (0.01). The use of simple observed AF across base breeds to compute G increased coefficients for individuals in distantly related populations. Estimated breedwise AF reduced differences in coefficients similarly within and across populations. The variability of the current adjusted G matrix decreased from 0.055 to 0.035 when breedwise AF were estimated from the base breed population. The direct estimated genomic values and their validation reliabilities were, however, unaffected by AF used to compute G when estimated with a genomic BLUP model, due to inclusion of breed means in the model. In multibreed populations, G adjusted with breedwise AF from the founder population may provide more consistency among relationship coefficients between genotyped and ungenotyped individuals in an across-breed single-step evaluation.
Collapse
|
Research Support, Non-U.S. Gov't |
12 |
27 |
14
|
Shikov AE, Barbitoff YA, Glotov AS, Danilova MM, Tonyan ZN, Nasykhova YA, Mikhailova AA, Bespalova ON, Kalinin RS, Mirzorustamova AM, Kogan IY, Baranov VS, Chernov AN, Pavlovich DM, Azarenko SV, Fedyakov MA, Tsay VV, Eismont YA, Romanova OV, Hobotnikov DN, Vologzhanin DA, Mosenko SV, Ponomareva TA, Talts YA, Anisenkova AU, Lisovets DG, Sarana AM, Urazov SP, Scherbak SG, Glotov OS. Analysis of the Spectrum of ACE2 Variation Suggests a Possible Influence of Rare and Common Variants on Susceptibility to COVID-19 and Severity of Outcome. Front Genet 2020; 11:551220. [PMID: 33133145 PMCID: PMC7550667 DOI: 10.3389/fgene.2020.551220] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Accepted: 08/28/2020] [Indexed: 01/08/2023] Open
Abstract
OBJECTIVES In March 2020, the World Health Organization declared that an infectious respiratory disease caused by a new severe acute respiratory syndrome coronavirus 2 [SARS-CoV-2, causing coronavirus disease 2019 (COVID-19)] became a pandemic. In our study, we have analyzed a large publicly available dataset, the Genome Aggregation Database (gnomAD), as well as a cohort of 37 Russian patients with COVID-19 to assess the influence of different classes of genetic variants in the angiotensin-converting enzyme-2 (ACE2) gene on the susceptibility to COVID-19 and the severity of disease outcome. RESULTS We demonstrate that the European populations slightly differ in alternative allele frequencies at the 2,754 variant sites in ACE2 identified in the gnomAD database. We find that the Southern European population has a lower frequency of missense variants and slightly higher frequency of regulatory variants. However, we found no statistical support for the significance of these differences. We also show that the Russian population is similar to other European populations when comparing the frequencies of the ACE2 variants. Evaluation of the effect of various classes of ACE2 variants on COVID-19 outcome in a cohort of Russian patients showed that common missense and regulatory variants do not explain the differences in disease severity. At the same time, we find several rare ACE2 variants (including rs146598386, rs73195521, rs755766792, and others) that are likely to affect the outcome of COVID-19. Our results demonstrate that the spectrum of genetic variants in ACE2 may partially explain the differences in severity of the COVID-19 outcome.
Collapse
|
research-article |
5 |
25 |
15
|
Makgahlela ML, Strandén I, Nielsen US, Sillanpää MJ, Mäntysaari EA. Using the unified relationship matrix adjusted by breed-wise allele frequencies in genomic evaluation of a multibreed population. J Dairy Sci 2013; 97:1117-27. [PMID: 24342683 DOI: 10.3168/jds.2013-7167] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2013] [Accepted: 10/16/2013] [Indexed: 12/22/2022]
Abstract
The observed low accuracy of genomic selection in multibreed and admixed populations results from insufficient linkage disequilibrium between markers and trait loci. Failure to remove variation due to the population structure may also hamper the prediction accuracy. We verified if accounting for breed origin of alleles in the calculation of genomic relationships would improve the prediction accuracy in an admixed population. Individual breed proportions derived from the pedigree were used to estimate breed-wise allele frequencies (AF). Breed-wise and across-breed AF were estimated from the currently genotyped population and also in the base population. Genomic relationship matrices (G) were subsequently calculated using across-breed (GAB) and breed-wise (GBW) AF estimated in the currently genotyped and also in the base population. Unified relationship matrices were derived by combining different G with pedigree relationships in the evaluation of genomic estimated breeding values (GEBV) for genotyped and ungenotyped animals. The validation reliabilities and inflation of GEBV were assessed by a linear regression of deregressed breeding value (deregressed proofs) on GEBV, weighted by the reliability of deregressed proofs. The regression coefficients (b1) from GAB ranged from 0.76 for milk to 0.90 for protein. Corresponding b1 terms from GBW ranged from 0.72 to 0.88. The validation reliabilities across 4 evaluations with different G were generally 36, 40, and 46% for milk, protein, and fat, respectively. Unexpectedly, validation reliabilities were generally similar across different evaluations, irrespective of AF used to compute G. Thus, although accounting for the population structure in GBW tends to simplify the blending of genomic- and pedigree-based relationships, it appeared to have little effect on the validation reliabilities.
Collapse
|
Validation Study |
12 |
22 |
16
|
Zhou Y, Lauschke VM. Comprehensive overview of the pharmacogenetic diversity in Ashkenazi Jews. J Med Genet 2018; 55:617-627. [PMID: 29970487 DOI: 10.1136/jmedgenet-2018-105429] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2018] [Revised: 05/28/2018] [Accepted: 06/10/2018] [Indexed: 12/12/2022]
Abstract
BACKGROUND Adverse drug reactions are a major concern in drug development and clinical therapy. Genetic polymorphisms in genes involved in drug metabolism and transport are major determinants of treatment efficacy and adverse reactions, and constitute important biomarkers for drug dosing, efficacy and safety. Importantly, human populations and subgroups differ substantially in their pharmacogenetic variability profiles, with important consequences for personalised medicine strategies and precision public health approaches. Despite their long migration history, Ashkenazi Jews constitute a rather isolated population with a unique genetic signature that is distinctly different from other populations. OBJECTIVE To provide a comprehensive overview of the pharmacogenetic profile in Ashkenazim. METHODS We analysed next-generation sequencing data from 5076 Ashkenazim individuals and used sequence data from 117 425 non-Jewish individuals as reference. RESULTS We derived frequencies of 164 alleles in 17 clinically relevant pharmacogenes and derived profiles of putative functional consequences, providing the most comprehensive data set of Jewish pharmacogenetic diversity published to date. Furthermore, we detected 127 variants with an aggregated frequency of 20.7% that were specifically found in Ashkenazim, of which 55 variants were putatively deleterious (aggregated frequency of 9.4%). CONCLUSION The revealed pattern of pharmacogenetic variability in Ashkenazi Jews is distinctly different from other populations and is expected to translate into unique functional consequences, especially for the metabolism of CYP2A6, CYP2C9, NAT2 and VKORC1 substrates. We anticipate that the presented data will serve as a powerful resource for the guidance of pharmacogenetic treatment decisions and the optimisation of population-specific genotyping strategies in the Ashkenazi diaspora.
Collapse
|
Research Support, Non-U.S. Gov't |
7 |
22 |
17
|
Pimentel ECG, Edel C, Emmerling R, Götz KU. How imputation errors bias genomic predictions. J Dairy Sci 2015; 98:4131-8. [PMID: 25841966 DOI: 10.3168/jds.2014-9170] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2014] [Accepted: 02/20/2015] [Indexed: 12/19/2022]
Abstract
The objective of this study was to investigate in detail the biasing effects of imputation errors on genomic predictions. Direct genomic values (DGV) of 3,494 Brown Swiss selection candidates for 37 production and conformation traits were predicted using either their observed 50K genotypes or their 50K genotypes imputed from a mimicked 6K chip. Changes in DGV caused by imputation errors were shown to be systematic. The DGV of top animals were, on average, underestimated and that of bottom animals were, on average, overestimated when imputed genotypes were used instead of observed genotypes. This pattern might be explained by the fact that imputation algorithms will usually suggest the most frequent haplotype from the sample whenever a haplotype cannot be determined unambiguously. That was empirically shown to cause an advantage for the bottom animals and a disadvantage for the top animals.
Collapse
|
Research Support, Non-U.S. Gov't |
10 |
21 |
18
|
Do MD, Le LGH, Nguyen VT, Dang TN, Nguyen NH, Vu HA, Mai TP. High-Resolution HLA Typing of HLA-A, -B, -C, -DRB1, and -DQB1 in Kinh Vietnamese by Using Next-Generation Sequencing. Front Genet 2020; 11:383. [PMID: 32425978 PMCID: PMC7204072 DOI: 10.3389/fgene.2020.00383] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Accepted: 03/27/2020] [Indexed: 12/19/2022] Open
Abstract
Human leukocyte antigen (HLA) genotyping displays the particular characteristics of HLA alleles and haplotype frequencies in each population. Although it is considered the current gold standard for HLA typing, high-resolution sequence-based HLA typing is currently unavailable in Kinh Vietnamese populations. In this study, high-resolution sequence-based HLA typing (3-field) was performed using an amplicon-based next-generation sequencing platform to identify the HLA-A, -B, -C, -DRB1, and -DQB1 alleles of 101 unrelated healthy Kinh Vietnamese individuals from southern Vietnam. A total of 28 HLA-A, 41 HLA-B, 21 HLA-C, 26 HLA-DRB1, and 25 HLA-DQB1 alleles were identified. The most frequently occurring HLA alleles were A∗11:01:01, B∗15:02:01, C∗07:02:01, DRB1∗12:02:01, and DQB1∗03:01:01. Haplotype calculation showed that A∗29:01:01∼B∗07:05:01, DRB1∗12:02:01∼DQB1∗3:01:01, A∗29:01:01∼C∗15:05:02∼B∗07:05:01, A∗33:03:01∼B∗58:01:01∼DRB1∗03:01:01, and A∗29:01:01∼C∗15:05:02∼B∗07:05:01∼DRB1∗10:01:01∼DQB1∗05:01:01 were the most common haplotypes in the southern Kinh Vietnamese population. Allele distribution and haplotype analyses demonstrated that the Vietnamese population shares HLA features with South-East Asians but retains unique characteristics. Data from this study will be potentially applicable in medicine and anthropology.
Collapse
|
Journal Article |
5 |
21 |
19
|
Eberhard HP, Schmidt AH, Mytilineos J, Fleischhauer K, Müller CR. Common and well-documented HLA alleles of German stem cell donors by haplotype frequency estimation. HLA 2019; 92:206-214. [PMID: 30117303 PMCID: PMC6175154 DOI: 10.1111/tan.13378] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2018] [Revised: 06/14/2018] [Accepted: 08/13/2018] [Indexed: 01/11/2023]
Abstract
We present a catalog of common and well-documented (CWD) alleles of the German population for the six HLA loci A, B, C, DRB1, DQB1, and DPB1. This study is based on a sample of over 5 million volunteer adult hematopoietic stem cell donors from the 26 German donor centers. To establish the catalog, allele and haplotype frequencies were estimated with a validated implementation of the expectation-maximization algorithm. CWD criteria similar to existing CWD catalogs were applied in order to be able to put our findings into the context of relevant existing references. Overall, 2155 HLA-A, -B, -C, -DRB1, -DQB1, and -DPB1 alleles were identified as CWD in the German donor population representing about 20% of the HLA alleles at two-field resolution in the IPD-IMGT/HLA Database release v3.25.0 from July 2016 for these six loci. We found a substantial concordance of CWD alleles between the three catalogs and showed the contribution of the German donor population to the CWD alleles domain. In conclusion, the definition of CWD criteria that allow interoperability, scalability, and flexibility will be crucial for the development of a worldwide CWD catalog.
Collapse
|
Research Support, Non-U.S. Gov't |
6 |
21 |
20
|
Takeshima SN, Miyasaka T, Matsumoto Y, Xue G, Diaz VDLB, Rogberg-Muñoz A, Giovambattista G, Ortiz M, Oltra J, Kanemaki M, Onuma M, Aida Y. Assessment of biodiversity in Chilean cattle using the distribution of major histocompatibility complex class II BoLA-DRB3 allele. ACTA ACUST UNITED AC 2014; 85:35-44. [PMID: 25430590 DOI: 10.1111/tan.12481] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2014] [Revised: 10/14/2014] [Accepted: 10/28/2014] [Indexed: 11/30/2022]
Abstract
Bovine leukocyte antigens (BoLAs) are used extensively as markers for bovine disease and immunological traits. In this study, we estimated BoLA-DRB3 allele frequencies using 888 cattle from 10 groups, including seven cattle breeds and three crossbreeds: 99 Red Angus, 100 Black Angus, 81 Chilean Wagyu, 49 Hereford, 95 Hereford × Angus, 71 Hereford × Jersey, 20 Hereford × Overo Colorado, 113 Holstein, 136 Overo Colorado, and 124 Overo Negro cattle. Forty-six BoLA-DRB3 alleles were identified, and each group had between 12 and 29 different BoLA-DRB3 alleles. Overo Negro had the highest number of alleles (29); this breed is considered in Chile to be an 'Old type' European Holstein Friesian descendant. By contrast, we detected 21 alleles in Holstein cattle, which are considered to be a 'Present type' Holstein Friesian cattle. Chilean cattle groups and four Japanese breeds were compared by neighbor-joining trees and a principal component analysis (PCA). The phylogenetic tree showed that Red Angus and Black Angus cattle were in the same clade, crossbreeds were closely related to their parent breeds, and Holstein cattle from Chile were closely related to Holstein cattle in Japan. Overall, the tree provided a thorough description of breed history. It also showed that the Overo Negro breed was closely related to the Holstein breed, consistent with historical data indicating that Overo Negro is an 'Old type' Holstein Friesian cattle. This allelic information will be important for investigating the relationship between major histocompatibility complex (MHC) and disease.
Collapse
|
Research Support, Non-U.S. Gov't |
11 |
20 |
21
|
Hariprakash JM, Vellarikkal SK, Keechilat P, Verma A, Jayarajan R, Dixit V, Ravi R, Senthivel V, Kumar A, Sehgal P, Sonakar AK, Ambawat S, Giri AK, Philip A, Sivadas A, Faruq M, Bharadwaj D, Sivasubbu S, Scaria V. Pharmacogenetic landscape of DPYD variants in south Asian populations by integration of genome-scale data. Pharmacogenomics 2017; 19:227-241. [PMID: 29239269 DOI: 10.2217/pgs-2017-0101] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
AIM Adverse drug reactions to 5-Fluorouracil(5-FU) is frequent and largely attributable to genetic variations in the DPYD gene, a rate limiting enzyme that clears 5-FU. The study aims at understanding the pharmacogenetic landscape of DPYD variants in south Asian populations. MATERIALS & METHODS Systematic analysis of population scale genome wide datasets of over 3000 south Asians was performed. Independent evaluation was performed in a small cohort of patients. RESULTS Our analysis revealed significant differences in the the allelic distribution of variants in different ethnicities. CONCLUSIONS This is the first and largest genetic map the DPYD variants associated with adverse drug reaction to 5-FU in south Asian population. Our study highlights ethnic differences in allelic frequencies.
Collapse
|
Research Support, Non-U.S. Gov't |
8 |
20 |
22
|
Campbell RF, McGrath PT, Paaby AB. Analysis of Epistasis in Natural Traits Using Model Organisms. Trends Genet 2018; 34:883-898. [PMID: 30166071 PMCID: PMC6541385 DOI: 10.1016/j.tig.2018.08.002] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Revised: 06/06/2018] [Accepted: 08/03/2018] [Indexed: 12/16/2022]
Abstract
The ability to detect and understand epistasis in natural populations is important for understanding how biological traits are influenced by genetic variation. However, identification and characterization of epistasis in natural populations remains difficult due to statistical issues that arise as a result of multiple comparisons, and the fact that most genetic variants segregate at low allele frequencies. In this review, we discuss how model organisms may be used to manipulate genotypic combinations to power the detection of epistasis as well as test interactions between specific genes. Findings from a number of species indicate that statistical epistasis is pervasive between natural genetic variants. However, the properties of experimental systems that enable analysis of epistasis also constrain extrapolation of these results back into natural populations.
Collapse
|
Review |
7 |
20 |
23
|
Mathematical Constraints on FST: Biallelic Markers in Arbitrarily Many Populations. Genetics 2017; 206:1581-1600. [PMID: 28476869 DOI: 10.1534/genetics.116.199141] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Accepted: 05/03/2017] [Indexed: 02/01/2023] Open
Abstract
[Formula: see text] is one of the most widely used statistics in population genetics. Recent mathematical studies have identified constraints that challenge interpretations of [Formula: see text] as a measure with potential to range from 0 for genetically similar populations to 1 for divergent populations. We generalize results obtained for population pairs to arbitrarily many populations, characterizing the mathematical relationship between [Formula: see text] the frequency M of the more frequent allele at a polymorphic biallelic marker, and the number of subpopulations K We show that for fixed K, [Formula: see text] has a peculiar constraint as a function of M, with a maximum of 1 only if [Formula: see text] for integers i with [Formula: see text] For fixed M, as K grows large, the range of [Formula: see text] becomes the closed or half-open unit interval. For fixed K, however, some [Formula: see text] always exists at which the upper bound on [Formula: see text] lies below [Formula: see text] We use coalescent simulations to show that under weak migration, [Formula: see text] depends strongly on M when K is small, but not when K is large. Finally, examining data on human genetic variation, we use our results to explain the generally smaller [Formula: see text] values between pairs of continents relative to global [Formula: see text] values. We discuss implications for the interpretation and use of [Formula: see text].
Collapse
|
Research Support, U.S. Gov't, Non-P.H.S. |
8 |
20 |
24
|
Henderson LM, Claw KG, Woodahl EL, Robinson RF, Boyer BB, Burke W, Thummel KE. P450 Pharmacogenetics in Indigenous North American Populations. J Pers Med 2018; 8:jpm8010009. [PMID: 29389890 PMCID: PMC5872083 DOI: 10.3390/jpm8010009] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Revised: 01/19/2018] [Accepted: 01/22/2018] [Indexed: 12/14/2022] Open
Abstract
Indigenous North American populations, including American Indian and Alaska Native peoples in the United States, the First Nations, Métis and Inuit peoples in Canada and Amerindians in Mexico, are historically under-represented in biomedical research, including genomic research on drug disposition and response. Without adequate representation in pharmacogenetic studies establishing genotype-phenotype relationships, Indigenous populations may not benefit fully from new innovations in precision medicine testing to tailor and improve the safety and efficacy of drug treatment, resulting in health care disparities. The purpose of this review is to summarize and evaluate what is currently known about cytochrome P450 genetic variation in Indigenous populations in North America and to highlight the importance of including these groups in future pharmacogenetic studies for implementation of personalized drug therapy.
Collapse
|
Review |
7 |
19 |
25
|
Barbitoff YA, Skitchenko RK, Poleshchuk OI, Shikov AE, Serebryakova EA, Nasykhova YA, Polev DE, Shuvalova AR, Shcherbakova IV, Fedyakov MA, Glotov OS, Glotov AS, Predeus AV. Whole-exome sequencing provides insights into monogenic disease prevalence in Northwest Russia. Mol Genet Genomic Med 2019; 7:e964. [PMID: 31482689 PMCID: PMC6825859 DOI: 10.1002/mgg3.964] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Accepted: 08/07/2019] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Allele frequency data from large exome and genome aggregation projects such as the Genome Aggregation Database (gnomAD) are of ultimate importance to the interpretation of medical resequencing data. However, allele frequencies might significantly differ in poorly studied populations that are underrepresented in large-scale projects, such as the Russian population. METHODS In this work, we leveraged our access to a large dataset of 694 exome samples to analyze genetic variation in the Northwest Russia. We compared the spectrum of genetic variants to the dbSNP build 151, and made estimates of ClinVar-based autosomal recessive (AR) disease allele prevalence as compared to gnomAD r. 2.1. RESULTS An estimated 9.3% of discovered variants were not present in dbSNP. We report statistically significant overrepresentation of pathogenic variants for several Mendelian disorders, including phenylketonuria (PAH, rs5030858), Wilson's disease (ATP7B, rs76151636), factor VII deficiency (F7, rs36209567), kyphoscoliosis type of Ehlers-Danlos syndrome (FKBP14, rs542489955), and several other recessive pathologies. We also make primary estimates of monogenic disease incidence in the population, with retinal dystrophy, cystic fibrosis, and phenylketonuria being the most frequent AR pathologies. CONCLUSION Our observations demonstrate the utility of population-specific allele frequency data to the diagnosis of monogenic disorders using high-throughput technologies.
Collapse
|
research-article |
6 |
19 |