1
|
Id-Lahoucine S, Cánovas A, Legarra A, Casellas J. Transmission ratio distortion regions in the context of genomic evaluation and their effects on reproductive traits in cattle. J Dairy Sci 2023; 106:7786-7798. [PMID: 37210358 DOI: 10.3168/jds.2022-23062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 04/19/2023] [Indexed: 05/22/2023]
Abstract
Transmission ratio distortion (TRD), which is a deviation from Mendelian expectations, has been associated with basic mechanisms of life such as sperm and ova fertility and viability at developmental stages of the reproductive cycle. In this study different models including TRD regions were tested for different reproductive traits [days from first service to conception (FSTC), number of services, first service nonreturn rate (NRR), and stillbirth (SB)]. Thus, in addition to a basic model with systematic and random effects, including genetic effects modeled through a genomic relationship matrix, we developed 2 additional models, including a second genomic relationship matrix based on TRD regions, and TRD regions as a random effect assuming heterogeneous variances. The analyses were performed with 10,623 cows and 1,520 bulls genotyped for 47,910 SNPs, 590 TRD regions, and several records ranging from 9,587 (FSTC) to 19,667 (SB). The results of this study showed the ability of TRD regions to capture some additional genetic variance for some traits; however, this did not translate into higher accuracy for genomic prediction. This could be explained by the nature of TRD itself, which may arise in different stages of the reproductive cycle. Nevertheless, important effects of TRD regions were found on SB (31 regions) and NRR (18 regions) when comparing at-risk versus control matings, especially for regions with allelic TRD pattern. Particularly for NRR, the probability of observing nonpregnant cow increases by up to 27% for specific TRD regions, and the probability of observing stillbirth increased by up to 254%. These results support the relevance of several TRD regions on some reproductive traits, especially those with allelic patterns that have not received as much attention as recessive TRD patterns.
Collapse
Affiliation(s)
- S Id-Lahoucine
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph N1G 2W1, ON, Canada
| | - A Cánovas
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph N1G 2W1, ON, Canada.
| | - A Legarra
- INRAE, UR631 SAGA, BP 52627, 32326 Castanet-Tolosan, France
| | - J Casellas
- Departament de Ciència Animal i dels Aliments, Universitat Autònoma de Barcelona, Bellaterra 08193, Barcelona, Spain
| |
Collapse
|
2
|
Id-Lahoucine S, Casellas J, Lu D, Sargolzaei M, Miller S, Cánovas A. Distortion of Mendelian segregation across the Angus cattle genome uncovering regions affecting reproduction. Sci Rep 2023; 13:13393. [PMID: 37591956 PMCID: PMC10435455 DOI: 10.1038/s41598-023-37710-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Accepted: 06/26/2023] [Indexed: 08/19/2023] Open
Abstract
Nowadays, the availability of genotyped trios (sire-dam-offspring) in the livestock industry enables the implementation of the transmission ratio distortion (TRD) approach to discover deleterious alleles in the genome. Various biological mechanisms at different stages of the reproductive cycle such as gametogenesis, embryo development and postnatal viability can induce signals of TRD (i.e., deviation from Mendelian inheritance expectations). In this study, TRD was evaluated using both SNP-by-SNP and sliding windows of 2-, 4-, 7-, 10- and 20-SNP across 92,942 autosomal SNPs for 258,140 genotyped Angus cattle including 7,486 sires, 72,688 dams and 205,966 offspring. Transmission ratio distortion was characterized using allelic (specific- and unspecific-parent TRD) and genotypic parameterizations (additive- and dominance-TRD). Across the Angus autosomal chromosomes, 851 regions were clearly found with decisive evidence for TRD. Among these findings, 19 haplotypes with recessive patterns (potential lethality for homozygote individuals) and 52 regions with allelic patterns exhibiting complete or quasi-complete absence for homozygous individuals in addition to under-representation (potentially reduced viability) of the carrier (heterozygous) offspring were found. In addition, 64 (12) and 20 (4) regions showed significant influence on the trait heifer pregnancy at p-value < 0.05 (after chromosome-wise false discovery rate) and 0.01, respectively, reducing the pregnancy rate up to 15%, thus, supporting the biological importance of TRD phenomenon in reproduction.
Collapse
Affiliation(s)
- S Id-Lahoucine
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, ON, N1G 2W1, Canada
| | - J Casellas
- Departament de Ciència Animal i dels Aliments, Universitat Autònoma de Barcelona, Bellaterra, 08193, Barcelona, Spain
| | - D Lu
- Angus Genetics Inc., St. Joseph, MO, 64506, USA
| | - M Sargolzaei
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, ON, N1G 2W1, Canada
- Department of Pathobiology, University of Guelph, Guelph, ON, N1G 2W1, Canada
- Select Sires, Inc., Plain City, OH, 43064, USA
| | - S Miller
- AGBU, a joint venture of NSW Department of Primary Industries and University of New England, Armidale, 2351, Australia
| | - A Cánovas
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, ON, N1G 2W1, Canada.
| |
Collapse
|
3
|
Miller SE, Sheehan MJ. Sex differences in deleterious genetic variants in a haplodiploid social insect. Mol Ecol 2023; 32:4546-4556. [PMID: 37350360 PMCID: PMC10528523 DOI: 10.1111/mec.17057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 06/01/2023] [Accepted: 06/12/2023] [Indexed: 06/24/2023]
Abstract
Deleterious variants are selected against but can linger in populations at low frequencies for long periods of time, decreasing fitness and contributing to disease burden in humans and other species. Deleterious variants occur at low frequency but distinguishing deleterious variants from low-frequency neutral variation is challenging based on population genomics data alone. As a result, we have little sense of the number and identity of deleterious variants in wild populations. For haplodiploid species, it has been hypothesised that deleterious alleles will be directly exposed to selection in haploid males, but selection can be masked in diploid females when deleterious variants are recessive, resulting in more efficient purging of deleterious mutations in males. Therefore, comparisons of the differences between haploid and diploid genomes from the same population may be a useful method for inferring rare deleterious variants. This study provides the first formal test of this hypothesis. Using wild populations of Northern paper wasps (Polistes fuscatus), we find that males have fewer missense and nonsense variants per generation than females from the same population. Allele frequency differences are especially pronounced for rare missense and nonsense variants and these differences lead to a lower mutational load in males than females. Based on these data we infer that many highly deleterious mutations are segregating in the paper wasp population. Stronger selection against deleterious alleles in haploid males may have implications for adaptation in other haplodiploid insects and provides evidence that wild populations harbour abundant deleterious variants.
Collapse
Affiliation(s)
- Sara E. Miller
- Laboratory for Animal Social Evolution and Recognition, Department of Neurobiology and Behavior, Cornell University, Ithaca, NY, USA
- Department of Biology, University of Missouri St. Louis, St. Louis, MO, USA
| | - Michael J. Sheehan
- Laboratory for Animal Social Evolution and Recognition, Department of Neurobiology and Behavior, Cornell University, Ithaca, NY, USA
| |
Collapse
|
4
|
Holborn MA, Ford G, Turner S, Mellet J, van Rensburg J, Joubert F, Pepper MS. The NESHIE and CP Genetics Resource (NCGR): A database of genes and variants reported in neonatal encephalopathy with suspected hypoxic ischemic encephalopathy (NESHIE) and consequential cerebral palsy (CP). Genomics 2022; 114:110508. [PMID: 36270382 PMCID: PMC9726645 DOI: 10.1016/j.ygeno.2022.110508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 10/12/2022] [Accepted: 10/17/2022] [Indexed: 01/15/2023]
Abstract
Neonatal encephalopathy (NE) with suspected hypoxic ischaemic encephalopathy (HIE) (NESHIE) is a complex syndrome occurring in newborns, characterised by altered neurological function. It has been suggested that genetic variants may influence NESHIE susceptibility and outcomes. Unlike NESHIE, for which a limited number of genetic studies have been performed, many studies have identified genetic variants associated with cerebral palsy (CP), which can develop from severe NESHIE. Identifying variants in patients with CP, as a consequence of NESHIE, may provide a starting point for the identification of genetic variants associated with NESHIE outcomes. We have constructed NCGR (NESHIE and CP Genetics Resource), a database of genes and variants reported in patients with NESHIE and CP (where relevant to NESHIE), for the purpose of collating and comparing genetic findings between the two conditions. In this paper we describe the construction and functionality of NCGR. Furthermore, we demonstrate how NCGR can be used to prioritise genes and variants of potential clinical relevance that may underlie a genetic predisposition to NESHIE and contribute to an understanding of its pathogenesis.
Collapse
Affiliation(s)
- Megan A. Holborn
- Institute for Cellular and Molecular Medicine, Department of Immunology; SAMRC Extramural Unit for Stem Cell Research and Therapy, Faculty of Health Sciences, University of Pretoria, Pretoria, South Africa
| | - Graeme Ford
- Institute for Cellular and Molecular Medicine, Department of Immunology; SAMRC Extramural Unit for Stem Cell Research and Therapy, Faculty of Health Sciences, University of Pretoria, Pretoria, South Africa,Centre for Bioinformatics and Computational Biology, Genomics Research Institute, Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
| | - Sarah Turner
- Institute for Cellular and Molecular Medicine, Department of Immunology; SAMRC Extramural Unit for Stem Cell Research and Therapy, Faculty of Health Sciences, University of Pretoria, Pretoria, South Africa,Centre for Bioinformatics and Computational Biology, Genomics Research Institute, Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
| | - Juanita Mellet
- Institute for Cellular and Molecular Medicine, Department of Immunology; SAMRC Extramural Unit for Stem Cell Research and Therapy, Faculty of Health Sciences, University of Pretoria, Pretoria, South Africa
| | - Jeanne van Rensburg
- Institute for Cellular and Molecular Medicine, Department of Immunology; SAMRC Extramural Unit for Stem Cell Research and Therapy, Faculty of Health Sciences, University of Pretoria, Pretoria, South Africa
| | - Fourie Joubert
- Centre for Bioinformatics and Computational Biology, Genomics Research Institute, Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
| | - Michael S. Pepper
- Institute for Cellular and Molecular Medicine, Department of Immunology; SAMRC Extramural Unit for Stem Cell Research and Therapy, Faculty of Health Sciences, University of Pretoria, Pretoria, South Africa,Corresponding author.
| |
Collapse
|
5
|
Identifying interpretable gene-biomarker associations with functionally informed kernel-based tests in 190,000 exomes. Nat Commun 2022; 13:5332. [PMID: 36088354 PMCID: PMC9464252 DOI: 10.1038/s41467-022-32864-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Accepted: 08/22/2022] [Indexed: 12/05/2022] Open
Abstract
Here we present an exome-wide rare genetic variant association study for 30 blood biomarkers in 191,971 individuals in the UK Biobank. We compare gene-based association tests for separate functional variant categories to increase interpretability and identify 193 significant gene-biomarker associations. Genes associated with biomarkers were ~ 4.5-fold enriched for conferring Mendelian disorders. In addition to performing weighted gene-based variant collapsing tests, we design and apply variant-category-specific kernel-based tests that integrate quantitative functional variant effect predictions for missense variants, splicing and the binding of RNA-binding proteins. For these tests, we present a computationally efficient combination of the likelihood-ratio and score tests that found 36% more associations than the score test alone while also controlling the type-1 error. Kernel-based tests identified 13% more associations than their gene-based collapsing counterparts and had advantages in the presence of gain of function missense variants. We introduce local collapsing by amino acid position for missense variants and use it to interpret associations and identify potential novel gain of function variants in PIEZO1. Our results show the benefits of investigating different functional mechanisms when performing rare-variant association tests, and demonstrate pervasive rare-variant contribution to biomarker variability.
Collapse
|
6
|
Jin F, Li J, Zhang YB, Liu X, Cai M, Liu M, Li M, Ma C, Yue R, Zhu Y, Lai R, Wang Z, Ji X, Wei H, Dong J, Liu Z, Wang Y, Sun Y, Wang X. A functional motif of long noncoding RNA Nron against osteoporosis. Nat Commun 2021; 12:3319. [PMID: 34083547 PMCID: PMC8175706 DOI: 10.1038/s41467-021-23642-7] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2020] [Accepted: 04/30/2021] [Indexed: 12/14/2022] Open
Abstract
Long noncoding RNAs are widely implicated in diverse disease processes. Nonetheless, their regulatory roles in bone resorption are undefined. Here, we identify lncRNA Nron as a critical suppressor of bone resorption. We demonstrate that osteoclastic Nron knockout mice exhibit an osteopenia phenotype with elevated bone resorption activity. Conversely, osteoclastic Nron transgenic mice exhibit lower bone resorption and higher bone mass. Furthermore, the pharmacological overexpression of Nron inhibits bone resorption, while caused apparent side effects in mice. To minimize the side effects, we further identify a functional motif of Nron. The delivery of Nron functional motif to osteoclasts effectively reverses bone loss without obvious side effects. Mechanistically, the functional motif of Nron interacts with E3 ubiquitin ligase CUL4B to regulate ERα stability. These results indicate that Nron is a key bone resorption suppressor, and the lncRNA functional motif could potentially be utilized to treat diseases with less risk of side effects. LncRNAs are implicated in the pathogenesis of a number of diseases. Here, the authors show that the lncRNA Nron suppresses bone resorption, and show that delivery of a functional motif of Nron increases bone mass in mouse models of osteoporosis.
Collapse
Affiliation(s)
- Fujun Jin
- Clinical Research Platform for Interdiscipline of Stomatology, The First Affiliated Hospital of Jinan University & Department of Stomatology, College of stomatology, Jinan University, Guangzhou, China.,Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Engineering Medicine, Beihang University, Beijing, China
| | - Junhui Li
- Department of Oral Implantology, School of Stomatology, Tongji University, Shanghai Engineering Research Center of Tooth Restoration and Regeneration, Shanghai, China
| | - Yong-Biao Zhang
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Engineering Medicine, Beihang University, Beijing, China
| | - Xiangning Liu
- Clinical Research Platform for Interdiscipline of Stomatology, The First Affiliated Hospital of Jinan University & Department of Stomatology, College of stomatology, Jinan University, Guangzhou, China.,Guangzhou Jinan Biomedicine Research and Development Center, National Engineering Research Center of Genetic Medicine, Institute of Biomedicine, College of Life Science and Technology, Jinan University, Guangzhou, China
| | - Mingxiang Cai
- Clinical Research Platform for Interdiscipline of Stomatology, The First Affiliated Hospital of Jinan University & Department of Stomatology, College of stomatology, Jinan University, Guangzhou, China.,Department of Oral Implantology, School of Stomatology, Tongji University, Shanghai Engineering Research Center of Tooth Restoration and Regeneration, Shanghai, China
| | - Meijing Liu
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Engineering Medicine, Beihang University, Beijing, China
| | - Mengyao Li
- Shanghai Institute of Immunology, Department of Immunology and Microbiology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Cui Ma
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Engineering Medicine, Beihang University, Beijing, China
| | - Rui Yue
- Institute for Regenerative Medicine, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Shanghai Key Laboratory of Signaling and Disease Research, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Yexuan Zhu
- Guangzhou Jinan Biomedicine Research and Development Center, National Engineering Research Center of Genetic Medicine, Institute of Biomedicine, College of Life Science and Technology, Jinan University, Guangzhou, China
| | - Renfa Lai
- Clinical Research Platform for Interdiscipline of Stomatology, The First Affiliated Hospital of Jinan University & Department of Stomatology, College of stomatology, Jinan University, Guangzhou, China
| | - Zuolin Wang
- Department of Oral Implantology, School of Stomatology, Tongji University, Shanghai Engineering Research Center of Tooth Restoration and Regeneration, Shanghai, China
| | - Xunming Ji
- Department of Neurosurgery & China-America Institute of Neuroscience, Xuanwu Hospital, Capital Medical University, Beijing, China
| | - Huawei Wei
- Zeki Biotechnology & Pharmaceutical Co. Ltd, Beijing, China
| | - Jun Dong
- Guangzhou Jinan Biomedicine Research and Development Center, National Engineering Research Center of Genetic Medicine, Institute of Biomedicine, College of Life Science and Technology, Jinan University, Guangzhou, China
| | - Zhiduo Liu
- Shanghai Institute of Immunology, Department of Immunology and Microbiology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Yifei Wang
- Guangzhou Jinan Biomedicine Research and Development Center, National Engineering Research Center of Genetic Medicine, Institute of Biomedicine, College of Life Science and Technology, Jinan University, Guangzhou, China.
| | - Yao Sun
- Department of Oral Implantology, School of Stomatology, Tongji University, Shanghai Engineering Research Center of Tooth Restoration and Regeneration, Shanghai, China.
| | - Xiaogang Wang
- Clinical Research Platform for Interdiscipline of Stomatology, The First Affiliated Hospital of Jinan University & Department of Stomatology, College of stomatology, Jinan University, Guangzhou, China. .,Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Engineering Medicine, Beihang University, Beijing, China.
| |
Collapse
|
7
|
Sonehara K, Okada Y. Genomics-driven drug discovery based on disease-susceptibility genes. Inflamm Regen 2021; 41:8. [PMID: 33691789 PMCID: PMC7944616 DOI: 10.1186/s41232-021-00158-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Accepted: 02/26/2021] [Indexed: 12/19/2022] Open
Abstract
Genome-wide association studies have identified numerous disease-susceptibility genes. As knowledge of gene–disease associations accumulates, it is becoming increasingly important to translate this knowledge into clinical practice. This challenge involves finding effective drug targets and estimating their potential side effects, which often results in failure of promising clinical trials. Here, we review recent advances and future perspectives in genetics-led drug discovery, with a focus on drug repurposing, Mendelian randomization, and the use of multifaceted omics data.
Collapse
Affiliation(s)
- Kyuto Sonehara
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, 2-2 Yamadaoka, Suita, 565-0871, Japan
| | - Yukinori Okada
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, 2-2 Yamadaoka, Suita, 565-0871, Japan. .,Laboratory of Statistical Immunology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Suita, 565-0871, Japan. .,Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, 565-0871, Japan.
| |
Collapse
|
8
|
Zhang J, Tang SY, Zhu XB, Li P, Lu JQ, Cong JS, Wang LB, Zhang F, Li Z. Whole exome sequencing and trio analysis to broaden the variant spectrum of genes in idiopathic hypogonadotropic hypogonadism. Asian J Androl 2021; 23:288-293. [PMID: 33208564 PMCID: PMC8152424 DOI: 10.4103/aja.aja_65_20] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Dozens of genes are associated with idiopathic hypogonadotropic hypogonadism (IHH) and an oligogenic etiology has been suggested. However, the associated genes may account for only approximately 50% cases. In addition, a genomic systematic pedigree analysis is still lacking. Here, we conducted whole exome sequencing (WES) on 18 unrelated men affected by IHH and their corresponding parents. Notably, one reported and 10 novel variants in eight known IHH causative genes (AXL, CCDC141, CHD7, DMXL2, FGFR1, PNPLA6, POLR3A, and PROKR2), nine variants in nine recently reported candidate genes (DCAF17, DCC, EGF, IGSF10, NOTCH1, PDE3A, RELN, SLIT2, and TRAPPC9), and four variants in four novel candidate genes for IHH (CCDC88C, CDON, GADL1, and SPRED3) were identified in 77.8% (14/18) of IHH cases. Among them, eight (8/18, 44.4%) cases carried more than one variant in IHH-related genes, supporting the oligogenic model. Interestingly, we found that those variants tended to be maternally inherited (maternal with n = 17 vs paternal with n = 7; P = 0.028). Our further retrospective investigation of published reports replicated the maternal bias (maternal with n = 46 vs paternal with n = 28; P = 0.024). Our study extended a variant spectrum for IHH and provided thefirst evidence that women are probably more tolerant to variants of IHH-related genes than men.
Collapse
Affiliation(s)
- Jian Zhang
- Obstetrics and Gynecology Hospital, NHC Key Laboratory of Reproduction Regulation (Shanghai Institute of Planned Parenthood Research), School of Life Sciences, Fudan University, Shanghai 200011, China
| | - Shu-Yan Tang
- Obstetrics and Gynecology Hospital, NHC Key Laboratory of Reproduction Regulation (Shanghai Institute of Planned Parenthood Research), School of Life Sciences, Fudan University, Shanghai 200011, China
| | - Xiao-Bin Zhu
- Department of Andrology, Center for Men's Health, Urologic Medical Center, Shanghai General Hospital, Shanghai Jiao Tong University, Shanghai 200080, China
| | - Peng Li
- Department of Andrology, Center for Men's Health, Urologic Medical Center, Shanghai General Hospital, Shanghai Jiao Tong University, Shanghai 200080, China
| | - Jian-Qi Lu
- Department of Research Institute, Reproduction Medical Center, The first Hospital of Lanzhou University, Lanzhou 730000, China
| | - Jiang-Shan Cong
- Obstetrics and Gynecology Hospital, NHC Key Laboratory of Reproduction Regulation (Shanghai Institute of Planned Parenthood Research), School of Life Sciences, Fudan University, Shanghai 200011, China
| | - Ling-Bo Wang
- Obstetrics and Gynecology Hospital, NHC Key Laboratory of Reproduction Regulation (Shanghai Institute of Planned Parenthood Research), School of Life Sciences, Fudan University, Shanghai 200011, China
| | - Feng Zhang
- Obstetrics and Gynecology Hospital, NHC Key Laboratory of Reproduction Regulation (Shanghai Institute of Planned Parenthood Research), School of Life Sciences, Fudan University, Shanghai 200011, China
| | - Zheng Li
- Department of Andrology, Center for Men's Health, Urologic Medical Center, Shanghai General Hospital, Shanghai Jiao Tong University, Shanghai 200080, China
| |
Collapse
|
9
|
Carron J, Torricelli C, Silva JK, Queiroz GSR, Ortega MM, Lima CSP, Lourenço GJ. microRNAs deregulation in head and neck squamous cell carcinoma. Head Neck 2020; 43:645-667. [PMID: 33159410 DOI: 10.1002/hed.26533] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 09/30/2020] [Accepted: 10/23/2020] [Indexed: 12/24/2022] Open
Abstract
Head and neck (HN) squamous cell carcinoma (SCC) is the eighth most common human cancer worldwide. Besides tobacco and alcohol consumption, genetic and epigenetic alterations play an important role in HNSCC occurrence and progression. microRNAs (miRNAs) are small noncoding RNAs that regulate cell cycle, proliferation, development, differentiation, and apoptosis by interfering in gene expression. Expression profiling of miRNAs showed that some miRNAs are upregulated or downregulated in tumor cells when compared with the normal cells. The present review focuses on the role of miRNAs deregulations in HNSCC, enrolled in risk, development, outcome, and therapy sensitivity. Moreover, the influence of single nucleotide variants in miRNAs target sites, miRNAs seed sites, and miRNAs-processing genes in HNSCC was also revised. Due to its potential for cancer diagnosis, progression, and as a therapeutic target, miRNAs may bring new perspectives in HNSCC understanding and therapy, especially for those patients with no or insufficient treatment options.
Collapse
Affiliation(s)
- Juliana Carron
- Laboratory of Cancer Genetics, School of Medical Sciences, University of Campinas, Campinas, Brazil
| | - Caroline Torricelli
- Laboratory of Cancer Genetics, School of Medical Sciences, University of Campinas, Campinas, Brazil
| | - Janet K Silva
- Laboratory of Cancer Genetics, School of Medical Sciences, University of Campinas, Campinas, Brazil
| | - Gabriela S R Queiroz
- Laboratory of Cancer Genetics, School of Medical Sciences, University of Campinas, Campinas, Brazil
| | - Manoela M Ortega
- Laboratory of Cell and Molecular Tumor Biology and Bioactive Compounds, São Francisco University, Bragança Paulista, Brazil
| | - Carmen S P Lima
- Laboratory of Cancer Genetics, School of Medical Sciences, University of Campinas, Campinas, Brazil
| | - Gustavo J Lourenço
- Laboratory of Cancer Genetics, School of Medical Sciences, University of Campinas, Campinas, Brazil
| |
Collapse
|
10
|
Yang J, Wang Z, Liu S, Wang W, Zhang H, Gui C. Functional Characterization Reveals the Significance of Rare Coding Variations in Human Organic Anion Transporting Polypeptide 2B1 (SLCO2B1). Mol Pharm 2020; 17:3966-3978. [DOI: 10.1021/acs.molpharmaceut.0c00747] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- Jingjie Yang
- College of Pharmaceutical Sciences, Soochow University, 199 Renai Road, Suzhou Industrial Park, Suzhou 215123, China
| | - Zhongmin Wang
- College of Pharmaceutical Sciences, Soochow University, 199 Renai Road, Suzhou Industrial Park, Suzhou 215123, China
| | - Shuai Liu
- College of Pharmaceutical Sciences, Soochow University, 199 Renai Road, Suzhou Industrial Park, Suzhou 215123, China
| | - Weipeng Wang
- College of Pharmaceutical Sciences, Soochow University, 199 Renai Road, Suzhou Industrial Park, Suzhou 215123, China
| | - Hongjian Zhang
- College of Pharmaceutical Sciences, Soochow University, 199 Renai Road, Suzhou Industrial Park, Suzhou 215123, China
| | - Chunshan Gui
- College of Pharmaceutical Sciences, Soochow University, 199 Renai Road, Suzhou Industrial Park, Suzhou 215123, China
| |
Collapse
|
11
|
Cirulli ET, White S, Read RW, Elhanan G, Metcalf WJ, Tanudjaja F, Fath DM, Sandoval E, Isaksson M, Schlauch KA, Grzymski JJ, Lu JT, Washington NL. Genome-wide rare variant analysis for thousands of phenotypes in over 70,000 exomes from two cohorts. Nat Commun 2020; 11:542. [PMID: 31992710 PMCID: PMC6987107 DOI: 10.1038/s41467-020-14288-y] [Citation(s) in RCA: 64] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Accepted: 12/19/2019] [Indexed: 02/08/2023] Open
Abstract
Understanding the impact of rare variants is essential to understanding human health. We analyze rare (MAF < 0.1%) variants against 4264 phenotypes in 49,960 exome-sequenced individuals from the UK Biobank and 1934 phenotypes (1821 overlapping with UK Biobank) in 21,866 members of the Healthy Nevada Project (HNP) cohort who underwent Exome + sequencing at Helix. After using our rare-variant-tailored methodology to reduce test statistic inflation, we identify 64 statistically significant gene-based associations in our meta-analysis of the two cohorts and 37 for phenotypes available in only one cohort. Singletons make significant contributions to our results, and the vast majority of the associations could not have been identified with a genotyping chip. Our results are available for interactive browsing in a webapp (https://ukb.research.helix.com). This comprehensive analysis illustrates the biological value of large, deeply phenotyped cohorts of unselected populations coupled with NGS data. Population-based association analyses of rare genetic variants with complex traits are limited by the availability of data from sufficiently large cohorts. Here, Cirulli et al. report gene-based collapsing analysis of exomes from 49,960 participants of the UK Biobank and 21,866 participants of the Healthy Nevada Project over a total of 4377 traits.
Collapse
Affiliation(s)
| | - Simon White
- Helix, 101S Ellsworth Ave Suite 350, San Mateo, CA, 94401, USA
| | - Robert W Read
- Desert Research Institute, 2215 Raggio Pkwy, Reno, NV, 89512, USA.,Renown Institute of Health Innovation, Reno, NV, 89512, USA
| | - Gai Elhanan
- Desert Research Institute, 2215 Raggio Pkwy, Reno, NV, 89512, USA.,Renown Institute of Health Innovation, Reno, NV, 89512, USA
| | - William J Metcalf
- Desert Research Institute, 2215 Raggio Pkwy, Reno, NV, 89512, USA.,Renown Institute of Health Innovation, Reno, NV, 89512, USA
| | | | - Donna M Fath
- Helix, 101S Ellsworth Ave Suite 350, San Mateo, CA, 94401, USA
| | - Efren Sandoval
- Helix, 101S Ellsworth Ave Suite 350, San Mateo, CA, 94401, USA
| | - Magnus Isaksson
- Helix, 101S Ellsworth Ave Suite 350, San Mateo, CA, 94401, USA
| | - Karen A Schlauch
- Desert Research Institute, 2215 Raggio Pkwy, Reno, NV, 89512, USA.,Renown Institute of Health Innovation, Reno, NV, 89512, USA
| | - Joseph J Grzymski
- Desert Research Institute, 2215 Raggio Pkwy, Reno, NV, 89512, USA.,Renown Institute of Health Innovation, Reno, NV, 89512, USA
| | - James T Lu
- Helix, 101S Ellsworth Ave Suite 350, San Mateo, CA, 94401, USA
| | | |
Collapse
|
12
|
Genetic intolerance analysis as a tool for protein science. BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES 2019; 1862:183058. [PMID: 31494120 DOI: 10.1016/j.bbamem.2019.183058] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2019] [Revised: 08/21/2019] [Accepted: 08/30/2019] [Indexed: 01/04/2023]
Abstract
Recent advances in whole genome and exome sequencing have dramatically increased the database of human gene variations. There are now enough sequenced human exomes and genomes to begin to identify gene variations that are notable because they are NOT observed in sequenced human genomes, apparently because they are subject to "purifying selection", exemplifying genetic intolerance. Such "dysprocreative" gene variations are embryonic lethal or prevent reproduction through any one of a number of possible mechanisms. Here we review an emerging quantitative approach, "Missense Tolerance Ratio" (MTR) analysis, that is used to assess protein-encoding gene (cDNA) sequence intolerance to missense mutations based on analysis of the >100 K and growing number of currently available human genome and exome sequences. This approach is already useful for analyzing intolerance to mutations in cDNA segments with a resolution on the order of 90 bases. Moreover, as the number of sequenced genomes/exomes increases by orders of magnitude it may eventually be possible to assess mutational tolerance in a statistically robust manner at or near single site resolution. Here we focus on how cDNA intolerance analysis complements other bioinformatic methods to illuminate structure-folding-function relationships for the encoded proteins. A set of disease-linked membrane proteins is employed to provide examples.
Collapse
|
13
|
Zhu Q, Zhang J, Chen Y, Hu Q, Shen H, Huang RY, Liu Q, Kaur J, Long M, Battaglia S, Eng KH, Lele SB, Zsiros E, Villella J, Lugade A, Yao S, Liu S, Moysich K, Odunsi KO. Whole-exome sequencing of ovarian cancer families uncovers putative predisposition genes. Int J Cancer 2019; 146:2147-2155. [PMID: 31265121 PMCID: PMC7065147 DOI: 10.1002/ijc.32545] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Revised: 06/10/2019] [Accepted: 06/21/2019] [Indexed: 01/05/2023]
Abstract
Despite the identification of several ovarian cancer (OC) predisposition genes, a large proportion of familial OC risk remains unexplained. We adopted a two-stage design to identify new OC predisposition genes. We first carried out a large germline whole-exome sequencing study on 158 patients from 140 families with significant OC history, but without evidence of genetic predisposition due to BRCA1/2. We then evaluated the potential candidate genes in a large case-control association study involving 381 OC cases in the Cancer Genome Atlas project and 27,173 population controls from the Exome Aggregation Consortium. Two new putative OC risk genes were identified, namely, ANKRD11, a putative tumor suppressor, and POLE, an enzyme involved in DNA repair and replication. These two genes likely confer moderate OC risk. We performed in vitro experiments and showed an ANKRD11 mutation identified in our patients markedly lowered the protein expression by compromising protein stability. Upon future validation and functional characterization, these genes may shed light on cancer etiology along with improving ascertainment power and preventive care of individuals at high risk of OC.
Collapse
Affiliation(s)
- Qianqian Zhu
- Department of Biostatistics & Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Jianmin Zhang
- Department of Cancer Genetics and Genomics, Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Yanmin Chen
- Department of Cancer Genetics and Genomics, Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Qiang Hu
- Department of Biostatistics & Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - He Shen
- Department of Cancer Genetics and Genomics, Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Ruea-Yea Huang
- Center for Immunotherapy, Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Qian Liu
- Department of Biostatistics & Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Jasmine Kaur
- Department of Gynecologic Oncology, Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Mark Long
- Department of Biostatistics & Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | | | - Kevin H Eng
- Department of Biostatistics & Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Shashikant B Lele
- Department of Gynecologic Oncology, Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Emese Zsiros
- Department of Gynecologic Oncology, Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Jeannine Villella
- Division of Gynecologic Oncology, Lenox Hill Hospital/ Northwell Health Cancer Institute, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, New York, NY
| | - Amit Lugade
- Center for Immunotherapy, Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Song Yao
- Department of Cancer Prevention and Control, Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Song Liu
- Department of Biostatistics & Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Kirsten Moysich
- Department of Cancer Prevention and Control, Roswell Park Comprehensive Cancer Center, Buffalo, NY
| | - Kunle O Odunsi
- Center for Immunotherapy, Roswell Park Comprehensive Cancer Center, Buffalo, NY.,Department of Gynecologic Oncology, Roswell Park Comprehensive Cancer Center, Buffalo, NY
| |
Collapse
|
14
|
McCarthy MJ. Missing a beat: assessment of circadian rhythm abnormalities in bipolar disorder in the genomic era. Psychiatr Genet 2019; 29:29-36. [PMID: 30516584 DOI: 10.1097/ypg.0000000000000215] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Circadian rhythm abnormalities have been recognized as a central feature of bipolar disorder (BD) but a coherent biological explanation for them remains lacking. Using genetic mutation of 'clock genes', robust animal models of mania and depression have been developed that elucidate key aspects of circadian rhythms and the circadian clock-mood connection. However, translation of this knowledge into humans remains incomplete. In recent years, very large genome-wide association studies (GWAS) have been conducted and the genetic underpinnings of BD are beginning to emerge. However, these genetic studies in BD do not match well with the evidence from animal studies that implicate the circadian clock in mood regulation. Even larger GWAS have been conducted for circadian phenotypes including chronotype, rhythm amplitude, sleep duration, and insomnia. These studies have identified a diverse set of associated genes, including a minority with previously well-characterized functions in the circadian clock. Taken together, the data from recent GWAS of BD and circadian phenotypes indicate that the genetic organization of the circadian clock, both in health and in BD is complex. The findings from GWAS elucidate potentially novel circadian mechanism that may be partly distinct from those identified in animal models. Pleiotropy, epistasis and nongenetic factors may play important roles in regulating circadian rhythms, some of which may underlie circadian rhythm disturbances in BD.
Collapse
Affiliation(s)
- Michael J McCarthy
- Department of Psychiatry, Center for Circadian Biology, VA San Diego Healthcare System, University of California San Diego, San Diego, California, USA
| |
Collapse
|
15
|
Zhang Q, Sahana G, Su G, Guldbrandtsen B, Lund MS, Calus MPL. Impact of rare and low-frequency sequence variants on reliability of genomic prediction in dairy cattle. Genet Sel Evol 2018; 50:62. [PMID: 30458700 PMCID: PMC6247626 DOI: 10.1186/s12711-018-0432-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 11/14/2018] [Indexed: 11/05/2022] Open
Abstract
Background Availability of whole-genome sequence data for a large number of cattle and efficient imputation methodologies open a new opportunity to include rare and low-frequency variants (RLFV) in genomic prediction in dairy cattle. The objective of this study was to examine the impact of including RLFV that are within genes and selected from whole-genome sequence variants, on the reliability of genomic prediction for fertility, health and longevity in dairy cattle. Results All genic RLFV with a minor allele frequency lower than 0.05 were extracted from imputed sequence data and subsets were created using different strategies. These subsets were subsequently combined with Illumina 50 k single nucleotide polymorphism (SNP) data and used for genomic prediction. Reliability of prediction obtained by using 50 k SNP data alone was used as reference value and absolute changes in reliabilities are referred to as changes in percentage points. Adding a component that included either all the genic or a subset of selected RLFV into the model in addition to the 50 k component changed the reliability of predictions by − 2.2 to 1.1%, i.e. hardly no change in reliability of prediction was found, regardless of how the RLFV were selected. In addition to these empirical analyses, a simulation study was performed to evaluate the potential impact of adding RLFV in the model on the reliability of prediction. Three sets of causal RLFV (containing 21,468, 1348 and 235 RLFV) that were randomly selected from different numbers of genes were generated and accounted for 10% additional genetic variance of the estimated variance explained by the 50 k SNPs. When genic RLFV based on mapping results were included in the prediction model, reliabilities improved by up to 4.0% and when the causal RLFV were included they improved by up to 6.8%. Conclusions Using selected RLFV from whole-genome sequence data had only a small impact on the empirical reliability of genomic prediction in dairy cattle. Our simulations revealed that for sequence data to bring a benefit, the key is to identify causal RLFV. Electronic supplementary material The online version of this article (10.1186/s12711-018-0432-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Qianqian Zhang
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, Denmark. .,Wageningen University and Research, Animal Breeding and Genomics, Wageningen, The Netherlands. .,Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.
| | - Goutam Sahana
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, Denmark
| | - Guosheng Su
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, Denmark
| | - Bernt Guldbrandtsen
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, Denmark
| | - Mogens Sandø Lund
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, Denmark
| | - Mario P L Calus
- Wageningen University and Research, Animal Breeding and Genomics, Wageningen, The Netherlands
| |
Collapse
|
16
|
Patel R, Scheinfeldt LB, Sanderford MD, Lanham TR, Tamura K, Platt A, Glicksberg BS, Xu K, Dudley JT, Kumar S. Adaptive Landscape of Protein Variation in Human Exomes. Mol Biol Evol 2018; 35:2015-2025. [PMID: 29846678 PMCID: PMC6063297 DOI: 10.1093/molbev/msy107] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
The human genome contains hundreds of thousands of missense mutations. However, only a handful of these variants are known to be adaptive, which implies that adaptation through protein sequence change is an extremely rare phenomenon in human evolution. Alternatively, existing methods may lack the power to pinpoint adaptive variation. We have developed and applied an Evolutionary Probability Approach (EPA) to discover candidate adaptive polymorphisms (CAPs) through the discordance between allelic evolutionary probabilities and their observed frequencies in human populations. EPA reveals thousands of missense CAPs, which suggest that a large number of previously optimal alleles experienced a reversal of fortune in the human lineage. We explored nonadaptive mechanisms to explain CAPs, including the effects of demography, mutation rate variability, and negative and positive selective pressures in modern humans. Many nonadaptive hypotheses were tested, but failed to explain the data, which suggests that a large proportion of CAP alleles have increased in frequency due to beneficial selection. This suggestion is supported by the fact that a vast majority of adaptive missense variants discovered previously in humans are CAPs, and hundreds of CAP alleles are protective in genotype-phenotype association data. Our integrated phylogenomic and population genetic EPA approach predicts the existence of thousands of nonneutral candidate variants in the human proteome. We expect this collection to be enriched in beneficial variation. The EPA approach can be applied to discover candidate adaptive variation in any protein, population, or species for which allele frequency data and reliable multispecies alignments are available.
Collapse
Affiliation(s)
- Ravi Patel
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
- Department of Biology, Temple University, Philadelphia, PA
| | - Laura B Scheinfeldt
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
- Department of Biology, Temple University, Philadelphia, PA
- Coriell Institute for Medical Research, Camden, NJ
| | - Maxwell D Sanderford
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
| | - Tamera R Lanham
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
| | - Koichiro Tamura
- Department of Biology, Tokyo Metropolitan University, Tokyo, Japan
| | - Alexander Platt
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
- Department of Biology, Temple University, Philadelphia, PA
- Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA
| | - Benjamin S Glicksberg
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Ke Xu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Joel T Dudley
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Sudhir Kumar
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
- Department of Biology, Temple University, Philadelphia, PA
- Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
17
|
Zhu Q, Yan L, Liu Q, Zhang C, Wei L, Hu Q, Preus L, Clay-Gilmour AI, Onel K, Stram DO, Pooler L, Sheng X, Haiman CA, Zhu X, Spellman SR, Pasquini M, McCarthy PL, Liu S, Hahn T, Sucheston-Campbell LE. Exome chip analyses identify genes affecting mortality after HLA-matched unrelated-donor blood and marrow transplantation. Blood 2018; 131:2490-2499. [PMID: 29610366 PMCID: PMC5981168 DOI: 10.1182/blood-2017-11-817973] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2017] [Accepted: 03/20/2018] [Indexed: 02/07/2023] Open
Abstract
Although survival outcomes have significantly improved, up to 40% of patients die within 1 year of HLA-matched unrelated-donor blood and marrow transplantation (BMT). To identify non-HLA genetic contributors to mortality after BMT, we performed the first exome-wide association study in the DISCOVeRY-BMT cohorts using the Illumina HumanExome BeadChip. This study includes 2473 patients with acute myeloid leukemia, acute lymphoblastic leukemia, or myelodysplastic syndrome and 2221 10/10 HLA-matched donors treated from 2000 to 2011. Single-variant and gene-level analyses were performed on overall survival (OS), transplantation-related mortality (TRM), and disease-related mortality (DRM). Genotype mismatches between recipients and donors in a rare nonsynonymous variant of testis-expressed gene TEX38 significantly increased risk of TRM, which was more dramatic when either the recipient or donor was female. Using the SKAT-O test to evaluate gene-level effects, variant genotypes of OR51D1 in recipients were significantly associated with OS and TRM. In donors, 4 (ALPP, EMID1, SLC44A5, LRP1), 1 (HHAT), and 2 genes (LYZL4, NT5E) were significantly associated with OS, TRM, and DRM, respectively. Inspection of NT5E crystal structures showed 4 of the associated variants affected the enzyme structure and likely decreased the catalytic efficiency of the enzyme. Further confirmation of these findings and additional functional studies may provide individualized risk prediction and prognosis, as well as alternative donor selection strategies.
Collapse
Affiliation(s)
- Qianqian Zhu
- Biostatistics and Bioinformatics, Roswell Park Cancer Institute, Buffalo, NY
| | - Li Yan
- Biostatistics and Bioinformatics, Roswell Park Cancer Institute, Buffalo, NY
| | - Qian Liu
- Biostatistics and Bioinformatics, Roswell Park Cancer Institute, Buffalo, NY
| | - Chi Zhang
- School of Biological Sciences, University of Nebraska-Lincoln, Lincoln, NE
| | - Lei Wei
- Biostatistics and Bioinformatics, Roswell Park Cancer Institute, Buffalo, NY
| | - Qiang Hu
- Biostatistics and Bioinformatics, Roswell Park Cancer Institute, Buffalo, NY
| | - Leah Preus
- College of Pharmacy, The Ohio State University, Columbus, OH
| | | | - Kenan Onel
- Department of Pediatric Hematology/Oncology, Cohen Children's Medical Center, Northwell Health, Manhasset, NY
| | - Daniel O Stram
- Preventive Medicine, University of Southern California, Los Angeles, CA
| | - Loreall Pooler
- Preventive Medicine, University of Southern California, Los Angeles, CA
| | - Xin Sheng
- Preventive Medicine, University of Southern California, Los Angeles, CA
| | | | - Xiaochun Zhu
- Center for International Blood and Marrow Transplant Research, Medical College of Wisconsin, Milwaukee, WI
| | - Stephen R Spellman
- Center for International Blood and Marrow Transplant Research, National Marrow Donor Program/Be The Match, Minneapolis, MN
| | - Marcelo Pasquini
- Center for International Blood and Marrow Transplant Research, Medical College of Wisconsin, Milwaukee, WI
| | - Philip L McCarthy
- Blood and Marrow Transplant Program, Department of Medicine, Roswell Park Cancer Institute, Buffalo, NY; and
| | - Song Liu
- Biostatistics and Bioinformatics, Roswell Park Cancer Institute, Buffalo, NY
| | - Theresa Hahn
- Blood and Marrow Transplant Program, Department of Medicine, Roswell Park Cancer Institute, Buffalo, NY; and
| | - Lara E Sucheston-Campbell
- College of Pharmacy, The Ohio State University, Columbus, OH
- College of Veterinary Medicine, The Ohio State University, Columbus, OH
| |
Collapse
|
18
|
Koko M, Abdallah MOE, Amin M, Ibrahim M. Challenges imposed by minor reference alleles on the identification and reporting of clinical variants from exome data. BMC Genomics 2018; 19:46. [PMID: 29334895 PMCID: PMC5769444 DOI: 10.1186/s12864-018-4433-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Accepted: 01/03/2018] [Indexed: 12/30/2022] Open
Abstract
Background The conventional variant calling of pathogenic alleles in exome and genome sequencing requires the presence of the non-pathogenic alleles as genome references. This hinders the correct identification of variants with minor and/or pathogenic reference alleles warranting additional approaches for variant calling. Results More than 26,000 Exome Aggregation Consortium (ExAC) variants have a minor reference allele including variants with known ClinVar disease alleles. For instance, in a number of variants related to clotting disorders, the phenotype-associated allele is a human genome reference allele (rs6025, rs6003, rs1799983, and rs2227564 using the assembly hg19). We highlighted how the current variant calling standards miss homozygous reference disease variants in these sites and provided a bioinformatic panel that can be used to screen these variants using commonly available variant callers. We present exome sequencing results from an individual with venous thrombosis to emphasize how pathogenic alleles in clinically relevant variants escape variant calling while non-pathogenic alleles are detected. Conclusions This article highlights the importance of specialized variant calling strategies in clinical variants with minor reference alleles especially in the context of personal genomes and exomes. We provide here a simple strategy to screen potential disease-causing variants when present in homozygous reference state. Electronic supplementary material The online version of this article (10.1186/s12864-018-4433-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Mahmoud Koko
- Department of Molecular Biology, Institute of Endemic Diseases, University of Khartoum, P. O. Box 102, Army Road, 11111, Khartoum, Sudan. .,Department of Neurology and Epileptology, Hertie Institute for Clinical Brain Research, Tübingen, Germany.
| | - Mohammed O E Abdallah
- Department of Molecular Biology, Institute of Endemic Diseases, University of Khartoum, P. O. Box 102, Army Road, 11111, Khartoum, Sudan
| | - Mutaz Amin
- Department of Molecular Biology, Institute of Endemic Diseases, University of Khartoum, P. O. Box 102, Army Road, 11111, Khartoum, Sudan.,Department of Biochemistry, Faculty of Medicine, University of Khartoum, Khartoum, Sudan
| | - Muntaser Ibrahim
- Department of Molecular Biology, Institute of Endemic Diseases, University of Khartoum, P. O. Box 102, Army Road, 11111, Khartoum, Sudan.
| |
Collapse
|
19
|
He Y, Luo J, Chen Y, Zhou X, Yu S, Jin L, Xiao X, Jia S, Liu Q. ARHGAP18 is a novel gene under positive natural selection that influences HbF levels in β-thalassaemia. Mol Genet Genomics 2017; 293:207-216. [DOI: 10.1007/s00438-017-1377-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2017] [Accepted: 09/25/2017] [Indexed: 10/18/2022]
|
20
|
Dutta R, Mainsah J, Yatskiv Y, Chakrabortty S, Brennan P, Khuder B, Qiu S, Fedorova L, Fedorov A. Intricacies in arrangement of SNP haplotypes suggest "Great Admixture" that created modern humans. BMC Genomics 2017; 18:433. [PMID: 28583085 PMCID: PMC5741169 DOI: 10.1186/s12864-017-3776-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2016] [Accepted: 05/09/2017] [Indexed: 12/22/2022] Open
Abstract
Background Inferring history from genomic sequences is challenging and problematic because chromosomes are mosaics of thousands of small Identicalby-descent (IBD) fragments, each of them having their own unique story. However, the main events in recent evolution might be deciphered from comparative analysis of numerous loci. A paradox of why humans, whose effective population size is only 104, have nearly three million frequent SNPs is formulated and examined. Results We studied 5398 loci evenly covering all human autosomes. Common haplotypes built from frequent SNPs that are present in people from various populations have been examined. We demonstrated highly non-random arrangement of alleles in common haplotypes. Abundance of mutually exclusive pairs of common haplotypes that have different alleles at every polymorphic position (so-called Yin/Yang haplotypes) was found in 56% of loci. A novel widely spread category of common haplotypes named Mosaic has been described. Mosaic consists of numerous pieces of Yin/Yang haplotypes and represents an ancestral stage of one of them. Scenarios of possible appearance of large number of frequent human SNPs and their habitual arrangement in Yin/Yang common haplotypes have been evaluated with an advanced genomic simulation algorithm. Conclusions Computer modeling demonstrated that the observed arrangement of 2.9 million frequent SNPs could not originate from a sole stand-alone population. A “Great Admixture” event has been proposed that can explain peculiarities with frequent SNP distributions. This Great Admixture presumably occurred 100–300 thousand years ago between two ancestral populations that had been separated from each other about a million years ago. Our programs and algorithms can be applied to other species to perform evolutionary and comparative genomics. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3776-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Rajib Dutta
- Program in Biomedical Sciences, University of Toledo, Health Science Campus, Toledo, 43614, OH, USA.,Department of Medicine, University of Toledo, Health Science Campus, Toledo, 43614, OH, USA
| | - Joseph Mainsah
- Program in Bioinformatics and Proteomics/Genomics, University of Toledo, Health Science Campus, Toledo, 43614, OH, USA
| | - Yuriy Yatskiv
- Program in Bioinformatics and Proteomics/Genomics, University of Toledo, Health Science Campus, Toledo, 43614, OH, USA
| | - Sharmistha Chakrabortty
- Program in Bioinformatics and Proteomics/Genomics, University of Toledo, Health Science Campus, Toledo, 43614, OH, USA
| | - Patrick Brennan
- Program in Bioinformatics and Proteomics/Genomics, University of Toledo, Health Science Campus, Toledo, 43614, OH, USA
| | - Basil Khuder
- Program in Bioinformatics and Proteomics/Genomics, University of Toledo, Health Science Campus, Toledo, 43614, OH, USA
| | - Shuhao Qiu
- Department of Medicine, University of Toledo, Health Science Campus, Toledo, 43614, OH, USA
| | | | - Alexei Fedorov
- Program in Bioinformatics and Proteomics/Genomics, University of Toledo, Health Science Campus, Toledo, 43614, OH, USA. .,Department of Medicine, University of Toledo, Health Science Campus, Toledo, 43614, OH, USA.
| |
Collapse
|
21
|
Amos CI, Dennis J, Wang Z, Byun J, Schumacher FR, Gayther SA, Casey G, Hunter DJ, Sellers TA, Gruber SB, Dunning AM, Michailidou K, Fachal L, Doheny K, Spurdle AB, Li Y, Xiao X, Romm J, Pugh E, Coetzee GA, Hazelett DJ, Bojesen SE, Caga-Anan C, Haiman CA, Kamal A, Luccarini C, Tessier D, Vincent D, Bacot F, Van Den Berg DJ, Nelson S, Demetriades S, Goldgar DE, Couch FJ, Forman JL, Giles GG, Conti DV, Bickeböller H, Risch A, Waldenberger M, Brüske-Hohlfeld I, Hicks BD, Ling H, McGuffog L, Lee A, Kuchenbaecker K, Soucy P, Manz J, Cunningham JM, Butterbach K, Kote-Jarai Z, Kraft P, FitzGerald L, Lindström S, Adams M, McKay JD, Phelan CM, Benlloch S, Kelemen LE, Brennan P, Riggan M, O'Mara TA, Shen H, Shi Y, Thompson DJ, Goodman MT, Nielsen SF, Berchuck A, Laboissiere S, Schmit SL, Shelford T, Edlund CK, Taylor JA, Field JK, Park SK, Offit K, Thomassen M, Schmutzler R, Ottini L, Hung RJ, Marchini J, Amin Al Olama A, Peters U, Eeles RA, Seldin MF, Gillanders E, Seminara D, Antoniou AC, Pharoah PDP, Chenevix-Trench G, Chanock SJ, Simard J, Easton DF. The OncoArray Consortium: A Network for Understanding the Genetic Architecture of Common Cancers. Cancer Epidemiol Biomarkers Prev 2017; 26:126-135. [PMID: 27697780 PMCID: PMC5224974 DOI: 10.1158/1055-9965.epi-16-0106] [Citation(s) in RCA: 245] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Revised: 06/30/2016] [Accepted: 07/29/2016] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Common cancers develop through a multistep process often including inherited susceptibility. Collaboration among multiple institutions, and funding from multiple sources, has allowed the development of an inexpensive genotyping microarray, the OncoArray. The array includes a genome-wide backbone, comprising 230,000 SNPs tagging most common genetic variants, together with dense mapping of known susceptibility regions, rare variants from sequencing experiments, pharmacogenetic markers, and cancer-related traits. METHODS The OncoArray can be genotyped using a novel technology developed by Illumina to facilitate efficient genotyping. The consortium developed standard approaches for selecting SNPs for study, for quality control of markers, and for ancestry analysis. The array was genotyped at selected sites and with prespecified replicate samples to permit evaluation of genotyping accuracy among centers and by ethnic background. RESULTS The OncoArray consortium genotyped 447,705 samples. A total of 494,763 SNPs passed quality control steps with a sample success rate of 97% of the samples. Participating sites performed ancestry analysis using a common set of markers and a scoring algorithm based on principal components analysis. CONCLUSIONS Results from these analyses will enable researchers to identify new susceptibility loci, perform fine-mapping of new or known loci associated with either single or multiple cancers, assess the degree of overlap in cancer causation and pleiotropic effects of loci that have been identified for disease-specific risk, and jointly model genetic, environmental, and lifestyle-related exposures. IMPACT Ongoing analyses will shed light on etiology and risk assessment for many types of cancer. Cancer Epidemiol Biomarkers Prev; 26(1); 126-35. ©2016 AACR.
Collapse
Affiliation(s)
- Christopher I Amos
- Biomedical Data Science, Geisel School of Medicine at Dartmouth, Hanover, New Hampshire.
| | - Joe Dennis
- Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, United Kingdom
| | - Zhaoming Wang
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, Tennessee
| | - Jinyoung Byun
- Biomedical Data Science, Geisel School of Medicine at Dartmouth, Hanover, New Hampshire
| | - Fredrick R Schumacher
- Department of Epidemiology and Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, Ohio
| | - Simon A Gayther
- The Center for Bioinformatics and Functional Genomics at Cedars Sinai Medical Center, Greater Los Angeles Area, Los Angeles, California
| | - Graham Casey
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, California
| | - David J Hunter
- Department of Epidemiology, Program in Molecular and Genetic Epidemiology, Harvard School of Public Health, Boston, Massachusetts
| | - Thomas A Sellers
- Department of Cancer Epidemiology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, Florida
| | - Stephen B Gruber
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, California
| | - Alison M Dunning
- Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, United Kingdom
| | - Kyriaki Michailidou
- Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, United Kingdom
| | - Laura Fachal
- Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, United Kingdom
| | - Kimberly Doheny
- Center for Inherited Disease Research, Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland
| | - Amanda B Spurdle
- Molecular Cancer Epidemiology, QIMR Berghofer Medical Research Institute, Herston, Queensland, Australia
| | - Yafang Li
- Biomedical Data Science, Geisel School of Medicine at Dartmouth, Hanover, New Hampshire
| | - Xiangjun Xiao
- Biomedical Data Science, Geisel School of Medicine at Dartmouth, Hanover, New Hampshire
| | - Jane Romm
- Center for Inherited Disease Research, Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland
| | - Elizabeth Pugh
- Center for Inherited Disease Research, Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland
| | | | | | - Stig E Bojesen
- Department of Clinical Biochemistry, Herlev and Gentofte Hospital, Copenhagen University Hospital, Copenhagen, Denmark
| | - Charlisse Caga-Anan
- Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, Maryland
| | - Christopher A Haiman
- The Center for Bioinformatics and Functional Genomics at Cedars Sinai Medical Center, Greater Los Angeles Area, Los Angeles, California
| | - Ahsan Kamal
- Biomedical Data Science, Geisel School of Medicine at Dartmouth, Hanover, New Hampshire
| | - Craig Luccarini
- Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, United Kingdom
| | - Daniel Tessier
- Génome Québec Innovation Centre, Montreal, Canada and McGill University, Montreal, Canada
| | - Daniel Vincent
- Génome Québec Innovation Centre, Montreal, Canada and McGill University, Montreal, Canada
| | - François Bacot
- Génome Québec Innovation Centre, Montreal, Canada and McGill University, Montreal, Canada
| | - David J Van Den Berg
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, California
| | - Stefanie Nelson
- Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, Maryland
| | - Stephen Demetriades
- University Health Network- The Princess Margaret Cancer Centre, Toronto, California
| | | | | | - Judith L Forman
- Biomedical Data Science, Geisel School of Medicine at Dartmouth, Hanover, New Hampshire
| | - Graham G Giles
- Cancer Epidemiology Centre, Cancer Council Victoria, Melbourne, Australia
- Cancer, Genetics and Immunology, Menzies Institute for Medical Research, Hobart, Australia
| | - David V Conti
- Division of Biostatistics, Department of Preventive Medicine, Zilkha Neurogenetic Institute, University of Southern California, Los Angeles, California
| | - Heike Bickeböller
- Department of Genetic Epidemiology, University Medical Center, Georg-August-University, Göttingen, Germany
| | - Angela Risch
- University of Salzburg and Cancer Cluster Salzburg, Salzburg, Austria
- Division of Epigenomics and Cancer Risk Factors, German Cancer Research Center, Heidelberg, Germany
- Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany
| | - Melanie Waldenberger
- Research Unit of Molecular Epidemiology, Institute of Epidemiology II, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Irene Brüske-Hohlfeld
- Helmholtz Zentrum München, Institut für Epidemiologie I, Neuherberg, Oberschleissheim, Germany
| | - Belynda D Hicks
- Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, Maryland
| | - Hua Ling
- Center for Inherited Disease Research, Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland
| | - Lesley McGuffog
- Cancer Epidemiology Centre, Cancer Council Victoria, Melbourne, Australia
- Cancer, Genetics and Immunology, Menzies Institute for Medical Research, Hobart, Australia
| | - Andrew Lee
- Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, United Kingdom
| | - Karoline Kuchenbaecker
- Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, United Kingdom
| | - Penny Soucy
- Cancer Genomics Laboratory, Centre Hospitalier Universitaire de Québec and Laval University, Québec City, Canada
| | - Judith Manz
- Research Unit of Molecular Epidemiology, Institute of Epidemiology II, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | | | - Katja Butterbach
- Cancer Epidemiology, German Cancer Research Center, Heidelberg, Germany
| | | | - Peter Kraft
- Department of Epidemiology, Program in Molecular and Genetic Epidemiology, Harvard School of Public Health, Boston, Massachusetts
| | - Liesel FitzGerald
- Cancer Epidemiology Centre, Cancer Council Victoria, Melbourne, Australia
- Cancer, Genetics and Immunology, Menzies Institute for Medical Research, Hobart, Australia
| | - Sara Lindström
- Department of Epidemiology, Program in Molecular and Genetic Epidemiology, Harvard School of Public Health, Boston, Massachusetts
- Department of Epidemiology, University of Washington, Seattle, Washington
| | - Marcia Adams
- Center for Inherited Disease Research, Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland
| | - James D McKay
- International Agency for Research on Cancer, World Health Organization, Lyon, France
| | - Catherine M Phelan
- Department of Cancer Epidemiology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, Florida
| | - Sara Benlloch
- Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, United Kingdom
| | - Linda E Kelemen
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina
| | - Paul Brennan
- International Agency for Research on Cancer, World Health Organization, Lyon, France
| | - Marjorie Riggan
- Department of Gynecology, Duke University Medical Center, Durham, North Carolina
| | - Tracy A O'Mara
- Cancer Division, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Hongbing Shen
- Department of Epidemiology and Biostatistics, Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Medicine, School of Public Health, Nanjing Medical University, Nanjing, P.R. China
| | - Yongyong Shi
- Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Bio-X Institutes, Shanghai Jiao Tong University, Shanghai, P.R. China
| | - Deborah J Thompson
- Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, United Kingdom
| | | | - Sune F Nielsen
- Department of Clinical Biochemistry, Herlev and Gentofte Hospital, Copenhagen University Hospital, Copenhagen, Denmark
- Department of Oncology, Herlev and Gentofte Hospital, Copenhagen University Hospital, Copenhagen, Denmark
| | - Andrew Berchuck
- Department of Gynecology, Duke University Medical Center, Durham, North Carolina
| | - Sylvie Laboissiere
- Génome Québec Innovation Centre, Montreal, Canada and McGill University, Montreal, Canada
| | - Stephanie L Schmit
- Department of Cancer Epidemiology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, Florida
- Department of Gastrointestinal Oncology, H. Lee Moffitt Cancer Center, Tampa, Florida
| | - Tameka Shelford
- Center for Inherited Disease Research, Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland
| | - Christopher K Edlund
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, California
| | - Jack A Taylor
- Molecular and Genetic Epidemiology Group, National Institute for Environmental Health Sciences, Research Triangle Park, North Carolina
| | - John K Field
- Institute of Translational Medicine, University of Liverpool, Liverpool, United Kingdom
| | - Sue K Park
- College of Medicine, Seoul National University, Gwanak-gu, Seoul, Korea
| | - Kenneth Offit
- Clinical Genetics Service, Memorial Hospital, New York, New York
- Cancer Biology and Genetics Program, Sloan Kettering Institute, New York, New York
- Department of Medicine, Weill Cornell Medical College, New York, New York
| | - Mads Thomassen
- Department of Clinical Genetics, Odense University Hospital, Odense, Denmark
| | - Rita Schmutzler
- Zentrum Familiärer Brust- und Eierstockkrebs, Universitätsklinikum Köln, Köln, Germany
| | - Laura Ottini
- Department of Molecular Medicine, Sapienza, University of Rome, Rome, Italy
| | - Rayjean J Hung
- Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, University of Toronto, Toronto, Canada
| | | | - Ali Amin Al Olama
- Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, United Kingdom
| | - Ulrike Peters
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | | | - Michael F Seldin
- Department of Biochemistry and Molecular Medicine, University of California at Davis, Davis, California
- Department of Internal Medicine, University of California at Davis, Davis, California
| | - Elizabeth Gillanders
- Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, Maryland
| | - Daniela Seminara
- Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, Maryland
| | - Antonis C Antoniou
- Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, United Kingdom
| | - Paul D P Pharoah
- Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, United Kingdom
| | | | - Stephen J Chanock
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Bethesda, Maryland
| | - Jacques Simard
- Cancer Genomics Laboratory, Centre Hospitalier Universitaire de Québec and Laval University, Québec City, Canada
| | - Douglas F Easton
- Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
22
|
Abstract
The wealth of available genetic information is allowing the reconstruction of human demographic and adaptive history. Demography and purifying selection affect the purge of rare, deleterious mutations from the human population, whereas positive and balancing selection can increase the frequency of advantageous variants, improving survival and reproduction in specific environmental conditions. In this review, I discuss how theoretical and empirical population genetics studies, using both modern and ancient DNA data, are a powerful tool for obtaining new insight into the genetic basis of severe disorders and complex disease phenotypes, rare and common, focusing particularly on infectious disease risk.
Collapse
Affiliation(s)
- Lluis Quintana-Murci
- Human Evolutionary Genetics Unit, Department of Genomes & Genetics, Institut Pasteur, Paris, 75015, France.
- Centre National de la Recherche Scientifique, URA3012, Paris, 75015, France.
- Center of Bioinformatics, Biostatistics and Integrative Biology, Institut Pasteur, Paris, 75015, France.
| |
Collapse
|
23
|
Kim JK, Yeom M, Hong JK, Song I, Lee YS, Guengerich FP, Choi JY. Six Germline Genetic Variations Impair the Translesion Synthesis Activity of Human DNA Polymerase κ. Chem Res Toxicol 2016; 29:1741-1754. [PMID: 27603496 DOI: 10.1021/acs.chemrestox.6b00244] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
DNA polymerase (pol) κ efficiently catalyzes error-free translesion DNA synthesis (TLS) opposite bulky N2-guanyl lesions induced by carcinogens such as polycyclic aromatic hydrocarbons. We investigated the biochemical effects of nine human nonsynonymous germline POLK variations on the TLS properties of pol κ, utilizing recombinant pol κ (residues 1-526) enzymes and DNA templates containing an N2-CH2(9-anthracenyl)G (N2-AnthG), 8-oxo-7,8-dihydroguanine (8-oxoG), O6-methyl(Me)G, or an abasic site. In steady-state kinetic analyses, the R246X, R298H, T473A, and R512W variants displayed 7- to 18-fold decreases in kcat/Km for dCTP insertion opposite G and N2-AnthG, with 2- to 3-fold decreases in DNA binding affinity, compared to that of the wild-type, and further showed 5- to 190-fold decreases in kcat/Km for next-base extension from C paired with N2-AnthG. The A471V variant showed 2- to 4-fold decreases in kcat/Km for correct nucleotide insertion opposite and beyond G (or N2-AnthG) compared to that of the wild-type. These five hypoactive variants also showed similar patterns of attenuation of TLS activity opposite 8-oxoG, O6-MeG, and abasic lesions. By contrast, the T44M variant exhibited 7- to 11-fold decreases in kcat/Km for dCTP insertion opposite N2-AnthG and O6-MeG (as well as for dATP insertion opposite an abasic site) but not opposite both G and 8-oxoG, nor beyond N2-AnthG, compared to that of the wild-type. These results suggest that the R246X, R298H, T473A, R512W, and A471V variants cause a general catalytic impairment of pol κ opposite G and all four lesions, whereas the T44M variant induces opposite lesion-dependent catalytic impairment, i.e., only opposite O6-MeG, abasic, and bulky N2-G lesions but not opposite G and 8-oxoG, in pol κ, which might indicate that these hypoactive pol κ variants are genetic factors in modifying individual susceptibility to genotoxic carcinogens in certain subsets of populations.
Collapse
Affiliation(s)
- Jae-Kwon Kim
- Division of Pharmacology, Department of Molecular Cell Biology, Samsung Biomedical Research Institute, Sungkyunkwan University School of Medicine , Suwon, Gyeonggi-do 16419, Republic of Korea
| | - Mina Yeom
- Division of Pharmacology, Department of Molecular Cell Biology, Samsung Biomedical Research Institute, Sungkyunkwan University School of Medicine , Suwon, Gyeonggi-do 16419, Republic of Korea
| | - Jin-Kyung Hong
- Division of Pharmacology, Department of Molecular Cell Biology, Samsung Biomedical Research Institute, Sungkyunkwan University School of Medicine , Suwon, Gyeonggi-do 16419, Republic of Korea
| | - Insil Song
- Division of Pharmacology, Department of Molecular Cell Biology, Samsung Biomedical Research Institute, Sungkyunkwan University School of Medicine , Suwon, Gyeonggi-do 16419, Republic of Korea
| | - Young-Sam Lee
- Department of New Biology, Daegu Gyeongbuk Institute of Science and Technology , Daegu 42988, Republic of Korea
| | - F Peter Guengerich
- Department of Biochemistry, Vanderbilt University School of Medicine , Nashville, Tennessee 37232-0146, United States
| | - Jeong-Yun Choi
- Division of Pharmacology, Department of Molecular Cell Biology, Samsung Biomedical Research Institute, Sungkyunkwan University School of Medicine , Suwon, Gyeonggi-do 16419, Republic of Korea
| |
Collapse
|
24
|
Abstract
Empirical studies and evolutionary theory support a role for rare variants in the etiology of complex traits. Given this motivation and increasing affordability of whole-exome and whole-genome sequencing, methods for rare variant association have been an active area of research for the past decade. Here, we provide a survey of the current literature and developments from the Genetics Analysis Workshop 19 (GAW19) Collapsing Rare Variants working group. In particular, we present the generalized linear regression framework and associated score statistic for the 2 major types of methods: burden and variance components methods. We further show that by simply modifying weights within these frameworks we arrive at many of the popular existing methods, for example, the cohort allelic sums test and sequence kernel association test. Meta-analysis techniques are also described. Next, we describe the 6 contributions from the GAW19 Collapsing Rare Variants working group. These included development of new methods, such as a retrospective likelihood for family data, a method using genomic structure to compare cases and controls, a haplotype-based meta-analysis, and a permutation-based method for combining different statistical tests. In addition, one contribution compared a mega-analysis of family-based and population-based data to meta-analysis. Finally, the power of existing family-based methods for binary traits was compared. We conclude with suggestions for open research questions.
Collapse
Affiliation(s)
- Stephanie A Santorico
- Department of Mathematical and Statistical Sciences, University of Colorado Denver, Denver, CO, 80217-3364, USA.
| | - Audrey E Hendricks
- Department of Mathematical and Statistical Sciences, University of Colorado Denver, Denver, CO, 80217-3364, USA.
| |
Collapse
|
25
|
Halvorsen M, Petrovski S, Shellhaas R, Tang Y, Crandall L, Goldstein D, Devinsky O. Mosaic mutations in early-onset genetic diseases. Genet Med 2015; 18:746-9. [PMID: 26716362 DOI: 10.1038/gim.2015.155] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Accepted: 09/16/2015] [Indexed: 11/09/2022] Open
Abstract
PURPOSE An emerging approach in medical genetics is to identify de novo mutations in patients with severe early-onset genetic disease that are absent in population controls and in the patient's parents. This approach, however, frequently misses post-zygotic "mosaic" mutations that are present in only a portion of the healthy parents' cells and are transmitted to offspring. METHODS We constructed a mosaic transmission screen for variants that have an ~50% alternative allele ratio in the proband but are significantly less than 50% in the transmitting parent. We applied it to two family-based genetic disease cohorts consisting of 9 cases of sudden unexplained death in childhood (SUDC) and 338 previously published cases of epileptic encephalopathy. RESULTS The screen identified six parental-mosaic transmissions across the two cohorts. The resultant rate of ~0.02 identified transmissions per trio is far lower than that of de novo mutations. Among these transmissions were two likely disease-causing mutations: an SCN1A mutation transmitted to an SUDC proband and her sibling with Dravet syndrome, as well as an SLC6A1 mutation in a proband with epileptic encephalopathy. CONCLUSION These results highlight explicit screening for mosaic mutations as an important complement to the established approach of screening for de novo mutations.Genet Med 18 7, 746-749.
Collapse
Affiliation(s)
- Matt Halvorsen
- Institute for Genomic Medicine, Columbia University, New York, New York, USA
| | - Slavé Petrovski
- Institute for Genomic Medicine, Columbia University, New York, New York, USA.,Department of Medicine, The University of Melbourne, Austin Health and Royal Melbourne Hospital, Melbourne, Victoria, Australia
| | - Renée Shellhaas
- Division of Pediatric Neurology, C.S. Mott Children's Hospital, University of Michigan, Ann Arbor, Michigan, USA
| | - Yingying Tang
- Molecular Genetics Laboratory, New York City Office of the Chief Medical Examiner, New York, New York, USA.,Department of Pathology, NYU Langone Medical Center, New York, New York, USA
| | - Laura Crandall
- SUDC Foundation, Herndon, Virginia, USA.,Comprehensive Epilepsy Center, Department of Neurology, NYU Langone Medical Center, New York, New York, USA
| | - David Goldstein
- Institute for Genomic Medicine, Columbia University, New York, New York, USA
| | - Orrin Devinsky
- Comprehensive Epilepsy Center, Department of Neurology, NYU Langone Medical Center, New York, New York, USA
| |
Collapse
|
26
|
Association mapping reveals the role of purifying selection in the maintenance of genomic variation in gene expression. Proc Natl Acad Sci U S A 2015; 112:15390-5. [PMID: 26604315 DOI: 10.1073/pnas.1503027112] [Citation(s) in RCA: 64] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The evolutionary forces that maintain genetic variation in quantitative traits within populations remain poorly understood. One hypothesis suggests that variation is under purifying selection, resulting in an excess of low-frequency variants and a negative correlation between minor allele frequency and selection coefficients. Here, we test these predictions using the genetic loci associated with total expression variation (eQTLs) and allele-specific expression variation (aseQTLs) mapped within a single population of the plant Capsella grandiflora. In addition to finding eQTLs and aseQTLs for a large fraction of genes, we show that alleles at these loci are rarer than expected and exhibit a negative correlation between phenotypic effect size and frequency. Overall, our results show that the distribution of frequencies and effect sizes of the loci responsible for local expression variation within a single outcrossing population are consistent with the effects of purifying selection.
Collapse
|
27
|
Genetic Architecture of Complex Human Traits: What Have We Learned from Genome-Wide Association Studies? CURRENT GENETIC MEDICINE REPORTS 2015. [DOI: 10.1007/s40142-015-0083-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
28
|
Gu W, Gurguis CI, Zhou JJ, Zhu Y, Ko EA, Ko JH, Wang T, Zhou T. Functional and Structural Consequence of Rare Exonic Single Nucleotide Polymorphisms: One Story, Two Tales. Genome Biol Evol 2015; 7:2929-40. [PMID: 26454016 PMCID: PMC4684694 DOI: 10.1093/gbe/evv191] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/05/2015] [Indexed: 01/01/2023] Open
Abstract
Genetic variation arising from single nucleotide polymorphisms (SNPs) is ubiquitously found among human populations. While disease-causing variants are known in some cases, identifying functional or causative variants for most human diseases remains a challenging task. Rare SNPs, rather than common ones, are thought to be more important in the pathology of most human diseases. We propose that rare SNPs should be divided into two categories dependent on whether the minor alleles are derived or ancestral. Derived alleles are less likely to have been purified by evolutionary processes and may be more likely to induce deleterious effects. We therefore hypothesized that the rare SNPs with derived minor alleles would be more important for human diseases and predicted that these variants would have larger functional or structural consequences relative to the rare variants for which the minor alleles are ancestral. We systematically investigated the consequences of the exonic SNPs on protein function, mRNA structure, and translation. We found that the functional and structural consequences are more significant for the rare exonic variants for which the minor alleles are derived. However, this pattern is reversed when the minor alleles are ancestral. Thus, the rare exonic SNPs with derived minor alleles are more likely to be deleterious. Age estimation of rare SNPs confirms that these potentially deleterious SNPs are recently evolved in the human population. These results have important implications for understanding the function of genetic variations in human exonic regions and for prioritizing functional SNPs in genome-wide association studies of human diseases.
Collapse
Affiliation(s)
- Wanjun Gu
- Research Center for Learning Sciences, Southeast University, Nanjing, Jiangsu, China
| | | | - Jin J Zhou
- Department of Epidemiology and Biostatistics, The University of Arizona
| | - Yihua Zhu
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu, China College of Information Science and Technology, Nanjing Agricultural University, Nanjing, Jiangsu, China
| | - Eun-A Ko
- Department of Pharmacology, The University of Nevada School of Medicine, Reno
| | - Jae-Hong Ko
- Department of Physiology, College of Medicine, Chung-Ang University, Seoul, South Korea
| | - Ting Wang
- Department of Medicine, The University of Arizona
| | - Tong Zhou
- Department of Medicine, The University of Arizona
| |
Collapse
|
29
|
Zhang W, Spector TD, Deloukas P, Bell JT, Engelhardt BE. Predicting genome-wide DNA methylation using methylation marks, genomic position, and DNA regulatory elements. Genome Biol 2015; 16:14. [PMID: 25616342 PMCID: PMC4389802 DOI: 10.1186/s13059-015-0581-9] [Citation(s) in RCA: 125] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2013] [Accepted: 01/02/2015] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Recent assays for individual-specific genome-wide DNA methylation profiles have enabled epigenome-wide association studies to identify specific CpG sites associated with a phenotype. Computational prediction of CpG site-specific methylation levels is critical to enable genome-wide analyses, but current approaches tackle average methylation within a locus and are often limited to specific genomic regions. RESULTS We characterize genome-wide DNA methylation patterns, and show that correlation among CpG sites decays rapidly, making predictions solely based on neighboring sites challenging. We built a random forest classifier to predict methylation levels at CpG site resolution using features including neighboring CpG site methylation levels and genomic distance, co-localization with coding regions, CpG islands (CGIs), and regulatory elements from the ENCODE project. Our approach achieves 92% prediction accuracy of genome-wide methylation levels at single-CpG-site precision. The accuracy increases to 98% when restricted to CpG sites within CGIs and is robust across platform and cell-type heterogeneity. Our classifier outperforms other types of classifiers and identifies features that contribute to prediction accuracy: neighboring CpG site methylation, CGIs, co-localized DNase I hypersensitive sites, transcription factor binding sites, and histone modifications were found to be most predictive of methylation levels. CONCLUSIONS Our observations of DNA methylation patterns led us to develop a classifier to predict DNA methylation levels at CpG site resolution with high accuracy. Furthermore, our method identified genomic features that interact with DNA methylation, suggesting mechanisms involved in DNA methylation modification and regulation, and linking diverse epigenetic processes.
Collapse
Affiliation(s)
- Weiwei Zhang
- Department of Molecular Genetics and Microbiology, Duke University, Durham, NC, USA.
| | - Tim D Spector
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK.
| | - Panos Deloukas
- William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK.
- Princess Al-Jawhara Al-Brahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), King Abdulaziz University, Jeddah, 21589, Saudi Arabia.
| | - Jordana T Bell
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK.
| | | |
Collapse
|
30
|
Kim J, Song I, Jo A, Shin JH, Cho H, Eoff RL, Guengerich FP, Choi JY. Biochemical analysis of six genetic variants of error-prone human DNA polymerase ι involved in translesion DNA synthesis. Chem Res Toxicol 2014; 27:1837-52. [PMID: 25162224 PMCID: PMC4203391 DOI: 10.1021/tx5002755] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
![]()
DNA
polymerase (pol) ι is the most error-prone among the
Y-family polymerases that participate in translesion synthesis (TLS).
Pol ι can bypass various DNA lesions, e.g., N2-ethyl(Et)G, O6-methyl(Me)G,
8-oxo-7,8-dihydroguanine (8-oxoG), and an abasic site, though frequently
with low fidelity. We assessed the biochemical effects of six reported
genetic variations of human pol ι on its TLS properties, using
the recombinant pol ι (residues 1–445) proteins and DNA
templates containing a G, N2-EtG, O6-MeG, 8-oxoG, or abasic site. The Δ1–25
variant, which is the N-terminal truncation of 25
residues resulting from an initiation codon variant (c.3G > A)
and
also is the formerly misassigned wild-type, exhibited considerably
higher polymerase activity than wild-type with Mg2+ (but
not with Mn2+), coinciding with its steady-state kinetic
data showing a ∼10-fold increase in kcat/Km for nucleotide incorporation
opposite templates (only with Mg2+). The R96G variant,
which lacks a R96 residue known to interact with the incoming nucleotide,
lost much of its polymerase activity, consistent with the kinetic
data displaying 5- to 72-fold decreases in kcat/Km for nucleotide incorporation
opposite templates either with Mg2+ or Mn2+,
except for that opposite N2-EtG with Mn2+ (showing a 9-fold increase for dCTP incorporation). The
Δ1–25 variant bound DNA 20- to 29-fold more tightly than
wild-type (with Mg2+), but the R96G variant bound DNA 2-fold
less tightly than wild-type. The DNA-binding affinity of wild-type,
but not of the Δ1–25 variant, was ∼7-fold stronger
with 0.15 mM Mn2+ than with Mg2+. The results
indicate that the R96G variation severely impairs most of the Mg2+- and Mn2+-dependent TLS abilities of pol ι,
whereas the Δ1–25 variation selectively and substantially
enhances the Mg2+-dependent TLS capability of pol ι,
emphasizing the potential translational importance of these pol ι
genetic variations, e.g., individual differences in TLS, mutation,
and cancer susceptibility to genotoxic carcinogens.
Collapse
Affiliation(s)
- Jinsook Kim
- Division of Pharmacology, Department of Molecular Cell Biology, and ‡Department of Physiology, Samsung Biomedical Research Institute, Sungkyunkwan University School of Medicine , Suwon, Gyeonggi-do 440-746, Republic of Korea
| | | | | | | | | | | | | | | |
Collapse
|
31
|
Takata A, Xu B, Ionita-Laza I, Roos JL, Gogos JA, Karayiorgou M. Loss-of-function variants in schizophrenia risk and SETD1A as a candidate susceptibility gene. Neuron 2014; 82:773-80. [PMID: 24853937 DOI: 10.1016/j.neuron.2014.04.043] [Citation(s) in RCA: 129] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/21/2014] [Indexed: 01/08/2023]
Abstract
Loss-of-function (LOF) (i.e., nonsense, splice site, and frameshift) variants that lead to disruption of gene function are likely to contribute to the etiology of neuropsychiatric disorders. Here, we perform a systematic investigation of the role of both de novo and inherited LOF variants in schizophrenia using exome sequencing data from 231 case and 34 control trios. We identify two de novo LOF variants in the SETD1A gene, which encodes a subunit of histone methyltransferase, a finding unlikely to have occurred by chance, and provide evidence for a more general role of chromatin regulators in schizophrenia risk. Transmission pattern analyses reveal that LOF variants are more likely to be transmitted to affected individuals than controls. This is especially true for private LOF variants in genes intolerant to functional genetic variation. These findings highlight the contribution of LOF mutations to the genetic architecture of schizophrenia and provide important insights into disease pathogenesis.
Collapse
Affiliation(s)
- Atsushi Takata
- Department of Psychiatry, Columbia University Medical Center, New York, NY 10032, USA
| | - Bin Xu
- Department of Psychiatry, Columbia University Medical Center, New York, NY 10032, USA
| | - Iuliana Ionita-Laza
- Department of Biostatistics, Columbia University Medical Center, New York, NY 10032, USA
| | - J Louw Roos
- Department of Psychiatry, Pretoria University, Weskoppies Hospital, Pretoria 0001, Republic of South Africa
| | - Joseph A Gogos
- Department of Neuroscience, Columbia University Medical Center, New York, NY 10032, USA; Department of Physiology & Cellular Biophysics, Columbia University Medical Center, New York, NY 10032, USA.
| | - Maria Karayiorgou
- Department of Psychiatry, Columbia University Medical Center, New York, NY 10032, USA; New York State Psychiatric Institute, New York, NY 10032, USA.
| |
Collapse
|
32
|
Song I, Kim EJ, Kim IH, Park EM, Lee KE, Shin JH, Guengerich FP, Choi JY. Biochemical characterization of eight genetic variants of human DNA polymerase κ involved in error-free bypass across bulky N(2)-guanyl DNA adducts. Chem Res Toxicol 2014; 27:919-30. [PMID: 24725253 DOI: 10.1021/tx500072m] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
DNA polymerase (pol) κ, one of the Y-family polymerases, has been shown to function in error-free translesion DNA synthesis (TLS) opposite the bulky N(2)-guanyl DNA lesions induced by many carcinogens such as polycyclic aromatic hydrocarbons. We analyzed the biochemical properties of eight reported human pol κ variants positioned in the polymerase core domain, using the recombinant pol κ (residues 1-526) protein and the DNA template containing an N(2)-CH2(9-anthracenyl)G (N(2)-AnthG). The truncation R219X was devoid of polymerase activity, and the E419G and Y432S variants showed much lower polymerase activity than wild-type pol κ. In steady-state kinetic analyses, E419G and Y432S displayed 20- to 34-fold decreases in kcat/Km for dCTP insertion opposite G and N(2)-AnthG compared to that of wild-type pol κ. The L21F, I39T, and D189G variants, as well as E419G and Y432S, displayed 6- to 22-fold decreases in kcat/Km for next-base extension from C paired with N(2)-AnthG, compared to that of wild-type pol κ. The defective Y432S variant had 4- to 5-fold lower DNA-binding affinity than wild-type, while a slightly more efficient S423R variant possessed 2- to 3-fold higher DNA-binding affinity. These results suggest that R219X abolishes and the E419G, Y432S, L21F, I39T, and D189G variations substantially impair the TLS ability of pol κ opposite bulky N(2)-G lesions in the insertion step opposite the lesion and/or the subsequent extension step, raising the possibility that certain nonsynonymous pol κ genetic variations translate into individual differences in susceptibility to genotoxic carcinogens.
Collapse
Affiliation(s)
- Insil Song
- Division of Pharmacology, Department of Molecular Cell Biology, Samsung Biomedical Research Institute, Sungkyunkwan University School of Medicine , Suwon, Gyeonggi-do 440-746, Republic of Korea
| | | | | | | | | | | | | | | |
Collapse
|
33
|
Nelson CL, Pelak K, Podgoreanu MV, Ahn SH, Scott WK, Allen AS, Cowell LG, Rude TH, Zhang Y, Tong A, Ruffin F, Sharma-Kuinkel BK, Fowler VG. A genome-wide association study of variants associated with acquisition of Staphylococcus aureus bacteremia in a healthcare setting. BMC Infect Dis 2014; 14:83. [PMID: 24524581 PMCID: PMC3928605 DOI: 10.1186/1471-2334-14-83] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2013] [Accepted: 02/06/2014] [Indexed: 01/10/2023] Open
Abstract
Background Humans vary in their susceptibility to acquiring Staphylococcus aureus infection, and research suggests that there is a genetic basis for this variability. Several recent genome-wide association studies (GWAS) have identified variants that may affect susceptibility to infectious diseases, demonstrating the potential value of GWAS in this arena. Methods We conducted a GWAS to identify common variants associated with acquisition of S. aureus bacteremia (SAB) resulting from healthcare contact. We performed a logistic regression analysis to compare patients with healthcare contact who developed SAB (361 cases) to patients with healthcare contact in the same hospital who did not develop SAB (699 controls), testing 542,410 SNPs and adjusting for age (by decade), sex, and 6 significant principal components from our EIGENSTRAT analysis. Additionally, we evaluated the joint effect of the host and pathogen genomes in association with severity of SAB infection via logistic regression, including an interaction of host SNP with bacterial genotype, and adjusting for age (by decade), sex, the 6 significant principal components, and dialysis status. Bonferroni corrections were applied in both analyses to control for multiple comparisons. Results Ours is the first study that has attempted to evaluate the entire human genome for variants potentially involved in the acquisition or severity of SAB. Although this study identified no common variant of large effect size to have genome-wide significance for association with either the risk of acquiring SAB or severity of SAB, the variant (rs2043436) most significantly associated with severity of infection is located in a biologically plausible candidate gene (CDON, a member of the immunoglobulin family) and may warrant further study. Conclusions The genetic architecture underlying SAB is likely to be complex. Future investigations using larger samples, narrowed phenotypes, and advances in both genotyping and analytical methodologies will be important tools for identifying causative variants for this common and serious cause of healthcare-associated infection.
Collapse
Affiliation(s)
- Charlotte L Nelson
- Duke Clinical Research Institute, Duke University Medical Center, 2400 Pratt Street, Room 0311 Terrace Level, Durham, NC 27705, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Sabarinathan R, Wenzel A, Novotny P, Tang X, Kalari KR, Gorodkin J. Transcriptome-wide analysis of UTRs in non-small cell lung cancer reveals cancer-related genes with SNV-induced changes on RNA secondary structure and miRNA target sites. PLoS One 2014; 9:e82699. [PMID: 24416147 PMCID: PMC3885406 DOI: 10.1371/journal.pone.0082699] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2013] [Accepted: 10/26/2013] [Indexed: 01/08/2023] Open
Abstract
Traditional mutation assessment methods generally focus on predicting disruptive changes in protein-coding regions rather than non-coding regulatory regions like untranslated regions (UTRs) of mRNAs. The UTRs, however, are known to have many sequence and structural motifs that can regulate translational and transcriptional efficiency and stability of mRNAs through interaction with RNA-binding proteins and other non-coding RNAs like microRNAs (miRNAs). In a recent study, transcriptomes of tumor cells harboring mutant and wild-type KRAS (V-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog) genes in patients with non-small cell lung cancer (NSCLC) have been sequenced to identify single nucleotide variations (SNVs). About 40% of the total SNVs (73,717) identified were mapped to UTRs, but omitted in the previous analysis. To meet this obvious demand for analysis of the UTRs, we designed a comprehensive pipeline to predict the effect of SNVs on two major regulatory elements, secondary structure and miRNA target sites. Out of 29,290 SNVs in 6462 genes, we predict 472 SNVs (in 408 genes) affecting local RNA secondary structure, 490 SNVs (in 447 genes) affecting miRNA target sites and 48 that do both. Together these disruptive SNVs were present in 803 different genes, out of which 188 (23.4%) were previously known to be cancer-associated. Notably, this ratio is significantly higher (one-sided Fisher's exact test p-value = 0.032) than the ratio (20.8%) of known cancer-associated genes (n = 1347) in our initial data set (n = 6462). Network analysis shows that the genes harboring disruptive SNVs were involved in molecular mechanisms of cancer, and the signaling pathways of LPS-stimulated MAPK, IL-6, iNOS, EIF2 and mTOR. In conclusion, we have found hundreds of SNVs which are highly disruptive with respect to changes in the secondary structure and miRNA target sites within UTRs. These changes hold the potential to alter the expression of known cancer genes or genes linked to cancer-associated pathways.
Collapse
Affiliation(s)
- Radhakrishnan Sabarinathan
- Center for non-coding RNA in Technology and Health, Section for Animal Genetics, Bioinformatics and Breeding, IKVH, University of Copenhagen, Frederiksberg, Denmark
| | - Anne Wenzel
- Center for non-coding RNA in Technology and Health, Section for Animal Genetics, Bioinformatics and Breeding, IKVH, University of Copenhagen, Frederiksberg, Denmark
| | - Peter Novotny
- Bioinformatics Centre, Department of Biology and Biotech Research and Innovation Centre, University of Copenhagen, Copenhagen, Denmark
| | - Xiaojia Tang
- Division of Biostatistics and Bioinformatics, Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Krishna R. Kalari
- Division of Biostatistics and Bioinformatics, Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, United States of America
- Department of Cancer Biology, Mayo Clinic Comprehensive Cancer Center, Jacksonville, Florida, United States of America
| | - Jan Gorodkin
- Center for non-coding RNA in Technology and Health, Section for Animal Genetics, Bioinformatics and Breeding, IKVH, University of Copenhagen, Frederiksberg, Denmark
- * E-mail:
| |
Collapse
|
35
|
Gupta PK, Kulwal PL, Jaiswal V. Association mapping in crop plants: opportunities and challenges. ADVANCES IN GENETICS 2014; 85:109-47. [PMID: 24880734 DOI: 10.1016/b978-0-12-800271-1.00002-0] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
The research area of association mapping (AM) is currently receiving major attention for genetic studies of quantitative traits in all major crops. However, the level of success and utility of AM achieved for crop improvement is not comparable to that in the area of human health care for diagnosis of complex human diseases. These AM studies in plants, as in humans, became possible due to the availability of DNA-based molecular markers and a variety of sophisticated statistical tools that are evolving on a regular basis. In this chapter, we first briefly review the significance of a variety of populations that are used in AM studies, then briefly describe the molecular markers and high-throughput genotyping strategies, and finally describe the approaches used for AM studies. The major part of the chapter is, however, devoted to analysis of reasons why the results of AM have been underutilized in plant breeding. We also examine the opportunities available and challenges faced while using AM for crop improvement programs. This includes a detailed discussion of the issues that have plagued AM studies, and the solutions that have become available to deal with these issues, so that in future, the results of AM studies may prove increasingly fruitful for crop improvement programs.
Collapse
Affiliation(s)
- Pushpendra K Gupta
- Department of Genetics and Plant Breeding, Ch. Charan Singh University, Meerut, UP, India
| | - Pawan L Kulwal
- State Level Biotechnology Centre, Mahatma Phule Agricultural University, Rahuri, MS, India
| | - Vandana Jaiswal
- Department of Genetics and Plant Breeding, Ch. Charan Singh University, Meerut, UP, India
| |
Collapse
|
36
|
Combarros O. Genetic Risk Factors for Alzheimer’s Disease. NEURODEGENER DIS 2014. [DOI: 10.1007/978-1-4471-6380-0_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
|
37
|
Evaluating empirical bounds on complex disease genetic architecture. Nat Genet 2013; 45:1418-27. [PMID: 24141362 DOI: 10.1038/ng.2804] [Citation(s) in RCA: 106] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2013] [Accepted: 09/30/2013] [Indexed: 12/13/2022]
Abstract
The genetic architecture of human diseases governs the success of genetic mapping and the future of personalized medicine. Although numerous studies have queried the genetic basis of common disease, contradictory hypotheses have been advocated about features of genetic architecture (for example, the contribution of rare versus common variants). We developed an integrated simulation framework, calibrated to empirical data, to enable the systematic evaluation of such hypotheses. For type 2 diabetes (T2D), two simple parameters--(i) the target size for causal mutation and (ii) the coupling between selection and phenotypic effect--define a broad space of architectures. Whereas extreme models are excluded by the combination of epidemiology, linkage and genome-wide association studies, many models remain consistent, including those where rare variants explain either little (<25%) or most (>80%) of T2D heritability. Ongoing sequencing and genotyping studies will further constrain the space of possible architectures, but very large samples (for example, >250,000 unselected individuals) will be required to localize most of the heritability underlying T2D and other traits characterized by these models.
Collapse
|
38
|
Battle A, Mostafavi S, Zhu X, Potash JB, Weissman MM, McCormick C, Haudenschild CD, Beckman KB, Shi J, Mei R, Urban AE, Montgomery SB, Levinson DF, Koller D. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res 2013. [PMID: 24092820 DOI: 10.1101/gr.155192] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Understanding the consequences of regulatory variation in the human genome remains a major challenge, with important implications for understanding gene regulation and interpreting the many disease-risk variants that fall outside of protein-coding regions. Here, we provide a direct window into the regulatory consequences of genetic variation by sequencing RNA from 922 genotyped individuals. We present a comprehensive description of the distribution of regulatory variation--by the specific expression phenotypes altered, the properties of affected genes, and the genomic characteristics of regulatory variants. We detect variants influencing expression of over ten thousand genes, and through the enhanced resolution offered by RNA-sequencing, for the first time we identify thousands of variants associated with specific phenotypes including splicing and allelic expression. Evaluating the effects of both long-range intra-chromosomal and trans (cross-chromosomal) regulation, we observe modularity in the regulatory network, with three-dimensional chromosomal configuration playing a particular role in regulatory modules within each chromosome. We also observe a significant depletion of regulatory variants affecting central and critical genes, along with a trend of reduced effect sizes as variant frequency increases, providing evidence that purifying selection and buffering have limited the deleterious impact of regulatory variation on the cell. Further, generalizing beyond observed variants, we have analyzed the genomic properties of variants associated with expression and splicing and developed a Bayesian model to predict regulatory consequences of genetic variants, applicable to the interpretation of individual genomes and disease studies. Together, these results represent a critical step toward characterizing the complete landscape of human regulatory variation.
Collapse
Affiliation(s)
- Alexis Battle
- Department of Computer Science, Stanford University, Stanford, California 94305, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res 2013; 24:14-24. [PMID: 24092820 PMCID: PMC3875855 DOI: 10.1101/gr.155192.113] [Citation(s) in RCA: 386] [Impact Index Per Article: 35.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Understanding the consequences of regulatory variation in the human genome remains a major challenge, with important implications for understanding gene regulation and interpreting the many disease-risk variants that fall outside of protein-coding regions. Here, we provide a direct window into the regulatory consequences of genetic variation by sequencing RNA from 922 genotyped individuals. We present a comprehensive description of the distribution of regulatory variation—by the specific expression phenotypes altered, the properties of affected genes, and the genomic characteristics of regulatory variants. We detect variants influencing expression of over ten thousand genes, and through the enhanced resolution offered by RNA-sequencing, for the first time we identify thousands of variants associated with specific phenotypes including splicing and allelic expression. Evaluating the effects of both long-range intra-chromosomal and trans (cross-chromosomal) regulation, we observe modularity in the regulatory network, with three-dimensional chromosomal configuration playing a particular role in regulatory modules within each chromosome. We also observe a significant depletion of regulatory variants affecting central and critical genes, along with a trend of reduced effect sizes as variant frequency increases, providing evidence that purifying selection and buffering have limited the deleterious impact of regulatory variation on the cell. Further, generalizing beyond observed variants, we have analyzed the genomic properties of variants associated with expression and splicing and developed a Bayesian model to predict regulatory consequences of genetic variants, applicable to the interpretation of individual genomes and disease studies. Together, these results represent a critical step toward characterizing the complete landscape of human regulatory variation.
Collapse
|
40
|
Matullo G, Di Gaetano C, Guarrera S. Next generation sequencing and rare genetic variants: from human population studies to medical genetics. ENVIRONMENTAL AND MOLECULAR MUTAGENESIS 2013; 54:518-532. [PMID: 23922201 DOI: 10.1002/em.21799] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2013] [Revised: 05/31/2013] [Accepted: 06/09/2013] [Indexed: 06/02/2023]
Abstract
The allelic frequency spectrum emerging from several Next Generation Sequencing (NGS) projects is revealing important details about evolutionary and demographic forces that shaped the human genome. Herein, we discuss some of the achievements of the use of low-frequency and rare variants from NGS studies. The majority of variants that affect protein-coding regions are recent and rare. Often, the novel rare variants are enriched for deleterious alleles and are population-specific, making them suitable for the study of disease susceptibility. To investigate this kind of variation and its effects in association studies, very large sample sizes will be necessary to achieve sufficient statistical power. Moreover, as these variants are typically population-specific, the replication of disease associations across populations could be very difficult due to population stratification. Therefore, the design of experiments focusing on the identification of rare variants and their effects should be carefully planned. Although several successes have already been achieved through NGS for genetic epidemiology, pharmacogenetic and clinical purposes, with improvements of the sequencing technology and decreased costs, further advances are expected in the near future.
Collapse
Affiliation(s)
- Giuseppe Matullo
- Dipartimento di Scienze Mediche, Università di Torino, Torino, Italy.
| | | | | |
Collapse
|
41
|
Goldstein DB, Allen A, Keebler J, Margulies EH, Petrou S, Petrovski S, Sunyaev S. Sequencing studies in human genetics: design and interpretation. Nat Rev Genet 2013; 14:460-70. [PMID: 23752795 DOI: 10.1038/nrg3455] [Citation(s) in RCA: 185] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Next-generation sequencing is becoming the primary discovery tool in human genetics. There have been many clear successes in identifying genes that are responsible for Mendelian diseases, and sequencing approaches are now poised to identify the mutations that cause undiagnosed childhood genetic diseases and those that predispose individuals to more common complex diseases. There are, however, growing concerns that the complexity and magnitude of complete sequence data could lead to an explosion of weakly justified claims of association between genetic variants and disease. Here, we provide an overview of the basic workflow in next-generation sequencing studies and emphasize, where possible, measures and considerations that facilitate accurate inferences from human sequencing studies.
Collapse
Affiliation(s)
- David B Goldstein
- Center for Human Genome Variation, Duke University School of Medicine, 308 Research Drive, Box 91009, LSRC B Wing, Room 330, Durham, North Carolina 27708, USA.
| | | | | | | | | | | | | |
Collapse
|
42
|
Long N, Dickson SP, Maia JM, Kim HS, Zhu Q, Allen AS. Leveraging prior information to detect causal variants via multi-variant regression. PLoS Comput Biol 2013; 9:e1003093. [PMID: 23762022 PMCID: PMC3675126 DOI: 10.1371/journal.pcbi.1003093] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2012] [Accepted: 04/29/2013] [Indexed: 01/03/2023] Open
Abstract
Although many methods are available to test sequence variants for association with complex diseases and traits, methods that specifically seek to identify causal variants are less developed. Here we develop and evaluate a Bayesian hierarchical regression method that incorporates prior information on the likelihood of variant causality through weighting of variant effects. By simulation studies using both simulated and real sequence variants, we compared a standard single variant test for analyzing variant-disease association with the proposed method using different weighting schemes. We found that by leveraging linkage disequilibrium of variants with known GWAS signals and sequence conservation (phastCons), the proposed method provides a powerful approach for detecting causal variants while controlling false positives. The decline in DNA sequencing cost permits the interrogation of potentially all variants across the entire allele frequency spectrum for their associations with complex human diseases and traits. However, the identification of causal variants remains challenging. Existing single variant tests do not distinguish between causal association and association induced by linkage disequilibrium and tend to be underpowered for rare or low-frequency variants, whereas variant grouping methods do not identify individual causal variants. We propose a novel Bayesian hierarchical regression approach that estimates effects of multiple variants on a disease trait simultaneously and incorporates prior information on the likelihood of causality. By simulation, we show that by combining linkage disequilibrium with known genome wide association signals and functional conservation, the proposed method, the first of its kind, is powerful to correctly detect causal variants.
Collapse
Affiliation(s)
- Nanye Long
- Center for Human Genome Variation, Duke University School of Medicine, Durham, North Carolina, United States of America.
| | | | | | | | | | | |
Collapse
|
43
|
Huang Y, Lu Y, Mues G, Wang S, Bonds J, D'Souza R. Functional evaluation of a novel tooth agenesis-associated bone morphogenetic protein 4 prodomain mutation. Eur J Oral Sci 2013; 121:313-8. [PMID: 23841782 DOI: 10.1111/eos.12055] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/11/2013] [Indexed: 11/28/2022]
Abstract
The detection of gene mutations in patients with congenitally missing teeth is not very complicated; however, proving causality is often quite difficult. Here, we report the detection of a substitution mutation, A42P, within the prodomain of bone morphogenetic protein 4 (BMP4) in a small family with tooth agenesis and describe a functional alteration that may be responsible for the tooth phenotype. As BMP4 is essential for the development of teeth and also for many other organs, it would be of considerable interest to find a BMP4 mutation that is associated only with tooth agenesis. Our in vitro investigations revealed that the A42P mutation neither affected processing and secretion of BMP4 nor altered functional properties, such as the induction of alkaline phosphatase or signaling through Smad1/5/8 phosphorylation by the mature BMP4 ligand. However, immunofluorescence staining revealed that the prodomains of BMP4 which harbor the A42P substitution form fibrillar structures around transfected cells in culture and that this fibrillar network is significantly decreased when mutant prodomains are expressed. Our finding suggests that in vivo, BMP4 prodomain behavior might also be altered by the mutation and could influence storage or transport of mature BMP4 in the extracellular matrix of the developing tooth.
Collapse
Affiliation(s)
- Yanyu Huang
- The State Key Laboratory Breeding Base of Basic Science of Stomatology (Hubei-MOST), Key Laboratory of Oral Biomedicine, Ministry of Education, School & Hospital of Stomatology, Wuhan University, Wuhan, China
| | | | | | | | | | | |
Collapse
|
44
|
Qiu R, Chen C, Jiang H, Shen L, Wu M, Liu C. Large genomic region free of GWAS-based common variants contains fertility-related genes. PLoS One 2013; 8:e61917. [PMID: 23613972 PMCID: PMC3629113 DOI: 10.1371/journal.pone.0061917] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2012] [Accepted: 03/15/2013] [Indexed: 02/01/2023] Open
Abstract
DNA variants, such as single nucleotide polymorphisms (SNPs) and copy number variants (CNVs), are unevenly distributed across the human genome. Currently, dbSNP contains more than 6 million human SNPs, and whole-genome genotyping arrays can assay more than 4 million of them simultaneously. In our study, we first questioned whether published genome-wide association studies (GWASs) assays cover all regions well in the genome. Using dbSNP build 135 data, we identified 50 genomic regions longer than 100 Kb that do not contain any common SNPs, i.e., those with minor allele frequency (MAF)≥1%. Secondly, because conserved regions are generally of functional importance, we tested genes in those large genomic regions without common SNPs. We found 97 genes and were enriched for reproduction function. In addition, we further filtered out regions with CNVs listed in the Database of Genomic Variants (DGV), segmental duplications from Human Genome Project and common variants identified by personal genome sequencing (UCSC). No region survived after those filtering. Our analysis suggests that, while there may not be many large genomic regions free of common variants, there are still some “holes” in the current human genomic map for common SNPs. Because GWAS only focused on common SNPs, interpretation of GWAS results should take this limitation into account. Particularly, two recent GWAS of fertility may be incomplete due to the map deficit. Additional SNP discovery efforts should pay close attention to these regions.
Collapse
Affiliation(s)
- Rong Qiu
- School of Information Science and Engineering, Central South University, Changsha, China
- Hunan Engineering Laboratory for Advanced Control and Intelligent Automation, Changsha, China
| | - Chao Chen
- Department of Psychiatry, University of Illinois at Chicago, Chicago, United States of America
- Institute of Human Genetics, University of Illinois at Chicago, Chicago, United States of America
- * E-mail: (CC); (CL)
| | - Hong Jiang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Libing Shen
- School of Life Science, Fudan University, Shanghai, China
| | - Min Wu
- School of Information Science and Engineering, Central South University, Changsha, China
- Hunan Engineering Laboratory for Advanced Control and Intelligent Automation, Changsha, China
| | - Chunyu Liu
- Department of Psychiatry, University of Illinois at Chicago, Chicago, United States of America
- Institute of Human Genetics, University of Illinois at Chicago, Chicago, United States of America
- State Key Laboratory of Medical Genetics of China, Central South University, Changsha, China
- * E-mail: (CC); (CL)
| |
Collapse
|
45
|
Hu Q, Wang D, Yan L, Zhao H, Liu S. VPA: an R tool for analyzing sequencing variants with user-specified frequency pattern. BMC Res Notes 2012; 5:31. [PMID: 22243673 PMCID: PMC3293055 DOI: 10.1186/1756-0500-5-31] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2011] [Accepted: 01/14/2012] [Indexed: 11/26/2022] Open
Abstract
Background The massive amounts of genetic variant generated by the next generation sequencing systems demand the development of effective computational tools for variant prioritization. Findings VPA (Variant Pattern Analyzer) is an R tool for prioritizing variants with specified frequency pattern from multiple study subjects in next-generation sequencing study. The tool starts from individual files of variant and sequence calls and extract variants with user-specified frequency pattern across the study subjects of interest. Several position level quality criteria can be incorporated into the variant extraction. It can be used in studies with matched pair design as well as studies with multiple groups of subjects. Conclusions VPA can be used as an automatic pipeline to prioritize variants for further functional exploration and hypothesis generation. The package is implemented in the R language and is freely available from http://vpa.r-forge.r-project.org.
Collapse
|
46
|
Stubbs A, McClellan EA, Horsman S, Hiltemann SD, Palli I, Nouwens S, Koning AH, Hoogland F, Reumers J, Heijsman D, Swagemakers S, Kremer A, Meijerink J, Lambrechts D, van der Spek PJ. Huvariome: a web server resource of whole genome next-generation sequencing allelic frequencies to aid in pathological candidate gene selection. J Clin Bioinforma 2012; 2:19. [PMID: 23164068 PMCID: PMC3549785 DOI: 10.1186/2043-9113-2-19] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2012] [Accepted: 10/16/2012] [Indexed: 01/01/2023] Open
Abstract
UNLABELLED BACKGROUND Next generation sequencing provides clinical research scientists with direct read out of innumerable variants, including personal, pathological and common benign variants. The aim of resequencing studies is to determine the candidate pathogenic variants from individual genomes, or from family-based or tumor/normal genome comparisons. Whilst the use of appropriate controls within the experimental design will minimize the number of false positive variations selected, this number can be reduced further with the use of high quality whole genome reference data to minimize false positives variants prior to candidate gene selection. In addition the use of platform related sequencing error models can help in the recovery of ambiguous genotypes from lower coverage data. DESCRIPTION We have developed a whole genome database of human genetic variations, Huvariome, determined by whole genome deep sequencing data with high coverage and low error rates. The database was designed to be sequencing technology independent but is currently populated with 165 individual whole genomes consisting of small pedigrees and matched tumor/normal samples sequenced with the Complete Genomics sequencing platform. Common variants have been determined for a Benelux population cohort and represented as genotypes alongside the results of two sets of control data (73 of the 165 genomes), Huvariome Core which comprises 31 healthy individuals from the Benelux region, and Diversity Panel consisting of 46 healthy individuals representing 10 different populations and 21 samples in three Pedigrees. Users can query the database by gene or position via a web interface and the results are displayed as the frequency of the variations as detected in the datasets. We demonstrate that Huvariome can provide accurate reference allele frequencies to disambiguate sequencing inconsistencies produced in resequencing experiments. Huvariome has been used to support the selection of candidate cardiomyopathy related genes which have a homozygous genotype in the reference cohorts. This database allows the users to see which selected variants are common variants (> 5% minor allele frequency) in the Huvariome core samples, thus aiding in the selection of potentially pathogenic variants by filtering out common variants that are not listed in one of the other public genomic variation databases. The no-call rate and the accuracy of allele calling in Huvariome provides the user with the possibility of identifying platform dependent errors associated with specific regions of the human genome. CONCLUSION Huvariome is a simple to use resource for validation of resequencing results obtained by NGS experiments. The high sequence coverage and low error rates provide scientists with the ability to remove false positive results from pedigree studies. Results are returned via a web interface that displays location-based genetic variation frequency, impact on protein function, association with known genetic variations and a quality score of the variation base derived from Huvariome Core and the Diversity Panel data. These results may be used to identify and prioritize rare variants that, for example, might be disease relevant. In testing the accuracy of the Huvariome database, alleles of a selection of ambiguously called coding single nucleotide variants were successfully predicted in all cases. Data protection of individuals is ensured by restricted access to patient derived genomes from the host institution which is relevant for future molecular diagnostics.
Collapse
Affiliation(s)
- Andrew Stubbs
- Department of Bioinformatics, Erasmus University Medical Center, Molewaterplein 50, Rotterdam, The Netherlands.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
47
|
Zhu Q, Ge D, Heinzen E, Dickson S, Urban T, Zhu M, Maia J, He M, Zhao Q, Shianna K, Goldstein D. Prioritizing genetic variants for causality on the basis of preferential linkage disequilibrium. Am J Hum Genet 2012; 91:422-34. [PMID: 22939045 DOI: 10.1016/j.ajhg.2012.07.010] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2012] [Revised: 05/15/2012] [Accepted: 07/13/2012] [Indexed: 02/06/2023] Open
Abstract
To date, the widely used genome-wide association studies (GWASs) of the human genome have reported thousands of variants that are significantly associated with various human traits. However, in the vast majority of these cases, the causal variants responsible for the observed associations remain unknown. In order to facilitate the identification of causal variants, we designed a simple computational method called the "preferential linkage disequilibrium (LD)" approach, which follows the variants discovered by GWASs to pinpoint the causal variants, even if they are rare compared with the discovery variants. The approach is based on the hypothesis that the GWAS-discovered variant is better at tagging the causal variants than are most other variants evaluated in the original GWAS. Applying the preferential LD approach to the GWAS signals of five human traits for which the causal variants are already known, we successfully placed the known causal variants among the top ten candidates in the majority of these cases. Application of this method to additional GWASs, including those of hepatitis C virus treatment response, plasma levels of clotting factors, and late-onset Alzheimer disease, has led to the identification of a number of promising candidate causal variants. This method represents a useful tool for delineating causal variants by bringing together GWAS signals and the rapidly accumulating variant data from next-generation sequencing.
Collapse
|
48
|
Role of rare variants in undetermined multiple adenomatous polyposis and early-onset colorectal cancer. J Hum Genet 2012; 57:709-716. [PMID: 22875147 DOI: 10.1038/jhg.2012.99] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Some 15-20% of multiple adenomatous polyposis have no genetic explanation and 20-30% of colorectal cancer (CRC) cases are thought to be due to inherited multifactorial causes. Accumulation of deleterious effects of low-frequency dominant and independently acting variants may be a partial explanation for such patients. The aim of this study was to type a selection of rare and low-frequency variants (<5%) to elucidate their role in CRC susceptibility. A total of 1181 subjects were included (866 controls; 315 cases). Cases comprised UK (n=184) and French (n=131) patients with MAP (n=187) or early-onset CRC (n=128). Seventy variants in 17 genes were examined in cases and controls. The effect of the variant effect on protein function was investigated in silico. Out of the 70 variants typed, 36 (51%) were tested for association. Twenty-one variants were rare (minor allele frequency (MAF) <1%). Four rare variants were found to have a significantly higher MAF in cases (EXO1-12, MLH1-1, CTNNB1-1 and BRCA2-37, P<0.05) than in controls. Pooling all rare variants with a MAF <0.5% showed an excess risk in cases (odds ratio=3.2; 95% confidence interval=1.1-9.5; P=0.04). Rare variants are important risk factors in CRC and, as such, should be systematically assayed alongside common variation in the search for the genetic basis of complex diseases.
Collapse
|
49
|
Xu C, Ladouceur M, Dastani Z, Richards JB, Ciampi A, Greenwood CMT. Multiple regression methods show great potential for rare variant association tests. PLoS One 2012; 7:e41694. [PMID: 22916111 PMCID: PMC3420665 DOI: 10.1371/journal.pone.0041694] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2012] [Accepted: 06/25/2012] [Indexed: 01/08/2023] Open
Abstract
The investigation of associations between rare genetic variants and diseases or phenotypes has two goals. Firstly, the identification of which genes or genomic regions are associated, and secondly, discrimination of associated variants from background noise within each region. Over the last few years, many new methods have been developed which associate genomic regions with phenotypes. However, classical methods for high-dimensional data have received little attention. Here we investigate whether several classical statistical methods for high-dimensional data: ridge regression (RR), principal components regression (PCR), partial least squares regression (PLS), a sparse version of PLS (SPLS), and the LASSO are able to detect associations with rare genetic variants. These approaches have been extensively used in statistics to identify the true associations in data sets containing many predictor variables. Using genetic variants identified in three genes that were Sanger sequenced in 1998 individuals, we simulated continuous phenotypes under several different models, and we show that these feature selection and feature extraction methods can substantially outperform several popular methods for rare variant analysis. Furthermore, these approaches can identify which variants are contributing most to the model fit, and therefore both goals of rare variant analysis can be achieved simultaneously with the use of regression regularization methods. These methods are briefly illustrated with an analysis of adiponectin levels and variants in the ADIPOQ gene.
Collapse
Affiliation(s)
- ChangJiang Xu
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada
| | | | | | | | | | | |
Collapse
|
50
|
Abstract
Stature is a classical and highly heritable complex trait, with 80%–90% of variation explained by genetic factors. In recent years, genome-wide association studies (GWAS) have successfully identified many common additive variants influencing human height; however, little attention has been given to the potential role of recessive genetic effects. Here, we investigated genome-wide recessive effects by an analysis of inbreeding depression on adult height in over 35,000 people from 21 different population samples. We found a highly significant inverse association between height and genome-wide homozygosity, equivalent to a height reduction of up to 3 cm in the offspring of first cousins compared with the offspring of unrelated individuals, an effect which remained after controlling for the effects of socio-economic status, an important confounder (χ2 = 83.89, df = 1; p = 5.2×10−20). There was, however, a high degree of heterogeneity among populations: whereas the direction of the effect was consistent across most population samples, the effect size differed significantly among populations. It is likely that this reflects true biological heterogeneity: whether or not an effect can be observed will depend on both the variance in homozygosity in the population and the chance inheritance of individual recessive genotypes. These results predict that multiple, rare, recessive variants influence human height. Although this exploratory work focuses on height alone, the methodology developed is generally applicable to heritable quantitative traits (QT), paving the way for an investigation into inbreeding effects, and therefore genetic architecture, on a range of QT of biomedical importance. Studies investigating the extent to which genetics influences human characteristics such as height have concentrated mainly on common variants of genes, where having one or two copies of a given variant influences the trait or risk of disease. This study explores whether a different type of genetic variant might also be important. We investigate the role of recessive genetic variants, where two identical copies of a variant are required to have an effect. By measuring genome-wide homozygosity—the phenomenon of inheriting two identical copies at a given point of the genome—in 35,000 individuals from 21 European populations, and by comparing this to individual height, we found that the more homozygous the genome, the shorter the individual. The offspring of first cousins (who have increased homozygosity) were predicted to be up to 3 cm shorter on average than the offspring of unrelated parents. Height is influenced by the combined effect of many recessive variants dispersed across the genome. This may also be true for other human characteristics and diseases, opening up a new way to understand how genetic variation influences our health.
Collapse
|