Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Total Articles

93
(from Reference Citation Analysis)

Article PDFs (25)

Cited by ≥ 1 (59)

Searched Name

Sebastian Zöllner

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Number	Citation Analysis
26	Jia X, Goes FS, Locke AE, Palmer D, Wang W, Cohen-Woods S, Genovese G, Jackson AU, Jiang C, Kvale M, Mullins N, Nguyen H, Pirooznia M, Rivera M, Ruderfer DM, Shen L, Thai K, Zawistowski M, Zhuang Y, Abecasis G, Akil H, Bergen S, Burmeister M, Chapman S, DelaBastide M, Juréus A, Kang HM, Kwok PY, Li JZ, Levy SE, Monson ET, Moran J, Sobell J, Watson S, Willour V, Zöllner S, Adolfsson R, Blackwood D, Boehnke M, Breen G, Corvin A, Craddock N, DiFlorio A, Hultman CM, Landen M, Lewis C, McCarroll SA, Richard McCombie W, McGuffin P, McIntosh A, McQuillin A, Morris D, Myers RM, O'Donovan M, Ophoff R, Boks M, Kahn R, Ouwehand W, Owen M, Pato C, Pato M, Posthuma D, Potash JB, Reif A, Sklar P, Smoller J, Sullivan PF, Vincent J, Walters J, Neale B, Purcell S, Risch N, Schaefer C, Stahl EA, Zandi PP, Scott LJ. Investigating rare pathogenic/likely pathogenic exonic variation in bipolar disorder. Mol Psychiatry 2021;26:5239-5250. [PMID: 33483695 PMCID: PMC8295400 DOI: 10.1038/s41380-020-01006-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Revised: 12/14/2020] [Accepted: 12/16/2020] [Indexed: 01/30/2023] Abstract Bipolar disorder (BD) is a serious mental illness with substantial common variant heritability. However, the role of rare coding variation in BD is not well established. We examined the protein-coding (exonic) sequences of 3,987 unrelated individuals with BD and 5,322 controls of predominantly European ancestry across four cohorts from the Bipolar Sequencing Consortium (BSC). We assessed the burden of rare, protein-altering, single nucleotide variants classified as pathogenic or likely pathogenic (P-LP) both exome-wide and within several groups of genes with phenotypic or biologic plausibility in BD. While we observed an increased burden of rare coding P-LP variants within 165 genes identified as BD GWAS regions in 3,987 BD cases (meta-analysis OR = 1.9, 95% CI = 1.3-2.8, one-sided p = 6.0 × 10-4), this enrichment did not replicate in an additional 9,929 BD cases and 14,018 controls (OR = 0.9, one-side p = 0.70). Although BD shares common variant heritability with schizophrenia, in the BSC sample we did not observe a significant enrichment of P-LP variants in SCZ GWAS genes, in two classes of neuronal synaptic genes (RBFOX2 and FMRP) associated with SCZ or in loss-of-function intolerant genes. In this study, the largest analysis of exonic variation in BD, individuals with BD do not carry a replicable enrichment of rare P-LP variants across the exome or in any of several groups of genes with biologic plausibility. Moreover, despite a strong shared susceptibility between BD and SCZ through common genetic variation, we do not observe an association between BD risk and rare P-LP coding variants in genes known to modulate risk for SCZ. Collapse Key Words genetics bipolar disorder Collapse MESH Headings Bipolar Disorder/genetics Exome/genetics Genetic Predisposition to Disease/genetics Genetic Variation/genetics Genome-Wide Association Study Humans Polymorphism, Single Nucleotide/genetics Schizophrenia/genetics Collapse Grants R01 MH085543 NIMH NIH HHS R01 MH104964 NIMH NIH HHS R01 MH077139 NIMH NIH HHS R01 MH087979 NIMH NIH HHS MR/L010305/1 Medical Research Council R01 MH106527 NIMH NIH HHS R01 MH094145 NIMH NIH HHS R01 MH085548 NIMH NIH HHS NC/C011202/1 National Centre for the Replacement, Refinement and Reduction of Animals in Research RC2 AG036607 NIA NIH HHS R01 MH095034 NIMH NIH HHS U01 MH105653 NIMH NIH HHS R01 MH110437 NIMH NIH HHS R01 MH123451 NIMH NIH HHS P30 CA045508 NCI NIH HHS R01 MH106531 NIMH NIH HHS U.S. Department of Health & Human Services \| NIH \| National Institute of Mental Health (NIMH) Dalio Foundation U.S. Department of Health & Human Services \| NIH \| National Institute on Aging (U.S. National Institute on Aging) Wayne and Gladys Valley Foundation Robert Wood Johnson Foundation (RWJF) The Dalio Foundation Collapse
27	Si Y, Vanderwerff B, Zöllner S. Why are rare variants hard to impute? Coalescent models reveal theoretical limits in existing algorithms. Genetics 2021;217:iyab011. [PMID: 33686438 PMCID: PMC8049559 DOI: 10.1093/genetics/iyab011] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2020] [Accepted: 12/15/2020] [Indexed: 01/13/2023] Open Abstract Genotype imputation is an indispensable step in human genetic studies. Large reference panels with deeply sequenced genomes now allow interrogating variants with minor allele frequency < 1% without sequencing. Although it is critical to consider limits of this approach, imputation methods for rare variants have only done so empirically; the theoretical basis of their imputation accuracy has not been explored. To provide theoretical consideration of imputation accuracy under the current imputation framework, we develop a coalescent model of imputing rare variants, leveraging the joint genealogy of the sample to be imputed and reference individuals. We show that broadly used imputation algorithms include model misspecifications about this joint genealogy that limit the ability to correctly impute rare variants. We develop closed-form solutions for the probability distribution of this joint genealogy and quantify the inevitable error rate resulting from the model misspecification across a range of allele frequencies and reference sample sizes. We show that the probability of a falsely imputed minor allele decreases with reference sample size, but the proportion of falsely imputed minor alleles mostly depends on the allele count in the reference sample. We summarize the impact of this error on genotype imputation on association tests by calculating the r2 between imputed and true genotype and show that even when modeling other sources of error, the impact of the model misspecification has a significant impact on the r2 of rare variants. To evaluate these predictions in practice, we compare the imputation of the same dataset across imputation panels of different sizes. Although this empirical imputation accuracy is substantially lower than our theoretical prediction, modeling misspecification seems to further decrease imputation accuracy for variants with low allele counts in the reference. These results provide a framework for developing new imputation algorithms and for interpreting rare variant association analyses. Collapse Key Words HMM coalescent model genotype imputation population rare variants Collapse MESH Headings Algorithms Gene Frequency Genetics, Population/methods Genome, Human Humans Models, Genetic Polymorphism, Genetic Collapse Grants R01 HG005855 NHGRI NIH HHS UM1 HG008901 NHGRI NIH HHS National Institutes of Health NIGMS Human Genetic Cell Repository Coriell Institute for Medical Research NHGRI Collapse
28	Dutta D, VandeHaar P, Fritsche LG, Zöllner S, Boehnke M, Scott LJ, Lee S. A powerful subset-based method identifies gene set associations and improves interpretation in UK Biobank. Am J Hum Genet 2021;108:669-681. [PMID: 33730541 DOI: 10.1016/j.ajhg.2021.02.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Accepted: 02/19/2021] [Indexed: 02/06/2023] Open Abstract Tests of association between a phenotype and a set of genes in a biological pathway can provide insights into the genetic architecture of complex phenotypes beyond those obtained from single-variant or single-gene association analysis. However, most existing gene set tests have limited power to detect gene set-phenotype association when a small fraction of the genes are associated with the phenotype and cannot identify the potentially "active" genes that might drive a gene set-based association. To address these issues, we have developed Gene set analysis Association Using Sparse Signals (GAUSS), a method for gene set association analysis that requires only GWAS summary statistics. For each significantly associated gene set, GAUSS identifies the subset of genes that have the maximal evidence of association and can best account for the gene set association. Using pre-computed correlation structure among test statistics from a reference panel, our p value calculation is substantially faster than other permutation- or simulation-based approaches. In simulations with varying proportions of causal genes, we find that GAUSS effectively controls type 1 error rate and has greater power than several existing methods, particularly when a small proportion of genes account for the gene set signal. Using GAUSS, we analyzed UK Biobank GWAS summary statistics for 10,679 gene sets and 1,403 binary phenotypes. We found that GAUSS is scalable and identified 13,466 phenotype and gene set association pairs. Within these gene sets, we identify an average of 17.2 (max = 405) genes that underlie these gene set associations. Collapse Key Words UK Biobank core subset pathway association phenome-wide associations summary statistics Collapse MESH Headings ATP-Binding Cassette Transporters/genetics Biological Specimen Banks Computer Simulation Data Interpretation, Statistical Databases, Genetic Datasets as Topic Gene Expression/genetics Genome-Wide Association Study/methods Humans Phenotype Research Design Time Factors United Kingdom Web Browser Collapse Grants P30 DK020572 NIDDK NIH HHS R01 HG008773 NHGRI NIH HHS R01 HG009976 NHGRI NIH HHS R01 LM012535 NLM NIH HHS Collapse
29	Taliun D, Harris DN, Kessler MD, Carlson J, Szpiech ZA, Torres R, Taliun SAG, Corvelo A, Gogarten SM, Kang HM, Pitsillides AN, LeFaive J, Lee SB, Tian X, Browning BL, Das S, Emde AK, Clarke WE, Loesch DP, Shetty AC, Blackwell TW, Smith AV, Wong Q, Liu X, Conomos MP, Bobo DM, Aguet F, Albert C, Alonso A, Ardlie KG, Arking DE, Aslibekyan S, Auer PL, Barnard J, Barr RG, Barwick L, Becker LC, Beer RL, Benjamin EJ, Bielak LF, Blangero J, Boehnke M, Bowden DW, Brody JA, Burchard EG, Cade BE, Casella JF, Chalazan B, Chasman DI, Chen YDI, Cho MH, Choi SH, Chung MK, Clish CB, Correa A, Curran JE, Custer B, Darbar D, Daya M, de Andrade M, DeMeo DL, Dutcher SK, Ellinor PT, Emery LS, Eng C, Fatkin D, Fingerlin T, Forer L, Fornage M, Franceschini N, Fuchsberger C, Fullerton SM, Germer S, Gladwin MT, Gottlieb DJ, Guo X, Hall ME, He J, Heard-Costa NL, Heckbert SR, Irvin MR, Johnsen JM, Johnson AD, Kaplan R, Kardia SLR, Kelly T, Kelly S, Kenny EE, Kiel DP, Klemmer R, Konkle BA, Kooperberg C, Köttgen A, Lange LA, Lasky-Su J, Levy D, Lin X, Lin KH, Liu C, Loos RJF, Garman L, Gerszten R, Lubitz SA, Lunetta KL, Mak ACY, Manichaikul A, Manning AK, Mathias RA, McManus DD, McGarvey ST, Meigs JB, Meyers DA, Mikulla JL, Minear MA, Mitchell BD, Mohanty S, Montasser ME, Montgomery C, Morrison AC, Murabito JM, Natale A, Natarajan P, Nelson SC, North KE, O'Connell JR, Palmer ND, Pankratz N, Peloso GM, Peyser PA, Pleiness J, Post WS, Psaty BM, Rao DC, Redline S, Reiner AP, Roden D, Rotter JI, Ruczinski I, Sarnowski C, Schoenherr S, Schwartz DA, Seo JS, Seshadri S, Sheehan VA, Sheu WH, Shoemaker MB, Smith NL, Smith JA, Sotoodehnia N, Stilp AM, Tang W, Taylor KD, Telen M, Thornton TA, Tracy RP, Van Den Berg DJ, Vasan RS, Viaud-Martinez KA, Vrieze S, Weeks DE, Weir BS, Weiss ST, Weng LC, Willer CJ, Zhang Y, Zhao X, Arnett DK, Ashley-Koch AE, Barnes KC, Boerwinkle E, Gabriel S, Gibbs R, Rice KM, Rich SS, Silverman EK, Qasba P, Gan W, Papanicolaou GJ, Nickerson DA, Browning SR, Zody MC, Zöllner S, Wilson JG, Cupples LA, Laurie CC, Jaquish CE, Hernandez RD, O'Connor TD, Abecasis GR. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 2021;590:290-299. [PMID: 33568819 PMCID: PMC7875770 DOI: 10.1038/s41586-021-03205-y] [Citation(s) in RCA: 860] [Impact Index Per Article: 286.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Accepted: 01/07/2021] [Indexed: 02/08/2023] Abstract The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%. Collapse Key Words rare variants next-generation sequencing genetics research Collapse MESH Headings Cytochrome P-450 CYP2D6/genetics Genetic Variation/genetics Genome, Human/genetics Genomics Haplotypes/genetics Heterozygote Humans INDEL Mutation Loss of Function Mutation Mutagenesis National Heart, Lung, and Blood Institute (U.S.) Phenotype Polymorphism, Single Nucleotide Population Density Precision Medicine/standards Quality Control Sample Size United States Whole Genome Sequencing/standards Collapse Grants R35 HG010692 NHGRI NIH HHS P30 ES010126 NIEHS NIH HHS P50 HL118006 NHLBI NIH HHS T32 HG000040 NHGRI NIH HHS R01 HL090620 NHLBI NIH HHS K08 HL141601 NHLBI NIH HHS HHSN268201800001C NHLBI NIH HHS K24 HL148521 NHLBI NIH HHS R01 HL111314 NHLBI NIH HHS R01 AI132476 NIAID NIH HHS R01 HL117626 NHLBI NIH HHS R01 HL131565 NHLBI NIH HHS P01 HL132825 NHLBI NIH HHS R01 DA044283 NIDA NIH HHS R01 HL123915 NHLBI NIH HHS R01 HL120393 NHLBI NIH HHS R03 HL141439 NHLBI NIH HHS K01 AG059898 NIA NIH HHS R01 HL155742 NHLBI NIH HHS U01 HL120393 NHLBI NIH HHS R01 DK117445 NIDDK NIH HHS UH3 HL151865 NHLBI NIH HHS R01 HL163972 NHLBI NIH HHS R01 AR072199 NIAMS NIH HHS R01 HL142711 NHLBI NIH HHS I01 BX005295 BLRD VA R01 HL113326 NHLBI NIH HHS U01 HL137162 NHLBI NIH HHS R01 HG005701 NHGRI NIH HHS R01 GM075091 NIGMS NIH HHS T32 HL007085 NHLBI NIH HHS R01 DA037904 NIDA NIH HHS R21 HL123677 NHLBI NIH HHS P30 DK020572 NIDDK NIH HHS R03 HL154284 NHLBI NIH HHS R01 MD012765 NIMHD NIH HHS U01 HG009088 NHGRI NIH HHS UM1 DK078616 NIDDK NIH HHS UG3 HL151865 NHLBI NIH HHS U01 CA182913 NCI NIH HHS T32 CA154274 NCI NIH HHS R01 HL149836 NHLBI NIH HHS K01 HL135405 NHLBI NIH HHS Collapse
30	Cochran AL, Nieser KJ, Forger DB, Zöllner S, McInnis MG. Gene-set Enrichment with Mathematical Biology (GEMB). Gigascience 2020;9:giaa091. [PMID: 33034635 PMCID: PMC7546080 DOI: 10.1093/gigascience/giaa091] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Revised: 06/01/2020] [Accepted: 08/14/2020] [Indexed: 11/14/2022] Open Abstract BACKGROUND Gene-set analyses measure the association between a disease of interest and a "set" of genes related to a biological pathway. These analyses often incorporate gene network properties to account for differential contributions of each gene. We extend this concept further-defining gene contributions based on biophysical properties-by leveraging mathematical models of biology to predict the effects of genetic perturbations on a particular downstream function. RESULTS We present a method that combines gene weights from model predictions and gene ranks from genome-wide association studies into a weighted gene-set test. We demonstrate in simulation how such a method can improve statistical power. To this effect, we identify a gene set, weighted by model-predicted contributions to intracellular calcium ion concentration, that is significantly related to bipolar disorder in a small dataset (P = 0.04; n = 544). We reproduce this finding using publicly available summary data from the Psychiatric Genomics Consortium (P = 1.7 × 10-4; n = 41,653). By contrast, an approach using a general calcium signaling pathway did not detect a significant association with bipolar disorder (P = 0.08). The weighted gene-set approach based on intracellular calcium ion concentration did not detect a significant relationship with schizophrenia (P = 0.09; n = 65,967) or major depression disorder (P = 0.30; n = 500,199). CONCLUSIONS Together, these findings show how incorporating math biology into gene-set analyses might help to identify biological functions that underlie certain polygenic disorders. Collapse Key Words bipolar disorder calcium signaling gene ontology gene-set analysis genetic enrichment mathematical biology Collapse MESH Headings Biology Bipolar Disorder/genetics Genetic Predisposition to Disease Genome-Wide Association Study Humans Multifactorial Inheritance Collapse Grants K01 MH112876 NIMH NIH HHS National Science Foundation National Institute of Mental Health Collapse
31	Kessler MD, Loesch DP, Perry JA, Heard-Costa NL, Taliun D, Cade BE, Wang H, Daya M, Ziniti J, Datta S, Celedón JC, Soto-Quiros ME, Avila L, Weiss ST, Barnes K, Redline SS, Vasan RS, Johnson AD, Mathias RA, Hernandez R, Wilson JG, Nickerson DA, Abecasis G, Browning SR, Zöllner S, O'Connell JR, Mitchell BD, O'Connor TD. De novo mutations across 1,465 diverse genomes reveal mutational insights and reductions in the Amish founder population. Proc Natl Acad Sci U S A 2020;117:2560-2569. [PMID: 31964835 PMCID: PMC7007577 DOI: 10.1073/pnas.1902766117] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open Abstract De novo mutations (DNMs), or mutations that appear in an individual despite not being seen in their parents, are an important source of genetic variation whose impact is relevant to studies of human evolution, genetics, and disease. Utilizing high-coverage whole-genome sequencing data as part of the Trans-Omics for Precision Medicine (TOPMed) Program, we called 93,325 single-nucleotide DNMs across 1,465 trios from an array of diverse human populations, and used them to directly estimate and analyze DNM counts, rates, and spectra. We find a significant positive correlation between local recombination rate and local DNM rate, and that DNM rate explains a substantial portion (8.98 to 34.92%, depending on the model) of the genome-wide variation in population-level genetic variation from 41K unrelated TOPMed samples. Genome-wide heterozygosity does correlate with DNM rate, but only explains <1% of variation. While we are underpowered to see small differences, we do not find significant differences in DNM rate between individuals of European, African, and Latino ancestry, nor across ancestrally distinct segments within admixed individuals. However, we did find significantly fewer DNMs in Amish individuals, even when compared with other Europeans, and even after accounting for parental age and sequencing center. Specifically, we found significant reductions in the number of C→A and T→C mutations in the Amish, which seem to underpin their overall reduction in DNMs. Finally, we calculated near-zero estimates of narrow sense heritability (h2), which suggest that variation in DNM rate is significantly shaped by nonadditive genetic effects and the environment. Collapse Key Words Amish de novo mutations diversity mutation rate recombination Collapse MESH Headings Adult Amish/genetics Cohort Studies DNA Mutational Analysis Female Genetics, Population Genome, Human Heterozygote Humans Male Mutation Pedigree Whole Genome Sequencing Young Adult Collapse Grants R35 HG010692 NHGRI NIH HHS R01 HL120393 NHLBI NIH HHS U24 AG021886 NIA NIH HHS HHSN268201500001C NHLBI NIH HHS R01 HL113338 NHLBI NIH HHS R01 AI079139 NIAID NIH HHS U01 HL137181 NHLBI NIH HHS R37 HL066289 NHLBI NIH HHS K08 HL141601 NHLBI NIH HHS HHSN268201500001I NHLBI NIH HHS R01 HG005701 NHGRI NIH HHS T32 CA154274 NCI NIH HHS P30 DK020595 NIDDK NIH HHS K01 HL135405 NHLBI NIH HHS P30 DK040561 NIDDK NIH HHS U01 HL137183 NHLBI NIH HHS K01 AG059898 NIA NIH HHS R01 HL121007 NHLBI NIH HHS T32 HL007698 NHLBI NIH HHS R01 HL092577 NHLBI NIH HHS R01 DK113003 NIDDK NIH HHS OT3 OD025459 NIH HHS P50 HL118006 NHLBI NIH HHS R01 HL066216 NHLBI NIH HHS T32 HG000040 NHGRI NIH HHS R01 HL138737 NHLBI NIH HHS U24 AG056270 NIA NIH HHS R01 HL141845 NHLBI NIH HHS R01 AR072199 NIAMS NIH HHS R35 HL135818 NHLBI NIH HHS R01 HL098433 NHLBI NIH HHS R01 HL104608 NHLBI NIH HHS R01 HL148239 NHLBI NIH HHS U01 HL072515 NHLBI NIH HHS R01 HL117626 NHLBI NIH HHS P20 GM121334 NIGMS NIH HHS R01 HL135129 NHLBI NIH HHS P01 HL132825 NHLBI NIH HHS R01 HL118267 NHLBI NIH HHS R01 AG018728 NIA NIH HHS Collapse
32	Narisu N, Rothwell R, Vrtačnik P, Rodríguez S, Didion J, Zöllner S, Erdos MR, Collins FS, Eriksson M. Analysis of somatic mutations identifies signs of selection during in vitro aging of primary dermal fibroblasts. Aging Cell 2019;18:e13010. [PMID: 31385397 PMCID: PMC6826141 DOI: 10.1111/acel.13010] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Revised: 06/20/2019] [Accepted: 06/30/2019] [Indexed: 12/13/2022] Open Abstract Somatic mutations are critical for cancer development and may play a role in age-related functional decline. Here, we used deep sequencing to analyze the prevalence of somatic mutations during in vitro cell aging. Primary dermal fibroblasts from healthy subjects of young and advanced age, from Hutchinson-Gilford progeria syndrome and from xeroderma pigmentosum complementation groups A and C, were first restricted in number and then expanded in vitro. DNA was obtained from cells pre- and post-expansion and sequenced at high depth (1656× mean coverage), over a cumulative 290 kb target region, including the exons of 44 aging-related genes. Allele frequencies of 58 somatic mutations differed between the pre- and post-cell culture expansion passages. Mathematical modeling revealed that the frequency change of three of the 58 mutations was unlikely to be explained by genetic drift alone, indicative of positive selection. Two of these three mutations, CDKN2A c.53C>T (T18M) and ERCC8 c.772T>A, were identified in cells from a patient with XPA. The allele frequency of the CDKN2A mutation increased from 0% to 55.3% with increasing cell culture passage. The third mutation, BRCA2 c.6222C>T (H2074H), was identified in a sample from a healthy individual of advanced age. However, further validation of the three mutations suggests that other unmeasured variants probably provide the selective advantage in these cells. Our results reinforce the notions that somatic mutations occur during aging and that some are under positive selection, supporting the model of increased tissue heterogeneity with increased age. Collapse Key Words* aging cell cell mosaicism genome instability molecular biology of aging positive selection somatic mutation tissue heterogeneity Collapse MESH Headings Adolescent Aged, 80 and over Cells, Cultured Cellular Senescence/genetics Child Child, Preschool DNA/genetics Female Fibroblasts/cytology Fibroblasts/metabolism Humans Male Mutation Sequence Analysis, RNA Skin/cytology Skin/metabolism Collapse Grants ZIA HG000024 NHGRI NIH HHS Collapse
33	Kowalski MH, Qian H, Hou Z, Rosen JD, Tapia AL, Shan Y, Jain D, Argos M, Arnett DK, Avery C, Barnes KC, Becker LC, Bien SA, Bis JC, Blangero J, Boerwinkle E, Bowden DW, Buyske S, Cai J, Cho MH, Choi SH, Choquet H, Cupples LA, Cushman M, Daya M, de Vries PS, Ellinor PT, Faraday N, Fornage M, Gabriel S, Ganesh SK, Graff M, Gupta N, He J, Heckbert SR, Hidalgo B, Hodonsky CJ, Irvin MR, Johnson AD, Jorgenson E, Kaplan R, Kardia SLR, Kelly TN, Kooperberg C, Lasky-Su JA, Loos RJF, Lubitz SA, Mathias RA, McHugh CP, Montgomery C, Moon JY, Morrison AC, Palmer ND, Pankratz N, Papanicolaou GJ, Peralta JM, Peyser PA, Rich SS, Rotter JI, Silverman EK, Smith JA, Smith NL, Taylor KD, Thornton TA, Tiwari HK, Tracy RP, Wang T, Weiss ST, Weng LC, Wiggins KL, Wilson JG, Yanek LR, Zöllner S, North KE, Auer PL, Raffield LM, Reiner AP, Li Y. Use of >100,000 NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations. PLoS Genet 2019;15:e1008500. [PMID: 31869403 PMCID: PMC6953885 DOI: 10.1371/journal.pgen.1008500] [Citation(s) in RCA: 152] [Impact Index Per Article: 30.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Revised: 01/10/2020] [Accepted: 10/30/2019] [Indexed: 01/10/2023] Open Abstract Most genome-wide association and fine-mapping studies to date have been conducted in individuals of European descent, and genetic studies of populations of Hispanic/Latino and African ancestry are limited. In addition, these populations have more complex linkage disequilibrium structure. In order to better define the genetic architecture of these understudied populations, we leveraged >100,000 phased sequences available from deep-coverage whole genome sequencing through the multi-ethnic NHLBI Trans-Omics for Precision Medicine (TOPMed) program to impute genotypes into admixed African and Hispanic/Latino samples with genome-wide genotyping array data. We demonstrated that using TOPMed sequencing data as the imputation reference panel improves genotype imputation quality in these populations, which subsequently enhanced gene-mapping power for complex traits. For rare variants with minor allele frequency (MAF) < 0.5%, we observed a 2.3- to 6.1-fold increase in the number of well-imputed variants, with 11-34% improvement in average imputation quality, compared to the state-of-the-art 1000 Genomes Project Phase 3 and Haplotype Reference Consortium reference panels. Impressively, even for extremely rare variants with minor allele count <10 (including singletons) in the imputation target samples, average information content rescued was >86%. Subsequent association analyses of TOPMed reference panel-imputed genotype data with hematological traits (hemoglobin (HGB), hematocrit (HCT), and white blood cell count (WBC)) in ~21,600 African-ancestry and ~21,700 Hispanic/Latino individuals identified associations with two rare variants in the HBB gene (rs33930165 with higher WBC [p = 8.8x10-15] in African populations, rs11549407 with lower HGB [p = 1.5x10-12] and HCT [p = 8.8x10-10] in Hispanics/Latinos). By comparison, neither variant would have been genome-wide significant if either 1000 Genomes Project Phase 3 or Haplotype Reference Consortium reference panels had been used for imputation. Our findings highlight the utility of the TOPMed imputation reference panel for identification of novel rare variant associations not previously detected in similarly sized genome-wide studies of under-represented African and Hispanic/Latino populations. Collapse Key Words Collapse MESH Headings Adult Black or African American/genetics Aged Aged, 80 and over Computational Biology/methods Databases, Genetic Female Gene Frequency Genetic Predisposition to Disease Genetics, Population Genome-Wide Association Study Genotyping Techniques Hispanic or Latino/genetics Humans Linkage Disequilibrium Male Middle Aged Precision Medicine/methods United States Whole Genome Sequencing/methods beta-Globins/genetics Collapse Grants R01 HL118356 NHLBI NIH HHS UL1 RR033176 NCRR NIH HHS R01 HL112064 NHLBI NIH HHS HHSN268201100037C NHLBI NIH HHS U01 HG007417 NHGRI NIH HHS HHSN268201100001I NHLBI NIH HHS HHSN268201500003C NHLBI NIH HHS R01 HL113323 NHLBI NIH HHS R01 HL139731 NHLBI NIH HHS HHSN268201300026C NHLBI NIH HHS RC2 AG036607 NIA NIH HHS HHSN268201800012C NHLBI NIH HHS U01 HG007419 NHGRI NIH HHS R01 HL104135 NHLBI NIH HHS N01HC95160 NHLBI NIH HHS R01 HL071251 NHLBI NIH HHS R01 HL123915 NHLBI NIH HHS R01 HL120393 NHLBI NIH HHS R01 HL087698 NHLBI NIH HHS K24 HL105780 NHLBI NIH HHS U54 HG003067 NHGRI NIH HHS R01 HL121007 NHLBI NIH HHS R01 HL071259 NHLBI NIH HHS R01 HG006703 NHGRI NIH HHS N01HC95163 NHLBI NIH HHS HHSN268201500001C NHLBI NIH HHS UL1 TR001079 NCATS NIH HHS R01 HL141826 NHLBI NIH HHS HHSN268201100004I NHLBI NIH HHS U01 HG007416 NHGRI NIH HHS T32 ES007018 NIEHS NIH HHS R01 AR048797 NIAMS NIH HHS R01 HL092577 NHLBI NIH HHS R21 HL091397 NHLBI NIH HHS N01HC95169 NHLBI NIH HHS U01 HL089897 NHLBI NIH HHS HHSN268201100046C NHLBI NIH HHS R01 HL129132 NHLBI NIH HHS R01 HL071250 NHLBI NIH HHS R01 HL142302 NHLBI NIH HHS N01HC65236 NHLBI NIH HHS R01 EY027004 NEI NIH HHS U01 HG007376 NHGRI NIH HHS N01HC65235 NHLBI NIH HHS HHSN268201100003C WHI NIH HHS R01 NS058700 NINDS NIH HHS N01HC95164 NHLBI NIH HHS HHSN268201300025C NHLBI NIH HHS U54 HG003273 NHGRI NIH HHS F32 HL085989 NHLBI NIH HHS HHSN268201800014C NHLBI NIH HHS N01HC95162 NHLBI NIH HHS R01 HL128914 NHLBI NIH HHS N01HC65234 NHLBI NIH HHS R01 HL119443 NHLBI NIH HHS N01HC95168 NHLBI NIH HHS HHSN268201200008C NHLBI NIH HHS R37 HL066289 NHLBI NIH HHS U01 HL089856 NHLBI NIH HHS T32 HL129982 NHLBI NIH HHS R01 HL067348 NHLBI NIH HHS R01 HL113326 NHLBI NIH HHS HHSN268201300027C NHLBI NIH HHS N01HC65233 NHLBI NIH HHS HHSN268201700002C NHLBI NIH HHS P30 DK063491 NIDDK NIH HHS R01 HL071051 NHLBI NIH HHS R01 DK101855 NIDDK NIH HHS HHSN268201700001I NHLBI NIH HHS HHSN268201800013I NIMHD NIH HHS HHSN268201300028C NHLBI NIH HHS N01HC65237 NHLBI NIH HHS 18SFRN34110082 American Heart Association-American Stroke Association HHSN271201100004C NIA NIH HHS HHSN268201700004I NHLBI NIH HHS N01HC95165 NHLBI NIH HHS HHSN268200900041C NHLBI NIH HHS HHSN268201500015C NHLBI NIH HHS N01HC95159 NHLBI NIH HHS R56 HG010297 NHGRI NIH HHS HHSN268201500001I NHLBI NIH HHS R01 DK116738 NIDDK NIH HHS M01 RR000052 NCRR NIH HHS N01HC95161 NHLBI NIH HHS HHSN268201100002C WHI NIH HHS UL1 TR001420 NCATS NIH HHS R01 HL104608 NHLBI NIH HHS HHSN268201800011C NHLBI NIH HHS U01 HL072518 NHLBI NIH HHS HHSN268201500014C NHLBI NIH HHS M01 RR007122 NCRR NIH HHS R01 NS075107 NINDS NIH HHS R01 HL146500 NHLBI NIH HHS HHSN268201500003I NHLBI NIH HHS R01 HL087263 NHLBI NIH HHS HHSN268201700005C NHLBI NIH HHS HHSN268201700001C NHLBI NIH HHS U01 HL072507 NHLBI NIH HHS HHSN268201700003C NHLBI NIH HHS R01 DK071891 NIDDK NIH HHS N01HC95167 NHLBI NIH HHS K01 HL130609 NHLBI NIH HHS N01HC25195 NHLBI NIH HHS R01 HL085571 NHLBI NIH HHS HHSN268201800015I NHLBI NIH HHS R01 HL071205 NHLBI NIH HHS HHSN268201700004C NHLBI NIH HHS UL1 TR000040 NCATS NIH HHS T32 HL007284 NHLBI NIH HHS HHSN268201100003I NHLBI NIH HHS HHSN268201100002I NHLBI NIH HHS HHSN268201700002I NHLBI NIH HHS HHSN268201700005I NHLBI NIH HHS R01 HL117626 NHLBI NIH HHS N01HC95166 NHLBI NIH HHS HHSN268201300029C NHLBI NIH HHS UL1 TR001881 NCATS NIH HHS R01 HL090682 NHLBI NIH HHS P01 HL132825 NHLBI NIH HHS HHSN268201700003I NHLBI NIH HHS HHSN268201200008I NHLBI NIH HHS R01 HL071258 NHLBI NIH HHS HHSN268201100001C WHI NIH HHS R01 HL055673 NHLBI NIH HHS HHSN268201100004C WHI NIH HHS R01 HL092301 NHLBI NIH HHS Collapse
34	Stahl EA, Breen G, Forstner AJ, McQuillin A, Ripke S, Trubetskoy V, Mattheisen M, Wang Y, Coleman JRI, Gaspar HA, de Leeuw CA, Steinberg S, Pavlides JMW, Trzaskowski M, Byrne EM, Pers TH, Holmans PA, Richards AL, Abbott L, Agerbo E, Akil H, Albani D, Alliey-Rodriguez N, Als TD, Anjorin A, Antilla V, Awasthi S, Badner JA, Bækvad-Hansen M, Barchas JD, Bass N, Bauer M, Belliveau R, Bergen SE, Pedersen CB, Bøen E, Boks MP, Boocock J, Budde M, Bunney W, Burmeister M, Bybjerg-Grauholm J, Byerley W, Casas M, Cerrato F, Cervantes P, Chambert K, Charney AW, Chen D, Churchhouse C, Clarke TK, Coryell W, Craig DW, Cruceanu C, Curtis D, Czerski PM, Dale AM, de Jong S, Degenhardt F, Del-Favero J, DePaulo JR, Djurovic S, Dobbyn AL, Dumont A, Elvsåshagen T, Escott-Price V, Fan CC, Fischer SB, Flickinger M, Foroud TM, Forty L, Frank J, Fraser C, Freimer NB, Frisén L, Gade K, Gage D, Garnham J, Giambartolomei C, Pedersen MG, Goldstein J, Gordon SD, Gordon-Smith K, Green EK, Green MJ, Greenwood TA, Grove J, Guan W, Guzman-Parra J, Hamshere ML, Hautzinger M, Heilbronner U, Herms S, Hipolito M, Hoffmann P, Holland D, Huckins L, Jamain S, Johnson JS, Juréus A, Kandaswamy R, Karlsson R, Kennedy JL, Kittel-Schneider S, Knowles JA, Kogevinas M, Koller AC, Kupka R, Lavebratt C, Lawrence J, Lawson WB, Leber M, Lee PH, Levy SE, Li JZ, Liu C, Lucae S, Maaser A, MacIntyre DJ, Mahon PB, Maier W, Martinsson L, McCarroll S, McGuffin P, McInnis MG, McKay JD, Medeiros H, Medland SE, Meng F, Milani L, Montgomery GW, Morris DW, Mühleisen TW, Mullins N, Nguyen H, Nievergelt CM, Adolfsson AN, Nwulia EA, O'Donovan C, Loohuis LMO, Ori APS, Oruc L, Ösby U, Perlis RH, Perry A, Pfennig A, Potash JB, Purcell SM, Regeer EJ, Reif A, Reinbold CS, Rice JP, Rivas F, Rivera M, Roussos P, Ruderfer DM, Ryu E, Sánchez-Mora C, Schatzberg AF, Scheftner WA, Schork NJ, Shannon Weickert C, Shehktman T, Shilling PD, Sigurdsson E, Slaney C, Smeland OB, Sobell JL, Søholm Hansen C, Spijker AT, St Clair D, Steffens M, Strauss JS, Streit F, Strohmaier J, Szelinger S, Thompson RC, Thorgeirsson TE, Treutlein J, Vedder H, Wang W, Watson SJ, Weickert TW, Witt SH, Xi S, Xu W, Young AH, Zandi P, Zhang P, Zöllner S, Adolfsson R, Agartz I, Alda M, Backlund L, Baune BT, Bellivier F, Berrettini WH, Biernacka JM, Blackwood DHR, Boehnke M, Børglum AD, Corvin A, Craddock N, Daly MJ, Dannlowski U, Esko T, Etain B, Frye M, Fullerton JM, Gershon ES, Gill M, Goes F, Grigoroiu-Serbanescu M, Hauser J, Hougaard DM, Hultman CM, Jones I, Jones LA, Kahn RS, Kirov G, Landén M, Leboyer M, Lewis CM, Li QS, Lissowska J, Martin NG, Mayoral F, McElroy SL, McIntosh AM, McMahon FJ, Melle I, Metspalu A, Mitchell PB, Morken G, Mors O, Mortensen PB, Müller-Myhsok B, Myers RM, Neale BM, Nimgaonkar V, Nordentoft M, Nöthen MM, O'Donovan MC, Oedegaard KJ, Owen MJ, Paciga SA, Pato C, Pato MT, Posthuma D, Ramos-Quiroga JA, Ribasés M, Rietschel M, Rouleau GA, Schalling M, Schofield PR, Schulze TG, Serretti A, Smoller JW, Stefansson H, Stefansson K, Stordal E, Sullivan PF, Turecki G, Vaaler AE, Vieta E, Vincent JB, Werge T, Nurnberger JI, Wray NR, Di Florio A, Edenberg HJ, Cichon S, Ophoff RA, Scott LJ, Andreassen OA, Kelsoe J, Sklar P. Genome-wide association study identifies 30 loci associated with bipolar disorder. Nat Genet 2019;51:793-803. [PMID: 31043756 PMCID: PMC6956732 DOI: 10.1038/s41588-019-0397-8] [Citation(s) in RCA: 901] [Impact Index Per Article: 180.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2018] [Accepted: 03/18/2019] [Indexed: 12/18/2022] Abstract Bipolar disorder is a highly heritable psychiatric disorder. We performed a genome-wide association study (GWAS) including 20,352 cases and 31,358 controls of European descent, with follow-up analysis of 822 variants with P < 1 × 10-4 in an additional 9,412 cases and 137,760 controls. Eight of the 19 variants that were genome-wide significant (P < 5 × 10-8) in the discovery GWAS were not genome-wide significant in the combined analysis, consistent with small effect sizes and limited power but also with genetic heterogeneity. In the combined analysis, 30 loci were genome-wide significant, including 20 newly identified loci. The significant loci contain genes encoding ion channels, neurotransmitter transporters and synaptic components. Pathway analysis revealed nine significantly enriched gene sets, including regulation of insulin secretion and endocannabinoid signaling. Bipolar I disorder is strongly genetically correlated with schizophrenia, driven by psychosis, whereas bipolar II disorder is more strongly correlated with major depressive disorder. These findings address key clinical questions and provide potential biological mechanisms for bipolar disorder. Collapse Key Words Collapse MESH Headings Bipolar Disorder/classification Bipolar Disorder/genetics Case-Control Studies Depressive Disorder, Major/genetics Female Genetic Loci Genetic Predisposition to Disease Genome-Wide Association Study Humans Male Polymorphism, Single Nucleotide Psychotic Disorders/genetics Schizophrenia/genetics Systems Biology Collapse Grants R01 MH104964 NIMH NIH HHS MR/L010305/1 Medical Research Council G1000708 Medical Research Council U01 MH109536 NIMH NIH HHS U01 MH109514 NIMH NIH HHS R01 MH085548 NIMH NIH HHS R00 MH101367 NIMH NIH HHS R01 MH123451 NIMH NIH HHS Wellcome Trust MR/L023784/2 Medical Research Council 001 World Health Organization R01 MH119243 NIMH NIH HHS Collapse
35	Budde M, Friedrichs S, Alliey-Rodriguez N, Ament S, Badner JA, Berrettini WH, Bloss CS, Byerley W, Cichon S, Comes AL, Coryell W, Craig DW, Degenhardt F, Edenberg HJ, Foroud T, Forstner AJ, Frank J, Gershon ES, Goes FS, Greenwood TA, Guo Y, Hipolito M, Hood L, Keating BJ, Koller DL, Lawson WB, Liu C, Mahon PB, McInnis MG, McMahon FJ, Meier SM, Mühleisen TW, Murray SS, Nievergelt CM, Nurnberger JI, Nwulia EA, Potash JB, Quarless D, Rice J, Roach JC, Scheftner WA, Schork NJ, Shekhtman T, Shilling PD, Smith EN, Streit F, Strohmaier J, Szelinger S, Treutlein J, Witt SH, Zandi PP, Zhang P, Zöllner S, Bickeböller H, Falkai PG, Kelsoe JR, Nöthen MM, Rietschel M, Schulze TG, Malzahn D. Efficient region-based test strategy uncovers genetic risk factors for functional outcome in bipolar disorder. Eur Neuropsychopharmacol 2019;29:156-170. [PMID: 30503783 DOI: 10.1016/j.euroneuro.2018.10.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/12/2018] [Revised: 10/16/2018] [Accepted: 10/23/2018] [Indexed: 11/21/2022] Abstract Genome-wide association studies of case-control status have advanced the understanding of the genetic basis of psychiatric disorders. Further progress may be gained by increasing sample size but also by new analysis strategies that advance the exploitation of existing data, especially for clinically important quantitative phenotypes. The functionally-informed efficient region-based test strategy (FIERS) introduced herein uses prior knowledge on biological function and dependence of genotypes within a powerful statistical framework with improved sensitivity and specificity for detecting consistent genetic effects across studies. As proof of concept, FIERS was used for the first genome-wide single nucleotide polymorphism (SNP)-based investigation on bipolar disorder (BD) that focuses on an important aspect of disease course, the functional outcome. FIERS identified a significantly associated locus on chromosome 15 (hg38: chr15:48965004 - 49464789 bp) with consistent effect strength between two independent studies (GAIN/TGen: European Americans, BOMA: Germans; n = 1592 BD patients in total). Protective and risk haplotypes were found on the most strongly associated SNPs. They contain a CTCF binding site (rs586758); CTCF sites are known to regulate sets of genes within a chromatin domain. The rs586758 - rs2086256 - rs1904317 haplotype is located in the promoter flanking region of the COPS2 gene, close to microRNA4716, and the EID1, SHC4, DTWD1 genes as plausible biological candidates. While implication with BD is novel, COPS2, EID1, and SHC4 are known to be relevant for neuronal differentiation and function and DTWD1 for psychopharmacological side effects. The test strategy FIERS that enabled this discovery is equally applicable for tag SNPs and sequence data. Collapse Key Words Functional annotation Global Assessment of Functioning Hypothesis-driven GWAS Kernel score test Linkage disequilibrium Psychiatric disorder Collapse MESH Headings Adolescent Adult Aged Bipolar Disorder/diagnosis Bipolar Disorder/genetics Bipolar Disorder/physiopathology Bipolar Disorder/psychology Case-Control Studies Female Genetic Predisposition to Disease/genetics Genome-Wide Association Study Genotype Haplotypes Humans Linkage Disequilibrium/genetics Male Middle Aged Models, Statistical Polymorphism, Single Nucleotide/genetics Prognosis Psychiatric Status Rating Scales White People/genetics Young Adult Collapse Grants R01 MH059535 NIMH NIH HHS Z01 MH002810 NIMH NIH HHS R01 MH059545 NIMH NIH HHS R01 MH059567 NIMH NIH HHS R01 MH059548 NIMH NIH HHS R01 MH059534 NIMH NIH HHS R01 MH059533 NIMH NIH HHS R01 MH059556 NIMH NIH HHS R01 MH059553 NIMH NIH HHS R01 MH060068 NIMH NIH HHS Collapse
36	Carlson J, Li JZ, Zöllner S. Helmsman: fast and efficient mutation signature analysis for massive sequencing datasets. BMC Genomics 2018;19:845. [PMID: 30486787 PMCID: PMC6263557 DOI: 10.1186/s12864-018-5264-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2018] [Accepted: 11/19/2018] [Indexed: 12/14/2022] Open Abstract Background The spectrum of somatic single-nucleotide variants in cancer genomes often reflects the signatures of multiple distinct mutational processes, which can provide clinically actionable insights into cancer etiology. Existing software tools for identifying and evaluating these mutational signatures do not scale to analyze large datasets containing thousands of individuals or millions of variants. Results We introduce Helmsman, a program designed to perform mutation signature analysis on arbitrarily large sequencing datasets. Helmsman is up to 300 times faster than existing software. Helmsman’s memory usage is independent of the number of variants, resulting in a small enough memory footprint to analyze datasets that would otherwise exceed the memory limitations of other programs. Conclusions Helmsman is a computationally efficient tool that enables users to evaluate mutational signatures in massive sequencing datasets that are otherwise intractable with existing software. Helmsman is freely available at https://github.com/carjed/helmsman. Electronic supplementary material The online version of this article (10.1186/s12864-018-5264-y) contains supplementary material, which is available to authorized users. Collapse Key Words Cancer genomics Mutational signatures Python Single nucleotide variants Collapse MESH Headings Collapse Grants Collapse
37	Breuer R, Mattheisen M, Frank J, Krumm B, Treutlein J, Kassem L, Strohmaier J, Herms S, Mühleisen TW, Degenhardt F, Cichon S, Nöthen MM, Karypis G, Kelsoe J, Greenwood T, Nievergelt C, Shilling P, Shekhtman T, Edenberg H, Craig D, Szelinger S, Nurnberger J, Gershon E, Alliey-Rodriguez N, Zandi P, Goes F, Schork N, Smith E, Koller D, Zhang P, Badner J, Berrettini W, Bloss C, Byerley W, Coryell W, Foroud T, Guo Y, Hipolito M, Keating B, Lawson W, Liu C, Mahon P, McInnis M, Murray S, Nwulia E, Potash J, Rice J, Scheftner W, Zöllner S, McMahon FJ, Rietschel M, Schulze TG. Detecting significant genotype-phenotype association rules in bipolar disorder: market research meets complex genetics. Int J Bipolar Disord 2018;6:24. [PMID: 30415424 PMCID: PMC6230336 DOI: 10.1186/s40345-018-0132-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/19/2018] [Accepted: 08/22/2018] [Indexed: 12/21/2022] Open Abstract Background Disentangling the etiology of common, complex diseases is a major challenge in genetic research. For bipolar disorder (BD), several genome-wide association studies (GWAS) have been performed. Similar to other complex disorders, major breakthroughs in explaining the high heritability of BD through GWAS have remained elusive. To overcome this dilemma, genetic research into BD, has embraced a variety of strategies such as the formation of large consortia to increase sample size and sequencing approaches. Here we advocate a complementary approach making use of already existing GWAS data: a novel data mining procedure to identify yet undetected genotype–phenotype relationships. We adapted association rule mining, a data mining technique traditionally used in retail market research, to identify frequent and characteristic genotype patterns showing strong associations to phenotype clusters. We applied this strategy to three independent GWAS datasets from 2835 phenotypically characterized patients with BD. In a discovery step, 20,882 candidate association rules were extracted. Results Two of these rules—one associated with eating disorder and the other with anxiety—remained significant in an independent dataset after robust correction for multiple testing. Both showed considerable effect sizes (odds ratio ~ 3.4 and 3.0, respectively) and support previously reported molecular biological findings. Conclusion Our approach detected novel specific genotype–phenotype relationships in BD that were missed by standard analyses like GWAS. While we developed and applied our method within the context of BD gene discovery, it may facilitate identifying highly specific genotype–phenotype relationships in subsets of genome-wide data sets of other complex phenotype with similar epidemiological properties and challenges to gene discovery efforts. Electronic supplementary material The online version of this article (10.1186/s40345-018-0132-x) contains supplementary material, which is available to authorized users. Collapse Key Words Bipolar disorder Data mining Genotype–phenotype patterns Rule discovery Subphenotypes Collapse MESH Headings Collapse Grants Collapse
38	Carlson J, Locke AE, Flickinger M, Zawistowski M, Levy S, Myers RM, Boehnke M, Kang HM, Scott LJ, Li JZ, Zöllner S. Extremely rare variants reveal patterns of germline mutation rate heterogeneity in humans. Nat Commun 2018;9:3753. [PMID: 30218074 PMCID: PMC6138700 DOI: 10.1038/s41467-018-05936-5] [Citation(s) in RCA: 80] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 07/30/2018] [Indexed: 12/30/2022] Open Abstract A detailed understanding of the genome-wide variability of single-nucleotide germline mutation rates is essential to studying human genome evolution. Here, we use ~36 million singleton variants from 3560 whole-genome sequences to infer fine-scale patterns of mutation rate heterogeneity. Mutability is jointly affected by adjacent nucleotide context and diverse genomic features of the surrounding region, including histone modifications, replication timing, and recombination rate, sometimes suggesting specific mutagenic mechanisms. Remarkably, GC content, DNase hypersensitivity, CpG islands, and H3K36 trimethylation are associated with both increased and decreased mutation rates depending on nucleotide context. We validate these estimated effects in an independent dataset of ~46,000 de novo mutations, and confirm our estimates are more accurate than previously published results based on ancestrally older variants without considering genomic features. Our results thus provide the most refined portrait to date of the factors contributing to genome-wide variability of the human germline mutation rate. Collapse Key Words Collapse MESH Headings Base Composition CpG Islands Cytosine DNA Methylation Deoxyribonucleases Evolution, Molecular Genetic Variation Genome, Human Germ-Line Mutation/genetics Guanine Histone Code Humans Mutation Rate Polymorphism, Single Nucleotide Collapse Grants U01MH105653 U.S. Department of Health & Human Services \| National Institutes of Health (NIH) R01 GM118928 NIGMS NIH HHS R01MH094145 U.S. Department of Health & Human Services \| National Institutes of Health (NIH) T32 HG000040 NHGRI NIH HHS R01 LM012848 NLM NIH HHS R01 MH085548 NIMH NIH HHS R01 DA043501 NIDA NIH HHS P30 DK020572 NIDDK NIH HHS T32HG00040 U.S. Department of Health & Human Services \| National Institutes of Health (NIH) R01 MH094145 NIMH NIH HHS U01 MH105653 NIMH NIH HHS R01GM118928 U.S. Department of Health & Human Services \| National Institutes of Health (NIH) Collapse
39	Reppell M, Zöllner S. An efficient algorithm for generating the internal branches of a Kingman coalescent. Theor Popul Biol 2018;122:57-66. [PMID: 28709926 PMCID: PMC5764821 DOI: 10.1016/j.tpb.2017.05.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2016] [Revised: 05/19/2017] [Accepted: 05/26/2017] [Indexed: 01/16/2023] Abstract Coalescent simulations are a widely used approach for simulating sample genealogies, but can become computationally burdensome in large samples. Methods exist to analytically calculate a sample's expected frequency spectrum without simulating full genealogies. However, statistics that rely on the distribution of the length of internal coalescent branches, such as the probability that two mutations of equal size arose on the same genealogical branch, have previously required full coalescent simulations to estimate. Here, we present a sampling method capable of efficiently generating limited portions of sample genealogies using a series of analytic equations that give probabilities for the number, start, and end of internal branches conditional on the number of final samples they subtend. These equations are independent of the coalescent waiting times and need only be calculated a single time, lending themselves to efficient computation. We compare our method with full coalescent simulations to show the resulting distribution of branch lengths and summary statistics are equivalent, but that for many conditions our method is at least 10 times faster. Collapse Key Words Coalescent Coalescent simulations Genealogical topology Collapse MESH Headings Algorithms Computer Simulation Genealogy and Heraldry Genetics, Population Humans Models, Genetic Mutation Pedigree Probability Collapse Grants R01 GM108805 NIGMS NIH HHS R01 HG000376 NHGRI NIH HHS T32 HG000040 NHGRI NIH HHS R56 HG000376 NHGRI NIH HHS R01 HG005855 NHGRI NIH HHS Collapse
40	Prossin AR, Chandler M, Ryan KA, Saunders EF, Kamali M, Papadopoulos V, Zöllner S, Dantzer R, McInnis MG. Functional TSPO polymorphism predicts variance in the diurnal cortisol rhythm in bipolar disorder. Psychoneuroendocrinology 2018;89:194-202. [PMID: 29414032 PMCID: PMC6048960 DOI: 10.1016/j.psyneuen.2018.01.013] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/13/2017] [Revised: 01/11/2018] [Accepted: 01/17/2018] [Indexed: 12/23/2022] Abstract INTRODUCTION Psychosocial stress contributes to onset/exacerbation of mood episodes and alcohol use, suggesting dysregulated diurnal cortisol rhythms underlie episodic exacerbations in Bipolar Disorder (BD). However, mechanisms underlying dysregulated HPA rhythms in BD and alcohol use disorders (AUD) are understudied. Knowledge of associated variance factors have great clinical translational potential by facilitating development of strategies to reduce stress-related relapse in BD and AUD. Evidence suggests structural changes to mitochondrial translocator protein (TSPO) (a regulator of steroid synthesis) due to the single nucleotide polymorphism rs6971, may explain much of this variance. However, whether rs6971 is associated with abnormal HPA rhythms and clinical exacerbation in humans is unknown. METHODS To show this common TSPO polymorphism impacts HPA rhythms in BD, we tested whether rs6971 (dichotomized: presence/absence of polymorphism) predicted variance in diurnal cortisol rhythm (saliva: morning and evening for 3 days) in 107 BD (50 with and 57 without AUD) and 28 healthy volunteers of similar age and ethno-demographic distribution. RESULTS Repeated measures ANOVA confirmed effects BD (F5,525 = 3.0, p = 0.010) and AUD (F5,525 = 2.9, p = 0.012), but not TSPO polymorphism (p > 0.05). Interactions were confirmed for TSPO × BD (F5,525 = 3.9, p = 0.002) and for TSPO × AUD (F5,525 = 2.8, p = 0.017). DISCUSSION We identified differences in diurnal cortisol rhythm depending on presence/absence of common TSPO polymorphism in BD volunteers with or without AUD and healthy volunteers. These results have wide ranging implications but further validation is needed prior to optimal clinical translation. Collapse Key Words Alcohol use disorder Biomarker Bipolar disorder Cortisol Diurnal rhythm Genetics HPA axis Immune Precision medicine Stress TSPO Variance factor rs6971 Collapse MESH Headings Adult Alcoholism/genetics Alcoholism/metabolism Alleles Bipolar Disorder/genetics Circadian Rhythm/physiology Female Gene Frequency/genetics Humans Hydrocortisone/metabolism Hypothalamo-Hypophyseal System/physiopathology Male Middle Aged Pituitary-Adrenal System/physiopathology Polymorphism, Single Nucleotide/genetics Receptors, GABA/genetics Receptors, GABA/metabolism Saliva/chemistry Collapse Grants K99 DA033454 NIDA NIH HHS KL2 TR000434 NCATS NIH HHS KL2 TR002241 NCATS NIH HHS R00 DA033454 NIDA NIH HHS Collapse
41	Boyce M, Warrington S, Cortezi B, Zöllner S, Vauléon S, Swinkels DW, Summo L, Schwoebel F, Riecke K. Safety, pharmacokinetics and pharmacodynamics of the anti-hepcidin Spiegelmer lexaptepid pegol in healthy subjects. Br J Pharmacol 2016;173:1580-8. [PMID: 26773325 PMCID: PMC4842915 DOI: 10.1111/bph.13433] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2015] [Revised: 01/11/2016] [Accepted: 01/11/2016] [Indexed: 01/16/2023] Open Abstract BACKGROUND AND PURPOSE Anaemia of chronic disease is characterized by impaired erythropoiesis due to functional iron deficiency, often caused by excessive hepcidin. Lexaptepid pegol, a pegylated structured l-oligoribonucleotide, binds and inactivates hepcidin. EXPERIMENTAL APPROACH We conducted a placebo-controlled study on the safety, pharmacokinetics and pharmacodynamics of lexaptepid after single and repeated i.v. and s.c. administration to 64 healthy subjects at doses from 0.3 to 4.8 mg·kg(-1) . KEY RESULTS After treatment with lexaptepid, serum iron concentration and transferrin increased dose-dependently. Iron increased from approximately 20 μmol·L(-1) at baseline by 67% at 8 h after i.v. infusion of 1.2 mg·kg(-1) lexaptepid. The pharmacokinetics showed dose-proportional increases in peak plasma concentrations and moderately over-proportional increases in systemic exposure. Lexaptepid had no effect on hepcidin production or anti-drug antibodies. Treatment with lexaptepid was generally safe and well tolerated, with mild and transient transaminase increases at doses ≥2.4 mg·kg(-1) and with local injection site reactions after s.c. but not after i.v. administration. CONCLUSIONS AND IMPLICATIONS Lexaptepid pegol inhibited hepcidin and dose-dependently raised serum iron and transferrin saturation. The compound is being further developed to treat anaemia of chronic disease. Collapse Key Words Collapse MESH Headings Dose-Response Relationship, Drug Double-Blind Method Drug Monitoring Female Healthy Volunteers Hepcidins/antagonists & inhibitors Humans Iron/blood Male Oligoribonucleotides/administration & dosage Oligoribonucleotides/adverse effects Oligoribonucleotides/pharmacokinetics Structure-Activity Relationship Transferrin/analysis Collapse Grants Collapse
42	Li M, Rothwell R, Vermaat M, Wachsmuth M, Schröder R, Laros JFJ, van Oven M, de Bakker PIW, Bovenberg JA, van Duijn CM, van Ommen GJB, Slagboom PE, Swertz MA, Wijmenga C, Kayser M, Boomsma DI, Zöllner S, de Knijff P, Stoneking M. Transmission of human mtDNA heteroplasmy in the Genome of the Netherlands families: support for a variable-size bottleneck. Genome Res 2016;26:417-26. [PMID: 26916109 PMCID: PMC4817766 DOI: 10.1101/gr.203216.115] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Accepted: 01/21/2016] [Indexed: 12/17/2022] Abstract Although previous studies have documented a bottleneck in the transmission of mtDNA genomes from mothers to offspring, several aspects remain unclear, including the size and nature of the bottleneck. Here, we analyze the dynamics of mtDNA heteroplasmy transmission in the Genomes of the Netherlands (GoNL) data, which consists of complete mtDNA genome sequences from 228 trios, eight dizygotic (DZ) twin quartets, and 10 monozygotic (MZ) twin quartets. Using a minor allele frequency (MAF) threshold of 2%, we identified 189 heteroplasmies in the trio mothers, of which 59% were transmitted to offspring, and 159 heteroplasmies in the trio offspring, of which 70% were inherited from the mothers. MZ twin pairs exhibited greater similarity in MAF at heteroplasmic sites than DZ twin pairs, suggesting that the heteroplasmy MAF in the oocyte is the major determinant of the heteroplasmy MAF in the offspring. We used a likelihood method to estimate the effective number of mtDNA genomes transmitted to offspring under different bottleneck models; a variable bottleneck size model provided the best fit to the data, with an estimated mean of nine individual mtDNA genomes transmitted. We also found evidence for negative selection during transmission against novel heteroplasmies (in which the minor allele has never been observed in polymorphism data). These novel heteroplasmies are enhanced for tRNA and rRNA genes, and mutations associated with mtDNA diseases frequently occur in these genes. Our results thus suggest that the female germ line is able to recognize and select against deleterious heteroplasmies. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
43	Tang CS, Zhang H, Cheung CYY, Xu M, Ho JCY, Zhou W, Cherny SS, Zhang Y, Holmen O, Au KW, Yu H, Xu L, Jia J, Porsch RM, Sun L, Xu W, Zheng H, Wong LY, Mu Y, Dou J, Fong CHY, Wang S, Hong X, Dong L, Liao Y, Wang J, Lam LSM, Su X, Yan H, Yang ML, Chen J, Siu CW, Xie G, Woo YC, Wu Y, Tan KCB, Hveem K, Cheung BMY, Zöllner S, Xu A, Eugene Chen Y, Jiang CQ, Zhang Y, Lam TH, Ganesh SK, Huo Y, Sham PC, Lam KSL, Willer CJ, Tse HF, Gao W. Exome-wide association analysis reveals novel coding sequence variants associated with lipid traits in Chinese. Nat Commun 2015;6:10206. [PMID: 26690388 PMCID: PMC4703860 DOI: 10.1038/ncomms10206] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2015] [Accepted: 11/13/2015] [Indexed: 12/19/2022] Open Abstract Blood lipids are important risk factors for coronary artery disease (CAD). Here we perform an exome-wide association study by genotyping 12,685 Chinese, using a custom Illumina HumanExome BeadChip, to identify additional loci influencing lipid levels. Single-variant association analysis on 65,671 single nucleotide polymorphisms reveals 19 loci associated with lipids at exome-wide significance (P<2.69 × 10⁻⁷), including three Asian-specific coding variants in known genes (CETP p.Asp459Gly, PCSK9 p.Arg93Cys and LDLR p.Arg257Trp). Furthermore, missense variants at two novel loci—PNPLA3 p.Ile148Met and PKD1L3 p.Thr429Ser—also influence levels of triglycerides and low-density lipoprotein cholesterol, respectively. Another novel gene, TEAD2, is found to be associated with high-density lipoprotein cholesterol through gene-based association analysis. Most of these newly identified coding variants show suggestive association (P<0.05) with CAD. These findings demonstrate that exome-wide genotyping on samples of non-European ancestry can identify additional population-specific possible causal variants, shedding light on novel lipid biology and CAD. An important risk factor for coronary artery disease is the level of blood lipids. Here the authors conduct an exome-wide association study in Chinese cohorts and identify three novel loci associated with lipid levels as well as three Asian-specific variants in known loci. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
44	Lo Y, Zhang L, Foxman B, Zöllner S. Whole-genome sequencing of uropathogenic Escherichia coli reveals long evolutionary history of diversity and virulence. INFECTION GENETICS AND EVOLUTION 2015;34:244-50. [PMID: 26112070 DOI: 10.1016/j.meegid.2015.06.023] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2015] [Revised: 06/17/2015] [Accepted: 06/20/2015] [Indexed: 01/07/2023] Abstract Uropathogenic Escherichia coli (UPEC) are phenotypically and genotypically very diverse. This diversity makes it challenging to understand the evolution of UPEC adaptations responsible for causing urinary tract infections (UTI). To gain insight into the relationship between evolutionary divergence and adaptive paths to uropathogenicity, we sequenced at deep coverage (190×) the genomes of 19 E. coli strains from urinary tract infection patients from the same geographic area. Our sample consisted of 14 UPEC isolates and 5 non-UTI-causing (commensal) rectal E. coli isolates. After identifying strain variants using de novo assembly-based methods, we clustered the strains based on pairwise sequence differences using a neighbor-joining algorithm. We examined evolutionary signals on the whole-genome phylogeny and contrasted these signals with those found on gene trees constructed based on specific uropathogenic virulence factors. The whole-genome phylogeny showed that the divergence between UPEC and commensal E. coli strains without known UPEC virulence factors happened over 32 million generations ago. Pairwise diversity between any two strains was also high, suggesting multiple genetic origins of uropathogenic strains in a small geographic region. Contrasting the whole-genome phylogeny with three gene trees constructed from common uropathogenic virulence factors, we detected no selective advantage of these virulence genes over other genomic regions. These results suggest that UPEC acquired uropathogenicity long time ago and used it opportunistically to cause extraintestinal infections. Collapse Key Words Next-generation sequencing Phylogeny Uropathogen Collapse MESH Headings Collapse Grants Collapse
45	Lin KH, Zöllner S. Robust and Powerful Affected Sibpair Test for Rare Variant Association. Genet Epidemiol 2015;39:325-33. [PMID: 25966809 DOI: 10.1002/gepi.21903] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2014] [Revised: 03/25/2015] [Accepted: 04/01/2015] [Indexed: 11/09/2022] Abstract Advances in DNA sequencing technology facilitate investigating the impact of rare variants on complex diseases. However, using a conventional case-control design, large samples are needed to capture enough rare variants to achieve sufficient power for testing the association between suspected loci and complex diseases. In such large samples, population stratification may easily cause spurious signals. One approach to overcome stratification is to use a family-based design. For rare variants, this strategy is especially appropriate, as power can be increased considerably by analyzing cases with affected relatives. We propose a novel framework for association testing in affected sibpairs by comparing the allele count of rare variants on chromosome regions shared identical by descent to the allele count of rare variants on nonshared chromosome regions, referred to as test for rare variant association with family-based internal control (TRAFIC). This design is generally robust to population stratification as cases and controls are matched within each sibpair. We evaluate the power analytically using general model for effect size of rare variants. For the same number of genotyped people, TRAFIC shows superior power over the conventional case-control study for variants with summed risk allele frequency f < 0.05; this power advantage is even more substantial when considering allelic heterogeneity. For complex models of gene-gene interaction, this power advantage depends on the direction of interaction and overall heritability. In sum, we introduce a new method for analyzing rare variants in affected sibpairs that is robust to population stratification, and provide freely available software. Collapse Key Words association test dichotomous traits family studies rare variants sequencing Collapse MESH Headings Collapse Grants Collapse
46	Lo Y, Kang HM, Nelson MR, Othman MI, Chissoe SL, Ehm MG, Abecasis GR, Zöllner S. Comparing variant calling algorithms for target-exon sequencing in a large sample. BMC Bioinformatics 2015;16:75. [PMID: 25884587 PMCID: PMC4359451 DOI: 10.1186/s12859-015-0489-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2014] [Accepted: 02/03/2015] [Indexed: 12/30/2022] Open Abstract BACKGROUND Sequencing studies of exonic regions aim to identify rare variants contributing to complex traits. With high coverage and large sample size, these studies tend to apply simple variant calling algorithms. However, coverage is often heterogeneous; sites with insufficient coverage may benefit from sophisticated calling algorithms used in low-coverage sequencing studies. We evaluate the potential benefits of different calling strategies by performing a comparative analysis of variant calling methods on exonic data from 202 genes sequenced at 24x in 7,842 individuals. We call variants using individual-based, population-based and linkage disequilibrium (LD)-aware methods with stringent quality control. We measure genotype accuracy by the concordance with on-target GWAS genotypes and between 80 pairs of sequencing replicates. We validate selected singleton variants using capillary sequencing. RESULTS Using these calling methods, we detected over 27,500 variants at the targeted exons; >57% were singletons. The singletons identified by individual-based analyses were of the highest quality. However, individual-based analyses generated more missing genotypes (4.72%) than population-based (0.47%) and LD-aware (0.17%) analyses. Moreover, individual-based genotypes were the least concordant with array-based genotypes and replicates. Population-based genotypes were less concordant than genotypes from LD-aware analyses with extended haplotypes. We reanalyzed the same dataset with a second set of callers and showed again that the individual-based caller identified more high-quality singletons than the population-based caller. We also replicated this result in a second dataset of 57 genes sequenced at 127.5x in 3,124 individuals. CONCLUSIONS We recommend population-based analyses for high quality variant calls with few missing genotypes. With extended haplotypes, LD-aware methods generate the most accurate and complete genotypes. In addition, individual-based analyses should complement the above methods to obtain the most singleton variants. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
47	Maier R, Moser G, Chen GB, Ripke S, Coryell W, Potash JB, Scheftner WA, Shi J, Weissman MM, Hultman CM, Landén M, Levinson DF, Kendler KS, Smoller JW, Wray NR, Lee SH, Absher D, Agartz I, Akil H, Amin F, Andreassen O, Anjorin A, Anney R, Arking D, Asherson P, Azevedo M, Backlund L, Badner J, Bailey A, Banaschewski T, Barchas J, Barnes M, Barrett T, Bass N, Battaglia A, Bauer M, Bayés M, Bellivier F, Bergen S, Berrettini W, Betancur C, Bettecken T, Biederman J, Binder E, Black D, Blackwood D, Bloss C, Boehnke M, Boomsma D, Breen G, Breuer R, Bruggeman R, Buccola N, Buitelaar J, Bunney W, Buxbaum J, Byerley W, Caesar S, Cahn W, Cantor R, Casas M, Chakravarti A, Chambert K, Choudhury K, Cichon S, Cloninger C, Collier D, Cook E, Coon H, Cormand B, Cormican P, Corvin A, Coryell W, Craddock N, Craig D, Craig I, Crosbie J, Cuccaro M, Curtis D, Czamara D, Daly M, Datta S, Dawson G, Day R, De Geus E, Degenhardt F, Devlin B, Djurovic S, Donohoe G, Doyle A, Duan J, Dudbridge F, Duketis E, Ebstein R, Edenberg H, Elia J, Ennis S, Etain B, Fanous A, Faraone S, Farmer A, Ferrier I, Flickinger M, Fombonne E, Foroud T, Frank J, Franke B, Fraser C, Freedman R, Freimer N, Freitag C, Friedl M, Frisén L, Gallagher L, Gejman P, Georgieva L, Gershon E, Geschwind D, Giegling I, Gill M, Gordon S, Gordon-Smith K, Green E, Greenwood T, Grice D, Gross M, Grozeva D, Guan W, Gurling H, De Haan L, Haines J, Hakonarson H, Hallmayer J, Hamilton S, Hamshere M, Hansen T, Hartmann A, Hautzinger M, Heath A, Henders A, Herms S, Hickie I, Hipolito M, Hoefels S, Holmans P, Holsboer F, Hoogendijk W, Hottenga JJ, Hultman C, Hus V, Ingason A, Ising M, Jamain S, Jones I, Jones L, Kähler A, Kahn R, Kandaswamy R, Keller M, Kelsoe J, Kendler K, Kennedy J, Kenny E, Kent L, Kim Y, Kirov G, Klauck S, Klei L, Knowles J, Kohli M, Koller D, Konte B, Korszun A, Krabbendam L, Krasucki R, Kuntsi J, Kwan P, Landén M, Långström N, Lathrop M, Lawrence J, Lawson W, Leboyer M, Ledbetter D, Lee P, Lencz T, Lesch KP, Levinson D, Lewis C, Li J, Lichtenstein P, Lieberman J, Lin DY, Linszen D, Liu C, Lohoff F, Loo S, Lord C, Lowe J, Lucae S, MacIntyre D, Madden P, Maestrini E, Magnusson P, Mahon P, Maier W, Malhotra A, Mane S, Martin C, Martin N, Mattheisen M, Matthews K, Mattingsdal M, McCarroll S, McGhee K, McGough J, McGrath P, McGuffin P, McInnis M, McIntosh A, McKinney R, McLean A, McMahon F, McMahon W, McQuillin A, Medeiros H, Medland S, Meier S, Melle I, Meng F, Meyer J, Middeldorp C, Middleton L, Milanova V, Miranda A, Monaco A, Montgomery G, Moran J, Moreno-De-Luca D, Morken G, Morris D, Morrow E, Moskvina V, Mowry B, Muglia P, Mühleisen T, Müller-Myhsok B, Murtha M, Myers R, Myin-Germeys I, Neale B, Nelson S, Nievergelt C, Nikolov I, Nimgaonkar V, Nolen W, Nöthen M, Nurnberger J, Nwulia E, Nyholt D, O’Donovan M, O’Dushlaine C, Oades R, Olincy A, Oliveira G, Olsen L, Ophoff R, Osby U, Owen M, Palotie A, Parr J, Paterson A, Pato C, Pato M, Penninx B, Pergadia M, Pericak-Vance M, Perlis R, Pickard B, Pimm J, Piven J, Posthuma D, Potash J, Poustka F, Propping P, Purcell S, Puri V, Quested D, Quinn E, Ramos-Quiroga J, Rasmussen H, Raychaudhuri S, Rehnström K, Reif A, Ribasés M, Rice J, Rietschel M, Ripke S, Roeder K, Roeyers H, Rossin L, Rothenberger A, Rouleau G, Ruderfer D, Rujescu D, Sanders A, Sanders S, Santangelo S, Schachar R, Schalling M, Schatzberg A, Scheftner W, Schellenberg G, Scherer S, Schork N, Schulze T, Schumacher J, Schwarz M, Scolnick E, Scott L, Sergeant J, Shi J, Shilling P, Shyn S, Silverman J, Sklar P, Slager S, Smalley S, Smit J, Smith E, Smoller J, Sonuga-Barke E, St Clair D, State M, Steffens M, Steinhausen HC, Strauss J, Strohmaier J, Stroup T, Sullivan P, Sutcliffe J, Szatmari P, Szelinger S, Thapar A, Thirumalai S, Thompson R, Todorov A, Tozzi F, Treutlein J, Tzeng JY, Uhr M, van den Oord E, Van Grootheest G, Van Os J, Vicente A, Vieland V, Vincent J, Visscher P, Walsh C, Wassink T, Watson S, Weiss L, Weissman M, Werge T, Wienker T, Wiersma D, Wijsman E, Willemsen G, Williams N, Willsey A, Witt S, Wray N, Xu W, Young A, Yu T, Zammit S, Zandi P, Zhang P, Zitman F, Zöllner S. Joint analysis of psychiatric disorders increases accuracy of risk prediction for schizophrenia, bipolar disorder, and major depressive disorder. Am J Hum Genet 2015;96:283-94. [PMID: 25640677 PMCID: PMC4320268 DOI: 10.1016/j.ajhg.2014.12.006] [Citation(s) in RCA: 163] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2014] [Accepted: 12/08/2014] [Indexed: 12/11/2022] Open Abstract Genetic risk prediction has several potential applications in medical research and clinical practice and could be used, for example, to stratify a heterogeneous population of patients by their predicted genetic risk. However, for polygenic traits, such as psychiatric disorders, the accuracy of risk prediction is low. Here we use a multivariate linear mixed model and apply multi-trait genomic best linear unbiased prediction for genetic risk prediction. This method exploits correlations between disorders and simultaneously evaluates individual risk for each disorder. We show that the multivariate approach significantly increases the prediction accuracy for schizophrenia, bipolar disorder, and major depressive disorder in the discovery as well as in independent validation datasets. By grouping SNPs based on genome annotation and fitting multiple random effects, we show that the prediction accuracy could be further improved. The gain in prediction accuracy of the multivariate approach is equivalent to an increase in sample size of 34% for schizophrenia, 68% for bipolar disorder, and 76% for major depressive disorders using single trait models. Because our approach can be readily applied to any number of GWAS datasets of correlated traits, it is a flexible and powerful tool to maximize prediction accuracy. With current sample size, risk predictors are not useful in a clinical setting but already are a valuable research tool, for example in experimental designs comparing cases with high and low polygenic risk. Collapse Key Words Collapse MESH Headings Bipolar Disorder/genetics Depressive Disorder, Major/genetics Genetic Testing/methods Genetics, Medical/methods Humans Linear Models Mental Disorders/genetics Multifactorial Inheritance/genetics Multivariate Analysis Polymorphism, Single Nucleotide/genetics Risk Assessment/methods Schizophrenia/genetics Collapse Grants D43 TW009114 FIC NIH HHS R01 MH060912 NIMH NIH HHS P01 GM099568 NIGMS NIH HHS MR/L010305/1 Medical Research Council G1000708 Medical Research Council R01 MH059541 NIMH NIH HHS R01 MH085548 NIMH NIH HHS UL1 TR000142 NCATS NIH HHS R01 MH061686 NIMH NIH HHS R01 MH090553 NIMH NIH HHS R01 MH059552 NIMH NIH HHS U01 MH085520 NIMH NIH HHS R01 MH077139 NIMH NIH HHS 104036 Wellcome Trust G0800509 Medical Research Council G0300189 Medical Research Council R01 MH063480 NIMH NIH HHS R01 MH094293 NIMH NIH HHS R01 MH059542 NIMH NIH HHS R25 MH060482 NIMH NIH HHS R01 MH075131 NIMH NIH HHS Collapse
48	Wang C, Zhan X, Bragg-Gresham J, Kang HM, Stambolian D, Chew EY, Branham KE, Heckenlively J, Fulton R, Wilson RK, Mardis ER, Lin X, Swaroop A, Zöllner S, Abecasis GR. Ancestry estimation and control of population stratification for sequence-based association studies. Nat Genet 2014;46:409-15. [PMID: 24633160 PMCID: PMC4084909 DOI: 10.1038/ng.2924] [Citation(s) in RCA: 105] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2013] [Accepted: 02/21/2014] [Indexed: 12/15/2022] Abstract Estimating individual ancestry is important in genetic association studies where population structure leads to false positive signals, although assigning ancestry remains challenging with targeted sequence data. We propose a new method for the accurate estimation of individual genetic ancestry, based on direct analysis of off-target sequence reads, and implement our method in the publicly available LASER software. We validate the method using simulated and empirical data and show that the method can accurately infer worldwide continental ancestry when used with sequencing data sets with whole-genome shotgun coverage as low as 0.001×. For estimates of fine-scale ancestry within Europe, the method performs well with coverage of 0.1×. On an even finer scale, the method improves discrimination between exome-sequenced study participants originating from different provinces within Finland. Finally, we show that our method can be used to improve case-control matching in genetic association studies and to reduce the risk of spurious findings due to population structure. Collapse Key Words Collapse MESH Headings Base Sequence Computer Simulation Genetic Association Studies/methods Genetics, Population/methods Models, Genetic Molecular Sequence Data Polymorphism, Single Nucleotide/genetics Principal Component Analysis Sequence Analysis, DNA Software Collapse Grants EY022005 NEI NIH HHS R35 CA197449 NCI NIH HHS HG003079 NHGRI NIH HHS R56 HG000376 NHGRI NIH HHS R29 CA076404 NCI NIH HHS HG005855 NHGRI NIH HHS R01 CA076404 NCI NIH HHS Intramural NIH HHS HG007022 NHGRI NIH HHS P01 CA134294 NCI NIH HHS R01 HG007022 NHGRI NIH HHS HG000376 NHGRI NIH HHS R01 EY022005 NEI NIH HHS R56 DK062370 NIDDK NIH HHS R37 CA076404 NCI NIH HHS RC2 HG005552 NHGRI NIH HHS U01 HG006513 NHGRI NIH HHS CA134294 NCI NIH HHS R01 HG000376 NHGRI NIH HHS R01 DK062370 NIDDK NIH HHS R01 HG005855 NHGRI NIH HHS CA076404 NCI NIH HHS HG006513 NHGRI NIH HHS U01 DK062370 NIDDK NIH HHS U54 HG003079 NHGRI NIH HHS DK062370 NIDDK NIH HHS HG005552 NHGRI NIH HHS Collapse
49	Moroi SE, Raoof DA, Reed DM, Zöllner S, Qin Z, Richards JE. Progress toward personalized medicine for glaucoma. EXPERT REVIEW OF OPHTHALMOLOGY 2014;4:145-161. [PMID: 23914252 DOI: 10.1586/eop.09.6] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Abstract How will you respond when a patient asks, "Doctor, what can I do to prevent myself from going blind from glaucoma like mom?". There is optimism that genetic profiling will help target patients to individualized treatments based on validated disease risk alleles, validated pharmacogenetic markers and behavioral modification. Personalized medicine will become a reality through identification of disease and pharmacogenetic markers, followed by careful study of how to employ this information in order to improve treatment outcomes. With advances in genomic technologies, research has shifted from the simple monogenic disease model to a complex multigenic and environmental disease model to answer these questions. Our challenges lie in developing risk models that incorporate gene-gene interactions, gene copy-number variations, environmental interactions, treatment effects and clinical covariates. Collapse Key Words aqueous humor dynamics genetics glaucoma intraocular pressure personalized medicine pharmacogenetics pharmacogenomics Collapse MESH Headings Collapse Grants Collapse
50	Zawistowski M, Reppell M, Wegmann D, St Jean PL, Ehm MG, Nelson MR, Novembre J, Zöllner S. Analysis of rare variant population structure in Europeans explains differential stratification of gene-based tests. Eur J Hum Genet 2014;22:1137-44. [PMID: 24398795 DOI: 10.1038/ejhg.2013.297] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2013] [Revised: 11/27/2013] [Accepted: 11/28/2013] [Indexed: 11/09/2022] Open Abstract There is substantial interest in the role of rare genetic variants in the etiology of complex human diseases. Several gene-based tests have been developed to simultaneously analyze multiple rare variants for association with phenotypic traits. The tests can largely be partitioned into two classes - 'burden' tests and 'joint' tests - based on how they accumulate evidence of association across sites. We used the empirical joint site frequency spectra of rare, nonsynonymous variation from a large multi-population sequencing study to explore the effect of realistic rare variant population structure on gene-based tests. We observed an important difference between the two test classes: their susceptibility to population stratification. Focusing on European samples, we found that joint tests, which allow variants to have opposite directions of effect, consistently showed higher levels of P-value inflation than burden tests. We determined that the differential stratification was caused by two specific patterns in the interpopulation distribution of rare variants, each correlating with inflation in one of the test classes. The pattern that inflates joint tests is more prevalent in real data, explaining the higher levels of inflation in these tests. Furthermore, we show that the different sources of inflation between tests lead to heterogeneous responses to genomic control correction and the number of variants analyzed. Our results indicate that care must be taken when interpreting joint and burden analyses of the same set of rare variants, in particular, to avoid mistaking inflated P-values in joint tests for stronger signals of true associations. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse