1
|
Elston RC. An Accidental Genetic Epidemiologist. Annu Rev Genomics Hum Genet 2020; 21:15-36. [DOI: 10.1146/annurev-genom-103119-125052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
I briefly describe my early life and how, through a series of serendipitous events, I became a genetic epidemiologist. I discuss how the Elston–Stewart algorithm was discovered and its contribution to segregation, linkage, and association analysis. New linkage findings and paternity testing resulted from having a genotyping lab. The different meanings of interaction—statistical and biological—are clarified. The computer package S.A.G.E. (Statistical Analysis for Genetic Epidemiology), based on extensive method development over two decades, was conceived in 1986, flourished for 20 years, and is now freely available for use and further development. Finally, I describe methods to estimate and test hypotheses about familial correlations, and point out that the liability model often used to estimate disease heritability estimates the heritability of that liability, rather than of the disease itself, and so can be highly dependent on the assumed distribution of that liability.
Collapse
Affiliation(s)
- Robert C. Elston
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, Ohio 44106, USA
| |
Collapse
|
2
|
Amorim A, Pinto N. Big data in forensic genetics. Forensic Sci Int Genet 2018; 37:102-105. [PMID: 30142461 DOI: 10.1016/j.fsigen.2018.08.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Revised: 07/23/2018] [Accepted: 08/01/2018] [Indexed: 12/16/2022]
Abstract
The potential and difficulties of the application of genome wide data in forensics are analyzed. We argue that, besides statistical, computational, ethical, economic and technical validation problems, the state of the art of population genetics theory is insufficient to deal with the forensic use of this type of data. In order to keep the current standards of quantifying and reporting genetic evidence, namely in kinship analyses and identification, substantial improvement in the theoretical framework should be reached, since to obtain genome-wide results is to provide the experts with data that they cannot quantify the corresponding evidentiary value. Therefore, while a satisfactory, generalized theoretical and biostatistical modelling is not achieved, it may well be wiser to improve the already established approaches to a limited, pre-defined number of validated genetic markers, amenable to a consensual handling and reporting. Whole genome population analyses will prove extremely useful in selecting the best suited and most efficient of those markers.
Collapse
Affiliation(s)
- António Amorim
- Instituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP), Porto, Portugal; Instituto de Investigação e Inovação em Saúde (i3s), Universidade do Porto, Porto, Portugal; Faculdade de Ciências, Universidade do Porto, Porto, Portugal
| | - Nadia Pinto
- Instituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP), Porto, Portugal; Instituto de Investigação e Inovação em Saúde (i3s), Universidade do Porto, Porto, Portugal; CMUP, Centro de Matemática da Universidade do Porto, Porto, Portugal.
| |
Collapse
|
3
|
Fang H, Wu Y, Narzisi G, O'Rawe JA, Barrón LTJ, Rosenbaum J, Ronemus M, Iossifov I, Schatz MC, Lyon GJ. Reducing INDEL calling errors in whole genome and exome sequencing data. Genome Med 2014; 6:89. [PMID: 25426171 PMCID: PMC4240813 DOI: 10.1186/s13073-014-0089-z] [Citation(s) in RCA: 120] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2014] [Accepted: 10/16/2014] [Indexed: 12/30/2022] Open
Abstract
Background INDELs, especially those disrupting protein-coding regions of the genome, have been strongly associated with human diseases. However, there are still many errors with INDEL variant calling, driven by library preparation, sequencing biases, and algorithm artifacts. Methods We characterized whole genome sequencing (WGS), whole exome sequencing (WES), and PCR-free sequencing data from the same samples to investigate the sources of INDEL errors. We also developed a classification scheme based on the coverage and composition to rank high and low quality INDEL calls. We performed a large-scale validation experiment on 600 loci, and find high-quality INDELs to have a substantially lower error rate than low-quality INDELs (7% vs. 51%). Results Simulation and experimental data show that assembly based callers are significantly more sensitive and robust for detecting large INDELs (>5 bp) than alignment based callers, consistent with published data. The concordance of INDEL detection between WGS and WES is low (53%), and WGS data uniquely identifies 10.8-fold more high-quality INDELs. The validation rate for WGS-specific INDELs is also much higher than that for WES-specific INDELs (84% vs. 57%), and WES misses many large INDELs. In addition, the concordance for INDEL detection between standard WGS and PCR-free sequencing is 71%, and standard WGS data uniquely identifies 6.3-fold more low-quality INDELs. Furthermore, accurate detection with Scalpel of heterozygous INDELs requires 1.2-fold higher coverage than that for homozygous INDELs. Lastly, homopolymer A/T INDELs are a major source of low-quality INDEL calls, and they are highly enriched in the WES data. Conclusions Overall, we show that accuracy of INDEL detection with WGS is much greater than WES even in the targeted region. We calculated that 60X WGS depth of coverage from the HiSeq platform is needed to recover 95% of INDELs detected by Scalpel. While this is higher than current sequencing practice, the deeper coverage may save total project costs because of the greater accuracy and sensitivity. Finally, we investigate sources of INDEL errors (for example, capture deficiency, PCR amplification, homopolymers) with various data that will serve as a guideline to effectively reduce INDEL errors in genome sequencing. Electronic supplementary material The online version of this article (doi:10.1186/s13073-014-0089-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Han Fang
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY USA ; Stony Brook University, 100 Nicolls Rd, Stony Brook, NY USA ; Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY USA
| | - Yiyang Wu
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY USA ; Stony Brook University, 100 Nicolls Rd, Stony Brook, NY USA
| | - Giuseppe Narzisi
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY USA ; New York Genome Center, New York, NY USA
| | - Jason A O'Rawe
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY USA ; Stony Brook University, 100 Nicolls Rd, Stony Brook, NY USA
| | - Laura T Jimenez Barrón
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY USA ; Centro de Ciencias Genomicas, Universidad Nacional Autonoma de Mexico, Cuernavaca, Morelos Mexico
| | - Julie Rosenbaum
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY USA
| | - Michael Ronemus
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY USA
| | - Ivan Iossifov
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY USA
| | - Michael C Schatz
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY USA
| | - Gholson J Lyon
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY USA ; Stony Brook University, 100 Nicolls Rd, Stony Brook, NY USA
| |
Collapse
|
5
|
Schrodi SJ, Mukherjee S, Shan Y, Tromp G, Sninsky JJ, Callear AP, Carter TC, Ye Z, Haines JL, Brilliant MH, Crane PK, Smelser DT, Elston RC, Weeks DE. Genetic-based prediction of disease traits: prediction is very difficult, especially about the future. Front Genet 2014; 5:162. [PMID: 24917882 PMCID: PMC4040440 DOI: 10.3389/fgene.2014.00162] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2014] [Accepted: 05/15/2014] [Indexed: 01/08/2023] Open
Abstract
Translation of results from genetic findings to inform medical practice is a highly anticipated goal of human genetics. The aim of this paper is to review and discuss the role of genetics in medically-relevant prediction. Germline genetics presages disease onset and therefore can contribute prognostic signals that augment laboratory tests and clinical features. As such, the impact of genetic-based predictive models on clinical decisions and therapy choice could be profound. However, given that (i) medical traits result from a complex interplay between genetic and environmental factors, (ii) the underlying genetic architectures for susceptibility to common diseases are not well-understood, and (iii) replicable susceptibility alleles, in combination, account for only a moderate amount of disease heritability, there are substantial challenges to constructing and implementing genetic risk prediction models with high utility. In spite of these challenges, concerted progress has continued in this area with an ongoing accumulation of studies that identify disease predisposing genotypes. Several statistical approaches with the aim of predicting disease have been published. Here we summarize the current state of disease susceptibility mapping and pharmacogenetics efforts for risk prediction, describe methods used to construct and evaluate genetic-based predictive models, and discuss applications.
Collapse
Affiliation(s)
- Steven J Schrodi
- Center for Human Genetics, Marshfield Clinic Research Foundation Marshfield, WI, USA
| | - Shubhabrata Mukherjee
- Department of Medicine, School of Medicine, University of Washington Seattle, WA, USA
| | - Ying Shan
- Departments of Human Genetics and Biostatistics, Graduate School of Public Health, University of Pittsburgh PA, USA
| | - Gerard Tromp
- Sigfried and Janet Weis Center for Research, Geisinger Health System Danville, PA, USA
| | - John J Sninsky
- Subsidiary of Quest Diagnostics, Discovery Research, Celera Corporation Alameda, CA, USA
| | - Amy P Callear
- Center for Human Genetics, Marshfield Clinic Research Foundation Marshfield, WI, USA ; Department of Biological Sciences, University of Pittsburgh Pittsburgh, PA, USA
| | - Tonia C Carter
- Center for Human Genetics, Marshfield Clinic Research Foundation Marshfield, WI, USA
| | - Zhan Ye
- Biomedical Informatics Research Center, Marshfield Clinic Research Foundation Marshfield, WI, USA
| | - Jonathan L Haines
- Department of Epidemiology and Biostatistics, Case Western Reserve School of Medicine Cleveland, OH, USA
| | - Murray H Brilliant
- Center for Human Genetics, Marshfield Clinic Research Foundation Marshfield, WI, USA
| | - Paul K Crane
- Department of Medicine, School of Medicine, University of Washington Seattle, WA, USA
| | - Diane T Smelser
- Sigfried and Janet Weis Center for Research, Geisinger Health System Danville, PA, USA
| | - Robert C Elston
- Department of Epidemiology and Biostatistics, Case Western Reserve School of Medicine Cleveland, OH, USA
| | - Daniel E Weeks
- Departments of Human Genetics and Biostatistics, Graduate School of Public Health, University of Pittsburgh PA, USA
| |
Collapse
|