501
|
Denny JC, Crawford DC, Ritchie MD, Bielinski SJ, Basford MA, Bradford Y, Chai HS, Bastarache L, Zuvich R, Peissig P, Carrell D, Ramirez AH, Pathak J, Wilke RA, Rasmussen L, Wang X, Pacheco JA, Kho AN, Hayes MG, Weston N, Matsumoto M, Kopp PA, Newton KM, Jarvik GP, Li R, Manolio TA, Kullo IJ, Chute CG, Chisholm RL, Larson EB, McCarty CA, Masys DR, Roden DM, de Andrade M. Variants near FOXE1 are associated with hypothyroidism and other thyroid conditions: using electronic medical records for genome- and phenome-wide studies. Am J Hum Genet 2011; 89:529-42. [PMID: 21981779 PMCID: PMC3188836 DOI: 10.1016/j.ajhg.2011.09.008] [Citation(s) in RCA: 193] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2011] [Revised: 09/15/2011] [Accepted: 09/15/2011] [Indexed: 12/20/2022] Open
Abstract
We repurposed existing genotypes in DNA biobanks across the Electronic Medical Records and Genomics network to perform a genome-wide association study for primary hypothyroidism, the most common thyroid disease. Electronic selection algorithms incorporating billing codes, laboratory values, text queries, and medication records identified 1317 cases and 5053 controls of European ancestry within five electronic medical records (EMRs); the algorithms' positive predictive values were 92.4% and 98.5% for cases and controls, respectively. Four single-nucleotide polymorphisms (SNPs) in linkage disequilibrium at 9q22 near FOXE1 were associated with hypothyroidism at genome-wide significance, the strongest being rs7850258 (odds ratio [OR] 0.74, p = 3.96 × 10(-9)). This association was replicated in a set of 263 cases and 1616 controls (OR = 0.60, p = 5.7 × 10(-6)). A phenome-wide association study (PheWAS) that was performed on this locus with 13,617 individuals and more than 200,000 patient-years of billing data identified associations with additional phenotypes: thyroiditis (OR = 0.58, p = 1.4 × 10(-5)), nodular (OR = 0.76, p = 3.1 × 10(-5)) and multinodular (OR = 0.69, p = 3.9 × 10(-5)) goiters, and thyrotoxicosis (OR = 0.76, p = 1.5 × 10(-3)), but not Graves disease (OR = 1.03, p = 0.82). Thyroid cancer, previously associated with this locus, was not significantly associated in the PheWAS (OR = 1.29, p = 0.09). The strongest association in the PheWAS was hypothyroidism (OR = 0.76, p = 2.7 × 10(-13)), which had an odds ratio that was nearly identical to that of the curated case-control population in the primary analysis, providing further validation of the PheWAS method. Our findings indicate that EMR-linked genomic data could allow discovery of genes associated with many diseases without additional genotyping cost.
Collapse
Affiliation(s)
- Joshua C Denny
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN 37232, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
502
|
Malin B, Loukides G, Benitez K, Clayton EW. Identifiability in biobanks: models, measures, and mitigation strategies. Hum Genet 2011; 130:383-92. [PMID: 21739176 PMCID: PMC3621020 DOI: 10.1007/s00439-011-1042-5] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2011] [Accepted: 06/12/2011] [Indexed: 12/29/2022]
Abstract
The collection and sharing of person-specific biospecimens has raised significant questions regarding privacy. In particular, the question of identifiability, or the degree to which materials stored in biobanks can be linked to the name of the individuals from which they were derived, is under scrutiny. The goal of this paper is to review the extent to which biospecimens and affiliated data can be designated as identifiable. To achieve this goal, we summarize recent research in identifiability assessment for DNA sequence data, as well as associated demographic and clinical data, shared via biobanks. We demonstrate the variability of the degree of risk, the factors that contribute to this variation, and potential ways to mitigate and manage such risk. Finally, we discuss the policy implications of these findings, particularly as they pertain to biobank security and access policies. We situate our review in the context of real data sharing scenarios and biorepositories.
Collapse
Affiliation(s)
- Bradley Malin
- Department of Biomedical Informatics, School of Medicine, Vanderbilt University, 2525 West End Avenue, Suite 600, Nashville, TN 37203, USA. Department of Electrical Engineering and Computer Science, School of Engineering, Vanderbilt University, Nashville, USA
| | - Grigorios Loukides
- Department of Biomedical Informatics, School of Medicine, Vanderbilt University, 2525 West End Avenue, Suite 600, Nashville, TN 37203, USA
| | - Kathleen Benitez
- Department of Biomedical Informatics, School of Medicine, Vanderbilt University, 2525 West End Avenue, Suite 600, Nashville, TN 37203, USA
| | - Ellen Wright Clayton
- Department of Pediatrics, School of Medicine, Vanderbilt, USA. Center for Biomedical Ethics and Society, School of Medicine, Vanderbilt University, 2525 West End Avenue, Suite 400, Nashville, TN 37203, USA. School of Law, Vanderbilt University, Nashville, USA
| |
Collapse
|
503
|
Nadkarni PM, Kemp R, Parikh CR. Leveraging a clinical research information system to assist biospecimen data and workflow management: a hybrid approach. J Clin Bioinforma 2011; 1:22. [PMID: 21884570 PMCID: PMC3174108 DOI: 10.1186/2043-9113-1-22] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2011] [Accepted: 08/25/2011] [Indexed: 01/30/2023] Open
Abstract
BACKGROUND Large multi-center clinical studies often involve the collection and analysis of biological samples. It is necessary to ensure timely, complete and accurate recording of analytical results and associated phenotypic and clinical information. The TRIBE-AKI Consortium http://www.yale.edu/tribeaki supports a network of multiple related studies and sample biorepository, thus allowing researchers to take advantage of a larger specimen collection than they might have at an individual institution. DESCRIPTION We describe a biospecimen data management system (BDMS) that supports TRIBE-AKI and is intended for multi-center collaborative clinical studies that involve shipment of biospecimens between sites. This system works in conjunction with a clinical research information system (CRIS) that stores the clinical data associated with the biospecimens, along with other patient-related parameters. Inter-operation between the two systems is mediated by an interactively invoked suite of Web Services, as well as by batch code. We discuss various challenges involved in integration. CONCLUSIONS Our experience indicates that an approach that emphasizes inter-operability is reasonably optimal in allowing each system to be utilized for the tasks for which it is best suited.
Collapse
Affiliation(s)
| | - Rowena Kemp
- Yale University School of Medicine, New Haven, CT, USA
| | - Chirag R Parikh
- Yale University School of Medicine, New Haven, CT, USA
- Clinical Epidemiology Research Center, VAMC, West Haven, CT, USA
| |
Collapse
|
504
|
Pirmohamed M. Pharmacogenetics: past, present and future. Drug Discov Today 2011; 16:852-61. [PMID: 21884816 DOI: 10.1016/j.drudis.2011.08.006] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2011] [Accepted: 08/16/2011] [Indexed: 12/15/2022]
Abstract
The subject area of pharmacogenetics, also known as pharmacogenomics, has a long history. Research in this area has led to fundamental discoveries, which have helped our understanding of the reasons why individuals differ in the way they handle drugs, and ultimately in the way they respond to drugs, either in terms of efficacy or toxicity. However, not much of this knowledge has been translated into clinical practice, most drug-gene associations that have some evidence of clinical validity have not progressed to clinical settings. Advances in genomics since 2000, including the ready availability of data on the variability of the human genome, have provided us with unprecedented opportunities to understand variability in drug responses, and the opportunity to incorporate this into patient care. This is only likely to occur with a systematic approach that evaluates and overcomes the different translational gaps in taking a biomarker from discovery to clinical practice. In this article, I explore the history of pharmacogenetics, appraise the current state of research in this area, and finish off with suggestions for progressing in the field in the future.
Collapse
Affiliation(s)
- Munir Pirmohamed
- The Wolfson Centre for Personalised Medicine, Department of Pharmacology, University of Liverpool, 1-5 Brownlow Street, Liverpool L693GL, UK.
| |
Collapse
|
505
|
Torkamani A, Scott-Van Zeeland AA, Topol EJ, Schork NJ. Annotating individual human genomes. Genomics 2011; 98:233-41. [PMID: 21839162 DOI: 10.1016/j.ygeno.2011.07.006] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2011] [Accepted: 07/26/2011] [Indexed: 02/03/2023]
Abstract
Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants.
Collapse
|
506
|
Kullo I, Ding K, Shameer K, McCarty C, Jarvik G, Denny J, Ritchie M, Ye Z, Crosslin D, Chisholm R, Manolio T, Chute C. Complement receptor 1 gene variants are associated with erythrocyte sedimentation rate. Am J Hum Genet 2011; 89:131-8. [PMID: 21700265 DOI: 10.1016/j.ajhg.2011.05.019] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2011] [Revised: 05/16/2011] [Accepted: 05/23/2011] [Indexed: 10/18/2022] Open
Abstract
The erythrocyte sedimentation rate (ESR), a commonly performed test of the acute phase response, is the rate at which erythrocytes sediment in vitro in 1 hr. The molecular basis of erythrocyte sedimentation is unknown. To identify genetic variants associated with ESR, we carried out a genome-wide association study of 7607 patients in the Electronic Medical Records and Genomics (eMERGE) network. The discovery cohort consisted of 1979 individuals from the Mayo Clinic, and the replication cohort consisted of 5628 individuals from the remaining four eMERGE sites. A nonsynonymous SNP, rs6691117 (Val→IIe), in the complement receptor 1 gene (CR1) was associated with ESR (discovery cohort p = 7 × 10(-12), replication cohort p = 3 × 10(-14), combined cohort p = 9 × 10(-24)). We imputed 61 SNPs in CR1, and a "possibly damaging" SNP (rs2274567, His→Arg) in linkage disequilibrium (r(2) = 0.74) with rs6691117 was also associated with ESR (discovery p = 5 × 10(-11), replication p = 7 × 10(-17), and combined cohort p = 2 × 10(-25)). The two nonsynonymous SNPs in CR1 are near the C3b/C4b binding site, suggesting a possible mechanism by which the variants may influence ESR. In conclusion, genetic variation in CR1, which encodes a protein that clears complement-tagged inflammatory particles from the circulation, influences interindividual variation in ESR, highlighting an association between the innate immunity pathway and erythrocyte interactions.
Collapse
|
507
|
Vachon CM. Genome-wide association studies go green: novel and cost-effective opportunities for identifying genetic associations. Mayo Clin Proc 2011; 86:597-9. [PMID: 21719615 PMCID: PMC3127553 DOI: 10.4065/mcp.2011.0337] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
508
|
Bielinski SJ, Chai HS, Pathak J, Talwalkar JA, Limburg PJ, Gullerud RE, Sicotte H, Klee EW, Ross JL, Kocher JPA, Kullo IJ, Heit JA, Petersen GM, de Andrade M, Chute CG. Mayo Genome Consortia: a genotype-phenotype resource for genome-wide association studies with an application to the analysis of circulating bilirubin levels. Mayo Clin Proc 2011; 86:606-14. [PMID: 21646302 PMCID: PMC3127556 DOI: 10.4065/mcp.2011.0178] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
OBJECTIVE To create a cohort for cost-effective genetic research, the Mayo Genome Consortia (MayoGC) has been assembled with participants from research studies across Mayo Clinic with high-throughput genetic data and electronic medical record (EMR) data for phenotype extraction. PARTICIPANTS AND METHODS Eligible participants include those who gave general research consent in the contributing studies to share high-throughput genotyping data with other investigators. Herein, we describe the design of the MayoGC, including the current participating cohorts, expansion efforts, data processing, and study management and organization. A genome-wide association study to identify genetic variants associated with total bilirubin levels was conducted to test the genetic research capability of the MayoGC. RESULTS Genome-wide significant results were observed on 2q37 (top single nucleotide polymorphism, rs4148325; P=5.0 × 10(-62)) and 12p12 (top single nucleotide polymorphism, rs4363657; P=5.1 × 10(-8)) corresponding to a gene cluster of uridine 5'-diphospho-glucuronosyltransferases (the UGT1A cluster) and solute carrier organic anion transporter family, member 1B1 (SLCO1B1), respectively. CONCLUSION Genome-wide association studies have identified genetic variants associated with numerous phenotypes but have been historically limited by inadequate sample size due to costly genotyping and phenotyping. Large consortia with harmonized genotype data have been assembled to attain sufficient statistical power, but phenotyping remains a rate-limiting factor in gene discovery research efforts. The EMR consists of an abundance of phenotype data that can be extracted in a relatively quick and systematic manner. The MayoGC provides a model of a unique collaborative effort in the environment of a common EMR for the investigation of genetic determinants of diseases.
Collapse
Affiliation(s)
- Suzette J Bielinski
- Division of Epidemiology, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
509
|
Wallace SE, Kent A. Population biobanks and returning individual research results: mission impossible or new directions? Hum Genet 2011; 130:393-401. [PMID: 21643981 DOI: 10.1007/s00439-011-1021-x] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2011] [Accepted: 05/25/2011] [Indexed: 02/07/2023]
Abstract
Historically, large-scale longitudinal genomic research studies have not returned individual research results to their participants, as these studies are not intended to find clinically significant information for individuals, but to produce 'generalisable' knowledge for future research. However, this stance is now changing. Commentators now argue that there is an ethical imperative to return clinically significant results and individuals are now expressing a desire to have them. This shift reflects societal changes, such as the rise of social networking and an increased desire to participate in medical decision-making, as well as a greater awareness of genetic information and the increasing ability of clinicians to use this information in health care treatment. This paper will discuss the changes that have prompted genomic research studies to reconsider their position and presents examples of projects that are actively engaged in returning individual research results.
Collapse
Affiliation(s)
- Susan E Wallace
- Department of Health Sciences, University of Leicester, 212a Adrian Building, University Road, Leicester LE1 7RH, UK.
| | | |
Collapse
|
510
|
McGuire AL, Basford M, Dressler LG, Fullerton SM, Koenig BA, Li R, McCarty CA, Ramos E, Smith ME, Somkin CP, Waudby C, Wolf WA, Clayton EW. Ethical and practical challenges of sharing data from genome-wide association studies: the eMERGE Consortium experience. Genome Res 2011; 21:1001-7. [PMID: 21632745 DOI: 10.1101/gr.120329.111] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
In 2007, the National Human Genome Research Institute (NHGRI) established the Electronic MEdical Records and GEnomics (eMERGE) Consortium (www.gwas.net) to develop, disseminate, and apply approaches to research that combine DNA biorepositories with electronic medical record (EMR) systems for large-scale, high-throughput genetic research. One of the major ethical and administrative challenges for the eMERGE Consortium has been complying with existing data-sharing policies. This paper discusses the challenges of sharing genomic data linked to health information in the electronic medical record (EMR) and explores the issues as they relate to sharing both within a large consortium and in compliance with the National Institutes of Health (NIH) data-sharing policy. We use the eMERGE Consortium experience to explore data-sharing challenges from the perspective of multiple stakeholders (i.e., research participants, investigators, and research institutions), provide recommendations for researchers and institutions, and call for clearer guidance from the NIH regarding ethical implementation of its data-sharing policy.
Collapse
Affiliation(s)
- Amy L McGuire
- Center for Medical Ethics and Health Policy, Baylor College of Medicine, Houston, TX 77030, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
511
|
Turner SD, Berg RL, Linneman JG, Peissig PL, Crawford DC, Denny JC, Roden DM, McCarty CA, Ritchie MD, Wilke RA. Knowledge-driven multi-locus analysis reveals gene-gene interactions influencing HDL cholesterol level in two independent EMR-linked biobanks. PLoS One 2011; 6:e19586. [PMID: 21589926 PMCID: PMC3092760 DOI: 10.1371/journal.pone.0019586] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2010] [Accepted: 04/01/2011] [Indexed: 11/18/2022] Open
Abstract
Genome-wide association studies (GWAS) are routinely being used to examine the genetic contribution to complex human traits, such as high-density lipoprotein cholesterol (HDL-C). Although HDL-C levels are highly heritable (h(2)∼0.7), the genetic determinants identified through GWAS contribute to a small fraction of the variance in this trait. Reasons for this discrepancy may include rare variants, structural variants, gene-environment (GxE) interactions, and gene-gene (GxG) interactions. Clinical practice-based biobanks now allow investigators to address these challenges by conducting GWAS in the context of comprehensive electronic medical records (EMRs). Here we apply an EMR-based phenotyping approach, within the context of routine care, to replicate several known associations between HDL-C and previously characterized genetic variants: CETP (rs3764261, p = 1.22e-25), LIPC (rs11855284, p = 3.92e-14), LPL (rs12678919, p = 1.99e-7), and the APOA1/C3/A4/A5 locus (rs964184, p = 1.06e-5), all adjusted for age, gender, body mass index (BMI), and smoking status. By using a novel approach which censors data based on relevant co-morbidities and lipid modifying medications to construct a more rigorous HDL-C phenotype, we identified an association between HDL-C and TRIB1, a gene which previously resisted identification in studies with larger sample sizes. Through the application of additional analytical strategies incorporating biological knowledge, we further identified 11 significant GxG interaction models in our discovery cohort, 8 of which show evidence of replication in a second biobank cohort. The strongest predictive model included a pairwise interaction between LPL (which modulates the incorporation of triglyceride into HDL) and ABCA1 (which modulates the incorporation of free cholesterol into HDL). These results demonstrate that gene-gene interactions modulate complex human traits, including HDL cholesterol.
Collapse
Affiliation(s)
- Stephen D. Turner
- Department of Molecular Physiology and Biophysics, Center for Human Genetics Research, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Richard L. Berg
- Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, United States of America
| | - James G. Linneman
- Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, United States of America
| | - Peggy L. Peissig
- Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, United States of America
| | - Dana C. Crawford
- Department of Molecular Physiology and Biophysics, Center for Human Genetics Research, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Joshua C. Denny
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Dan M. Roden
- Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
- Department of Pharmacology, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Catherine A. McCarty
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, United States of America
| | - Marylyn D. Ritchie
- Department of Molecular Physiology and Biophysics, Center for Human Genetics Research, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Russell A. Wilke
- Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| |
Collapse
|
512
|
Sarkar IN, Butte AJ, Lussier YA, Tarczy-Hornoch P, Ohno-Machado L. Translational bioinformatics: linking knowledge across biological and clinical realms. J Am Med Inform Assoc 2011; 18:354-7. [PMID: 21561873 PMCID: PMC3128415 DOI: 10.1136/amiajnl-2011-000245] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Nearly a decade since the completion of the first draft of the human genome, the biomedical community is positioned to usher in a new era of scientific inquiry that links fundamental biological insights with clinical knowledge. Accordingly, holistic approaches are needed to develop and assess hypotheses that incorporate genotypic, phenotypic, and environmental knowledge. This perspective presents translational bioinformatics as a discipline that builds on the successes of bioinformatics and health informatics for the study of complex diseases. The early successes of translational bioinformatics are indicative of the potential to achieve the promise of the Human Genome Project for gaining deeper insights to the genetic underpinnings of disease and progress toward the development of a new generation of therapies.
Collapse
Affiliation(s)
- Indra Neil Sarkar
- Center for Clinical and Translational Science, University of Vermont, Burlington, Vermont 05405, USA.
| | | | | | | | | |
Collapse
|
513
|
Radha V, Kanthimathi S, Mohan V. Genetics of Type 2 diabetes in Asian Indians. ACTA ACUST UNITED AC 2011. [DOI: 10.2217/dmt.11.14] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/06/2022]
|
514
|
El Emam K. Methods for the de-identification of electronic health records for genomic research. Genome Med 2011; 3:25. [PMID: 21542889 PMCID: PMC3129641 DOI: 10.1186/gm239] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Electronic health records are increasingly being linked to DNA repositories and used as a source of clinical information for genomic research. Privacy legislation in many jurisdictions, and most research ethics boards, require that either personal health information is de-identified or that patient consent or authorization is sought before the data are disclosed for secondary purposes. Here, I discuss how de-identification has been applied in current genomic research projects. Recent metrics and methods that can be used to ensure that the risk of re-identification is low and that disclosures are compliant with privacy legislation and regulations (such as the Health Insurance Portability and Accountability Act Privacy Rule) are reviewed. Although these methods can protect against the known approaches for re-identification, residual risks and specific challenges for genomic research are also discussed.
Collapse
Affiliation(s)
- Khaled El Emam
- Children's Hospital of Eastern Ontario Research Institute, 401 Smyth Road, Ottawa, Ontario K1J 8L1, Canada.
| |
Collapse
|
515
|
Turner S, Armstrong LL, Bradford Y, Carlson CS, Crawford DC, Crenshaw AT, de Andrade M, Doheny KF, Haines JL, Hayes G, Jarvik G, Jiang L, Kullo IJ, Li R, Ling H, Manolio TA, Matsumoto M, McCarty CA, McDavid AN, Mirel DB, Paschall JE, Pugh EW, Rasmussen LV, Wilke RA, Zuvich RL, Ritchie MD. Quality control procedures for genome-wide association studies. ACTA ACUST UNITED AC 2011; Chapter 1:Unit1.19. [PMID: 21234875 DOI: 10.1002/0471142905.hg0119s68] [Citation(s) in RCA: 201] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Genome-wide association studies (GWAS) are being conducted at an unprecedented rate in population-based cohorts and have increased our understanding of the pathophysiology of complex disease. Regardless of context, the practical utility of this information will ultimately depend upon the quality of the original data. Quality control (QC) procedures for GWAS are computationally intensive, operationally challenging, and constantly evolving. Here we enumerate some of the challenges in QC of GWAS data and describe the approaches that the electronic MEdical Records and Genomics (eMERGE) network is using for quality assurance in GWAS data, thereby minimizing potential bias and error in GWAS results. We discuss common issues associated with QC of GWAS data, including data file formats, software packages for data manipulation and analysis, sex chromosome anomalies, sample identity, sample relatedness, population substructure, batch effects, and marker quality. We propose best practices and discuss areas of ongoing and future research.
Collapse
Affiliation(s)
- Stephen Turner
- Center for Human Genetics Research, Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, Tennessee, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
516
|
Lemke AA, Wu JT, Waudby C, Pulley J, Somkin CP, Trinidad SB. Community engagement in biobanking: Experiences from the eMERGE Network. Genom Soc Policy 2010; 6:50. [PMID: 22962560 PMCID: PMC3434453 DOI: 10.1186/1746-5354-6-3-50] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Advances in genomic technologies and the promise of "personalised medicine" have spurred the interest of researchers, healthcare systems, and the general public. However, the success of population-based genetic studies depends on the willingness of large numbers of individuals and diverse communities to grant researchers access to detailed medical and genetic information. Certain features of this kind of research - such as the establishment of biobanks and prospective data collection from participants' electronic medical records - make the potential risks and benefits to participants difficult to specify in advance. Therefore, community input into biobank processes is essential. In this report, we describe community engagement efforts undertaken by six United States biobanks, various outcomes from these engagements, and lessons learned. Our aim is to provide useful insights and potential strategies for the various disciplines that work with communities involved in biobank-based genomic research.
Collapse
Affiliation(s)
- Amy A Lemke
- Genomics and Social Science Research, Madison, WI; Institute of Medicine, Washington DC; Center for Human Genetics, Marshfield Clinic; Medical Education and Administration, Vanderbilt University; Kaiser Permanente Division of Research, Oakland CA; Department of Bioethics and Humanities, University of Washington
| | | | | | | | | | | |
Collapse
|
517
|
Abstract
There are several features of genetic and genomic research that challenge established norms of informed consent. In this paper, we discuss these challenges, explore specific elements of informed consent for genetic and genomic research conducted in the United States, and consider alternative consent models that have been proposed. All of these models attempt to balance the obligation to respect and protect research participants with the larger social interest in advancing beneficial research as quickly as possible.
Collapse
Affiliation(s)
- Amy L McGuire
- Center for Medical Ethics and Health Policy, Baylor College of Medicine, Houston, TX 77030, USA.
| | | |
Collapse
|