1
|
Wang L, Zhang X, Meng X, Koskeridis F, Georgiou A, Yu L, Campbell H, Theodoratou E, Li X. Methodology in phenome-wide association studies: a systematic review. J Med Genet 2021; 58:720-728. [PMID: 34272311 DOI: 10.1136/jmedgenet-2021-107696] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Accepted: 05/27/2021] [Indexed: 11/04/2022]
Abstract
Phenome-wide association study (PheWAS) has been increasingly used to identify novel genetic associations across a wide spectrum of phenotypes. This systematic review aims to summarise the PheWAS methodology, discuss the advantages and challenges of PheWAS, and provide potential implications for future PheWAS studies. Medical Literature Analysis and Retrieval System Online (MEDLINE) and Excerpta Medica Database (EMBASE) databases were searched to identify all published PheWAS studies up until 24 April 2021. The PheWAS methodology incorporating how to perform PheWAS analysis and which software/tool could be used, were summarised based on the extracted information. A total of 1035 studies were identified and 195 eligible articles were finally included. Among them, 137 (77.0%) contained 10 000 or more study participants, 164 (92.1%) defined the phenome based on electronic medical records data, 140 (78.7%) used genetic variants as predictors, and 73 (41.0%) conducted replication analysis to validate PheWAS findings and almost all of them (94.5%) received consistent results. The methodology applied in these PheWAS studies was dissected into several critical steps, including quality control of the phenome, selecting predictors, phenotyping, statistical analysis, interpretation and visualisation of PheWAS results, and the workflow for performing a PheWAS was established with detailed instructions on each step. This study provides a comprehensive overview of PheWAS methodology to help practitioners achieve a better understanding of the PheWAS design, to detect understudied or overstudied outcomes, and to direct their research by applying the most appropriate software and online tools for their study data structure.
Collapse
Affiliation(s)
- Lijuan Wang
- School of Public Health and the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Xiaomeng Zhang
- Centre for Global Health, The University of Edinburgh Usher Institute of Population Health Sciences and Informatics, Edinburgh, UK
| | - Xiangrui Meng
- Vanke School of Public Health, Tsinghua University, Beijing, China
| | - Fotios Koskeridis
- Department of Hygiene and Epidemiology, University of Ioannina, Ioannina, Epirus, Greece
| | - Andrea Georgiou
- Department of Hygiene and Epidemiology, University of Ioannina, Ioannina, Epirus, Greece
| | - Lili Yu
- School of Public Health and the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Harry Campbell
- Centre for Global Health, The University of Edinburgh Usher Institute of Population Health Sciences and Informatics, Edinburgh, UK
| | - Evropi Theodoratou
- Centre for Global Health, The University of Edinburgh Usher Institute of Population Health Sciences and Informatics, Edinburgh, UK.,Cancer Research UK Edinburgh Centre, The University of Edinburgh MRC Institute of Genetics and Molecular Medicine, Edinburgh, UK
| | - Xue Li
- School of Public Health and the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| |
Collapse
|
2
|
Katsevich E, Sabatti C. MULTILAYER KNOCKOFF FILTER: CONTROLLED VARIABLE SELECTION AT MULTIPLE RESOLUTIONS. Ann Appl Stat 2019; 13:1-33. [PMID: 31687060 PMCID: PMC6827557 DOI: 10.1214/18-aoas1185] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
We tackle the problem of selecting from among a large number of variables those that are "important" for an outcome. We consider situations where groups of variables are also of interest. For example, each variable might be a genetic polymorphism, and we might want to study how a trait depends on variability in genes, segments of DNA that typically contain multiple such polymorphisms. In this context, to discover that a variable is relevant for the outcome implies discovering that the larger entity it represents is also important. To guarantee meaningful results with high chance of replicability, we suggest controlling the rate of false discoveries for findings at the level of individual variables and at the level of groups. Building on the knockoff construction of Barber and Candès [Ann. Statist. 43 (2015) 2055-2085] and the multilayer testing framework of Barber and Ramdas [J. Roy. Statist. Soc. Ser. B 79 (2017) 1247-1268], we introduce the multilayer knockoff filter (MKF). We prove that MKF simultaneously controls the FDR at each resolution and use simulations to show that it incurs little power loss compared to methods that provide guarantees only for the discoveries of individual variables. We apply MKF to analyze a genetic dataset and find that it successfully reduces the number of false gene discoveries without a significant reduction in power.
Collapse
Affiliation(s)
- Eugene Katsevich
- DEPARTMENT OF STATISTICS, STANFORD UNIVERSITY, 390 SERRA MALL, STANFORD, CALIFORNIA 94305, ,
| | - Chiara Sabatti
- DEPARTMENT OF STATISTICS, STANFORD UNIVERSITY, 390 SERRA MALL, STANFORD, CALIFORNIA 94305, ,
| |
Collapse
|
4
|
Unravelling the human genome-phenome relationship using phenome-wide association studies. Nat Rev Genet 2016; 17:129-45. [PMID: 26875678 DOI: 10.1038/nrg.2015.36] [Citation(s) in RCA: 168] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Advances in genotyping technology have, over the past decade, enabled the focused search for common genetic variation associated with human diseases and traits. With the recently increased availability of detailed phenotypic data from electronic health records and epidemiological studies, the impact of one or more genetic variants on the phenome is starting to be characterized both in clinical and population-based settings using phenome-wide association studies (PheWAS). These studies reveal a number of challenges that will need to be overcome to unlock the full potential of PheWAS for the characterization of the complex human genome-phenome relationship.
Collapse
|
5
|
Monte AA, Brocker C, Nebert DW, Gonzalez FJ, Thompson DC, Vasiliou V. Improved drug therapy: triangulating phenomics with genomics and metabolomics. Hum Genomics 2014; 8:16. [PMID: 25181945 PMCID: PMC4445687 DOI: 10.1186/s40246-014-0016-9] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2014] [Accepted: 08/05/2014] [Indexed: 12/23/2022] Open
Abstract
Embracing the complexity of biological systems has a greater likelihood to improve prediction of clinical drug response. Here we discuss limitations of a singular focus on genomics, epigenomics, proteomics, transcriptomics, metabolomics, or phenomics-highlighting the strengths and weaknesses of each individual technique. In contrast, 'systems biology' is proposed to allow clinicians and scientists to extract benefits from each technique, while limiting associated weaknesses by supplementing with other techniques when appropriate. Perfect predictive modeling is not possible, whereas modeling of intertwined phenomic responses using genomic stratification with metabolomic modifications may greatly improve predictive values for drug therapy. We thus propose a novel-integrated approach to personalized medicine that begins with phenomic data, is stratified by genomics, and ultimately refined by metabolomic pathway data. Whereas perfect prediction of efficacy and safety of drug therapy is not possible, improvements can be achieved by embracing the complexity of the biological system. Starting with phenomics, the combination of linking metabolomics to identify common biologic pathways and then stratifying by genomic architecture, might increase predictive values. This systems biology approach has the potential, in specific subsets of patients, to avoid drug therapy that will be either ineffective or unsafe.
Collapse
Affiliation(s)
- Andrew A Monte
- University of Colorado Department of Emergency Medicine, Leprino Building, 7th Floor Campus Box B-215, 12401 E. 17th Avenue, Aurora, CO, 80045, USA.
- Skaggs School of Pharmacy and Pharmaceutical Sciences, Aurora, CO, 80045, USA.
- Rocky Mountain Poison & Drug Center, Denver, CO, 80204, USA.
| | - Chad Brocker
- Laboratory of Metabolism, Center for Cancer Research, National Institute of Cancer, Bethesda, MD, 20892, USA.
| | - Daniel W Nebert
- Division of Human Genetics, Department of Pediatrics and Molecular Developmental Biology, University of Cincinnati Medical Center, Cincinnati, OH, 45220, USA.
- Department of Environmental Health and Center for Environmental Genetics, University of Cincinnati Medical Center, Cincinnati, OH, 45220, USA.
| | - Frank J Gonzalez
- Laboratory of Metabolism, Center for Cancer Research, National Institute of Cancer, Bethesda, MD, 20892, USA.
| | - David C Thompson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, Aurora, CO, 80045, USA.
| | - Vasilis Vasiliou
- Skaggs School of Pharmacy and Pharmaceutical Sciences, Aurora, CO, 80045, USA.
| |
Collapse
|