1
|
van der Es T, Soheili-Nezhad S, Roth Mota N, Franke B, Buitelaar J, Sprooten E. Exploring the genetic architecture of brain structure and ADHD using polygenic neuroimaging-derived scores. Am J Med Genet B Neuropsychiatr Genet 2024:e32987. [PMID: 39016115 DOI: 10.1002/ajmg.b.32987] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 04/24/2024] [Accepted: 05/11/2024] [Indexed: 07/18/2024]
Abstract
Genome-wide association studies (GWAS) have provided valuable insights into the genetic basis of neuropsychiatric disorders and highlighted their complexity. Careful consideration of the polygenicity and complex genetic architecture could aid in the understanding of the underlying brain mechanisms. We introduce an innovative approach to polygenic scoring, utilizing imaging-derived phenotypes (IDPs) to predict a clinical phenotype. We leveraged IDP GWAS data from the UK Biobank, to create polygenic imaging-derived scores (PIDSs). As a proof-of-concept, we assessed genetic variations in brain structure between individuals with ADHD and unaffected controls across three NeuroIMAGE waves (n = 954). Out of the 94 PIDS, 72 exhibited significant associations with their corresponding IDPs in an independent sample. Notably, several global measures, including cerebellum white matter, cerebellum cortex, and cerebral white matter, displayed substantial variance explained for their respective IDPs, ranging from 3% to 5.7%. Conversely, the associations between each IDP and the clinical ADHD phenotype were relatively weak. These findings highlight the growing power of GWAS in structural neuroimaging traits, enabling the construction of polygenic scores that accurately reflect the underlying polygenic architecture. However, to establish robust connections between PIDS and behavioral or clinical traits such as ADHD, larger samples are needed. Our novel approach to polygenic risk scoring offers a valuable tool for researchers in the field of psychiatric genetics.
Collapse
Affiliation(s)
- Tim van der Es
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
- Genome Institute of Singapore, A*STAR, Singapore, Singapore
| | | | - Nina Roth Mota
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Barbara Franke
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
- Department of Cognitive Neuroscience, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Jan Buitelaar
- Department of Cognitive Neuroscience, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Emma Sprooten
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
- Department of Cognitive Neuroscience, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
| |
Collapse
|
2
|
Vitale D, Koretsky M, Kuznetsov N, Hong S, Martin J, James M, Makarious MB, Leonard H, Iwaki H, Faghri F, Blauwendraat C, Singleton AB, Song Y, Levine K, Sreelatha AAK, Fang ZH, Nalls M. GenoTools: An Open-Source Python Package for Efficient Genotype Data Quality Control and Analysis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.26.586362. [PMID: 38585876 PMCID: PMC10996710 DOI: 10.1101/2024.03.26.586362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
GenoTools, a Python package, streamlines population genetics research by integrating ancestry estimation, quality control (QC), and genome-wide association studies (GWAS) capabilities into efficient pipelines. By tracking samples, variants, and quality-specific measures throughout fully customizable pipelines, users can easily manage genetics data for large and small studies. GenoTools' "Ancestry" module renders highly accurate predictions, allowing for high-quality ancestry-specific studies, and enables custom ancestry model training and serialization, specified to the user's genotyping or sequencing platform. As the genotype processing engine that powers several large initiatives, including the NIH's Center for Alzheimer's and Related Dementias (CARD) and the Global Parkinson's Genetics Program (GP2). GenoTools was used to process and analyze the UK Biobank and major Alzheimer's Disease (AD) and Parkinson's Disease (PD) datasets with over 400,000 genotypes from arrays and 5000 sequences and has led to novel discoveries in diverse populations. It has provided replicable ancestry predictions, implemented rigorous QC, and conducted genetic ancestry-specific GWAS to identify systematic errors or biases through a single command. GenoTools is a customizable tool that enables users to efficiently analyze and scale genotype data with reproducible and scalable ancestry, QC, and GWAS pipelines.
Collapse
|
3
|
Kasai F, Fukushima M, Miyagi Y, Nakamura Y. Genetic diversity among the present Japanese population: evidence from genotyping of human cell lines established in Japan. Hum Cell 2024; 37:944-950. [PMID: 38639832 PMCID: PMC11194210 DOI: 10.1007/s13577-024-01055-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 03/12/2024] [Indexed: 04/20/2024]
Abstract
Japan is often assumed to have a highly homogeneous ethnic population, because it is an island country. This is evident in human cell lines collected from cell banks; however, these genotypes have not been thoroughly characterized. To examine the population genotypes of human cell lines established in Japan, we conducted SNP genotyping on 57 noncancerous cell lines and 43 lung cancer cell lines. Analysis of biogeographic ancestry revealed that 58 cell lines had non-admixed Japanese genotypes, 21 cell lines had an admixture of Japanese and East Asian genotypes, and the remaining 21 cell lines had East Asian genotypes. The proportion of non-admixed Japanese genotypes was similar between lung cancer and noncancerous cell lines, suggesting that patients in Japan may not exclusively have Japanese genotypes. This could influence the incidence of inherited diseases and should be taken into account in personalized medicine tailored to genetic background. The genetic makeup of the present-day Japanese population cannot be fully explained by the ancestral Jomon and Yayoi lineages. Instead, it is necessary to consider a certain level of genetic admixture between Japanese and neighboring Asian populations. Our study revealed genetic variation among human cell lines derived from Japanese individuals, reflecting the diversity present within the Japanese population.
Collapse
Affiliation(s)
- Fumio Kasai
- Cell Engineering Division, BioResource Research Center, RIKEN Cell Bank, Tsukuba, Japan.
| | - Makoto Fukushima
- Cell Engineering Division, BioResource Research Center, RIKEN Cell Bank, Tsukuba, Japan
| | - Yohei Miyagi
- Molecular Pathology and Genetics Division, Kanagawa Cancer Center Research Institute, Yokohama, Japan
| | - Yukio Nakamura
- Cell Engineering Division, BioResource Research Center, RIKEN Cell Bank, Tsukuba, Japan
| |
Collapse
|
4
|
Pikalyova K, Orlov A, Horvath D, Marcou G, Varnek A. Predicting S. aureus antimicrobial resistance with interpretable genomic space maps. Mol Inform 2024; 43:e202300263. [PMID: 38386182 DOI: 10.1002/minf.202300263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 01/15/2024] [Accepted: 02/08/2024] [Indexed: 02/23/2024]
Abstract
Increasing antimicrobial resistance (AMR) represents a global healthcare threat. To decrease the spread of AMR and associated mortality, methods for rapid selection of optimal antibiotic treatment are urgently needed. Machine learning (ML) models based on genomic data to predict resistant phenotypes can serve as a fast screening tool prior to phenotypic testing. Nonetheless, many existing ML methods lack interpretability. Therefore, we present a methodology for visualization of sequence space and AMR prediction based on the non-linear dimensionality reduction method - generative topographic mapping (GTM). This approach, applied to AMR data of >5000 S. aureus isolates retrieved from the PATRIC database, yielded GTM models with reasonable accuracy for all drugs (balanced accuracy values ≥0.75). The Generative Topographic Maps (GTMs) represent data in the form of illustrative maps of the genomic space and allow for antibiotic-wise comparison of resistant phenotypes. The maps were also found to be useful for the analysis of genetic determinants responsible for drug resistance. Overall, the GTM-based methodology is a useful tool for both the illustrative exploration of the genomic sequence space and AMR prediction.
Collapse
Affiliation(s)
- Karina Pikalyova
- Laboratoire de Chémoinformatique, UMR 7140, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg, 67000, France
| | - Alexey Orlov
- Laboratoire de Chémoinformatique, UMR 7140, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg, 67000, France
| | - Dragos Horvath
- Laboratoire de Chémoinformatique, UMR 7140, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg, 67000, France
| | - Gilles Marcou
- Laboratoire de Chémoinformatique, UMR 7140, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg, 67000, France
| | - Alexandre Varnek
- Laboratoire de Chémoinformatique, UMR 7140, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg, 67000, France
| |
Collapse
|
5
|
Yang J, Li G, Chen S, Su X, Xu D, Zhai Y, Liu Y, Hu G, Guo C, Yang HB, Occhipinti LG, Hu FX. Machine Learning-Assistant Colorimetric Sensor Arrays for Intelligent and Rapid Diagnosis of Urinary Tract Infection. ACS Sens 2024; 9:1945-1956. [PMID: 38530950 DOI: 10.1021/acssensors.3c02687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2024]
Abstract
Urinary tract infections (UTIs), which can lead to pyelonephritis, urosepsis, and even death, are among the most prevalent infectious diseases worldwide, with a notable increase in treatment costs due to the emergence of drug-resistant pathogens. Current diagnostic strategies for UTIs, such as urine culture and flow cytometry, require time-consuming protocols and expensive equipment. We present here a machine learning-assisted colorimetric sensor array based on recognition of ligand-functionalized Fe single-atom nanozymes (SANs) for the identification of microorganisms at the order, genus, and species levels. Colorimetric sensor arrays are built from the SAN Fe1-NC functionalized with four types of recognition ligands, generating unique microbial identification fingerprints. By integrating the colorimetric sensor arrays with a trained computational classification model, the platform can identify more than 10 microorganisms in UTI urine samples within 1 h. Diagnostic accuracy of up to 97% was achieved in 60 UTI clinical samples, holding great potential for translation into clinical practice applications.
Collapse
Affiliation(s)
- Jianyu Yang
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Ge Li
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Shihong Chen
- School of Chemistry and Chemical Engineering, Southwest University, Chongqing 400715, China
| | - Xiaozhi Su
- Shanghai Synchrotron Radiation Facility, Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai 201204, China
| | - Dong Xu
- Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine, Chinese Academy of Sciences, Hangzhou, Zhejiang 310022, China
- Wenling Big Data and Artificial Intelligence Institute in Medicine, Taizhou, Zhejiang 317502, China
- Key Laboratory of Head & Neck Cancer Translational Research of Zhejiang Province, Hangzhou, Zhejiang 310022, China
- Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital, Taizhou, Zhejiang 317502, China
| | - Yueming Zhai
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Yuhang Liu
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Guangxuan Hu
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Chunxian Guo
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Hong Bin Yang
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Luigi G Occhipinti
- Department of Engineering, University of Cambridge, 9 J J Thomson Avenue, Cambridge CB3 0FA, U.K
| | - Fang Xin Hu
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| |
Collapse
|
6
|
Liu X, Koyama S, Tomizuka K, Takata S, Ishikawa Y, Ito S, Kosugi S, Suzuki K, Hikino K, Koido M, Koike Y, Horikoshi M, Gakuhari T, Ikegawa S, Matsuda K, Momozawa Y, Ito K, Kamatani Y, Terao C. Decoding triancestral origins, archaic introgression, and natural selection in the Japanese population by whole-genome sequencing. SCIENCE ADVANCES 2024; 10:eadi8419. [PMID: 38630824 PMCID: PMC11023554 DOI: 10.1126/sciadv.adi8419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Accepted: 03/07/2024] [Indexed: 04/19/2024]
Abstract
We generated Japanese Encyclopedia of Whole-Genome/Exome Sequencing Library (JEWEL), a high-depth whole-genome sequencing dataset comprising 3256 individuals from across Japan. Analysis of JEWEL revealed genetic characteristics of the Japanese population that were not discernible using microarray data. First, rare variant-based analysis revealed an unprecedented fine-scale genetic structure. Together with population genetics analysis, the present-day Japanese can be decomposed into three ancestral components. Second, we identified unreported loss-of-function (LoF) variants and observed that for specific genes, LoF variants appeared to be restricted to a more limited set of transcripts than would be expected by chance, with PTPRD as a notable example. Third, we identified 44 archaic segments linked to complex traits, including a Denisovan-derived segment at NKX6-1 associated with type 2 diabetes. Most of these segments are specific to East Asians. Fourth, we identified candidate genetic loci under recent natural selection. Overall, our work provided insights into genetic characteristics of the Japanese population.
Collapse
Affiliation(s)
- Xiaoxi Liu
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
| | - Satoshi Koyama
- Laboratory for Cardiovascular Genomics and Informatics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Medical and Population Genetics and Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Boston, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
| | - Kohei Tomizuka
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Sadaaki Takata
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yuki Ishikawa
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Shuji Ito
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Laboratory for Bone and Joint Diseases, RIKEN Center for Medical Sciences, Tokyo, Japan
- Department of Orthopedic Surgery, Faculty of Medicine, Shimane University, Izumo, Japan
| | - Shunichi Kosugi
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Kunihiko Suzuki
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Keiko Hikino
- Laboratory for Pharmacogenomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Masaru Koido
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Yoshinao Koike
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Laboratory for Bone and Joint Diseases, RIKEN Center for Medical Sciences, Tokyo, Japan
- Department of Orthopedic Surgery, Hokkaido University Graduate School of Medicine, Sapporo, Japan
| | - Momoko Horikoshi
- Laboratory for Genomics of Diabetes and Metabolism, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Takashi Gakuhari
- Institute for the Study of Ancient Civilizations and Cultural Resources, College of Human and Social Sciences, Kanazawa University, Kanazawa, Japan
| | - Shiro Ikegawa
- Laboratory for Bone and Joint Diseases, RIKEN Center for Medical Sciences, Tokyo, Japan
| | - Kochi Matsuda
- Laboratory of Genome Technology, Human Genome Center, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
- Laboratory of Clinical Genome Sequencing, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Yukihide Momozawa
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Kaoru Ito
- Laboratory for Cardiovascular Genomics and Informatics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yoichiro Kamatani
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
- The Department of Applied Genetics, The School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| |
Collapse
|
7
|
Xiang R, Kelemen M, Xu Y, Harris LW, Parkinson H, Inouye M, Lambert SA. Recent advances in polygenic scores: translation, equitability, methods and FAIR tools. Genome Med 2024; 16:33. [PMID: 38373998 PMCID: PMC10875792 DOI: 10.1186/s13073-024-01304-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 02/07/2024] [Indexed: 02/21/2024] Open
Abstract
Polygenic scores (PGS) can be used for risk stratification by quantifying individuals' genetic predisposition to disease, and many potentially clinically useful applications have been proposed. Here, we review the latest potential benefits of PGS in the clinic and challenges to implementation. PGS could augment risk stratification through combined use with traditional risk factors (demographics, disease-specific risk factors, family history, etc.), to support diagnostic pathways, to predict groups with therapeutic benefits, and to increase the efficiency of clinical trials. However, there exist challenges to maximizing the clinical utility of PGS, including FAIR (Findable, Accessible, Interoperable, and Reusable) use and standardized sharing of the genomic data needed to develop and recalculate PGS, the equitable performance of PGS across populations and ancestries, the generation of robust and reproducible PGS calculations, and the responsible communication and interpretation of results. We outline how these challenges may be overcome analytically and with more diverse data as well as highlight sustained community efforts to achieve equitable, impactful, and responsible use of PGS in healthcare.
Collapse
Affiliation(s)
- Ruidong Xiang
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
| | - Martin Kelemen
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
| | - Yu Xu
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
| | - Laura W Harris
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Helen Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Michael Inouye
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia.
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK.
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK.
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK.
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK.
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK.
| | - Samuel A Lambert
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
8
|
Kishi T, Ikuta T, Sakuma K, Hatano M, Matsuda Y, Esumi S, Miyake N, Miura I, Kato M, Iwata N. Safety profile of antidepressant for Japanese adults with major depressive disorder: A systematic review and network meta-analysis. Psychiatry Clin Neurosci 2024; 78:142-144. [PMID: 37984427 DOI: 10.1111/pcn.13622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 10/17/2023] [Accepted: 11/13/2023] [Indexed: 11/22/2023]
Affiliation(s)
- Taro Kishi
- Department of Psychiatry, Fujita Health University School of Medicine, Toyoake, Japan
| | - Toshikazu Ikuta
- Department of Communication Sciences and Disorders, School of Applied Sciences, University of Mississippi, University, Mississippi, USA
| | - Kenji Sakuma
- Department of Psychiatry, Fujita Health University School of Medicine, Toyoake, Japan
| | - Masakazu Hatano
- Department of Pharmacotherapeutics and Informatics, Fujita Health University School of Medicine, Toyoake, Japan
| | - Yuki Matsuda
- Department of Psychiatry, Jikei University School of Medicine, Tokyo, Japan
| | - Satoru Esumi
- Faculty of Pharmaceutical Sciences, Kobe Gakuin University, Kobe, Japan
| | - Nobumi Miyake
- Department of Neuropsychiatry, St. Marianna University School of Medicine, Kawasaki, Japan
| | - Itaru Miura
- Department of Neuropsychiatry, Fukushima Medical University School of Medicine, Fukushima, Japan
| | - Masaki Kato
- Department of Neuropsychiatry, Kansai Medical University, Osaka, Japan
| | - Nakao Iwata
- Department of Psychiatry, Fujita Health University School of Medicine, Toyoake, Japan
| |
Collapse
|
9
|
Alamad B, Elliott K, Knight JC. Cross-population applications of genomics to understand the risk of multifactorial traits involving inflammation and immunity. CAMBRIDGE PRISMS. PRECISION MEDICINE 2024; 2:e3. [PMID: 38549844 PMCID: PMC10953767 DOI: 10.1017/pcm.2023.25] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 11/15/2023] [Accepted: 12/18/2023] [Indexed: 04/26/2024]
Abstract
The interplay between genetic and environmental factors plays a significant role in interindividual variation in immune and inflammatory responses. The availability of high-throughput low-cost genotyping and next-generation sequencing has revolutionized our ability to identify human genetic variation and understand how this varies within and between populations, and the relationship with disease. In this review, we explore the potential of genomics for patient benefit, specifically in the diagnosis, prognosis and treatment of inflammatory and immune-related diseases. We summarize the knowledge arising from genetic and functional genomic approaches, and the opportunity for personalized medicine. The review covers applications in infectious diseases, rare immunodeficiencies and autoimmune diseases, illustrating advances in diagnosis and understanding risk including use of polygenic risk scores. We further explore the application for patient stratification and drug target prioritization. The review highlights a key challenge to the field arising from the lack of sufficient representation of genetically diverse populations in genomic studies. This currently limits the clinical utility of genetic-based diagnostic and risk-based applications in non-Caucasian populations. We highlight current genome projects, initiatives and biobanks from diverse populations and how this is being used to improve healthcare globally by improving our understanding of genetic susceptibility to diseases and regional pathogens such as malaria and tuberculosis. Future directions and opportunities for personalized medicine and wider application of genomics in health care are described, for the benefit of individual patients and populations worldwide.
Collapse
Affiliation(s)
- Bana Alamad
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Kate Elliott
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Julian C. Knight
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Chinese Academy of Medical Science Oxford Institute, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| |
Collapse
|
10
|
Sun X, Guo J, Li R, Zhang H, Zhang Y, Liu GE, Emu Q, Zhang H. Whole-Genome Resequencing Reveals Genetic Diversity and Wool Trait-Related Genes in Liangshan Semi-Fine-Wool Sheep. Animals (Basel) 2024; 14:444. [PMID: 38338087 PMCID: PMC10854784 DOI: 10.3390/ani14030444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Revised: 01/12/2024] [Accepted: 01/25/2024] [Indexed: 02/12/2024] Open
Abstract
Understanding the genetic makeup of local sheep breeds is essential for their scientific conservation and sustainable utilization. The Liangshan semi-fine-wool sheep (LSS), a Chinese semi-fine-wool breed renowned for its soft wool, was analyzed using whole-genome sequencing data including 35 LSS, 84 sheep from other domestic breeds, and 20 Asiatic mouflons. We investigated the genetic composition of LSS by conducting analyses of the population structure, runs of homozygosity, genomic inbreeding coefficients, and selection signature. Our findings indicated that LSS shares greater genetic similarity with Border Leicester and Romney sheep than with Tibetan (TIB), Yunnan (YNS), and Chinese Merino sheep. Genomic analysis indicated low to moderate inbreeding coefficients, ranging from 0.014 to 0.154. In identifying selection signals across the LSS genome, we pinpointed 195 candidate regions housing 74 annotated genes (e.g., IRF2BP2, BVES, and ALOX5). We also found the overlaps between the candidate regions and several known quantitative trait loci related to wool traits, such as the wool staple length and wool fiber diameter. A selective sweep region, marked by the highest value of cross-population extended haplotype homozygosity, encompassed IRF2BP2-an influential candidate gene affecting fleece fiber traits. Furthermore, notable differences in genotype frequency at a mutation site (c.1051 + 46T > C, Chr25: 6,784,190 bp) within IRF2BP2 were observed between LSS and TIB and YNS sheep (Fisher's exact test, p < 2.2 × 10-16). Taken together, these findings offer insights crucial for the conservation and breeding enhancement of LSS.
Collapse
Affiliation(s)
- Xueliang Sun
- Key Laboratory of Livestock and Poultry Multi-Omics, Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China; (X.S.); (J.G.)
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China
| | - Jiazhong Guo
- Key Laboratory of Livestock and Poultry Multi-Omics, Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China; (X.S.); (J.G.)
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China
| | - Ran Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
| | - Huanhuan Zhang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
| | - Yifei Zhang
- Key Laboratory of Livestock and Poultry Multi-Omics, Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China; (X.S.); (J.G.)
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China
| | - George E. Liu
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
| | - Quzhe Emu
- Animal Genetics and Breeding Key Laboratory of Sichuan Province, Sichuan Animal Science Academy, No. 7, Niusha Road, Chengdu 610066, China
| | - Hongping Zhang
- Key Laboratory of Livestock and Poultry Multi-Omics, Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China; (X.S.); (J.G.)
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China
| |
Collapse
|
11
|
Douillard V, Dos Santos Brito Silva N, Bourguiba-Hachemi S, Naslavsky MS, Scliar MO, Duarte YAO, Zatz M, Passos-Bueno MR, Limou S, Gourraud PA, Launay É, Castelli EC, Vince N. Optimal population-specific HLA imputation with dimension reduction. HLA 2024; 103:e15282. [PMID: 37950640 DOI: 10.1111/tan.15282] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 08/29/2023] [Accepted: 10/14/2023] [Indexed: 11/13/2023]
Abstract
Human genomics has quickly evolved, powering genome-wide association studies (GWASs). SNP-based GWASs cannot capture the intense polymorphism of HLA genes, highly associated with disease susceptibility. There are methods to statistically impute HLA genotypes from SNP-genotypes data, but lack of diversity in reference panels hinders their performance. We evaluated the accuracy of the 1000 Genomes data as a reference panel for imputing HLA from admixed individuals of African and European ancestries, focusing on (a) the full dataset, (b) 10 replications from 6 populations, and (c) 19 conditions for the custom reference panels. The full dataset outperformed smaller models, with a good F1-score of 0.66 for HLA-B. However, custom models outperformed the multiethnic or population models of similar size (F1-scores up to 0.53, against up to 0.42). We demonstrated the importance of using genetically specific models for imputing populations, which are currently underrepresented in public datasets, opening the door to HLA imputation for every genetic population.
Collapse
Affiliation(s)
- Venceslas Douillard
- Nantes Université, INSERM, Ecole Centrale Nantes, Center for Research in Transplantation and Translational Immunology, Nantes, France
| | - Nayane Dos Santos Brito Silva
- Nantes Université, INSERM, Ecole Centrale Nantes, Center for Research in Transplantation and Translational Immunology, Nantes, France
- São Paulo State University, Molecular Genetics and Bioinformatics Laboratory, School of Medicine, Botucatu, Brazil
| | - Sonia Bourguiba-Hachemi
- Nantes Université, INSERM, Ecole Centrale Nantes, Center for Research in Transplantation and Translational Immunology, Nantes, France
| | - Michel S Naslavsky
- Human Genome and Stem Cell Research Center, University of São Paulo, São Paulo, Brazil
- Department of Genetics and Evolutionary Biology, Biosciences Institute, University of São Paulo, São Paulo, Brazil
- Hospital Israelita Albert Einstein, São Paulo, Brazil
| | - Marilia O Scliar
- Human Genome and Stem Cell Research Center, University of São Paulo, São Paulo, Brazil
| | - Yeda A O Duarte
- Medical-Surgical Nursing Department, School of Nursing, University of São Paulo, São Paulo, Brazil
- Epidemiology Department, Public Health School, University of São Paulo, São Paulo, Brazil
| | - Mayana Zatz
- Human Genome and Stem Cell Research Center, University of São Paulo, São Paulo, Brazil
- Department of Genetics and Evolutionary Biology, Biosciences Institute, University of São Paulo, São Paulo, Brazil
| | - Maria Rita Passos-Bueno
- Human Genome and Stem Cell Research Center, University of São Paulo, São Paulo, Brazil
- Department of Genetics and Evolutionary Biology, Biosciences Institute, University of São Paulo, São Paulo, Brazil
| | - Sophie Limou
- Nantes Université, INSERM, Ecole Centrale Nantes, Center for Research in Transplantation and Translational Immunology, Nantes, France
| | - Pierre-Antoine Gourraud
- Nantes Université, INSERM, Ecole Centrale Nantes, Center for Research in Transplantation and Translational Immunology, Nantes, France
| | - Élise Launay
- Nantes Université, INSERM, Ecole Centrale Nantes, Center for Research in Transplantation and Translational Immunology, Nantes, France
- Department of Pediatrics and Pediatric Emergency, Hôpital Femme Enfant Adolescent, CHU de Nantes, Nantes, France
| | - Erick C Castelli
- São Paulo State University, Molecular Genetics and Bioinformatics Laboratory, School of Medicine, Botucatu, Brazil
| | - Nicolas Vince
- Nantes Université, INSERM, Ecole Centrale Nantes, Center for Research in Transplantation and Translational Immunology, Nantes, France
| |
Collapse
|
12
|
Seifert F, Eisenblätter R, Beckmann J, Schürmann P, Hanel P, Jentschke M, Böhmer G, Strauß HG, Hirchenhain C, Schmidmayr M, Müller F, Fasching P, Luyten A, Häfner N, Dürst M, Runnebaum IB, Hillemanns P, Dörk T, Ramachandran D. Association of two genomic variants with HPV type-specific risk of cervical cancer. Tumour Virus Res 2023; 16:200269. [PMID: 37499979 PMCID: PMC10415783 DOI: 10.1016/j.tvr.2023.200269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 07/22/2023] [Accepted: 07/24/2023] [Indexed: 07/29/2023] Open
Abstract
PROBLEM Human papillomavirus infection is integral to developing invasive cervical cancer in the majority of patients. In a recent genome-wide association study, rs9357152 and rs4243652 have been associated with seropositivity for HPV16 or HPV18, respectively. It is unknown whether these variants also associate with cervical cancer triggered by either HPV16 or HPV18. METHODS We investigate whether the two HPV susceptibility variants show association with type-specific cervical cancer in a genetic case-control study with cases stratified by HPV16 or HPV18, respectively. We further tested whether rs9357152 modulates gene expression of any of 36 genes at the human leukocyte antigen locus in 256 cervical tissues. RESULTS rs9357152 was associated with invasive HPV16-positive cervical cancer (OR 1.33, 95%CI 1.03-1.70, p = 0.03), and rs4243652 was associated with HPV18-positive adenocarcinomas (OR 2.96, 95%CI 1.18-7.41, p = 0.02). These associations remained borderline significant after testing against different sets of controls. rs9357152 was found to be an eQTL for HLA-DRB1 in HPV-positive cervical tissues (pANOVA = 0.0009), with the risk allele lowering mRNA levels. CONCLUSIONS We find evidence that HPV seropositivity variants at chromosome 6 and 14 may modulate type-specific cervical cancer risk. rs9357152 may exert its effect through regulating HLA-DRB1 induction in the presence of HPV. In regard of multiple testing, these results need to be confirmed in larger studies.
Collapse
Affiliation(s)
- Finja Seifert
- Gynaecology Research Unit, Comprehensive Cancer Center, Hannover Medical School, D-30625, Hannover, Germany
| | - Rieke Eisenblätter
- Gynaecology Research Unit, Comprehensive Cancer Center, Hannover Medical School, D-30625, Hannover, Germany
| | - Julia Beckmann
- Gynaecology Research Unit, Comprehensive Cancer Center, Hannover Medical School, D-30625, Hannover, Germany
| | - Peter Schürmann
- Gynaecology Research Unit, Comprehensive Cancer Center, Hannover Medical School, D-30625, Hannover, Germany
| | - Patricia Hanel
- Gynaecology Research Unit, Comprehensive Cancer Center, Hannover Medical School, D-30625, Hannover, Germany
| | - Matthias Jentschke
- Clinics of Gynaecology and Obstetrics, Hannover Medical School, D-30625, Hannover, Germany
| | | | - Hans-Georg Strauß
- Department of Gynaecology, University Clinics, Martin-Luther University, Halle-Wittenberg, Germany
| | - Christine Hirchenhain
- Department of Gynaecology, Clinics Carl Gustav Carus, University of Dresden, Dresden, Germany
| | - Monika Schmidmayr
- Department of Gynaecology, Technische Universität München, Munich, Germany
| | - Florian Müller
- Martin-Luther Hospital, Charite University, Berlin, Germany
| | - Peter Fasching
- Department of Gynaecology and Obstetrics, Erlangen University Hospital, Friedrich-Alexander University of Erlangen-Nuremberg, Erlangen, Germany
| | - Alexander Luyten
- Dysplasia Unit, Department of Gynaecology and Obstetrics, Mare Klinikum, Kronshagen, Germany; Department of Gynaecology, Wolfsburg Hospital, Wolfsburg, Germany
| | - Norman Häfner
- Department of Gynaecology, Jena University Hospital, Friedrich -Schiller-University Jena, Jena, Germany
| | - Matthias Dürst
- Department of Gynaecology, Jena University Hospital, Friedrich -Schiller-University Jena, Jena, Germany
| | - Ingo B Runnebaum
- Department of Gynaecology, Jena University Hospital, Friedrich -Schiller-University Jena, Jena, Germany
| | - Peter Hillemanns
- Clinics of Gynaecology and Obstetrics, Hannover Medical School, D-30625, Hannover, Germany
| | - Thilo Dörk
- Gynaecology Research Unit, Comprehensive Cancer Center, Hannover Medical School, D-30625, Hannover, Germany
| | - Dhanya Ramachandran
- Gynaecology Research Unit, Comprehensive Cancer Center, Hannover Medical School, D-30625, Hannover, Germany.
| |
Collapse
|
13
|
Liu X, Matsunami M, Horikoshi M, Ito S, Ishikawa Y, Suzuki K, Momozawa Y, Niida S, Kimura R, Ozaki K, Maeda S, Imamura M, Terao C. Natural Selection Signatures in the Hondo and Ryukyu Japanese Subpopulations. Mol Biol Evol 2023; 40:msad231. [PMID: 37903429 PMCID: PMC10615566 DOI: 10.1093/molbev/msad231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 09/20/2023] [Accepted: 10/06/2023] [Indexed: 11/01/2023] Open
Abstract
Natural selection signatures across Japanese subpopulations are under-explored. Here we conducted genome-wide selection scans with 622,926 single nucleotide polymorphisms for 20,366 Japanese individuals, who were recruited from the main-islands of Japanese Archipelago (Hondo) and the Ryukyu Archipelago (Ryukyu), representing two major Japanese subpopulations. The integrated haplotype score (iHS) analysis identified several signals in one or both subpopulations. We found a novel candidate locus at IKZF2, especially in Ryukyu. Significant signals were observed in the major histocompatibility complex region in both subpopulations. The lead variants differed and demonstrated substantial allele frequency differences between Hondo and Ryukyu. The lead variant in Hondo tags HLA-A*33:03-C*14:03-B*44:03-DRB1*13:02-DQB1*06:04-DPB1*04:01, a haplotype specific to Japanese and Korean. While in Ryukyu, the lead variant tags DRB1*15:01-DQB1*06:02, which had been recognized as a genetic risk factor for narcolepsy. In contrast, it is reported to confer protective effects against type 1 diabetes and human T lymphotropic virus type 1-associated myelopathy/tropical spastic paraparesis. The FastSMC analysis identified 8 loci potentially affected by selection within the past 20-150 generations, including 2 novel candidate loci. The analysis also showed differences in selection patterns of ALDH2 between Hondo and Ryukyu, a gene recognized to be specifically targeted by selection in East Asian. In summary, our study provided insights into the selection signatures within the Japanese and nominated potential sources of selection pressure.
Collapse
Affiliation(s)
- Xiaoxi Liu
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
| | - Masatoshi Matsunami
- Department of Advanced Genomic and Laboratory Medicine, Graduate School of Medicine, University of the Ryukyus, Nishihara-Cho, Japan
| | - Momoko Horikoshi
- Laboratory for Genomics of Diabetes and Metabolism, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Shuji Ito
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yuki Ishikawa
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Kunihiko Suzuki
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yukihide Momozawa
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Shumpei Niida
- Core Facility Administration, Research Institute, National Center for Geriatrics and Gerontology, Obu, Japan
| | - Ryosuke Kimura
- Department of Human Biology and Anatomy, Graduate School of Medicine, University of the Ryukyus, Nishihara-Cho, Japan
| | - Kouichi Ozaki
- Medical Genome Center, Research Institute, National Center for Geriatrics and Gerontology, Obu, Japan
| | - Shiro Maeda
- Department of Advanced Genomic and Laboratory Medicine, Graduate School of Medicine, University of the Ryukyus, Nishihara-Cho, Japan
- Division of Clinical Laboratory and Blood Transfusion, University of the Ryukyus Hospital, Okinawa, Japan
| | - Minako Imamura
- Department of Advanced Genomic and Laboratory Medicine, Graduate School of Medicine, University of the Ryukyus, Nishihara-Cho, Japan
- Division of Clinical Laboratory and Blood Transfusion, University of the Ryukyus Hospital, Okinawa, Japan
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
- The Department of Applied Genetics, School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| |
Collapse
|
14
|
Li W, Mirone J, Prasad A, Miolane N, Legrand C, Dao Duc K. Orthogonal outlier detection and dimension estimation for improved MDS embedding of biological datasets. FRONTIERS IN BIOINFORMATICS 2023; 3:1211819. [PMID: 37637212 PMCID: PMC10448701 DOI: 10.3389/fbinf.2023.1211819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 07/26/2023] [Indexed: 08/29/2023] Open
Abstract
Conventional dimensionality reduction methods like Multidimensional Scaling (MDS) are sensitive to the presence of orthogonal outliers, leading to significant defects in the embedding. We introduce a robust MDS method, called DeCOr-MDS (Detection and Correction of Orthogonal outliers using MDS), based on the geometry and statistics of simplices formed by data points, that allows to detect orthogonal outliers and subsequently reduce dimensionality. We validate our methods using synthetic datasets, and further show how it can be applied to a variety of large real biological datasets, including cancer image cell data, human microbiome project data and single cell RNA sequencing data, to address the task of data cleaning and visualization.
Collapse
Affiliation(s)
- Wanxin Li
- Department of Computer Science, University of British Columbia, Vancouver, BC, Canada
| | - Jules Mirone
- Department of Mathematics, University of British Columbia, Vancouver, BC, Canada
- Centre de Mathématiques Appliquées, Ecole Polytechnique, Palaiseau, France
| | - Ashok Prasad
- Department of Chemical and Biological Engineering, School of Biomedical Engineering, Colorado State University, Fort Collins, CO, United States
| | - Nina Miolane
- Department of Electrical and Computer Engineering, University of California, Santa Barbara, Santa Barbara, CA, United States
| | - Carine Legrand
- Université Paris Cité, Génomes, biologie cellulaire et thérapeutique U944, INSERM, CNRS, Paris, France
| | - Khanh Dao Duc
- Department of Computer Science, University of British Columbia, Vancouver, BC, Canada
- Department of Mathematics, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
15
|
Moon J, Posada-Quintero HF, Chon KH. Genetic data visualization using literature text-based neural networks: Examples associated with myocardial infarction. Neural Netw 2023; 165:562-595. [PMID: 37364469 DOI: 10.1016/j.neunet.2023.05.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 04/11/2023] [Accepted: 05/09/2023] [Indexed: 06/28/2023]
Abstract
Data visualization is critical to unraveling hidden information from complex and high-dimensional data. Interpretable visualization methods are critical, especially in the biology and medical fields, however, there are limited effective visualization methods for large genetic data. Current visualization methods are limited to lower-dimensional data and their performance suffers if there is missing data. In this study, we propose a literature-based visualization method to reduce high-dimensional data without compromising the dynamics of the single nucleotide polymorphisms (SNP) and textual interpretability. Our method is innovative because it is shown to (1) preserves both global and local structures of SNP while reducing the dimension of the data using literature text representations, and (2) enables interpretable visualizations using textual information. For performance evaluations, we examined the proposed approach to classify various classification categories including race, myocardial infarction event age groups, and sex using several machine learning models on the literature-derived SNP data. We used visualization approaches to examine clustering of data as well as quantitative performance metrics for the classification of the risk factors examined above. Our method outperformed all popular dimensionality reduction and visualization methods for both classification and visualization, and it is robust against missing and higher-dimensional data. Moreover, we found it feasible to incorporate both genetic and other risk information obtained from literature with our method.
Collapse
Affiliation(s)
- Jihye Moon
- Department of Biomedical Engineering, University of Connecticut, Storrs, CT 06269, USA.
| | | | - Ki H Chon
- Department of Biomedical Engineering, University of Connecticut, Storrs, CT 06269, USA.
| |
Collapse
|
16
|
Cooke NP, Mattiangeli V, Cassidy LM, Okazaki K, Kasai K, Bradley DG, Gakuhari T, Nakagome S. Genomic insights into a tripartite ancestry in the Southern Ryukyu Islands. EVOLUTIONARY HUMAN SCIENCES 2023; 5:e23. [PMID: 37587935 PMCID: PMC10426068 DOI: 10.1017/ehs.2023.18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 06/21/2023] [Accepted: 06/22/2023] [Indexed: 08/18/2023] Open
Abstract
A tripartite structure for the genetic origin of Japanese populations states that present-day populations are descended from three main ancestors: (1) the indigenous Jomon hunter-gatherers; (2) a Northeast Asian component that arrived during the agrarian Yayoi period; and (3) a major influx of East Asian ancestry in the imperial Kofun period. However, the genetic heterogeneity observed in different regions of the Japanese archipelago highlights the need to assess the applicability and suitability of this model. Here, we analyse historic genomes from the southern Ryukyu Islands, which have unique cultural and historical backgrounds compared with other parts of Japan. Our analysis supports the tripartite structure as the best fit in this region, with significantly higher estimated proportions of Jomon ancestry than mainland Japanese. Unlike the main islands, where each continental ancestor was directly brought by immigrants from the continent, those who already possessed the tripartite ancestor migrated to the southern Ryukyu Islands and admixed with the prehistoric people around the eleventh century AD, coinciding with the emergence of the Gusuku period. These results reaffirm the tripartite model in the southernmost extremes of the Japanese archipelago and show variability in how the structure emerged in diverse geographic regions.
Collapse
Affiliation(s)
- Niall P. Cooke
- School of Medicine, Trinity College Dublin, Dublin, Ireland
| | | | - Lara M. Cassidy
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
| | - Kenji Okazaki
- Department of Anatomy, Faculty of Medicine, Tottori University, Japan
| | - Kenji Kasai
- Toyama Prefectural Center for Archaeological Operations, Toyama, Japan
| | - Daniel G. Bradley
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
| | - Takashi Gakuhari
- Institute for the Study of Ancient Civilizations and Cultural Resources, College of Human and Social Sciences, Kanazawa University, Kanazawa, Japan
| | - Shigeki Nakagome
- School of Medicine, Trinity College Dublin, Dublin, Ireland
- Institute for the Study of Ancient Civilizations and Cultural Resources, College of Human and Social Sciences, Kanazawa University, Kanazawa, Japan
| |
Collapse
|
17
|
Abu-El-Haija A, Reddi HV, Wand H, Rose NC, Mori M, Qian E, Murray MF. The clinical application of polygenic risk scores: A points to consider statement of the American College of Medical Genetics and Genomics (ACMG). Genet Med 2023; 25:100803. [PMID: 36920474 DOI: 10.1016/j.gim.2023.100803] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 01/27/2023] [Indexed: 03/16/2023] Open
Affiliation(s)
- Aya Abu-El-Haija
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA; Harvard Medical School, Boston, MA
| | - Honey V Reddi
- Department of Pathology & Laboratory Medicine, Medical College of Wisconsin, Milwaukee, WI
| | - Hannah Wand
- Division of Cardiovascular Medicine, Department of Medicine, Stanford Medicine, Stanford, CA
| | - Nancy C Rose
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, School of Medicine, University of Utah Health, Salt Lake City, UT
| | - Mari Mori
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH; Genetic and Genomic Medicine, Nationwide Children's Hospital, Columbus, OH
| | - Emily Qian
- Department of Genetics, Yale University, New Haven, CT
| | | |
Collapse
|
18
|
Katsushika S, Kodera S, Sawano S, Shinohara H, Setoguchi N, Tanabe K, Higashikuni Y, Takeda N, Fujiu K, Daimon M, Akazawa H, Morita H, Komuro I. An explainable artificial intelligence-enabled electrocardiogram analysis model for the classification of reduced left ventricular function. EUROPEAN HEART JOURNAL. DIGITAL HEALTH 2023; 4:254-264. [PMID: 37265859 PMCID: PMC10232279 DOI: 10.1093/ehjdh/ztad027] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 03/27/2023] [Accepted: 04/18/2023] [Indexed: 06/03/2023]
Abstract
Aims The black box nature of artificial intelligence (AI) hinders the development of interpretable AI models that are applicable in clinical practice. We aimed to develop an AI model for classifying patients of reduced left ventricular ejection fraction (LVEF) from 12-lead electrocardiograms (ECG) with the decision-interpretability. Methods and results We acquired paired ECG and echocardiography datasets from the central and co-operative institutions. For the central institution dataset, a random forest model was trained to identify patients with reduced LVEF among 29 907 ECGs. Shapley additive explanations were applied to 7196 ECGs. To extract the model's decision criteria, the calculated Shapley additive explanations values were clustered for 192 non-paced rhythm patients in which reduced LVEF was predicted. Although the extracted criteria were different for each cluster, these criteria generally comprised a combination of six ECG findings: negative T-wave inversion in I/V5-6 leads, low voltage in I/II/V4-6 leads, Q wave in V3-6 leads, ventricular activation time prolongation in I/V5-6 leads, S-wave prolongation in V2-3 leads, and corrected QT interval prolongation. Similarly, for the co-operative institution dataset, the extracted criteria comprised a combination of the same six ECG findings. Furthermore, the accuracy of seven cardiologists' ECG readings improved significantly after watching a video explaining the interpretation of these criteria (before, 62.9% ± 3.9% vs. after, 73.9% ± 2.4%; P = 0.02). Conclusion We visually interpreted the model's decision criteria to evaluate its validity, thereby developing a model that provided the decision-interpretability required for clinical application.
Collapse
Affiliation(s)
- Susumu Katsushika
- Department of Cardiovascular Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
| | | | - Shinnosuke Sawano
- Department of Cardiovascular Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
| | - Hiroki Shinohara
- Department of Cardiovascular Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
| | - Naoto Setoguchi
- Department of Cardiovascular Medicine, Mitsui Memorial Hospital, 1 Kanda-Izumi-cho, Chiyoda-ku, Tokyo 101-8643, Japan
| | - Kengo Tanabe
- Department of Cardiovascular Medicine, Mitsui Memorial Hospital, 1 Kanda-Izumi-cho, Chiyoda-ku, Tokyo 101-8643, Japan
| | - Yasutomi Higashikuni
- Department of Cardiovascular Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
| | - Norifumi Takeda
- Department of Cardiovascular Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
| | - Katsuhito Fujiu
- Department of Advanced Cardiology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
| | - Masao Daimon
- Department of Clinical Laboratory, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
| | - Hiroshi Akazawa
- Department of Cardiovascular Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
| | - Hiroyuki Morita
- Department of Cardiovascular Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
| | | |
Collapse
|
19
|
Yang Y, Tuong ZK, Yu D. Dimensionality reduction under scrutiny. NATURE COMPUTATIONAL SCIENCE 2023; 3:8-9. [PMID: 38177957 DOI: 10.1038/s43588-022-00383-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2024]
Affiliation(s)
- Yang Yang
- Frazer Institute, The University of Queensland, Brisbane, Australia
| | - Zewen K Tuong
- Molecular Immunity Unit, Department of Medicine, University of Cambridge, Cambridge, UK
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | - Di Yu
- Frazer Institute, The University of Queensland, Brisbane, Australia.
- Ian Frazer Centre for Children's Immunotherapy Research, Child Health Research Centre, Faculty of Medicine, The University of Queensland, Brisbane, Australia.
| |
Collapse
|
20
|
Genetic footprints of assortative mating in the Japanese population. Nat Hum Behav 2023; 7:65-73. [PMID: 36138222 PMCID: PMC9883156 DOI: 10.1038/s41562-022-01438-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 07/20/2022] [Indexed: 02/03/2023]
Abstract
Assortative mating (AM) is a pattern characterized by phenotypic similarities between mating partners. Detecting the evidence of AM has been challenging due to the lack of large-scale datasets that include phenotypic data on both partners, especially in populations of non-European ancestries. Gametic phase disequilibrium between trait-associated alleles is a signature of parental AM on a polygenic trait, which can be detected even without partner data. Here, using polygenic scores for 81 traits in the Japanese population using BioBank Japan Project genome-wide association studies data (n = 172,270), we found evidence of AM on the liability to type 2 diabetes and coronary artery disease, as well as on dietary habits. In cross-population comparison using United Kingdom Biobank data (n = 337,139) we found shared but heterogeneous impacts of AM between populations.
Collapse
|
21
|
Lam M, Chen CY, Hill WD, Xia C, Tian R, Levey DF, Gelernter J, Stein MB, Hatoum AS, Huang H, Malhotra AK, Runz H, Ge T, Lencz T. Collective genomic segments with differential pleiotropic patterns between cognitive dimensions and psychopathology. Nat Commun 2022; 13:6868. [PMID: 36369282 PMCID: PMC9652380 DOI: 10.1038/s41467-022-34418-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 10/24/2022] [Indexed: 11/13/2022] Open
Abstract
Cognitive deficits are known to be related to most forms of psychopathology. Here, we perform local genetic correlation analysis as a means of identifying independent segments of the genome that show biologically interpretable pleiotropic associations between cognitive dimensions and psychopathology. We identify collective segments of the genome, which we call "meta-loci", showing differential pleiotropic patterns for psychopathology relative to either cognitive task performance (CTP) or performance on a non-cognitive factor (NCF) derived from educational attainment. We observe that neurodevelopmental gene sets expressed during the prenatal-early childhood period predominate in CTP-relevant meta-loci, while post-natal gene sets are more involved in NCF-relevant meta-loci. Further, we demonstrate that neurodevelopmental gene sets are dissociable across CTP meta-loci with respect to their spatial distribution across the brain. Additionally, we find that GABA-ergic, cholinergic, and glutamatergic genes drive pleiotropic relationships within dissociable meta-loci.
Collapse
Affiliation(s)
- Max Lam
- Division of Psychiatry Research, The Zucker Hillside Hospital, Northwell, Glen Oaks, NY, USA
- Institute of Behavioral Science, Feinstein Institutes for Medical Research, Manhasset, NY, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytical and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Institute of Mental Health, Singapore, Singapore
| | - Chia-Yen Chen
- Translational Biology, Research and Development, Biogen Inc, Cambridge, MA, USA
| | - W David Hill
- Lothian Birth Cohorts group, Department of Psychology, University of Edinburgh, Edinburgh, UK
| | - Charley Xia
- Lothian Birth Cohorts group, Department of Psychology, University of Edinburgh, Edinburgh, UK
| | - Ruoyu Tian
- Computational Biology and Human Genetics, Dewpoint Therapeutics, Boston, MA, USA
| | - Daniel F Levey
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA
- VA Connecticut Healthcare System, West Haven, CT, USA
| | - Joel Gelernter
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA
- VA Connecticut Healthcare System, West Haven, CT, USA
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
- Department of Neuroscience, Yale University School of Medicine, New Haven, CT, USA
| | - Murray B Stein
- VA San Diego Healthcare System, San Diego, CA, USA
- Department of Psychiatry, University of California, San Diego, La Jolla, CA, USA
- Herbert Wertheim School of Public Health and Human Longevity Science, University of California San Diego, La Jolla, CA, USA
| | - Alexander S Hatoum
- Department of Psychiatry, Washington University in St. Louis Medical School, St. Louis, MO, USA
| | - Hailiang Huang
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytical and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Anil K Malhotra
- Division of Psychiatry Research, The Zucker Hillside Hospital, Northwell, Glen Oaks, NY, USA
- Institute of Behavioral Science, Feinstein Institutes for Medical Research, Manhasset, NY, USA
- Department of Psychiatry, Zucker School of Medicine at Hofstra/Norwell, Hempstead, NY, USA
- Department of Molecular Medicine, Zucker School of Medicine at Hofstra/Norwell, Hempstead, NY, USA
| | - Heiko Runz
- Translational Biology, Research and Development, Biogen Inc, Cambridge, MA, USA
| | - Tian Ge
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Todd Lencz
- Division of Psychiatry Research, The Zucker Hillside Hospital, Northwell, Glen Oaks, NY, USA.
- Institute of Behavioral Science, Feinstein Institutes for Medical Research, Manhasset, NY, USA.
- Department of Psychiatry, Zucker School of Medicine at Hofstra/Norwell, Hempstead, NY, USA.
- Department of Molecular Medicine, Zucker School of Medicine at Hofstra/Norwell, Hempstead, NY, USA.
| |
Collapse
|
22
|
DOCK2 is involved in the host genetics and biology of severe COVID-19. Nature 2022; 609:754-760. [PMID: 35940203 PMCID: PMC9492544 DOI: 10.1038/s41586-022-05163-5] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 07/28/2022] [Indexed: 12/12/2022]
Abstract
Identifying the host genetic factors underlying severe COVID-19 is an emerging challenge1–5. Here we conducted a genome-wide association study (GWAS) involving 2,393 cases of COVID-19 in a cohort of Japanese individuals collected during the initial waves of the pandemic, with 3,289 unaffected controls. We identified a variant on chromosome 5 at 5q35 (rs60200309-A), close to the dedicator of cytokinesis 2 gene (DOCK2), which was associated with severe COVID-19 in patients less than 65 years of age. This risk allele was prevalent in East Asian individuals but rare in Europeans, highlighting the value of genome-wide association studies in non-European populations. RNA-sequencing analysis of 473 bulk peripheral blood samples identified decreased expression of DOCK2 associated with the risk allele in these younger patients. DOCK2 expression was suppressed in patients with severe cases of COVID-19. Single-cell RNA-sequencing analysis (n = 61 individuals) identified cell-type-specific downregulation of DOCK2 and a COVID-19-specific decreasing effect of the risk allele on DOCK2 expression in non-classical monocytes. Immunohistochemistry of lung specimens from patients with severe COVID-19 pneumonia showed suppressed DOCK2 expression. Moreover, inhibition of DOCK2 function with CPYPP increased the severity of pneumonia in a Syrian hamster model of SARS-CoV-2 infection, characterized by weight loss, lung oedema, enhanced viral loads, impaired macrophage recruitment and dysregulated type I interferon responses. We conclude that DOCK2 has an important role in the host immune response to SARS-CoV-2 infection and the development of severe COVID-19, and could be further explored as a potential biomarker and/or therapeutic target. A genome-wide association study highlights a variant in DOCK2, which is common in East Asian populations but rare in Europeans, as a host genetic risk factor for severe COVID-19.
Collapse
|
23
|
Wang Y, Tsuo K, Kanai M, Neale BM, Martin AR. Challenges and Opportunities for Developing More Generalizable Polygenic Risk Scores. Annu Rev Biomed Data Sci 2022; 5:293-320. [PMID: 35576555 PMCID: PMC9828290 DOI: 10.1146/annurev-biodatasci-111721-074830] [Citation(s) in RCA: 47] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Polygenic risk scores (PRS) estimate an individual's genetic likelihood of complex traits and diseases by aggregating information across multiple genetic variants identified from genome-wide association studies. PRS can predict a broad spectrum of diseases and have therefore been widely used in research settings. Some work has investigated their potential applications as biomarkers in preventative medicine, but significant work is still needed to definitively establish and communicate absolute risk to patients for genetic and modifiable risk factors across demographic groups. However, the biggest limitation of PRS currently is that they show poor generalizability across diverse ancestries and cohorts. Major efforts are underway through methodological development and data generation initiatives to improve their generalizability. This review aims to comprehensively discuss current progress on the development of PRS, the factors that affect their generalizability, and promising areas for improving their accuracy, portability, and implementation.
Collapse
Affiliation(s)
- Ying Wang
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA,Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| | - Kristin Tsuo
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA,Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA,Biological and Biomedical Sciences, Harvard Medical School, Boston, Massachusetts, USA
| | - Masahiro Kanai
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA,Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA,Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA,Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
| | - Benjamin M. Neale
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA,Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| | - Alicia R. Martin
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA,Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| |
Collapse
|
24
|
Greenwood D, Taverner T, Adderley NJ, Price MJ, Gokhale K, Sainsbury C, Gallier S, Welch C, Sapey E, Murray D, Fanning H, Ball S, Nirantharakumar K, Croft W, Moss P. Machine learning of COVID-19 clinical data identifies population structures with therapeutic potential. iScience 2022; 25:104480. [PMID: 35665240 PMCID: PMC9153184 DOI: 10.1016/j.isci.2022.104480] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 03/07/2022] [Accepted: 05/20/2022] [Indexed: 11/29/2022] Open
Abstract
Clinical outcomes for patients with COVID-19 are heterogeneous and there is interest in defining subgroups for prognostic modeling and development of treatment algorithms. We obtained 28 demographic and laboratory variables in patients admitted to hospital with COVID-19. These comprised a training cohort (n = 6099) and two validation cohorts during the first and second waves of the pandemic (n = 996; n = 1011). Uniform manifold approximation and projection (UMAP) dimension reduction and Gaussian mixture model (GMM) analysis was used to define patient clusters. 29 clusters were defined in the training cohort and associated with markedly different mortality rates, which were predictive within confirmation datasets. Deconvolution of clinical features within clusters identified unexpected relationships between variables. Integration of large datasets using UMAP-assisted clustering can therefore identify patient subgroups with prognostic information and uncovers unexpected interactions between clinical variables. This application of machine learning represents a powerful approach for delineating disease pathogenesis and potential therapeutic interventions.
Collapse
Affiliation(s)
- David Greenwood
- Institute of Immunology and Immunotherapy, University of Birmingham, Birmingham, UK
- The Centre for Computational Biology, University of Birmingham, Birmingham, UK
| | - Thomas Taverner
- Institute of Applied Health Research, University of Birmingham, Birmingham, UK
| | - Nicola J. Adderley
- Institute of Applied Health Research, University of Birmingham, Birmingham, UK
| | - Malcolm James Price
- Institute of Applied Health Research, University of Birmingham, Birmingham, UK
- NIHR Birmingham Biomedical Research Centre, University Hospitals Birmingham NHS Foundation Trust and University of Birmingham, Birmingham, UK
| | - Krishna Gokhale
- Institute of Applied Health Research, University of Birmingham, Birmingham, UK
| | | | - Suzy Gallier
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Institute of Inflammation and Ageing, University of Birmingham, Birmingham, UK
| | - Carly Welch
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Institute of Inflammation and Ageing, University of Birmingham, Birmingham, UK
| | - Elizabeth Sapey
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Institute of Inflammation and Ageing, University of Birmingham, Birmingham, UK
- Health Data Research, London, UK
| | - Duncan Murray
- Institute of Immunology and Immunotherapy, University of Birmingham, Birmingham, UK
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | - Hilary Fanning
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | - Simon Ball
- Institute of Immunology and Immunotherapy, University of Birmingham, Birmingham, UK
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Health Data Research, London, UK
| | | | - Wayne Croft
- Institute of Immunology and Immunotherapy, University of Birmingham, Birmingham, UK
- The Centre for Computational Biology, University of Birmingham, Birmingham, UK
| | - Paul Moss
- Institute of Immunology and Immunotherapy, University of Birmingham, Birmingham, UK
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Corresponding author
| |
Collapse
|
25
|
Wei Y, Wu J, Wu Y, Liu H, Meng F, Liu Q, Midgley AC, Zhang X, Qi T, Kang H, Chen R, Kong D, Zhuang J, Yan X, Huang X. Prediction and Design of Nanozymes using Explainable Machine Learning. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2022; 34:e2201736. [PMID: 35487518 DOI: 10.1002/adma.202201736] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/20/2022] [Indexed: 06/14/2023]
Abstract
An abundant number of nanomaterials have been discovered to possess enzyme-like catalytic activity, termed nanozymes. It is identified that a variety of internal and external factors influence the catalytic activity of nanozymes. However, there is a lack of essential methodologies to uncover the hidden mechanisms between nanozyme features and enzyme-like activity. Here, a data-driven approach is demonstrated that utilizes machine-learning algorithms to understand particle-property relationships, allowing for classification and quantitative predictions of enzyme-like activity exhibited by nanozymes. High consistency between predicted outputs and the observations is confirmed by accuracy (90.6%) and R2 (up to 0.80). Furthermore, sensitive analysis of the models reveals the central roles of transition metals in determining nanozyme activity. As an example, the models are successfully applied to predict or design desirable nanozymes by uncovering the hidden relationship between different periods of transition metals and their enzyme-like performance. This study offers a promising strategy to develop nanozymes with desirable catalytic activity and demonstrates the potential of machine learning within the field of material science.
Collapse
Affiliation(s)
- Yonghua Wei
- State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials for the Ministry of Education, College of Life Sciences, Frontiers Science Center for Cell Responses, Nankai University, Tianjin, 300071, China
- School of Medicine, Nankai University, Tianjin, 300071, China
| | - Jin Wu
- State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials for the Ministry of Education, College of Life Sciences, Frontiers Science Center for Cell Responses, Nankai University, Tianjin, 300071, China
| | - Yixuan Wu
- State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials for the Ministry of Education, College of Life Sciences, Frontiers Science Center for Cell Responses, Nankai University, Tianjin, 300071, China
| | - Hongjiang Liu
- State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials for the Ministry of Education, College of Life Sciences, Frontiers Science Center for Cell Responses, Nankai University, Tianjin, 300071, China
| | - Fanqiang Meng
- College of Information Science and Engineering, China University of Petroleum, Beijing, 102249, China
| | - Qiqi Liu
- State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials for the Ministry of Education, College of Life Sciences, Frontiers Science Center for Cell Responses, Nankai University, Tianjin, 300071, China
| | - Adam C Midgley
- State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials for the Ministry of Education, College of Life Sciences, Frontiers Science Center for Cell Responses, Nankai University, Tianjin, 300071, China
| | - Xiangyun Zhang
- State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials for the Ministry of Education, College of Life Sciences, Frontiers Science Center for Cell Responses, Nankai University, Tianjin, 300071, China
| | - Tianyi Qi
- State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials for the Ministry of Education, College of Life Sciences, Frontiers Science Center for Cell Responses, Nankai University, Tianjin, 300071, China
| | - Helong Kang
- State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials for the Ministry of Education, College of Life Sciences, Frontiers Science Center for Cell Responses, Nankai University, Tianjin, 300071, China
| | - Rui Chen
- School of Materials Science and Engineering, Nankai University, Tianjin, 300350, China
| | - Deling Kong
- State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials for the Ministry of Education, College of Life Sciences, Frontiers Science Center for Cell Responses, Nankai University, Tianjin, 300071, China
| | - Jie Zhuang
- School of Medicine, Nankai University, Tianjin, 300071, China
- Joint Laboratory of Nanozymes, College of Life Sciences, Nankai University, Tianjin, 300071, China
| | - Xiyun Yan
- State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials for the Ministry of Education, College of Life Sciences, Frontiers Science Center for Cell Responses, Nankai University, Tianjin, 300071, China
- Joint Laboratory of Nanozymes, College of Life Sciences, Nankai University, Tianjin, 300071, China
- CAS Engineering Laboratory for Nanozymes, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Xinglu Huang
- State Key Laboratory of Medicinal Chemical Biology, Key Laboratory of Bioactive Materials for the Ministry of Education, College of Life Sciences, Frontiers Science Center for Cell Responses, Nankai University, Tianjin, 300071, China
- Joint Laboratory of Nanozymes, College of Life Sciences, Nankai University, Tianjin, 300071, China
| |
Collapse
|
26
|
Sikaroudi M, Rahnamayan S, Tizhoosh HR. Hospital-Agnostic Image Representation Learning in Digital Pathology. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:3055-3058. [PMID: 36086646 DOI: 10.1109/embc48229.2022.9871198] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Whole Slide Images (WSIs) in digital pathology are used to diagnose cancer subtypes. The difference in procedures to acquire WSIs at various trial sites gives rise to variability in the histopathology images, thus making consistent diagnosis challenging. These differences may stem from variability in image acquisition through multi-vendor scanners, variable acquisition parameters, and differences in staining procedure; as well, patient demographics may bias the glass slide batches before image acquisition. These variabilities are assumed to cause a domain shift in the images of different hospitals. It is crucial to overcome this domain shift because an ideal machine-learning model must be able to work on the diverse sources of images, independent of the acquisition center. A domain generalization technique is leveraged in this study to improve the generalization capability of a Deep Neural Network (DNN), to an unseen histopathology image set (i.e., from an unseen hospital/trial site) in the presence of domain shift. According to experimental results, the conventional supervisedlearning regime generalizes poorly to data collected from different hospitals. However, the proposed hospital-agnostic learning can improve the generalization considering the lowdimensional latent space representation visualization, and classification accuracy results.
Collapse
|
27
|
Pirruccello JP, Di Achille P, Nauffal V, Nekoui M, Friedman SF, Klarqvist MDR, Chaffin MD, Weng LC, Cunningham JW, Khurshid S, Roselli C, Lin H, Koyama S, Ito K, Kamatani Y, Komuro I, Jurgens SJ, Benjamin EJ, Batra P, Natarajan P, Ng K, Hoffmann U, Lubitz SA, Ho JE, Lindsay ME, Philippakis AA, Ellinor PT. Genetic analysis of right heart structure and function in 40,000 people. Nat Genet 2022; 54:792-803. [PMID: 35697867 PMCID: PMC10313645 DOI: 10.1038/s41588-022-01090-3] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 04/26/2022] [Indexed: 01/29/2023]
Abstract
Congenital heart diseases often involve maldevelopment of the evolutionarily recent right heart chamber. To gain insight into right heart structure and function, we fine-tuned deep learning models to recognize the right atrium, right ventricle and pulmonary artery, measuring right heart structures in 40,000 individuals from the UK Biobank with magnetic resonance imaging. Genome-wide association studies identified 130 distinct loci associated with at least one right heart measurement, of which 72 were not associated with left heart structures. Loci were found near genes previously linked with congenital heart disease, including NKX2-5, TBX5/TBX3, WNT9B and GATA4. A genome-wide polygenic predictor of right ventricular ejection fraction was associated with incident dilated cardiomyopathy (hazard ratio, 1.33 per standard deviation; P = 7.1 × 10-13) and remained significant after accounting for a left ventricular polygenic score. Harnessing deep learning to perform large-scale cardiac phenotyping, our results yield insights into the genetic determinants of right heart structure and function.
Collapse
Affiliation(s)
- James P Pirruccello
- Cardiology Division, Massachusetts General Hospital, Boston, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Paolo Di Achille
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Victor Nauffal
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Cardiovascular Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Mahan Nekoui
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Samuel F Friedman
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Marcus D R Klarqvist
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Mark D Chaffin
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Lu-Chen Weng
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jonathan W Cunningham
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Cardiovascular Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Shaan Khurshid
- Cardiology Division, Massachusetts General Hospital, Boston, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Carolina Roselli
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Honghuang Lin
- Framingham Heart Study, Boston University and National Heart, Lung, and Blood Institute, Framingham, MA, USA
- Division of Clinical Informatics, Department of Medicine, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Satoshi Koyama
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Laboratory for Cardiovascular Genomics and Informatics, RIKEN Center for Integrative Medical Sciences, Kanagawa, Japan
| | - Kaoru Ito
- Laboratory for Cardiovascular Genomics and Informatics, RIKEN Center for Integrative Medical Sciences, Kanagawa, Japan
| | - Yoichiro Kamatani
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Kanagawa, Japan
| | - Issei Komuro
- Department of Cardiovascular Medicine, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Sean J Jurgens
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Experimental Cardiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Emelia J Benjamin
- Framingham Heart Study, Boston University and National Heart, Lung, and Blood Institute, Framingham, MA, USA
- Department of Medicine, Cardiology and Preventive Medicine Sections, Boston University School of Medicine, Boston, MA, USA
- Epidemiology Department, Boston University School of Public Health, Boston, MA, USA
| | - Puneet Batra
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Pradeep Natarajan
- Cardiology Division, Massachusetts General Hospital, Boston, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
| | | | - Udo Hoffmann
- Department of Radiology, Harvard Medical School, Boston, MA, USA
- Cardiovascular Imaging Research Center, Massachusetts General Hospital, Boston, MA, USA
| | - Steven A Lubitz
- Cardiology Division, Massachusetts General Hospital, Boston, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
- Demoulas Center for Cardiac Arrhythmias, Massachusetts General Hospital, Boston, MA, USA
| | - Jennifer E Ho
- Harvard Medical School, Boston, MA, USA
- CardioVascular Institute and Division of Cardiology, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Mark E Lindsay
- Cardiology Division, Massachusetts General Hospital, Boston, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
- Thoracic Aortic Center, Massachusetts General Hospital, Boston, MA, USA
| | | | - Patrick T Ellinor
- Cardiology Division, Massachusetts General Hospital, Boston, MA, USA.
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA.
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Harvard Medical School, Boston, MA, USA.
- Demoulas Center for Cardiac Arrhythmias, Massachusetts General Hospital, Boston, MA, USA.
| |
Collapse
|
28
|
Bois A, Tervil B, Moreau A, Vienne-Jumeau A, Ricard D, Oudre L. A topological data analysis-based method for gait signals with an application to the study of multiple sclerosis. PLoS One 2022; 17:e0268475. [PMID: 35560328 PMCID: PMC9106173 DOI: 10.1371/journal.pone.0268475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Accepted: 04/30/2022] [Indexed: 11/30/2022] Open
Abstract
In the past few years, light, affordable wearable inertial measurement units have been providing to clinicians and researchers the possibility to quantitatively study motor degeneracy by comparing gait trials from patients and/or healthy subjects. To do so, standard gait features can be used but they fail to detect subtle changes in several pathologies including multiple sclerosis. Multiple sclerosis is a demyelinating disease of the central nervous system whose symptoms include lower limb impairment, which is why gait trials are commonly used by clinicians for their patients’ follow-up. This article describes a method to compare pairs of gait signals, visualize the results and interpret them, based on topological data analysis techniques. Our method is non-parametric and requires no data other than gait signals acquired with inertial measurement units. We introduce tools from topological data analysis (sublevel sets, persistence barcodes) in a practical way to make it as accessible as possible in order to encourage its use by clinicians. We apply our method to study a cohort of patients suffering from progressive multiple sclerosis and healthy subjects. We show that it can help estimate the severity of the disease and also be used for longitudinal follow-up to detect an evolution of the disease or other phenomena such as asymmetry or outliers.
Collapse
Affiliation(s)
- Alexandre Bois
- Université Paris-Saclay, ENS Paris-Saclay, CNRS, Centre Borelli, Gif-sur-Yvette, France
- Université de Paris, CNRS, Centre Borelli, Paris, France
- * E-mail:
| | - Brian Tervil
- Université Paris-Saclay, ENS Paris-Saclay, CNRS, Centre Borelli, Gif-sur-Yvette, France
- Université de Paris, CNRS, Centre Borelli, Paris, France
| | - Albane Moreau
- Service de Neurologie, Service de Santé des Armées, Hôpital d’Instruction des Armées Percy, Clamart, France
| | - Aliénor Vienne-Jumeau
- Université Paris-Saclay, ENS Paris-Saclay, CNRS, Centre Borelli, Gif-sur-Yvette, France
- Université de Paris, CNRS, Centre Borelli, Paris, France
- Service de Neurologie, Service de Santé des Armées, Hôpital d’Instruction des Armées Percy, Clamart, France
| | - Damien Ricard
- Université Paris-Saclay, ENS Paris-Saclay, CNRS, Centre Borelli, Gif-sur-Yvette, France
- Université de Paris, CNRS, Centre Borelli, Paris, France
- Service de Neurologie, Service de Santé des Armées, Hôpital d’Instruction des Armées Percy, Clamart, France
- Ecole du Val-de-Grâce, Ecole de Santé des Armées, Paris, France
| | - Laurent Oudre
- Université Paris-Saclay, ENS Paris-Saclay, CNRS, Centre Borelli, Gif-sur-Yvette, France
- Université de Paris, CNRS, Centre Borelli, Paris, France
| |
Collapse
|
29
|
Lu TP, Kamatani Y, Belbin G, Park T, Hsiao CK. Editorial: Current Status and Future Challenges of Biobank Data Analysis. Front Genet 2022; 13:882611. [PMID: 35495141 PMCID: PMC9047950 DOI: 10.3389/fgene.2022.882611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 03/24/2022] [Indexed: 11/23/2022] Open
Affiliation(s)
- Tzu-Pin Lu
- Department of Public Health, College of Public Health, Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei, Taiwan
| | - Yoichiro Kamatani
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Gillian Belbin
- Institute of Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul, South Korea
| | - Chuhsing Kate Hsiao
- Department of Public Health, College of Public Health, Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei, Taiwan
- *Correspondence: Chuhsing Kate Hsiao,
| |
Collapse
|
30
|
Mars N, Kerminen S, Feng YCA, Kanai M, Läll K, Thomas LF, Skogholt AH, della Briotta Parolo P, Neale BM, Smoller JW, Gabrielsen ME, Hveem K, Mägi R, Matsuda K, Okada Y, Pirinen M, Palotie A, Ganna A, Martin AR, Ripatti S. Genome-wide risk prediction of common diseases across ancestries in one million people. CELL GENOMICS 2022; 2:None. [PMID: 35591975 PMCID: PMC9010308 DOI: 10.1016/j.xgen.2022.100118] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Revised: 08/24/2021] [Accepted: 03/18/2022] [Indexed: 12/14/2022]
Abstract
Polygenic risk scores (PRS) measure genetic disease susceptibility by combining risk effects across the genome. For coronary artery disease (CAD), type 2 diabetes (T2D), and breast and prostate cancer, we performed cross-ancestry evaluation of genome-wide PRSs in six biobanks in Europe, the United States, and Asia. We studied transferability of these highly polygenic, genome-wide PRSs across global ancestries, within European populations with different health-care systems, and local population substructures in a population isolate. All four PRSs had similar accuracy across European and Asian populations, with poorer transferability in the smaller group of individuals of African ancestry. The PRSs had highly similar effect sizes in different populations of European ancestry, and in early- and late-settlement regions with different recent population bottlenecks in Finland. Comparing genome-wide PRSs to PRSs containing a smaller number of variants, the highly polygenic, genome-wide PRSs generally displayed higher effect sizes and better transferability across global ancestries. Our findings indicate that in the populations investigated, the current genome-wide polygenic scores for common diseases have potential for clinical utility within different health-care settings for individuals of European ancestry, but that the utility in individuals of African ancestry is currently much lower.
Collapse
Affiliation(s)
- Nina Mars
- Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Biomedicum 2U, Tukholmankatu 8, 00290 Helsinki, Finland
| | - Sini Kerminen
- Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Biomedicum 2U, Tukholmankatu 8, 00290 Helsinki, Finland
| | - Yen-Chen A. Feng
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA,Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA,Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA,Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan
| | - Masahiro Kanai
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA,Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA,Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kristi Läll
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Laurent F. Thomas
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway,K. G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, Faculty of Medicine and Health, Norwegian University of Science and Technology, Trondheim, Norway,BioCore - Bioinformatics Core Facility, Norwegian University of Science and Technology, Trondheim, Norway
| | - Anne Heidi Skogholt
- K. G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, Faculty of Medicine and Health, Norwegian University of Science and Technology, Trondheim, Norway
| | - Pietro della Briotta Parolo
- Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Biomedicum 2U, Tukholmankatu 8, 00290 Helsinki, Finland
| | | | | | - Benjamin M. Neale
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA,Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA,Harvard Medical School, Boston, MA, USA
| | - Jordan W. Smoller
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA,Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA,Harvard Medical School, Boston, MA, USA
| | - Maiken E. Gabrielsen
- K. G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, Faculty of Medicine and Health, Norwegian University of Science and Technology, Trondheim, Norway,HUNT Research Center, Department of Public Health and Nursing, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Kristian Hveem
- K. G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, Faculty of Medicine and Health, Norwegian University of Science and Technology, Trondheim, Norway
| | - Reedik Mägi
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Koichi Matsuda
- Department of Computational Biology and Medical Sciences, Graduate school of Frontier Sciences, the University of Tokyo, Tokyo, Japan
| | - Yukinori Okada
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan,Laboratory of Statistical Immunology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Suita, Japan,Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, Japan,Laboratory for Systems Genetics, RIKEN Center for Integrative Medical Sciences, Kanagawa, Japan
| | - Matti Pirinen
- Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Biomedicum 2U, Tukholmankatu 8, 00290 Helsinki, Finland,Department of Public Health, University of Helsinki, Helsinki, Finland,Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
| | - Aarno Palotie
- Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Biomedicum 2U, Tukholmankatu 8, 00290 Helsinki, Finland,Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA,Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Andrea Ganna
- Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Biomedicum 2U, Tukholmankatu 8, 00290 Helsinki, Finland,Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Alicia R. Martin
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA,Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA,Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Samuli Ripatti
- Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Biomedicum 2U, Tukholmankatu 8, 00290 Helsinki, Finland,Department of Public Health, University of Helsinki, Helsinki, Finland,Broad Institute of MIT and Harvard, Cambridge, MA, USA,Corresponding author
| |
Collapse
|
31
|
Ito M, Yamauchi A, Urano M, Kato T, Matsuo M, Nakashima K, Saito K. Epidemiological investigation of spinal muscular atrophy in Japan. Brain Dev 2022; 44:2-16. [PMID: 34452804 DOI: 10.1016/j.braindev.2021.08.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/12/2021] [Revised: 08/05/2021] [Accepted: 08/05/2021] [Indexed: 11/19/2022]
Abstract
BACKGROUND International reporting of epidemiological surveys of spinal muscular atrophy (SMA) in Japan has been limited to Shikoku, despite the epidemiology of the disease in countries worldwide becoming clearer. Treatments of 5q-SMA have been developed, and epidemiological studies are needed. PURPOSE This study aimed to conduct a nationwide epidemiological survey of SMA in Japan to clarify the actual situation of SMA in Japan. METHOD Patients with all clinical types of SMA, including neonates and adults, were selected from 1,005 medical facilities in Japan. RESULTS As of December 2017, the actual number of reported patients with SMA was 658 and the genetic testing rate was 79.5%. The estimated number of patients was 1,478 (95% confidence interval (CI), 1,122-1,834), with a prevalence of 1.17 (95%CI, 0.89-1.45) per 100,000 people and an incidence of 0.51 (95%CI, 0.32-0.71) per 10,000 live births. Incidence rates of 5q-SMA by clinical type were 0.27 (95%CI, 0.17-0.38) and 0.08 (95%CI, 0.04-0.11) per 10,000 live births for type 1 and 2, respectively, in cases with a definitive diagnosis by genetic testing. We found that 363 cases (82.7%) occurred less than 2 years and 88 (20.0%) occurred age of 2 months old or under. CONCLUSION This study clarifies the prevalence and incidence of SMA in Japan. As infantile onset accounts for most cases of SMA, newborn screening and subsequent treatment are important to save lives.
Collapse
Affiliation(s)
- Mayuri Ito
- Institute of Medical Genetics, Tokyo Women's Medical University, Japan
| | - Akemi Yamauchi
- Institute of Medical Genetics, Tokyo Women's Medical University, Japan
| | - Mari Urano
- Institute of Medical Genetics, Tokyo Women's Medical University, Japan
| | - Tamaki Kato
- Institute of Medical Genetics, Tokyo Women's Medical University, Japan
| | - Mari Matsuo
- Institute of Medical Genetics, Tokyo Women's Medical University, Japan
| | - Kenji Nakashima
- National Hospital Organization, Matsue Medical Center, Japan
| | - Kayoko Saito
- Institute of Medical Genetics, Tokyo Women's Medical University, Japan.
| |
Collapse
|
32
|
Pärna K, Nolte IM, Snieder H, Fischer K, Marnetto D, Pagani L. A Principal Component Informed Approach to Address Polygenic Risk Score Transferability Across European Cohorts. Front Genet 2022; 13:899523. [PMID: 35923706 PMCID: PMC9340200 DOI: 10.3389/fgene.2022.899523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 05/26/2022] [Indexed: 11/16/2022] Open
Abstract
One important confounder in genome-wide association studies (GWASs) is population genetic structure, which may generate spurious associations if not properly accounted for. This may ultimately result in a biased polygenic risk score (PRS) prediction, especially when applied to another population. To explore this matter, we focused on principal component analysis (PCA) and asked whether a population genetics informed strategy focused on PCs derived from an external reference population helps in mitigating this PRS transferability issue. Throughout the study, we used two complex model traits, height and body mass index, and samples from UK and Estonian Biobanks. We aimed to investigate 1) whether using a reference population (1000G) for computation of the PCs adjusted for in the discovery cohort improves the resulting PRS performance in a target set from another population and 2) whether adjusting the validation model for PCs is required at all. Our results showed that any other set of PCs performed worse than the one computed on samples from the same population as the discovery dataset. Furthermore, we show that PC correction in GWAS cannot prevent residual population structure information in the PRS, also for non-structured traits. Therefore, we confirm the utility of PC correction in the validation model when the investigated trait shows an actual correlation with population genetic structure, to account for the residual confounding effect when evaluating the predictive value of PRS.
Collapse
Affiliation(s)
- Katri Pärna
- Institute of Genomics, University of Tartu, Tartu, Estonia.,Department of Epidemiology, University of Groningen, Groningen, Netherlands
| | - Ilja M Nolte
- Department of Epidemiology, University of Groningen, Groningen, Netherlands
| | - Harold Snieder
- Department of Epidemiology, University of Groningen, Groningen, Netherlands
| | - Krista Fischer
- Institute of Genomics, University of Tartu, Tartu, Estonia.,Institute of Mathematics and Statistics, University of Tartu, Tartu, Estonia
| | | | - Davide Marnetto
- Institute of Genomics, University of Tartu, Tartu, Estonia.,Department of Neurosciences "Rita Levi Montalcini", University of Turin, Torino, Italy
| | - Luca Pagani
- Institute of Genomics, University of Tartu, Tartu, Estonia.,Department of Biology, University of Padova, Padova, Italy
| |
Collapse
|
33
|
Sohail M, Izarraras-Gomez A, Ortega-Del Vecchyo D. Populations, Traits, and Their Spatial Structure in Humans. Genome Biol Evol 2021; 13:evab272. [PMID: 34894236 PMCID: PMC8715524 DOI: 10.1093/gbe/evab272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/06/2021] [Indexed: 11/16/2022] Open
Abstract
The spatial distribution of genetic variants is jointly determined by geography, past demographic processes, natural selection, and its interplay with environmental variation. A fraction of these genetic variants are "causal alleles" that affect the manifestation of a complex trait. The effect exerted by these causal alleles on complex traits can be independent or dependent on the environment. Understanding the evolutionary processes that shape the spatial structure of causal alleles is key to comprehend the spatial distribution of complex traits. Natural selection, past population size changes, range expansions, consanguinity, assortative mating, archaic introgression, admixture, and the environment can alter the frequencies, effect sizes, and heterozygosities of causal alleles. This provides a genetic axis along which complex traits can vary. However, complex traits also vary along biogeographical and sociocultural axes which are often correlated with genetic axes in complex ways. The purpose of this review is to consider these genetic and environmental axes in concert and examine the ways they can help us decipher the variation in complex traits that is visible in humans today. This initiative necessarily implies a discussion of populations, traits, the ability to infer and interpret "genetic" components of complex traits, and how these have been impacted by adaptive events. In this review, we provide a history-aware discussion on these topics using both the recent and more distant past of our academic discipline and its relevant contexts.
Collapse
Affiliation(s)
- Mashaal Sohail
- Department of Human Genetics, University of Chicago, USA
- Centro de Ciencias Genómicas (CCG), Universidad Nacional Autónoma de México (UNAM), Cuernavaca, Morelos, México
| | - Alan Izarraras-Gomez
- Laboratorio Internacional de Investigación sobre el Genoma Humano (LIIGH), Universidad Nacional Autónoma de México (UNAM), Juriquilla, Querétaro, México
| | - Diego Ortega-Del Vecchyo
- Laboratorio Internacional de Investigación sobre el Genoma Humano (LIIGH), Universidad Nacional Autónoma de México (UNAM), Juriquilla, Querétaro, México
| |
Collapse
|
34
|
Responsible use of polygenic risk scores in the clinic: potential benefits, risks and gaps. Nat Med 2021; 27:1876-1884. [PMID: 34782789 DOI: 10.1038/s41591-021-01549-6] [Citation(s) in RCA: 166] [Impact Index Per Article: 55.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 09/22/2021] [Indexed: 01/24/2023]
Abstract
Polygenic risk scores (PRSs) aggregate the many small effects of alleles across the human genome to estimate the risk of a disease or disease-related trait for an individual. The potential benefits of PRSs include cost-effective enhancement of primary disease prevention, more refined diagnoses and improved precision when prescribing medicines. However, these must be weighed against the potential risks, such as uncertainties and biases in PRS performance, as well as potential misunderstanding and misuse of these within medical practice and in wider society. By addressing key issues including gaps in best practices, risk communication and regulatory frameworks, PRSs can be used responsibly to improve human health. Here, the International Common Disease Alliance's PRS Task Force, a multidisciplinary group comprising expertise in genetics, law, ethics, behavioral science and more, highlights recent research to provide a comprehensive summary of the state of polygenic score research, as well as the needs and challenges as PRSs move closer to widespread use in the clinic.
Collapse
|
35
|
Ramachandran D, Dörk T. Genomic Risk Factors for Cervical Cancer. Cancers (Basel) 2021; 13:5137. [PMID: 34680286 PMCID: PMC8533931 DOI: 10.3390/cancers13205137] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 10/04/2021] [Accepted: 10/11/2021] [Indexed: 12/28/2022] Open
Abstract
Cervical cancer is the fourth common cancer amongst women worldwide. Infection by high-risk human papilloma virus is necessary in most cases, but not sufficient to develop invasive cervical cancer. Despite a predicted genetic heritability in the range of other gynaecological cancers, only few genomic susceptibility loci have been identified thus far. Various case-control association studies have found corroborative evidence for several independent risk variants at the 6p21.3 locus (HLA), while many reports of associations with variants outside the HLA region remain to be validated in other cohorts. Here, we review cervical cancer susceptibility variants arising from recent genome-wide association studies and meta-analysis in large cohorts and propose 2q14 (PAX8), 17q12 (GSDMB), and 5p15.33 (CLPTM1L) as consistently replicated non-HLA cervical cancer susceptibility loci. We further discuss the available evidence for these loci, knowledge gaps, future perspectives, and the potential impact of these findings on precision medicine strategies to combat cervical cancer.
Collapse
Affiliation(s)
| | - Thilo Dörk
- Gynaecology Research Unit, Department of Gynaecology and Obstetrics, Comprehensive Cancer Center, Hannover Medical School, D-30625 Hannover, Germany;
| |
Collapse
|
36
|
Matsunami M, Koganebuchi K, Imamura M, Ishida H, Kimura R, Maeda S. Fine-Scale Genetic Structure and Demographic History in the Miyako Islands of the Ryukyu Archipelago. Mol Biol Evol 2021; 38:2045-2056. [PMID: 33432348 PMCID: PMC8097307 DOI: 10.1093/molbev/msab005] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
The Ryukyu Archipelago is located in the southwest of the Japanese islands and is composed of dozens of islands, grouped into the Miyako Islands, Yaeyama Islands, and Okinawa Islands. Based on the results of principal component analysis on genome-wide single-nucleotide polymorphisms, genetic differentiation was observed among the island groups of the Ryukyu Archipelago. However, a detailed population structure analysis of the Ryukyu Archipelago has not yet been completed. We obtained genomic DNA samples from 1,240 individuals living in the Miyako Islands, and we genotyped 665,326 single-nucleotide polymorphisms to infer population history within the Miyako Islands, including Miyakojima, Irabu, and Ikema islands. The haplotype-based analysis showed that populations in the Miyako Islands were divided into three subpopulations located on Miyakojima northeast, Miyakojima southwest, and Irabu/Ikema. The results of haplotype sharing and the D statistics analyses showed that the Irabu/Ikema subpopulation received gene flows different from those of the Miyakojima subpopulations, which may be related with the historically attested immigration during the Gusuku period (900 − 500 BP). A coalescent-based demographic inference suggests that the Irabu/Ikema population firstly split away from the ancestral Ryukyu population about 41 generations ago, followed by a split of the Miyako southwest population from the ancestral Ryukyu population (about 16 generations ago), and the differentiation of the ancestral Ryukyu population into two populations (Miyako northeast and Okinawajima populations) about seven generations ago. Such genetic information is useful for explaining the population history of modern Miyako people and must be taken into account when performing disease association studies.
Collapse
Affiliation(s)
- Masatoshi Matsunami
- Department of Advanced Genomic and Laboratory Medicine, Graduate School of Medicine, University of the Ryukyus, Nishihara-Cho, Japan
| | - Kae Koganebuchi
- Advanced Medical Research Center, Faculty of Medicine, University of the Ryukyus, Nishihara-Cho, Japan
| | - Minako Imamura
- Department of Advanced Genomic and Laboratory Medicine, Graduate School of Medicine, University of the Ryukyus, Nishihara-Cho, Japan.,Division of Clinical Laboratory and Blood Transfusion, University of the Ryukyus Hospital, Nishihara-Cho, Japan
| | - Hajime Ishida
- Department of Human Biology and Anatomy, Graduate School of Medicine, University of the Ryukyus, Nishihara-Cho, Japan
| | - Ryosuke Kimura
- Department of Human Biology and Anatomy, Graduate School of Medicine, University of the Ryukyus, Nishihara-Cho, Japan
| | - Shiro Maeda
- Department of Advanced Genomic and Laboratory Medicine, Graduate School of Medicine, University of the Ryukyus, Nishihara-Cho, Japan.,Division of Clinical Laboratory and Blood Transfusion, University of the Ryukyus Hospital, Nishihara-Cho, Japan
| |
Collapse
|
37
|
Yang Y, Sun H, Zhang Y, Zhang T, Gong J, Wei Y, Duan YG, Shu M, Yang Y, Wu D, Yu D. Dimensionality reduction by UMAP reinforces sample heterogeneity analysis in bulk transcriptomic data. Cell Rep 2021; 36:109442. [PMID: 34320340 DOI: 10.1016/j.celrep.2021.109442] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Revised: 06/01/2021] [Accepted: 07/01/2021] [Indexed: 12/13/2022] Open
Abstract
Transcriptomic analysis plays a key role in biomedical research. Linear dimensionality reduction methods, especially principal-component analysis (PCA), are widely used in detecting sample-to-sample heterogeneity, while recently developed non-linear methods, such as t-distributed stochastic neighbor embedding (t-SNE) and uniform manifold approximation and projection (UMAP), can efficiently cluster heterogeneous samples in single-cell RNA sequencing analysis. Yet, the application of t-SNE and UMAP in bulk transcriptomic analysis and comparison with conventional methods have not been achieved. We compare four major dimensionality reduction methods (PCA, multidimensional scaling [MDS], t-SNE, and UMAP) in analyzing 71 large bulk transcriptomic datasets. UMAP is superior to PCA and MDS but shows some advantages over t-SNE in differentiating batch effects, identifying pre-defined biological groups, and revealing in-depth clusters in two-dimensional space. Importantly, UMAP generates sample clusters uncovering biological features and clinical meaning. We recommend deploying UMAP in visualizing and analyzing sizable bulk transcriptomic datasets to reinforce sample heterogeneity analysis.
Collapse
Affiliation(s)
- Yang Yang
- The University of Queensland Diamantina Institute, Faculty of Medicine, The University of Queensland, Translational Research Institute, Brisbane, QLD, Australia; Shandong Artificial Intelligence Institute, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
| | - Hongjian Sun
- Shandong Artificial Intelligence Institute, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China; School of Microelectronics, Shandong University, Jinan, China
| | - Yu Zhang
- Laboratory of Immunology for Environment and Health, School of Pharmaceutical Sciences, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
| | - Tiefu Zhang
- University of Electronic Science and Technology of China, Chengdu, China
| | - Jialei Gong
- Shenzhen Key Laboratory of Fertility Regulation, Center of Assisted Reproduction and Embryology, University of Hong Kong, Shenzhen Hospital, Shenzhen, China
| | - Yunbo Wei
- Laboratory of Immunology for Environment and Health, School of Pharmaceutical Sciences, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
| | - Yong-Gang Duan
- Shenzhen Key Laboratory of Fertility Regulation, Center of Assisted Reproduction and Embryology, University of Hong Kong, Shenzhen Hospital, Shenzhen, China
| | - Minglei Shu
- Shandong Artificial Intelligence Institute, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
| | - Yuchen Yang
- Department of Pathology and Laboratory Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA; McAllister Heart Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Di Wu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA; Division of Oral and Craniofacial Health Science, Adams School of Dentistry, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA; Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC, USA.
| | - Di Yu
- The University of Queensland Diamantina Institute, Faculty of Medicine, The University of Queensland, Translational Research Institute, Brisbane, QLD, Australia; Shandong Artificial Intelligence Institute, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China; Laboratory of Immunology for Environment and Health, School of Pharmaceutical Sciences, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China.
| |
Collapse
|
38
|
Konuma T, Okada Y. Statistical genetics and polygenic risk score for precision medicine. Inflamm Regen 2021; 41:18. [PMID: 34140035 PMCID: PMC8212479 DOI: 10.1186/s41232-021-00172-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Accepted: 06/09/2021] [Indexed: 12/27/2022] Open
Abstract
The prediction of disease risks is an essential part of personalized medicine, which includes early disease detection, prevention, and intervention. The polygenic risk score (PRS) has become the standard for quantifying genetic liability in predicting disease risks. PRS utilizes single-nucleotide polymorphisms (SNPs) with genetic risks elucidated by genome-wide association studies (GWASs) and is calculated as weighted sum scores of these SNPs with genetic risks using their effect sizes from GWASs as their weights. The utilities of PRS have been explored in many common diseases, such as cancer, coronary artery disease, obesity, and diabetes, and in various non-disease traits, such as clinical biomarkers. These applications demonstrated that PRS could identify a high-risk subgroup of these diseases as a predictive biomarker and provide information on modifiable risk factors driving health outcomes. On the other hand, there are several limitations to implementing PRSs in clinical practice, such as biased sensitivity for the ethnic background of PRS calculation and geographical differences even in the same population groups. Also, it remains unclear which method is the most suitable for the prediction with high accuracy among numerous PRS methods developed so far. Although further improvements of its comprehensiveness and generalizability will be needed for its clinical implementation in the future, PRS will be a powerful tool for therapeutic interventions and lifestyle recommendations in a wide range of diseases. Thus, it may ultimately improve the health of an entire population in the future.
Collapse
Affiliation(s)
- Takahiro Konuma
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, 2-2 Yamadaoka, Suita, 565-0871, Japan.,Central Pharmaceutical Research Institute, Japan Tobacco Inc., Takatsuki, 569-1125, Japan
| | - Yukinori Okada
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, 2-2 Yamadaoka, Suita, 565-0871, Japan. .,Laboratory of Statistical Immunology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Suita, 565-0871, Japan. .,Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, 565-0871, Japan.
| |
Collapse
|
39
|
Tills O, Spicer JI, Ibbini Z, Rundle SD. Spectral phenotyping of embryonic development reveals integrative thermodynamic responses. BMC Bioinformatics 2021; 22:232. [PMID: 33957860 PMCID: PMC8101172 DOI: 10.1186/s12859-021-04152-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 04/21/2021] [Indexed: 11/26/2022] Open
Abstract
Background Energy proxy traits (EPTs) are a novel approach to high dimensional organismal phenotyping that quantify the spectrum of energy levels within different temporal frequencies associated with mean pixel value fluctuations from video. They offer significant potential in addressing the phenotyping bottleneck in biology and are effective at identifying lethal endpoints and measuring specific functional traits, but the extent to which they might contribute additional understanding of the phenotype remains unknown. Consequently, here we test the biological significance of EPTs and their responses relative to fundamental thermodynamic principles. We achieve this using the entire embryonic development of Radix balthica, a freshwater pond snail, at different temperatures (20, 25 & 30 °C) and comparing responses against predictions from Arrhenius’ equation (Q10 = 2). Results We find that EPTs are thermally sensitive and their spectra of frequency response enable effective high-dimensional treatment clustering throughout organismal development. Temperature-specific deviation in EPTs from thermodynamic predictions were evident and indicative of physiological mitigation, although they differed markedly in their responses from manual measures. The EPT spectrum was effective in capturing aspects of the phenotype predictive of biological outcomes, and suggest that EPTs themselves may reflect levels of energy turnover. Conclusions Whole-organismal biology is incredibly complex, and this contributes to the challenge of developing universal phenotyping approaches. Here, we demonstrate the biological relevance of a new holistic approach to phenotyping that is not constrained by preconceived notions of biological importance. Furthermore, we find that EPTs are an effective approach to measuring even the most dynamic life history stages. Supplementary information The online version contains supplementary material available at 10.1186/s12859-021-04152-1.
Collapse
Affiliation(s)
- Oliver Tills
- Marine Biology and Ecology Research Centre, School of Biological and Marine Sciences, University of Plymouth, Drake Circus, Plymouth, PL4 8AA, Devon, UK.
| | - John I Spicer
- Marine Biology and Ecology Research Centre, School of Biological and Marine Sciences, University of Plymouth, Drake Circus, Plymouth, PL4 8AA, Devon, UK
| | - Ziad Ibbini
- Marine Biology and Ecology Research Centre, School of Biological and Marine Sciences, University of Plymouth, Drake Circus, Plymouth, PL4 8AA, Devon, UK
| | - Simon D Rundle
- Marine Biology and Ecology Research Centre, School of Biological and Marine Sciences, University of Plymouth, Drake Circus, Plymouth, PL4 8AA, Devon, UK
| |
Collapse
|
40
|
Vermeulen M, Smith K, Eremin K, Rayner G, Walton M. Application of Uniform Manifold Approximation and Projection (UMAP) in spectral imaging of artworks. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2021; 252:119547. [PMID: 33588368 DOI: 10.1016/j.saa.2021.119547] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 01/22/2021] [Accepted: 01/24/2021] [Indexed: 05/20/2023]
Abstract
This study assesses the potential of Uniform Manifold Approximation and Projection (UMAP) as an alternative tool to t-distributed Stochastic Neighbor Embedding (t-SNE) for the reduction and visualization of visible spectral images of works of art. We investigate the influence of UMAP parameters-such as, correlation distance, minimum embedding distance, as well as number of embedding neighbors- on the reduction and visualization of spectral images collected from Poèmes Barbares (1896), a major work by the French artist Paul Gauguin in the collection of the Harvard Art Museums. The use of a cosine distance metric and number of neighbors equal to 10 preserves both the local and global structure of the Gauguin dataset in a reduced two-dimensional embedding space thus yielding simple and clear groupings of the pigments used by the artist. The centroids of these groups were identified by locating the densest regions within the UMAP embedding through a 2D histogram peak finding algorithm. These centroids were subsequently fit to the dataset by non-negative least square thus forming maps of pigments distributed across the work of art studied. All findings were correlated to macro XRF imaging analyses carried out on the same painting. The described procedure for reduction and visualization of spectral images of a work of art is quick, easy to implement, and the software is opensource thus promising an improved strategy for interrogating reflectance images from complex works of art.
Collapse
Affiliation(s)
- Marc Vermeulen
- Northwestern University / Art Institute of Chicago Center for Scientific Studies in the Arts (NU-ACCESS), 2145 Sheridan Road, Evanston, IL, United States
| | - Kate Smith
- Harvard Art Museums, Straus Center for Conservation and Technical Studies, 32 Quincy St, Cambridge, MA, United States
| | - Katherine Eremin
- Harvard Art Museums, Straus Center for Conservation and Technical Studies, 32 Quincy St, Cambridge, MA, United States
| | - Georgina Rayner
- Harvard Art Museums, Straus Center for Conservation and Technical Studies, 32 Quincy St, Cambridge, MA, United States
| | - Marc Walton
- Northwestern University / Art Institute of Chicago Center for Scientific Studies in the Arts (NU-ACCESS), 2145 Sheridan Road, Evanston, IL, United States.
| |
Collapse
|
41
|
Geographic variation in the polygenic score of height in Japan. Hum Genet 2021; 140:1097-1108. [PMID: 33900438 DOI: 10.1007/s00439-021-02281-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 04/12/2021] [Indexed: 10/21/2022]
Abstract
A geographical gradient of height has existed in Japan for approximately 100 years. People in northern Japan tend to be taller than those in southern Japan. The differences in annual temperature and day length between the northern and southern prefectures of Japan have been suggested as possible causes of the height gradient. Although height is well known to be a polygenic trait with high heritability, the genetic contributions to the gradient have not yet been explored. Polygenic score (PS) is calculated by aggregating the effects of genetic variants identified by genome-wide association studies (GWASs) to predict the traits of individual subjects. Here, we calculated the PS of height for 10,840 Japanese individuals from all 47 prefectures in Japan. The median height PS for each prefecture was significantly correlated with the mean height of females and males obtained from another independent Japanese nation-wide height dataset, suggesting genetic contribution to the observed height gradient. We also found that individuals and prefectures genetically closer to continental East Asian ancestry tended to have a higher PS; modern Japanese people are considered to have originated as result of admixture between indigenous Jomon people and immigrants from continental East Asia. Another PS analysis based on the GWAS using only the mainland Japanese was conducted to evaluate the effect of population stratification on PS. The result also supported genetic contribution to height, and indicated that the PS might be affected by a bias due to population stratification even in a relatively homogenous population like Japanese.
Collapse
|
42
|
OSADA NAOKI, KAWAI YOSUKE. Exploring models of human migration to the Japanese archipelago using genome-wide genetic data. ANTHROPOL SCI 2021. [DOI: 10.1537/ase.201215] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Affiliation(s)
- NAOKI OSADA
- Faculty of Information Science and Technology, Hokkaido University, Sapporo
| | - YOSUKE KAWAI
- Genome Medical Science Project, National Center for Global Health and Medicine, Tokyo
| |
Collapse
|
43
|
KAMATANI Y, NAKAMURA Y. Genetic variations in medical research in the past, at present and in the future. PROCEEDINGS OF THE JAPAN ACADEMY. SERIES B, PHYSICAL AND BIOLOGICAL SCIENCES 2021; 97:324-335. [PMID: 34121043 PMCID: PMC8403528 DOI: 10.2183/pjab.97.018] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 04/14/2021] [Indexed: 06/12/2023]
Abstract
As we look so different, our genomic sequences vary enormously. The differences in our genome, genetic variations, have played very significant roles in medical research and have contributed to improvement of medical managements in the last 2-3 decades. Genetic variations include germline variations, somatic mutations, and diversities in receptor genes of rearranged immune cells, T cells and B cells. Germline variants are in some cases causative of genetic diseases, are associated with the risk of various diseases, and also affect drug efficacies or adverse events. Some somatic mutations are causative of tumor development. Recent DNA sequencing technologies allow us to perform single-cell analysis or detailed repertoire analysis of B and T cells. It is critically important to investigate temporal changes in immune environment in various anatomical regions in the next one to two decades. In this review article, we would like to introduce the roles of genetic variations in medical fields in the past, at present and in the future.
Collapse
Affiliation(s)
- Yoichiro KAMATANI
- Laboratory of Complex Trait Genomics, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Yusuke NAKAMURA
- Cancer Precision Medicine Center, Japanese Foundation for Cancer Research, Tokyo, Japan
| |
Collapse
|
44
|
Systematic Identification of Key Functional Modules and Genes in Gastric Cancer. BIOMED RESEARCH INTERNATIONAL 2020; 2020:8853348. [PMID: 33282955 PMCID: PMC7685902 DOI: 10.1155/2020/8853348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Revised: 10/14/2020] [Accepted: 10/28/2020] [Indexed: 11/24/2022]
Abstract
Gastric cancer (GC) is associated with high incidence and mortality rates worldwide. Differentially expressed gene (DEG) analysis and weighted gene coexpression network analysis (WGCNA) are important bioinformatic methods for screening core genes. In our study, DEG analysis and WGCNA were combined to screen the hub genes, and pathway enrichment analyses were performed on the DEGs. SBNO2 was identified as the hub gene based on the intersection between the DEGs and the purple module in WGCNA. The expression and prognostic value of SBNO2 were verified in UALCAN, GEPIA2, Human Cancer Metastasis Database, Kaplan–Meier plotter, and TIMER. We identified 1974 DEGs, and 28 modules were uncovered via WGCNA. The purple module was identified as the hub module in WGCNA. SBNO2 was identified as the hub gene, which was upregulated in tumour tissues. Moreover, patients with GC and higher SBNO2 expression had worse prognoses. In addition, SBNO2 was suggested to play an important role in immune cell infiltration. In summary, based on DEGs and key modules related to GC, we identified SBNO2 as a hub gene, thereby offering novel insights into the development and treatment of GC.
Collapse
|
45
|
López-Cortés XA, Matamala F, Maldonado C, Mora-Poblete F, Scapim CA. A Deep Learning Approach to Population Structure Inference in Inbred Lines of Maize. Front Genet 2020; 11:543459. [PMID: 33329691 PMCID: PMC7732446 DOI: 10.3389/fgene.2020.543459] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Accepted: 10/19/2020] [Indexed: 11/16/2022] Open
Abstract
Analysis of population genetic variation and structure is a common practice for genome-wide studies, including association mapping, ecology, and evolution studies in several crop species. In this study, machine learning (ML) clustering methods, K-means (KM), and hierarchical clustering (HC), in combination with non-linear and linear dimensionality reduction techniques, deep autoencoder (DeepAE) and principal component analysis (PCA), were used to infer population structure and individual assignment of maize inbred lines, i.e., dent field corn (n = 97) and popcorn (n = 86). The results revealed that the HC method in combination with DeepAE-based data preprocessing (DeepAE-HC) was the most effective method to assign individuals to clusters (with 96% of correct individual assignments), whereas DeepAE-KM, PCA-HC, and PCA-KM were assigned correctly 92, 89, and 81% of the lines, respectively. These findings were consistent with both Silhouette Coefficient (SC) and Davies-Bouldin validation indexes. Notably, DeepAE-HC also had better accuracy than the Bayesian clustering method implemented in InStruct. The results of this study showed that deep learning (DL)-based dimensional reduction combined with ML clustering methods is a useful tool to determine genetically differentiated groups and to assign individuals into subpopulations in genome-wide studies without having to consider previous genetic assumptions.
Collapse
Affiliation(s)
| | - Felipe Matamala
- Department of Computer Sciences and Industries, Catholic University of the Maule, Talca, Chile
| | - Carlos Maldonado
- Instituto de Ciencias Agroalimentarias, Animales y Ambientales, Universidad de O’Higgins, San Fernando, Chile
| | | | | |
Collapse
|
46
|
Diaz-Papkovich A, Anderson-Trocmé L, Gravel S. A review of UMAP in population genetics. J Hum Genet 2020; 66:85-91. [PMID: 33057159 DOI: 10.1038/s10038-020-00851-4] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Revised: 09/10/2020] [Accepted: 09/14/2020] [Indexed: 01/25/2023]
Abstract
Uniform manifold approximation and projection (UMAP) has been rapidly adopted by the population genetics community to study population structure. It has become common in visualizing the ancestral composition of human genetic datasets, as well as searching for unique clusters of data, and for identifying geographic patterns. Here we give an overview of applications of UMAP in population genetics, provide recommendations for best practices, and offer insights on optimal uses for the technique.
Collapse
Affiliation(s)
- Alex Diaz-Papkovich
- Quantitative Life Sciences Program, McGill University, Montreal, QC, Canada.,Department of Human Genetics, McGill University, Montreal, QC, Canada
| | | | - Simon Gravel
- Department of Human Genetics, McGill University, Montreal, QC, Canada.
| |
Collapse
|
47
|
Narita A, Ueki M, Tamiya G. Artificial intelligence powered statistical genetics in biobanks. J Hum Genet 2020; 66:61-65. [PMID: 32782383 DOI: 10.1038/s10038-020-0822-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 07/15/2020] [Accepted: 07/26/2020] [Indexed: 12/19/2022]
Abstract
Large-scale, sometimes nationwide, prospective genomic cohorts biobanking rich biological specimens such as blood, urine and tissues, have been established and released their vast amount of data in several countries. These genetic and epidemiological resources are expected to allow investigators to disentangle genetic and environmental components conferring common complex diseases. There are, however, two major challenges to statistical genetics for this goal: small sample size-high dimensionality and multilayered-heterogenous endophenotypes. Rather counterintuitively, biobank data generally have small sample size relative to their data dimensionality consisting of genomic variation, lifestyle questionnaire, and sometimes their interaction. This is a widely acknowledged difficulty in data analysis, so-called "p»n problem" in statistics or "curse of dimensionality" in machine-learning field. On the other hand, we have too many measurements of individual health status, which are endophenotypes, such as health check-up data, images, psychological test scores in addition to metabolomics and proteomics data. These endophenotypes are rich but not so tractable because of their worsen dimensionality, and substantial correlation, sometimes confusing causation among them. We have tried to overcome the problems inherent to biobank data, using statistical machine-learning and deep-learning technologies.
Collapse
Affiliation(s)
- Akira Narita
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Japan
| | - Masao Ueki
- RIKEN Center for Advanced Intelligence Project, Tokyo, Japan
| | - Gen Tamiya
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Japan. .,RIKEN Center for Advanced Intelligence Project, Tokyo, Japan.
| |
Collapse
|