1
|
Claudio Quiros A, Coudray N, Yeaton A, Yang X, Liu B, Le H, Chiriboga L, Karimkhan A, Narula N, Moore DA, Park CY, Pass H, Moreira AL, Le Quesne J, Tsirigos A, Yuan K. Mapping the landscape of histomorphological cancer phenotypes using self-supervised learning on unannotated pathology slides. Nat Commun 2024; 15:4596. [PMID: 38862472 DOI: 10.1038/s41467-024-48666-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 05/08/2024] [Indexed: 06/13/2024] Open
Abstract
Cancer diagnosis and management depend upon the extraction of complex information from microscopy images by pathologists, which requires time-consuming expert interpretation prone to human bias. Supervised deep learning approaches have proven powerful, but are inherently limited by the cost and quality of annotations used for training. Therefore, we present Histomorphological Phenotype Learning, a self-supervised methodology requiring no labels and operating via the automatic discovery of discriminatory features in image tiles. Tiles are grouped into morphologically similar clusters which constitute an atlas of histomorphological phenotypes (HP-Atlas), revealing trajectories from benign to malignant tissue via inflammatory and reactive phenotypes. These clusters have distinct features which can be identified using orthogonal methods, linking histologic, molecular and clinical phenotypes. Applied to lung cancer, we show that they align closely with patient survival, with histopathologically recognised tumor types and growth patterns, and with transcriptomic measures of immunophenotype. These properties are maintained in a multi-cancer study.
Collapse
Affiliation(s)
- Adalberto Claudio Quiros
- School of Computing Science, University of Glasgow, Glasgow, Scotland, UK
- School of Cancer Sciences, University of Glasgow, Glasgow, Scotland, UK
| | - Nicolas Coudray
- Applied Bioinformatics Laboratories, NYU Grossman School of Medicine, New York, NY, USA
- Department of Cell Biology, NYU Grossman School of Medicine, New York, NY, USA
- Department of Medicine, Division of Precision Medicine, NYU Grossman School of Medicine, New York, USA
| | - Anna Yeaton
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - Xinyu Yang
- School of Computing Science, University of Glasgow, Glasgow, Scotland, UK
| | - Bojing Liu
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Soln, Sweden
| | - Hortense Le
- Department of Medicine, Division of Precision Medicine, NYU Grossman School of Medicine, New York, USA
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - Luis Chiriboga
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - Afreen Karimkhan
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - Navneet Narula
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - David A Moore
- Department of Cellular Pathology, University College London Hospital, London, UK
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
| | - Christopher Y Park
- Department of Medicine, Division of Precision Medicine, NYU Grossman School of Medicine, New York, USA
| | - Harvey Pass
- Department of Cardiothoracic Surgery, NYU Grossman School of Medicine, New York, NY, USA
| | - Andre L Moreira
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - John Le Quesne
- School of Cancer Sciences, University of Glasgow, Glasgow, Scotland, UK.
- Cancer Research UK Scotland Institute, Glasgow, Scotland, UK.
- Queen Elizabeth University Hospital, Greater Glasgow and Clyde NHS Trust, Glasgow, Scotland, UK.
| | - Aristotelis Tsirigos
- Applied Bioinformatics Laboratories, NYU Grossman School of Medicine, New York, NY, USA.
- Department of Medicine, Division of Precision Medicine, NYU Grossman School of Medicine, New York, USA.
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA.
| | - Ke Yuan
- School of Computing Science, University of Glasgow, Glasgow, Scotland, UK.
- School of Cancer Sciences, University of Glasgow, Glasgow, Scotland, UK.
- Cancer Research UK Scotland Institute, Glasgow, Scotland, UK.
| |
Collapse
|
2
|
van der Zalm AP, Dings MPG, Manoukian P, Boersma H, Janssen R, Bailey P, Koster J, Zwijnenburg D, Volckmann R, Bootsma S, Waasdorp C, van Mourik M, Blangé D, van den Ende T, Oyarce CI, Derks S, Creemers A, Ebbing EA, Hooijer GK, Meijer SL, van Berge Henegouwen MI, Medema JP, van Laarhoven HWM, Bijlsma MF. The pluripotency factor NANOG contributes to mesenchymal plasticity and is predictive for outcome in esophageal adenocarcinoma. COMMUNICATIONS MEDICINE 2024; 4:89. [PMID: 38760583 PMCID: PMC11101480 DOI: 10.1038/s43856-024-00512-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Accepted: 04/25/2024] [Indexed: 05/19/2024] Open
Abstract
BACKGROUND Despite the advent of neoadjuvant chemoradiotherapy (CRT), overall survival rates of esophageal adenocarcinoma (EAC) remain low. A readily induced mesenchymal transition of EAC cells contributes to resistance to CRT. METHODS In this study, we aimed to chart the heterogeneity in cell state transition after CRT and to identify its underpinnings. A panel of 12 esophageal cultures were treated with CRT and ranked by their relative epithelial-mesenchymal plasticity. RNA-sequencing was performed on 100 pre-treatment biopsies. After RNA-sequencing, Ridge regression analysis was applied to correlate gene expression to ranked plasticity, and models were developed to predict mesenchymal transitions in patients. Plasticity score predictions of the three highest significant predictive models were projected on the pre-treatment biopsies and related to clinical outcome data. Motif enrichment analysis of the genes associated with all three models was performed. RESULTS This study reveals NANOG as the key associated transcription factor predicting mesenchymal plasticity in EAC. Expression of NANOG in pre-treatment biopsies is highly associated with poor response to neoadjuvant chemoradiation, the occurrence of recurrences, and median overall survival difference in EAC patients (>48 months). Perturbation of NANOG reduces plasticity and resensitizes cell lines, organoid cultures, and patient-derived in vivo grafts. CONCLUSIONS In conclusion, NANOG is a key transcription factor in mesenchymal plasticity in EAC and a promising predictive marker for outcome.
Collapse
Affiliation(s)
- Amber P van der Zalm
- Amsterdam UMC location University of Amsterdam, Center for Experimental and Molecular Medicine, Laboratory of Experimental Oncology and Radiobiology, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Cancer Biology, Amsterdam, The Netherlands
- Amsterdam UMC location University of Amsterdam, Department of Medical Oncology, Amsterdam, the Netherlands
| | - Mark P G Dings
- Amsterdam UMC location University of Amsterdam, Center for Experimental and Molecular Medicine, Laboratory of Experimental Oncology and Radiobiology, Amsterdam, The Netherlands
- Amsterdam UMC location University of Amsterdam, Department of Medical Oncology, Amsterdam, the Netherlands
- Oncode Institute, Amsterdam, Netherlands
| | - Paul Manoukian
- Amsterdam UMC location University of Amsterdam, Center for Experimental and Molecular Medicine, Laboratory of Experimental Oncology and Radiobiology, Amsterdam, The Netherlands
- Amsterdam UMC location University of Amsterdam, Department of Medical Oncology, Amsterdam, the Netherlands
- Oncode Institute, Amsterdam, Netherlands
| | - Hannah Boersma
- Amsterdam UMC location University of Amsterdam, Center for Experimental and Molecular Medicine, Laboratory of Experimental Oncology and Radiobiology, Amsterdam, The Netherlands
| | - Reimer Janssen
- Amsterdam UMC location University of Amsterdam, Center for Experimental and Molecular Medicine, Laboratory of Experimental Oncology and Radiobiology, Amsterdam, The Netherlands
| | - Peter Bailey
- School of Cancer Sciences, University of Glasgow, Glasgow, UK
| | - Jan Koster
- Amsterdam UMC location University of Amsterdam, Center for Experimental and Molecular Medicine, Laboratory of Experimental Oncology and Radiobiology, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Cancer Biology, Amsterdam, The Netherlands
| | - Danny Zwijnenburg
- Amsterdam UMC location University of Amsterdam, Center for Experimental and Molecular Medicine, Laboratory of Experimental Oncology and Radiobiology, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Cancer Biology, Amsterdam, The Netherlands
| | - Richard Volckmann
- Amsterdam UMC location University of Amsterdam, Center for Experimental and Molecular Medicine, Laboratory of Experimental Oncology and Radiobiology, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Cancer Biology, Amsterdam, The Netherlands
| | - Sanne Bootsma
- Amsterdam UMC location University of Amsterdam, Center for Experimental and Molecular Medicine, Laboratory of Experimental Oncology and Radiobiology, Amsterdam, The Netherlands
- Amsterdam UMC location University of Amsterdam, Department of Medical Oncology, Amsterdam, the Netherlands
- Oncode Institute, Amsterdam, Netherlands
| | - Cynthia Waasdorp
- Amsterdam UMC location University of Amsterdam, Center for Experimental and Molecular Medicine, Laboratory of Experimental Oncology and Radiobiology, Amsterdam, The Netherlands
- Amsterdam UMC location University of Amsterdam, Department of Medical Oncology, Amsterdam, the Netherlands
- Oncode Institute, Amsterdam, Netherlands
| | - Monique van Mourik
- Cancer Center Amsterdam, Cancer Biology, Amsterdam, The Netherlands
- Amsterdam UMC location University of Amsterdam, Department of Medical Oncology, Amsterdam, the Netherlands
| | - Dionne Blangé
- Amsterdam UMC location University of Amsterdam, Center for Experimental and Molecular Medicine, Laboratory of Experimental Oncology and Radiobiology, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Cancer Biology, Amsterdam, The Netherlands
- Amsterdam UMC location University of Amsterdam, Department of Medical Oncology, Amsterdam, the Netherlands
| | - Tom van den Ende
- Amsterdam UMC location University of Amsterdam, Center for Experimental and Molecular Medicine, Laboratory of Experimental Oncology and Radiobiology, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Cancer Biology, Amsterdam, The Netherlands
- Amsterdam UMC location University of Amsterdam, Department of Medical Oncology, Amsterdam, the Netherlands
| | - César I Oyarce
- Amsterdam UMC location University of Amsterdam, Center for Experimental and Molecular Medicine, Laboratory of Experimental Oncology and Radiobiology, Amsterdam, The Netherlands
- Cancer Center Amsterdam, Cancer Biology, Amsterdam, The Netherlands
| | - Sarah Derks
- Oncode Institute, Amsterdam, Netherlands
- Amsterdam UMC, Vrije Universiteit Amsterdam, Department of Medical Oncology, Cancer Center Amsterdam, Amsterdam, the Netherlands
| | - Aafke Creemers
- Amsterdam UMC location University of Amsterdam, Center for Experimental and Molecular Medicine, Laboratory of Experimental Oncology and Radiobiology, Amsterdam, The Netherlands
| | - Eva A Ebbing
- Amsterdam UMC location University of Amsterdam, Center for Experimental and Molecular Medicine, Laboratory of Experimental Oncology and Radiobiology, Amsterdam, The Netherlands
| | - Gerrit K Hooijer
- Amsterdam UMC location University of Amsterdam, Department of Pathology, Amsterdam, the Netherlands
| | - Sybren L Meijer
- Amsterdam UMC location University of Amsterdam, Department of Pathology, Amsterdam, the Netherlands
| | - Mark I van Berge Henegouwen
- Cancer Center Amsterdam, Cancer Biology, Amsterdam, The Netherlands
- Amsterdam UMC location University of Amsterdam, Department of Surgery, Amsterdam, the Netherlands
| | - Jan Paul Medema
- Amsterdam UMC location University of Amsterdam, Center for Experimental and Molecular Medicine, Laboratory of Experimental Oncology and Radiobiology, Amsterdam, The Netherlands
- Amsterdam UMC location University of Amsterdam, Department of Medical Oncology, Amsterdam, the Netherlands
- Oncode Institute, Amsterdam, Netherlands
| | - Hanneke W M van Laarhoven
- Cancer Center Amsterdam, Cancer Biology, Amsterdam, The Netherlands
- Amsterdam UMC location University of Amsterdam, Department of Medical Oncology, Amsterdam, the Netherlands
| | - Maarten F Bijlsma
- Amsterdam UMC location University of Amsterdam, Center for Experimental and Molecular Medicine, Laboratory of Experimental Oncology and Radiobiology, Amsterdam, The Netherlands.
- Amsterdam UMC location University of Amsterdam, Department of Medical Oncology, Amsterdam, the Netherlands.
- Oncode Institute, Amsterdam, Netherlands.
| |
Collapse
|
3
|
Stanton AM, Boyd RL, O’Cleirigh C, Olivier S, Dolotina B, Gunda R, Koole O, Gareta D, Modise TH, Reynolds Z, Khoza T, Herbst K, Ndung’u T, Hanekom WA, Wong EB, Pillay D, Siedner MJ. HIV, multimorbidity, and health-related quality of life in rural KwaZulu-Natal, South Africa: A population-based study. PLoS One 2024; 19:e0293963. [PMID: 38381724 PMCID: PMC10880982 DOI: 10.1371/journal.pone.0293963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 10/23/2023] [Indexed: 02/23/2024] Open
Abstract
Health-related quality of life (HRQoL) assesses the perceived impact of health status across life domains. Although research has explored the relationship between specific conditions, including HIV, and HRQoL in low-resource settings, less attention has been paid to the association between multimorbidity and HRQoL. In a secondary analysis of cross-sectional data from the Vukuzazi ("Wake up and know ourselves" in isiZulu) study, which identified the prevalence and overlap of non-communicable and infectious diseases in the uMkhanyakunde district of KwaZulu-Natal, we (1) evaluated the impact of multimorbidity on HRQoL; (2) determined the relative associations among infectious diseases, non-communicable diseases (NCDs), and HRQoL; and (3) examined the effects of controlled versus non-controlled disease on HRQoL. HRQoL was measured using the EQ-5D-3L, which assesses overall perceived health, five specific domains (mobility, self-care, usual activities, pain/discomfort, and anxiety/depression), and three levels of problems (no problems, some problems, and extreme problems). Six diseases and disease states were included in this analysis: HIV, diabetes, stroke, heart attack, high blood pressure, and TB. After examining the degree to which number of conditions affects HRQoL, we estimated the effect of joint associations among combinations of diseases, each HRQoL domain, and overall health. Then, in one set of ridge regression models, we assessed the relative impact of HIV, diabetes, stroke, heart attack, high blood pressure, and tuberculosis on the HRQoL domains; in a second set of models, the contribution of treatment (controlled vs. uncontrolled disease) was added. A total of 14,008 individuals were included in this analysis. Having more conditions adversely affected perceived health (r = -0.060, p<0.001, 95% CI: -0.073 to -0.046) and all HRQoL domains. Infectious conditions were related to better perceived health (r = 0.051, p<0.001, 95% CI: 0.037 to 0.064) and better HRQoL, whereas non-communicable diseases (NCDs) were associated with worse perceived health (r = -0.124, p<0.001, -95% CI: 0.137 to -0.110) and lower HRQoL. Particular combinations of NCDs were detrimental to perceived health, whereas HIV, which was characterized by access to care and suppressed viral load in the large majority of those affected, was counterintuitively associated with better perceived health. With respect to disease control, unique combinations of uncontrolled NCDs were significantly related to worse perceived health, and controlled HIV was associated with better perceived health. The presence of controlled and uncontrolled NCDs was associated with poor perceived health and worse HRQoL, whereas the presence of controlled HIV was associated with improved HRQoL. HIV disease control may be critical for HRQoL among people with HIV, and incorporating NCD prevention and attention to multimorbidity into healthcare strategies may improve HRQoL.
Collapse
Affiliation(s)
- Amelia M. Stanton
- Department of Psychological and Brain Sciences, Boston University, Boston, Massachusetts, United States of America
- Department of Psychiatry, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- The Fenway Institute, Fenway Health, Boston, Massachusetts, United States of America
| | - Ryan L. Boyd
- The Obelus Institute, Washington, DC, United States of America
| | - Conall O’Cleirigh
- Department of Psychiatry, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- The Fenway Institute, Fenway Health, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
| | - Stephen Olivier
- Africa Health Research Institute, KwaZulu-Natal, South Africa
| | - Brett Dolotina
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, New York, United States of America
| | - Resign Gunda
- Africa Health Research Institute, KwaZulu-Natal, South Africa
- Division of Infection and Immunity, University College London, London, United Kingdom
- School of Nursing and Public Health, College of Health Sciences, University of KwaZulu-Natal, KwaZulu-Natal, South Africa
| | - Olivier Koole
- Africa Health Research Institute, KwaZulu-Natal, South Africa
- Department of Clinical Research, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Dickman Gareta
- Africa Health Research Institute, KwaZulu-Natal, South Africa
| | | | - Zahra Reynolds
- Division of Infectious Diseases, Massachusetts General Hospital, Boston, Massachusetts, United States of America
| | - Thandeka Khoza
- Africa Health Research Institute, KwaZulu-Natal, South Africa
| | - Kobus Herbst
- Africa Health Research Institute, KwaZulu-Natal, South Africa
- DSI-MRC South African Population Research Infrastructure Network (SAPRIN), South African Medical Research Council, Durban, South Africa
| | - Thumbi Ndung’u
- Africa Health Research Institute, KwaZulu-Natal, South Africa
| | - Willem A. Hanekom
- Africa Health Research Institute, KwaZulu-Natal, South Africa
- Division of Infection and Immunity, University College London, London, United Kingdom
| | - Emily B. Wong
- Africa Health Research Institute, KwaZulu-Natal, South Africa
- Division of Infection and Immunity, University College London, London, United Kingdom
- Division of Infectious Diseases, Massachusetts General Hospital, Boston, Massachusetts, United States of America
| | - Deenan Pillay
- Africa Health Research Institute, KwaZulu-Natal, South Africa
- Division of Infection and Immunity, University College London, London, United Kingdom
| | - Mark J. Siedner
- Africa Health Research Institute, KwaZulu-Natal, South Africa
- Division of Infectious Diseases, Massachusetts General Hospital, Boston, Massachusetts, United States of America
| | | |
Collapse
|
4
|
Jiang Y, Gong G. Common and distinct patterns underlying different linguistic tasks: multivariate disconnectome symptom mapping in poststroke patients. Cereb Cortex 2024; 34:bhae008. [PMID: 38265297 DOI: 10.1093/cercor/bhae008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2023] [Revised: 01/04/2023] [Accepted: 01/05/2023] [Indexed: 01/25/2024] Open
Abstract
Numerous studies have been devoted to neural mechanisms of a variety of linguistic tasks (e.g. speech comprehension and production). To date, however, whether and how the neural patterns underlying different linguistic tasks are similar or differ remains elusive. In this study, we compared the neural patterns underlying 3 linguistic tasks mainly concerning speech comprehension and production. To address this, multivariate regression approaches with lesion/disconnection symptom mapping were applied to data from 216 stroke patients with damage to the left hemisphere. The results showed that lesion/disconnection patterns could predict both poststroke scores of speech comprehension and production tasks; these patterns exhibited shared regions on the temporal pole of the left hemisphere as well as unique regions contributing to the prediction for each domain. Lower scores in speech comprehension tasks were associated with lesions/abnormalities in the superior temporal gyrus and middle temporal gyrus, while lower scores in speech production tasks were associated with lesions/abnormalities in the left inferior parietal lobe and frontal lobe. These results suggested an important role of the ventral and dorsal stream pathways in speech comprehension and production (i.e. supporting the dual stream model) and highlighted the applicability of the novel multivariate disconnectome-based symptom mapping in cognitive neuroscience research.
Collapse
Affiliation(s)
- Yaya Jiang
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Gaolang Gong
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
- Beijing Key Laboratory of Brain Imaging and Connectomics, Beijing Normal University, Beijing 100875, China
- Chinese Institute for Brain Research, Beijing 102206, China
| |
Collapse
|
5
|
Lu C, Donners MMPC, Karel J, de Boer H, van Zonneveld AJ, den Ruijter H, Jukema JW, Kraaijeveld A, Kuiper J, Pasterkamp G, Cavill R, Perales-Patón J, Ferrannini E, Goossens P, Biessen EAL. Sex-specific differences in cytokine signaling pathways in circulating monocytes of cardiovascular disease patients. Atherosclerosis 2023; 384:117123. [PMID: 37127497 DOI: 10.1016/j.atherosclerosis.2023.04.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 03/14/2023] [Accepted: 04/14/2023] [Indexed: 05/03/2023]
Abstract
BACKGROUND AND AIMS This study aims to identify sex-specific transcriptional differences and signaling pathways in circulating monocytes contributing to cardiovascular disease. METHODS AND RESULTS We generated sex-biased gene expression signatures by comparing male versus female monocytes of coronary artery disease (CAD) patients (n = 450) from the Center for Translational Molecular Medicine-Circulating Cells Cohort. Gene set enrichment analysis demonstrated that monocytes from female CAD patients carry stronger chemotaxis and migratory signature than those from males. We then inferred cytokine signaling activities based on CytoSig database of 51 cytokine and growth factor regulation profiles. Monocytes from females feature a higher activation level of EGF, IFN1, VEGF, GM-CSF, and CD40L pathways, whereas IL-4, INS, and HMGB1 signaling was seen to be more activated in males. These sex differences were not observed in healthy subjects, as shown for an independent monocyte cohort of healthy subjects (GSE56034, n = 485). More pronounced GM-CSF signaling in monocytes of female CAD patients was confirmed by the significant enrichment of GM-CSF-activated monocyte signature in females. As we show these effects were not due to increased plasma levels of the corresponding ligands, sex-intrinsic differences in monocyte signaling regulation are suggested. Consistently, regulatory network analysis revealed jun-B as a shared transcription factor activated in all female-specific pathways except IFN1 but suppressed in male-activated IL-4. CONCLUSIONS We observed overt CAD-specific sex differences in monocyte transcriptional profiles and cytokine- or growth factor-induced responses, which provide insights into underlying mechanisms of sex differences in CVD.
Collapse
Affiliation(s)
- Chang Lu
- Department of Pathology, Cardiovascular Research Institute Maastricht (CARIM), Maastricht UMC+, Maastricht University, Maastricht, the Netherlands
| | - Marjo M P C Donners
- Department of Pathology, Cardiovascular Research Institute Maastricht (CARIM), Maastricht UMC+, Maastricht University, Maastricht, the Netherlands.
| | - Joël Karel
- Department of Advanced Computing Sciences, Maastricht University, Maastricht, the Netherlands
| | - Hetty de Boer
- Department of Internal Medicine (Nephrology), Leiden UMC, Leiden, the Netherlands
| | | | - Hester den Ruijter
- Laboratory for Experimental Cardiology, Department of Cardiology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - J Wouter Jukema
- Department of Cardiology, Leiden University Medical Center, Leiden, the Netherlands; Netherlands Heart Institute, Utrecht, the Netherlands
| | - Adriaan Kraaijeveld
- Department of Cardiology, University Medical Center Utrecht, Utrecht, the Netherlands
| | - Johan Kuiper
- Division of BioTherapeutics, Leiden Academic Centre for Drug Research, Leiden University, Leiden, the Netherlands
| | | | - Rachel Cavill
- Department of Advanced Computing Sciences, Maastricht University, Maastricht, the Netherlands
| | - Javier Perales-Patón
- Institute for Computational Biomedicine, Faculty of Medicine, Heidelberg University and Heidelberg University Hospital, Heidelberg, Germany; Institute of Experimental Medicine and Systems Biology, RWTH Aachen University, Aachen, Germany; Joint Research Centre for Computational Biomedicine (JRC COMBINE), Faculty of Medicine, RWTH Aachen University, Aachen, Germany
| | - Ele Ferrannini
- Consiglio Nazionale Delle Ricerche (CNR) Institute of Clinical Physiology, Pisa, Italy
| | - Pieter Goossens
- Department of Pathology, Cardiovascular Research Institute Maastricht (CARIM), Maastricht UMC+, Maastricht University, Maastricht, the Netherlands
| | - Erik A L Biessen
- Department of Pathology, Cardiovascular Research Institute Maastricht (CARIM), Maastricht UMC+, Maastricht University, Maastricht, the Netherlands; Institute for Molecular Cardiovascular Research, RWTH Aachen University, Aachen, 52074, Germany
| |
Collapse
|
6
|
Migdał M, Arakawa T, Takizawa S, Furuno M, Suzuki H, Arner E, Winata CL, Kaczkowski B. xcore: an R package for inference of gene expression regulators. BMC Bioinformatics 2023; 24:14. [PMID: 36631751 PMCID: PMC9832628 DOI: 10.1186/s12859-022-05084-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 11/25/2022] [Indexed: 01/13/2023] Open
Abstract
BACKGROUND Elucidating the Transcription Factors (TFs) that drive the gene expression changes in a given experiment is a common question asked by researchers. The existing methods rely on the predicted Transcription Factor Binding Site (TFBS) to model the changes in the motif activity. Such methods only work for TFs that have a motif and assume the TF binding profile is the same in all cell types. RESULTS Given the wealth of the ChIP-seq data available for a wide range of the TFs in various cell types, we propose that gene expression modeling can be done using ChIP-seq "signatures" directly, effectively skipping the motif finding and TFBS prediction steps. We present xcore, an R package that allows TF activity modeling based on ChIP-seq signatures and the user's gene expression data. We also provide xcoredata a companion data package that provides a collection of preprocessed ChIP-seq signatures. We demonstrate that xcore leads to biologically relevant predictions using transforming growth factor beta induced epithelial-mesenchymal transition time-courses, rinderpest infection time-courses, and embryonic stem cells differentiated to cardiomyocytes time-course profiled with Cap Analysis Gene Expression. CONCLUSIONS xcore provides a simple analytical framework for gene expression modeling using linear models that can be easily incorporated into differential expression analysis pipelines. Taking advantage of public ChIP-seq databases, xcore can identify meaningful molecular signatures and relevant ChIP-seq experiments.
Collapse
Affiliation(s)
- Maciej Migdał
- grid.419362.bLaboratory of Zebrafish Developmental Genomics, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Takahiro Arakawa
- grid.509459.40000 0004 0472 0267RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045 Japan
| | - Satoshi Takizawa
- grid.509459.40000 0004 0472 0267RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045 Japan
| | - Masaaki Furuno
- grid.509459.40000 0004 0472 0267RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045 Japan
| | - Harukazu Suzuki
- grid.509459.40000 0004 0472 0267RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045 Japan
| | - Erik Arner
- grid.509459.40000 0004 0472 0267RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045 Japan ,grid.418236.a0000 0001 2162 0389Present Address: GSK, Gunnels Wood Rd, Stevenage, SG1 2NY UK
| | - Cecilia Lanny Winata
- grid.419362.bLaboratory of Zebrafish Developmental Genomics, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Bogumił Kaczkowski
- grid.509459.40000 0004 0472 0267RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045 Japan ,grid.417815.e0000 0004 5929 4381Present Address: Data Sciences and Quantitative Biology, Discovery Sciences, AstraZeneca R&D, Cambridge, UK
| |
Collapse
|
7
|
Wiese AD, Lim SL, Filion DL, Kang SS. Intolerance of uncertainty and neural measures of anticipation and reactivity for affective stimuli. Int J Psychophysiol 2023; 183:138-147. [PMID: 36423712 DOI: 10.1016/j.ijpsycho.2022.11.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 10/31/2022] [Accepted: 11/16/2022] [Indexed: 11/23/2022]
Abstract
Intolerance of uncertainty (IU) is a transdiagnostic construct referring to the aversive interpretation of contexts characterized by uncertainty. Indeed, there is a growing body of research examining individual differences in IU and how these are associated with emotional anticipation and reactivity during periods of certainty and uncertainty, however, how these associations are reflected via neurophysiological indices remain understudied and poorly understood. The present study examined the relationship between self-reported IU and neurophysiological measures of emotional anticipation and reactivity, namely stimulus preceding negativity (SPN) and late positive potential (LPP), and self-report measures of emotional experiences. These measures were captured during an S1-S2 picture viewing tasks in which participants were presented with cues (S1) that either indicated the affective valence of upcoming picture (S2) or provided no information about the valence. Findings here provide evidence for significant associations between SPN amplitude and IU scores during uncertain and certain-positive cueing conditions, and significant associations between LPP amplitude and IU scores during both certain- and uncertain-negative picture viewing conditions that appear driven by prospective IU sub-scores. These positive associations between IU and SPN amplitude are suggestive of heightened emotional anticipation following S1 cues, while positive associations between IU and LPP are suggestive of heightened emotional reactivity following S2 images. These findings are discussed in detail relative to existing IU literature, and potential implications of these findings.
Collapse
Affiliation(s)
- Andrew D Wiese
- Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, United States of America
| | - Seung-Lark Lim
- Department of Psychology, University of Missouri - Kansas City, United States of America
| | - Diane L Filion
- Department of Psychology, University of Missouri - Kansas City, United States of America
| | - Seung Suk Kang
- Department of Biomedical Sciences, University of Missouri - Kansas City, United States of America.
| |
Collapse
|
8
|
Sundaram M, Schmidt JP, Han BA, Drake JM, Stephens PR. Traits, phylogeny and host cell receptors predict Ebolavirus host status among African mammals. PLoS Negl Trop Dis 2022; 16:e0010993. [PMID: 36542657 PMCID: PMC9815631 DOI: 10.1371/journal.pntd.0010993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 01/05/2023] [Accepted: 11/28/2022] [Indexed: 12/24/2022] Open
Abstract
We explore how animal host traits, phylogenetic identity and cell receptor sequences relate to infection status and mortality from ebolaviruses. We gathered exhaustive databases of mortality from Ebolavirus after exposure and infection status based on PCR and antibody tests. We performed ridge regressions predicting mortality and infection as a function of traits, phylogenetic eigenvectors and separately host receptor sequences. We found that mortality from Ebolavirus had a strong association to life history characteristics and phylogeny. In contrast, infection status related not just to life history and phylogeny, but also to fruit consumption which suggests that geographic overlap of frugivorous mammals can lead to spread of virus in the wild. Niemann Pick C1 (NPC1) receptor sequences predicted infection statuses of bats included in our study with very high accuracy, suggesting that characterizing NPC1 in additional species is a promising avenue for future work. We combine the predictions from our mortality and infection status models to differentiate between species that are infected and also die from Ebolavirus versus species that are infected but tolerate the virus (possible reservoirs of Ebolavirus). We therefore present the first comprehensive estimates of Ebolavirus reservoir statuses for all known terrestrial mammals in Africa.
Collapse
Affiliation(s)
- Mekala Sundaram
- Department of Integrative Biology, Oklahoma State University, Stillwater, Oklahoma, United States of America
| | - John Paul Schmidt
- Odum School of Ecology, University of Georgia, Athens, Georgia, United States of America
| | - Barbara A. Han
- Cary Institute of Ecosystems Studies, Millbrook, New York, United States of America
| | - John M. Drake
- Odum School of Ecology, University of Georgia, Athens, Georgia, United States of America
- Center for the Ecology of Infectious Diseases, University of Georgia, Athens, Georgia, United States of America
| | - Patrick R. Stephens
- Department of Integrative Biology, Oklahoma State University, Stillwater, Oklahoma, United States of America
| |
Collapse
|
9
|
Gauran II, Xue G, Chen C, Ombao H, Yu Z. Ridge Penalization in High-Dimensional Testing With Applications to Imaging Genetics. Front Neurosci 2022; 16:836100. [PMID: 35401090 PMCID: PMC8987922 DOI: 10.3389/fnins.2022.836100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Accepted: 02/24/2022] [Indexed: 11/13/2022] Open
Abstract
High-dimensionality is ubiquitous in various scientific fields such as imaging genetics, where a deluge of functional and structural data on brain-relevant genetic polymorphisms are investigated. It is crucial to identify which genetic variations are consequential in identifying neurological features of brain connectivity compared to merely random noise. Statistical inference in high-dimensional settings poses multiple challenges involving analytical and computational complexity. A widely implemented strategy in addressing inference goals is penalized inference. In particular, the role of the ridge penalty in high-dimensional prediction and estimation has been actively studied in the past several years. This study focuses on ridge-penalized tests in high-dimensional hypothesis testing problems by proposing and examining a class of methods for choosing the optimal ridge penalty. We present our findings on strategies to improve the statistical power of ridge-penalized tests and what determines the optimal ridge penalty for hypothesis testing. The application of our work to an imaging genetics study and biological research will be presented.
Collapse
Affiliation(s)
- Iris Ivy Gauran
- Biostatistics Group, Computer, Electrical, Mathematical Sciences, and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Gui Xue
- Center for Brain and Learning Science, Beijing Normal University, Beijing, China
| | - Chuansheng Chen
- Department of Psychological Science, University of California, Irvine, Irvine, CA, United States
| | - Hernando Ombao
- Biostatistics Group, Computer, Electrical, Mathematical Sciences, and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Zhaoxia Yu
- Department of Statistics, University of California, Irvine, Irvine, CA, United States
| |
Collapse
|
10
|
Hertzberg VS, Singh H, Fournier CN, Moustafa A, Polak M, Kuelbs CA, Torralba MG, Tansey MG, Nelson KE, Glass JD. Gut microbiome differences between amyotrophic lateral sclerosis patients and spouse controls. Amyotroph Lateral Scler Frontotemporal Degener 2022; 23:91-99. [PMID: 33818222 PMCID: PMC10676149 DOI: 10.1080/21678421.2021.1904994] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 02/08/2021] [Accepted: 03/08/2021] [Indexed: 02/06/2023]
Abstract
Objective: Amyotrophic lateral sclerosis (ALS) is a progressive neurodegenerative disease that is incurable and ultimately fatal. Few therapeutic options are available to patients. In this study, we explored differences in microbiome composition associated with ALS. Methods: We compared the gut microbiome and inflammatory marker profiles of ALS patients (n = 10) to those of their spouses (n = 10). Gut microbiome profiles were determined by 16S rRNA gene sequencing. Results: The gut microbial communities of the ALS patients were more diverse and were deficient in Prevotella spp. compared with those of their spouses. In contrast, healthy couples (n = 10 couples of the opposite sex) recruited from the same geographic region as the patient population did not exhibit these differences. Stool and plasma inflammatory markers were similar between ALS patients and their spouses. Predictive analysis of microbial enzymes revealed that ALS patients had decreased activity in several metabolic pathways, including carbon metabolism, butyrate metabolism, and systems involving histidine kinase and response regulators. Conclusions: ALS patients exhibit differences in their gut microbial communities compared with spouse controls. Our findings suggest that modifying the gut microbiome, such as via amelioration of Prevotella spp. deficiency, and/or altering butyrate metabolism may have translational value for ALS treatment.
Collapse
Affiliation(s)
- Vicki S Hertzberg
- Nell Hodgson Woodruff School of Nursing, Emory University, Atlanta, GA, USA
| | | | - Christina N Fournier
- Department of Neurology, Emory University, Atlanta, GA, USA
- Department of Neurology, Department of Veterans Affairs, Atlanta, GA, USA
| | - Ahmed Moustafa
- Department of Biology, The American University in Cairo, Cairo, Egypt
| | - Meraida Polak
- Department of Neurology, Emory University, Atlanta, GA, USA
| | | | | | - Malú G Tansey
- Department of Physiology, Emory University, Atlanta, GA, USA
| | - Karen E Nelson
- J. Craig Venter Institute, Rockville, MD, USA
- J. Craig Venter Institute, La Jolla, CA, USA
| | | |
Collapse
|
11
|
Karanth S, Tanui CK, Meng J, Pradhan AK. Exploring the predictive capability of advanced machine learning in identifying severe disease phenotype in Salmonella enterica. Food Res Int 2022; 151:110817. [PMID: 34980422 DOI: 10.1016/j.foodres.2021.110817] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Revised: 11/12/2021] [Accepted: 11/17/2021] [Indexed: 11/26/2022]
Abstract
The past few years have seen a significant increase in availability of whole genome sequencing information, allowing for its incorporation in predictive modeling for foodborne pathogens to account for inter- and intra-species differences in their virulence. However, this is hindered by the inability of traditional statistical methods to analyze such large amounts of data compared to the number of observations/isolates. In this study, we have explored the applicability of machine learning (ML) models to predict the disease outcome, while identifying features that exert a significant effect on the prediction. This study was conducted on Salmonella enterica, a major foodborne pathogen with considerable inter- and intra-serovar variation. WGS of isolates obtained from various sources (i.e., human, chicken, and swine) were used as input in four machine learning models (logistic regression with ridge, random forest, support vector machine, and AdaBoost) to classify isolates based on disease severity (extraintestinal vs. gastrointestinal) in the host. The predictive performances of all models were tested with and without Elastic Net regularization to combat dimensionality issues. Elastic Net-regularized logistic regression model showed the best area under the receiver operating characteristic curve (AUC-ROC; 0.86) and outcome prediction accuracy (0.76). Additionally, genes coding for transcriptional regulation, acidic, oxidative, and anaerobic stress response, and antibiotic resistance were found to be significant predictors of disease severity. These genes, which were significantly associated with each outcome, could possibly be input in amended, gene-expression-specific predictive models to estimate virulence pattern-specific effect of Salmonella and other foodborne pathogens on human health.
Collapse
Affiliation(s)
- Shraddha Karanth
- Department of Nutrition and Food Science, University of Maryland, College Park, MD 20742, USA
| | - Collins K Tanui
- Department of Nutrition and Food Science, University of Maryland, College Park, MD 20742, USA; Center for Food Safety and Security Systems, University of Maryland, College Park, MD 20742, USA
| | - Jianghong Meng
- Department of Nutrition and Food Science, University of Maryland, College Park, MD 20742, USA; Center for Food Safety and Security Systems, University of Maryland, College Park, MD 20742, USA; Joint Institute for Food Safety and Applied Nutrition, University of Maryland, College Park, MD 20742, USA
| | - Abani K Pradhan
- Department of Nutrition and Food Science, University of Maryland, College Park, MD 20742, USA; Center for Food Safety and Security Systems, University of Maryland, College Park, MD 20742, USA.
| |
Collapse
|
12
|
Veerman JR, Leday GGR, van de Wiel MA. Estimation of variance components, heritability and the ridge penalty in high-dimensional generalized linear models. COMMUN STAT-SIMUL C 2022. [DOI: 10.1080/03610918.2019.1646760] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Jurre R. Veerman
- Departement of Epidemiology & Biostatistics, Amsterdam Public Health research institute, Amsterdam University medical centers, Amsterdam, The Netherlands
- Mathematical Institute, Leiden University, Leiden, the Netherlands
| | | | - Mark A. van de Wiel
- Departement of Epidemiology & Biostatistics, Amsterdam Public Health research institute, Amsterdam University medical centers, Amsterdam, The Netherlands
- MRC Biostatistics Unit, Cambridge University, Cambridge, UK
| |
Collapse
|
13
|
|
14
|
Abstract
We introduce a supervised machine learning approach with sparsity constraints for phylogenomics, referred to as evolutionary sparse learning (ESL). ESL builds models with genomic loci—such as genes, proteins, genomic segments, and positions—as parameters. Using the Least Absolute Shrinkage and Selection Operator, ESL selects only the most important genomic loci to explain a given phylogenetic hypothesis or presence/absence of a trait. ESL models do not directly involve conventional parameters such as rates of substitutions between nucleotides, rate variation among positions, and phylogeny branch lengths. Instead, ESL directly employs the concordance of variation across sequences in an alignment with the evolutionary hypothesis of interest. ESL provides a natural way to combine different molecular and nonmolecular data types and incorporate biological and functional annotations of genomic loci in model building. We propose positional, gene, function, and hypothesis sparsity scores, illustrate their use through an example, and suggest several applications of ESL. The ESL framework has the potential to drive the development of a new class of computational methods that will complement traditional approaches in evolutionary genomics, particularly for identifying influential loci and sequences given a phylogeny and building models to test hypotheses. ESL’s fast computational times and small memory footprint will also help democratize big data analytics and improve scientific rigor in phylogenomics.
Collapse
Affiliation(s)
- Sudhir Kumar
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA.,Department of Biology, Temple University, Philadelphia, PA.,Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Sudip Sharma
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA.,Department of Biology, Temple University, Philadelphia, PA
| |
Collapse
|
15
|
Werfel S, Jakob CEM, Borgmann S, Schneider J, Spinner C, Schons M, Hower M, Wille K, Haselberger M, Heuzeroth H, Rüthrich MM, Dolff S, Kessel J, Heemann U, Vehreschild JJ, Rieg S, Schmaderer C. Development and validation of a simplified risk score for the prediction of critical COVID-19 illness in newly diagnosed patients. J Med Virol 2021; 93:6703-6713. [PMID: 34331717 PMCID: PMC8426905 DOI: 10.1002/jmv.27252] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 05/13/2021] [Accepted: 07/29/2021] [Indexed: 11/10/2022]
Abstract
Scores to identify patients at high risk of progression of coronavirus disease (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), may become instrumental for clinical decision-making and patient management. We used patient data from the multicentre Lean European Open Survey on SARS-CoV-2-Infected Patients (LEOSS) and applied variable selection to develop a simplified scoring system to identify patients at increased risk of critical illness or death. A total of 1946 patients who tested positive for SARS-CoV-2 were included in the initial analysis and assigned to derivation and validation cohorts (n = 1297 and n = 649, respectively). Stability selection from over 100 baseline predictors for the combined endpoint of progression to the critical phase or COVID-19-related death enabled the development of a simplified score consisting of five predictors: C-reactive protein (CRP), age, clinical disease phase (uncomplicated vs. complicated), serum urea, and D-dimer (abbreviated as CAPS-D score). This score yielded an area under the curve (AUC) of 0.81 (95% confidence interval [CI]: 0.77-0.85) in the validation cohort for predicting the combined endpoint within 7 days of diagnosis and 0.81 (95% CI: 0.77-0.85) during full follow-up. We used an additional prospective cohort of 682 patients, diagnosed largely after the "first wave" of the pandemic to validate the predictive accuracy of the score and observed similar results (AUC for the event within 7 days: 0.83 [95% CI: 0.78-0.87]; for full follow-up: 0.82 [95% CI: 0.78-0.86]). An easily applicable score to calculate the risk of COVID-19 progression to critical illness or death was thus established and validated.
Collapse
Affiliation(s)
- Stanislas Werfel
- Department of Nephrology, School of Medicine, Technical University of Munich, Klinikum rechts der Isar, Munich, Germany
| | - Carolin E M Jakob
- Department I for Internal Medicine, University Hospital of Cologne, University of Cologne, Cologne, Germany.,German Centre for Infection Research (DZIF), Partner Site Bonn-Cologne, Cologne, Germany
| | - Stefan Borgmann
- Department of Infectious Diseases and Infection Control, Ingolstadt Hospital, Ingolstadt, Germany
| | - Jochen Schneider
- Department of Internal Medicine II, School of Medicine, Technical University of Munich, University Hospital rechts der Isar, Munich, Germany.,German Centre for Infection Research (DZIF), Partner Site Munich, Munich, Germany
| | - Christoph Spinner
- Department of Internal Medicine II, School of Medicine, Technical University of Munich, University Hospital rechts der Isar, Munich, Germany.,German Centre for Infection Research (DZIF), Partner Site Munich, Munich, Germany
| | - Maximilian Schons
- Department I for Internal Medicine, University Hospital of Cologne, University of Cologne, Cologne, Germany
| | - Martin Hower
- Department of Pneumology, Infectious Diseases and Internal Medicine, Klinikum Dortmund gGmbH, Dortmund, Germany
| | - Kai Wille
- University Clinic for Haematology, Oncology, Haemostaseology and Palliative Care, Johannes Wesling Medical Centre Minden UKRUB, University of Bochum, Minden, Germany
| | | | - Hanno Heuzeroth
- Department of Emergency and Intensive Care Medicine, Klinikum Ernst von Bergmann, Potsdam, Germany
| | - Maria M Rüthrich
- Department of Internal Medicine II, Hematology and Medical Oncology, University Hospital Jena, Jena, Germany
| | - Sebastian Dolff
- Department of Infectious Diseases, University Hospital Essen, University Duisburg-Essen, Essen, Germany
| | - Johanna Kessel
- Department of Internal Medicine, Hematology and Oncology, Goethe University Frankfurt, Frankfurt, Germany
| | - Uwe Heemann
- Department of Nephrology, School of Medicine, Technical University of Munich, Klinikum rechts der Isar, Munich, Germany
| | - Jörg J Vehreschild
- Department I for Internal Medicine, University Hospital of Cologne, University of Cologne, Cologne, Germany.,German Centre for Infection Research (DZIF), Partner Site Bonn-Cologne, Cologne, Germany.,Department of Internal Medicine, Hematology and Oncology, Goethe University Frankfurt, Frankfurt, Germany
| | - Siegbert Rieg
- Department of Medicine II, University of Freiburg, Freiburg, Germany
| | - Christoph Schmaderer
- Department of Nephrology, School of Medicine, Technical University of Munich, Klinikum rechts der Isar, Munich, Germany
| | | |
Collapse
|
16
|
Khalili S, Faradmal J, Mahjub H, Moeini B, Ezzati-Rastegar K. Overcoming the problems caused by collinearity in mixed-effects logistic model: determining the contribution of various types of violence on depression in pregnant women. BMC Med Res Methodol 2021; 21:154. [PMID: 34320952 PMCID: PMC8317320 DOI: 10.1186/s12874-021-01325-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Accepted: 05/21/2021] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Collinearity is a common and problematic phenomenon in studies on public health. It leads to inflation in variance of estimator and reduces test power. This phenomenon can occur in any model. In this study, a new ridge mixed-effects logistic model (RMELM) is proposed to overcome consequences of collinearity in correlated binary responses. METHODS Parameters were estimated through penalized log-likelihood with combining expectation maximization (EM) algorithm, gradient ascent, and Fisher-scoring methods. A simulation study was performed to compare new model with mixed-effects logistic model(MELM). Mean square error, relative bias, empirical power, and variance of random effects were used to evaluate RMELM. Also, contribution of various types of violence, and intervention on depression among pregnant women experiencing intimate partner violence(IPV) were analyzed by new and previous models. RESULTS Simulation study showed that mean square errors of fixed effects were decreased for RMELM than MELM and empirical power were increased. Inflation in variance of estimators due to collinearity was clearly shown in the MELM in data on IPV and RMELM adjusted the variances. CONCLUSIONS According to simulation results and analyzing IPV data, this new estimator is appropriate to deal with collinearity problems in the modelling of correlated binary responses.
Collapse
Affiliation(s)
- Sanaz Khalili
- Department of Biostatistics School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Javad Faradmal
- Department of Biostatistics School of Public Health, Modeling of Noncommunicable Diseases Research Center, Hamadan University of Medical Sciences, Hamadan, Iran.
| | - Hossein Mahjub
- Department of Biostatistics School of Public Health, Modeling of Noncommunicable Diseases Research Center, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Babak Moeini
- Social Determinants of Health Research Center, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Khadijeh Ezzati-Rastegar
- Health Education and Promotion, Department of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran
| |
Collapse
|
17
|
Pluta D, Shen T, Xue G, Chen C, Ombao H, Yu Z. Ridge-penalized adaptive Mantel test and its application in imaging genetics. Stat Med 2021; 40:5313-5332. [PMID: 34216035 DOI: 10.1002/sim.9127] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 06/01/2021] [Accepted: 06/16/2021] [Indexed: 01/23/2023]
Abstract
We propose a ridge-penalized adaptive Mantel test (AdaMant) for evaluating the association of two high-dimensional sets of features. By introducing a ridge penalty, AdaMant tests the association across many metrics simultaneously. We demonstrate how ridge penalization bridges Euclidean and Mahalanobis distances and their corresponding linear models from the perspective of association measurement and testing. This result is not only theoretically interesting but also has important implications in penalized hypothesis testing, especially in high-dimensional settings such as imaging genetics. Applying the proposed method to an imaging genetic study of visual working memory in healthy adults, we identified interesting associations of brain connectivity (measured by electroencephalogram coherence) with selected genetic features.
Collapse
Affiliation(s)
- Dustin Pluta
- Department of Statistics, University of California, Irvine, Irvine, California, USA
| | - Tong Shen
- Department of Statistics, University of California, Irvine, Irvine, California, USA
| | - Gui Xue
- Center for Brain and Learning Science, Beijing Normal University, Beijing, China
| | - Chuansheng Chen
- Department of Psychology and Social Behavior, University of California, Irvine, Irvine, California, USA
| | - Hernando Ombao
- Statistics Program, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Zhaoxia Yu
- Department of Statistics, University of California, Irvine, Irvine, California, USA
| |
Collapse
|
18
|
Enderle I, Costet N, Cognez N, Zaros C, Caudeville J, Garlantezec R, Chevrier C, Nougadere A, De Lauzon-Guillain B, Le Lous M, Beranger R. Prenatal exposure to pesticides and risk of preeclampsia among pregnant women: Results from the ELFE cohort. ENVIRONMENTAL RESEARCH 2021; 197:111048. [PMID: 33766571 DOI: 10.1016/j.envres.2021.111048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2020] [Revised: 03/15/2021] [Accepted: 03/16/2021] [Indexed: 06/12/2023]
Abstract
BACKGROUND Preeclampsia is a pregnancy-specific syndrome caused by abnormal placentation. Although environmental chemicals, including some pesticides, are suspected of impairing placentation and promoting preeclampsia, its relationship with preeclampsia has been insufficiently explored. OBJECTIVES We aimed to investigate the relation between non-occupational exposure to pesticides during pregnancy and the risk of preeclampsia. METHODS The study cohort comprised 195 women with and 17,181 without preeclampsia from the ELFE birth cohort. We used toxicogenomic approaches to select 41 pesticides of interest for their possible influence on preeclampsia. We assessed household pesticide use (self-reported data), environmental exposure to agricultural pesticides (geographic information systems), and dietary exposure (food-frequency questionnaire with data from monitoring pesticide residues in food and water). Dietary exposures to pesticides were grouped into clusters of similar exposures to resolve collinearity issues. For each exposure source, pesticides were mutually adjusted, and odds ratios estimated with logistic regression models. RESULTS The quantity of prochloraz applied within a kilometer of the women's homes was higher in women with than without preeclampsia (fourth quartile vs. others; adjusted odds ratio [aOR] = 1.54; 95%CI: 1.02, 2.35), especially when preeclampsia was diagnosed before 34 weeks of gestation (aOR = 2.25; 95%CI: 1.01, 5.06). The reverse was observed with nearby cypermethrin application (aOR = 0.59, 95%CI: 0.36, 0.96). In sensitivity analyses, women with preeclampsia receiving antihypertensive treatment had a significantly higher probability of using herbicides at home during pregnancy than women without preeclampsia (aOR = 2.20; 95%CI: 1.23, 3.93). No statistically significant association was found between dietary exposure to pesticide residues and preeclampsia. DISCUSSION While the most of the associations examined remained statistically non-significant, our results suggest the possible influence on preeclampsia of residential exposures to prochloraz and some herbicides. These estimations are supported by toxicological and mechanistic data.
Collapse
Affiliation(s)
- Isabelle Enderle
- CHU Rennes, Univ Rennes, Inserm, EHESP, Irset (Institut de Recherche en Santé, Environnement et Travail) - UMR_S 1085, F-35000, Rennes, France; Department of Obstetrics and Gynecology and Reproductive Medicine, Anne de Bretagne University Hospital, Rennes, France.
| | - Nathalie Costet
- Univ Rennes, Inserm, EHESP, Irset - UMR_S 1085, F-35000, Rennes, France
| | - Noriane Cognez
- Univ Rennes, Inserm, EHESP, Irset - UMR_S 1085, F-35000, Rennes, France
| | - Cécile Zaros
- French Institute for Demographic Studies (Ined), French Institute for Medical Research and Health (Inserm), French Blood Agency, ELFE Joint Unit, F-75020, Paris, France
| | - Julien Caudeville
- INERIS (French National Institute for Industrial Environment and Risks), 60550, Verneuil-en-Halatte, France
| | - Ronan Garlantezec
- CHU Rennes, Univ Rennes, Inserm, EHESP, Irset (Institut de Recherche en Santé, Environnement et Travail) - UMR_S 1085, F-35000, Rennes, France
| | - Cécile Chevrier
- Univ Rennes, Inserm, EHESP, Irset - UMR_S 1085, F-35000, Rennes, France
| | - Alexandre Nougadere
- ANSES, Risk Assessment Department, 14 Rue Pierre et Marie Curie, F-94701, Maisons-Alfort, France
| | | | - Maela Le Lous
- Department of Obstetrics and Gynecology and Reproductive Medicine, Anne de Bretagne University Hospital, Rennes, France
| | - Rémi Beranger
- CHU Rennes, Univ Rennes, Inserm, EHESP, Irset (Institut de Recherche en Santé, Environnement et Travail) - UMR_S 1085, F-35000, Rennes, France; Department of Obstetrics and Gynecology and Reproductive Medicine, Anne de Bretagne University Hospital, Rennes, France
| |
Collapse
|
19
|
Chang SM, Yang M, Lu W, Huang YJ, Huang Y, Hung H, Miecznikowski JC, Lu TP, Tzeng JY. Gene-Set Integrative Analysis of Multi-Omics Data Using Tensor-based Association Test. Bioinformatics 2021; 37:2259-2265. [PMID: 33674827 PMCID: PMC8388036 DOI: 10.1093/bioinformatics/btab125] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 12/30/2020] [Accepted: 02/24/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Facilitated by technological advances and the decrease in costs, it is feasible to gather subject data from several omics platforms. Each platform assesses different molecular events, and the challenge lies in efficiently analyzing these data to discover novel disease genes or mechanisms. A common strategy is to regress the outcomes on all omics variables in a gene set. However, this approach suffers from problems associated with high-dimensional inference. RESULTS We introduce a tensor-based framework for variable-wise inference in multi-omics analysis. By accounting for the matrix structure of an individual's multi-omics data, the proposed tensor methods incorporate the relationship among omics effects, reduce the number of parameters, and boost the modeling efficiency. We derive the variable-specific tensor test and enhance computational efficiency of tensor modeling. Using simulations and data applications on the Cancer Cell Line Encyclopedia (CCLE), we demonstrate our method performs favorably over baseline methods and will be useful for gaining biological insights in multi-omics analysis. AVAILABILITY AND IMPLEMENTATION R function and instruction are available from the authors' website: https://www4.stat.ncsu.edu/∼jytzeng/Software/TR.omics/TRinstruction.pdf. SUPPLEMENTARY INFORMATION Supplementary materials are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sheng-Mao Chang
- Department of Statistics, National Cheng Kung University, Tainan, Taiwan
| | - Meng Yang
- Department of Statistics, North Carolina State University, Raleigh NC, 27695, USA
| | - Wenbin Lu
- Department of Statistics, North Carolina State University, Raleigh NC, 27695, USA
| | - Yu-Jyun Huang
- Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei, Taiwan
| | - Yueyang Huang
- Bioinformatics Research Center, North Carolina State University, Raleigh NC, 27695, USA
| | - Hung Hung
- Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei, Taiwan
| | | | - Tzu-Pin Lu
- Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei, Taiwan
| | - Jung-Ying Tzeng
- Department of Statistics, National Cheng Kung University, Tainan, Taiwan.,Department of Statistics, North Carolina State University, Raleigh NC, 27695, USA.,Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei, Taiwan.,Bioinformatics Research Center, North Carolina State University, Raleigh NC, 27695, USA
| |
Collapse
|
20
|
Development and validation of a meal quality index with applications to NHANES 2005-2014. PLoS One 2020; 15:e0244391. [PMID: 33351843 PMCID: PMC7755194 DOI: 10.1371/journal.pone.0244391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Accepted: 12/09/2020] [Indexed: 11/19/2022] Open
Abstract
The Meal Balance Index (MBI) assesses the nutritional quality and balance of meals. It is a score between 0 and 100 that takes into account both shortfall and excess nutrients, adjusted for the energy content of the meal. In the present study the score was applied to 147849 meals reported in the National Health and Nutrition Examination Surveys (NHANES) 2005-2014 in order to evaluate its validity and compare against exemplary meals designed as part of 24h diets that meet US dietary guidelines. Meals from exemplary menu plans developed by nutrition experts scored on average 76±14 (mean ± standard deviation) whereas those of NHANES participants scored 45±14. Scores of breakfast, lunch, dinner, snack, considered jointly as independent variables, were moderately but positively and significantly associated with the Healthy Eating Index (Pearson correlation 0.6). MBI scores were significantly associated with the density of positive micronutrients (e.g. Vit A, Vit C) and favorable food groups (e.g. fruits, whole grains) not directly included in the MBI algorithm. The MBI is a valid tool to assess the nutritional quality of meals reported in the US population and if applied to culinary recipe websites could potentially help users to understand which meals are nutritionally balanced. Choice of more balanced individual meals can guide healthier cooking and eating.
Collapse
|
21
|
Mendelian randomization while jointly modeling cis genetics identifies causal relationships between gene expression and lipids. Nat Commun 2020; 11:4930. [PMID: 33004804 PMCID: PMC7530717 DOI: 10.1038/s41467-020-18716-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Accepted: 09/08/2020] [Indexed: 12/11/2022] Open
Abstract
Inference of causality between gene expression and complex traits using Mendelian randomization (MR) is confounded by pleiotropy and linkage disequilibrium (LD) of gene-expression quantitative trait loci (eQTL). Here, we propose an MR method, MR-link, that accounts for unobserved pleiotropy and LD by leveraging information from individual-level data, even when only one eQTL variant is present. In simulations, MR-link shows false-positive rates close to expectation (median 0.05) and high power (up to 0.89), outperforming all other tested MR methods and coloc. Application of MR-link to low-density lipoprotein cholesterol (LDL-C) measurements in 12,449 individuals with expression and protein QTL summary statistics from blood and liver identifies 25 genes causally linked to LDL-C. These include the known SORT1 and ApoE genes as well as PVRL2, located in the APOE locus, for which a causal role in liver was not known. Our results showcase the strength of MR-link for transcriptome-wide causal inferences. Mendelian randomization is a useful tool to infer causal relationships between traits, but can be confounded by the presence of pleiotropy. Here, the authors have developed MR-link, a Mendelian randomization method which accounts for unobserved pleiotropy and linkage disequilibrium between instrumental variables.
Collapse
|
22
|
Wittenburg D, Bonk S, Doschoris M, Reyer H. Design of experiments for fine-mapping quantitative trait loci in livestock populations. BMC Genet 2020; 21:66. [PMID: 32600319 PMCID: PMC7324978 DOI: 10.1186/s12863-020-00871-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Accepted: 06/09/2020] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND Single nucleotide polymorphisms (SNPs) which capture a significant impact on a trait can be identified with genome-wide association studies. High linkage disequilibrium (LD) among SNPs makes it difficult to identify causative variants correctly. Thus, often target regions instead of single SNPs are reported. Sample size has not only a crucial impact on the precision of parameter estimates, it also ensures that a desired level of statistical power can be reached. We study the design of experiments for fine-mapping of signals of a quantitative trait locus in such a target region. METHODS A multi-locus model allows to identify causative variants simultaneously, to state their positions more precisely and to account for existing dependencies. Based on the commonly applied SNP-BLUP approach, we determine the z-score statistic for locally testing non-zero SNP effects and investigate its distribution under the alternative hypothesis. This quantity employs the theoretical instead of observed dependence between SNPs; it can be set up as a function of paternal and maternal LD for any given population structure. RESULTS We simulated multiple paternal half-sib families and considered a target region of 1 Mbp. A bimodal distribution of estimated sample size was observed, particularly if more than two causative variants were assumed. The median of estimates constituted the final proposal of optimal sample size; it was consistently less than sample size estimated from single-SNP investigation which was used as a baseline approach. The second mode pointed to inflated sample sizes and could be explained by blocks of varying linkage phases leading to negative correlations between SNPs. Optimal sample size increased almost linearly with number of signals to be identified but depended much stronger on the assumption on heritability. For instance, three times as many samples were required if heritability was 0.1 compared to 0.3. An R package is provided that comprises all required tools. CONCLUSIONS Our approach incorporates information about the population structure into the design of experiments. Compared to a conventional method, this leads to a reduced estimate of sample size enabling the resource-saving design of future experiments for fine-mapping of candidate variants.
Collapse
Affiliation(s)
- Dörte Wittenburg
- Leibniz Institute for Farm Animal Biology, Institute of Genetics and Biometry, Dummerstorf, 18196 Germany
| | - Sarah Bonk
- University Medicine Greifswald, Department of Psychiatry and Psychotherapy, Greifswald, 17475 Germany
| | - Michael Doschoris
- Leibniz Institute for Farm Animal Biology, Institute of Genetics and Biometry, Dummerstorf, 18196 Germany
| | - Henry Reyer
- Leibniz Institute for Farm Animal Biology, Institute of Genome Biology, Dummerstorf, 18196 Germany
| |
Collapse
|
23
|
On Some Test Statistics for Testing the Regression Coefficients in Presence of Multicollinearity: A Simulation Study. STATS 2020. [DOI: 10.3390/stats3010005] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Ridge regression is a popular method to solve the multicollinearity problem for both linear and non-linear regression models. This paper studied forty different ridge regression t-type tests of the individual coefficients of a linear regression model. A simulation study was conducted to evaluate the performance of the proposed tests with respect to their empirical sizes and powers under different settings. Our simulation results demonstrated that many of the proposed tests have type I error rates close to the 5% nominal level and, among those, all tests except one have considerable gain in powers over the standard ordinary least squares (OLS) t-type test. It was observed from our simulation results that seven tests based on some ridge estimators performed better than the rest in terms of achieving higher power gains while maintaining a 5% nominal size.
Collapse
|
24
|
Shigemizu D, Akiyama S, Asanomi Y, Boroevich KA, Sharma A, Tsunoda T, Sakurai T, Ozaki K, Ochiya T, Niida S. A comparison of machine learning classifiers for dementia with Lewy bodies using miRNA expression data. BMC Med Genomics 2019; 12:150. [PMID: 31666070 PMCID: PMC6822471 DOI: 10.1186/s12920-019-0607-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Accepted: 10/18/2019] [Indexed: 12/21/2022] Open
Abstract
Background Dementia with Lewy bodies (DLB) is the second most common subtype of neurodegenerative dementia in humans following Alzheimer’s disease (AD). Present clinical diagnosis of DLB has high specificity and low sensitivity and finding potential biomarkers of prodromal DLB is still challenging. MicroRNAs (miRNAs) have recently received a lot of attention as a source of novel biomarkers. Methods In this study, using serum miRNA expression of 478 Japanese individuals, we investigated potential miRNA biomarkers and constructed an optimal risk prediction model based on several machine learning methods: penalized regression, random forest, support vector machine, and gradient boosting decision tree. Results The final risk prediction model, constructed via a gradient boosting decision tree using 180 miRNAs and two clinical features, achieved an accuracy of 0.829 on an independent test set. We further predicted candidate target genes from the miRNAs. Gene set enrichment analysis of the miRNA target genes revealed 6 functional genes included in the DHA signaling pathway associated with DLB pathology. Two of them were further supported by gene-based association studies using a large number of single nucleotide polymorphism markers (BCL2L1: P = 0.012, PIK3R2: P = 0.021). Conclusions Our proposed prediction model provides an effective tool for DLB classification. Also, a gene-based association test of rare variants revealed that BCL2L1 and PIK3R2 were statistically significantly associated with DLB.
Collapse
Affiliation(s)
- Daichi Shigemizu
- Laboratory Chief, Division of Genomic Medicine, Medical Genome Center, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu, Aichi, 474-8511, Japan. .,Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University (TMDU), Tokyo, 113-8510, Japan. .,RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan. .,CREST, JST, Tokyo, 113-8510, Japan.
| | - Shintaro Akiyama
- Laboratory Chief, Division of Genomic Medicine, Medical Genome Center, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu, Aichi, 474-8511, Japan
| | - Yuya Asanomi
- Laboratory Chief, Division of Genomic Medicine, Medical Genome Center, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu, Aichi, 474-8511, Japan
| | - Keith A Boroevich
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan
| | - Alok Sharma
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan.,CREST, JST, Tokyo, 113-8510, Japan.,School of Engineering & Physics, University of the South Pacific, Suva, Fiji.,Institute for Integrated and Intelligent Systems, Griffith University, QLD, Brisbane, 4111, Australia
| | - Tatsuhiko Tsunoda
- Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University (TMDU), Tokyo, 113-8510, Japan.,RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan.,CREST, JST, Tokyo, 113-8510, Japan
| | - Takashi Sakurai
- The Center for Comprehensive Care and Research on Memory Disorders, National Center for Geriatrics and Gerontology, Obu, Aichi, 474-8511, Japan.,Department of Cognitive and Behavioral Science, Nagoya University Graduate School of Medicine, Nagoya, Aichi, 466-8550, Japan
| | - Kouichi Ozaki
- Laboratory Chief, Division of Genomic Medicine, Medical Genome Center, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu, Aichi, 474-8511, Japan.,RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan
| | - Takahiro Ochiya
- Division of Molecular and Cellular Medicine, Fundamental Innovative Oncology Core Center, National Cancer Center Research Institute, Tokyo, 104-0045, Japan.,Institute of Medical Science, Tokyo Medical University, Tokyo, 160-8402, Japan
| | - Shumpei Niida
- Laboratory Chief, Division of Genomic Medicine, Medical Genome Center, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu, Aichi, 474-8511, Japan
| |
Collapse
|
25
|
Majid A, Aslam M, Altaf S, Amanullah M. Addressing the distributed lag models with heteroscedastic errors. COMMUN STAT-SIMUL C 2019. [DOI: 10.1080/03610918.2019.1643884] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Abdul Majid
- Pakistan Bureau of Statistics, Regional Office, Multan, Pakistan
| | - Muhammad Aslam
- Department of Statistics, Bahauddin Zakariya University, Multan, Pakistan
| | - Saima Altaf
- Department of Statistics, Bahauddin Zakariya University, Multan, Pakistan
| | - Muhammad Amanullah
- Department of Statistics, Bahauddin Zakariya University, Multan, Pakistan
| |
Collapse
|
26
|
Dysfunctional Neural Processes Underlying Context Processing Deficits in Schizophrenia. BIOLOGICAL PSYCHIATRY: COGNITIVE NEUROSCIENCE AND NEUROIMAGING 2019; 4:644-654. [PMID: 31147272 DOI: 10.1016/j.bpsc.2019.03.012] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Accepted: 03/20/2019] [Indexed: 12/21/2022]
Abstract
BACKGROUND People with schizophrenia (PSZ) have profound deficits in context processing, an executive process that guides adaptive behaviors according to goals and stored contextual information. Although various neural processes are involved in context processing and are affected in PSZ, the core underlying neural dysfunction is unclear. METHODS To determine the relative importance of neural dysfunctions within prefrontal cognitive control, sensory activity, and motor activity to context processing deficits in PSZ, we examined event-related potentials (ERPs) in 60 PSZ and 51 healthy control subjects during an optimal context processing task. We also analyzed the Ex-Gaussian reaction time distribution to examine abnormalities in motor control variability in PSZ. RESULTS Compared with healthy control subjects, PSZ had lower response accuracy and greater variability in their normal reaction times during high context processing demands. Latencies of normal and slow responses were generally increased in PSZ. High context processing-related reductions in frontal ERPs were indicative of specific deficits in proactive and reactive cognitive controls in PSZ, while ERPs associated with visual and motor processes were reduced regardless of context processing demands, indicating generalized visuomotor deficits. In contrast to previous studies, we found that diminished frontal responses reflective of proactive control of the contextual cue, rather than visual responses of cue encoding, predicted response accuracy deficits in PSZ. In addition, probe-related ERP components of motor preparation, prefrontal reactive control, and frontomotor interaction predicted Ex-Gaussian indices of reaction time instability in PSZ. CONCLUSIONS Prefrontal proactive and reactive control deficits associated with failures in using mental representation likely underlie context processing deficits in PSZ.
Collapse
|
27
|
Keaton SA, Madaj ZB, Heilman P, Smart L, Grit J, Gibbons R, Postolache T, Roaten K, Achtyes E, Brundin L. An inflammatory profile linked to increased suicide risk. J Affect Disord 2019; 247:57-65. [PMID: 30654266 PMCID: PMC6860980 DOI: 10.1016/j.jad.2018.12.100] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 11/25/2018] [Accepted: 12/24/2018] [Indexed: 12/19/2022]
Abstract
BACKGROUND Suicide risk assessments are often challenging for clinicians, and therefore, biological markers are warranted as guiding tools in these assessments. Suicidal patients display increased cytokine levels in peripheral blood, although the composite inflammatory profile in the subjects is still unknown. It is also not yet established whether certain inflammatory changes are specific to suicidal subjects. To address this, we measured 45 immunobiological factors in peripheral blood and identified the biological profiles associated with cross-diagnostic suicide risk and depression, respectively. METHODS Sixty-six women with mood and anxiety disorders underwent computerized adaptive testing for mental health, assessing depression and suicide risk. Weighted correlation network analysis was used to uncover system level associations between suicide risk, depression, and the immunobiological factors in plasma. Secondary regression models were used to establish the sensitivity of the results to potential confounders, including age, body mass index (BMI), treatment and symptoms of depression and anxiety. RESULTS The biological profile of patients assessed to be at increased suicide risk differed from that associated with depression. At the system level, a biological cluster containing increased levels of interleukin-6, lymphocytes, monocytes, white blood cell count and polymorphonuclear leukocyte count significantly impacted suicide risk, with the latter two inferring the strongest influence. The cytokine interleukin-8 was independently and negatively associated with increased suicide risk. The results remained after adjusting for confounders. LIMITATIONS This study is cross-sectional and not designed to prove causality. DISCUSSION A unique immunobiological profile was linked to increased suicide risk. The profile was different from that observed in patients with depressive symptoms, and indicates that granulocyte mediated biological mechanisms could be activated in patients at risk for suicide.
Collapse
Affiliation(s)
- Sarah A Keaton
- Department of Physiology, Michigan State University, East Lansing, MI, USA,Center for Neurodegenerative Science, Van Andel Research Institute, Grand Rapids, MI, USA
| | - Zachary B Madaj
- Bioinformatics and Biostatistics Core, Van Andel Research Institute, Grand Rapids, MI, USA
| | - Patrick Heilman
- Center for Neurodegenerative Science, Van Andel Research Institute, Grand Rapids, MI, USA
| | - LeAnn Smart
- Pine Rest Christian Mental Health Services, Grand Rapids, MI, USA
| | - Jamie Grit
- Center for Cancer and Cell Biology, Van Andel Research Institute, Grand Rapids, MI, USA
| | - Robert Gibbons
- Center for Health Statistics, Departments of Medicine and Public Health Sciences, University of Chicago, Illinois, USA
| | - Teodor Postolache
- Department of Psychiatry, University of Maryland-Baltimore School of Medicine, Baltimore, MD, USA,Rocky Mountain Mirecc, Denver, CO, USA
| | - Kimberly Roaten
- Department of Psychiatry, University of Texas Southwestern, Dallas, TX, USA
| | - Eric Achtyes
- Pine Rest Christian Mental Health Services, Grand Rapids, MI, USA,Division of Psychiatry & Behavioral Medicine, Michigan State University College of Human Medicine, Grand Rapids, Michigan, USA
| | - Lena Brundin
- Center for Neurodegenerative Science, Van Andel Research Institute, Grand Rapids, MI, USA.
| |
Collapse
|
28
|
Joint Analysis of Multiple Phenotypes in Association Studies based on Cross-Validation Prediction Error. Sci Rep 2019; 9:1073. [PMID: 30705317 PMCID: PMC6355816 DOI: 10.1038/s41598-018-37538-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2018] [Accepted: 11/19/2018] [Indexed: 01/28/2023] Open
Abstract
In genome-wide association studies (GWAS), joint analysis of multiple phenotypes could have increased statistical power over analyzing each phenotype individually to identify genetic variants that are associated with complex diseases. With this motivation, several statistical methods that jointly analyze multiple phenotypes have been developed, such as O’Brien’s method, Trait-based Association Test that uses Extended Simes procedure (TATES), multivariate analysis of variance (MANOVA), and joint model of multiple phenotypes (MultiPhen). However, the performance of these methods under a wide range of scenarios is not consistent: one test may be powerful in some situations, but not in the others. Thus, one challenge in joint analysis of multiple phenotypes is to construct a test that could maintain good performance across different scenarios. In this article, we develop a novel statistical method to test associations between a genetic variant and Multiple Phenotypes based on cross-validation Prediction Error (MultP-PE). Extensive simulations are conducted to evaluate the type I error rates and to compare the power performance of MultP-PE with various existing methods. The simulation studies show that MultP-PE controls type I error rates very well and has consistently higher power than the tests we compared in all simulation scenarios. We conclude with the recommendation for the use of MultP-PE for its good performance in association studies with multiple phenotypes.
Collapse
|
29
|
Rasigade JP, Leclère A, Alla F, Tessier A, Bes M, Lechiche C, Vernet-Garnier V, Laouénan C, Vandenesch F, Leport C. Staphylococcus aureus CC30 Lineage and Absence of sed, j, r-Harboring Plasmid Predict Embolism in Infective Endocarditis. Front Cell Infect Microbiol 2018; 8:187. [PMID: 29938201 PMCID: PMC6003251 DOI: 10.3389/fcimb.2018.00187] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Accepted: 05/14/2018] [Indexed: 12/28/2022] Open
Abstract
Staphylococcus aureus induces severe infective endocarditis (IE) where embolic complications are a major cause of death. Risk factors for embolism have been reported such as a younger age or larger IE vegetations, while methicillin resistance conferred by the mecA gene appeared as a protective factor. It is unclear, however, whether embolism is influenced by other S. aureus characteristics such as clonal complex (CC) or virulence pattern. We examined clinical and microbiological predictors of embolism in a prospective multicentric cohort of 98 French patients with monomicrobial S. aureus IE. The genomic contents of causative isolates were characterized using DNA array. To preserve statistical power, genotypic predictors were restricted to CC, secreted virulence factors and virulence regulators. Multivariate regularized logistic regression identified three independent predictors of embolism. Patients at higher risk were younger than the cohort median age of 62.5 y (adjusted odds ratio [OR] 0.14; 95% confidence interval [CI] 0.05-0.36). S. aureus characteristics predicting embolism were a CC30 genetic background (adjusted OR 9.734; 95% CI 1.53-192.8) and the absence of pIB485-like plasmid-borne enterotoxin-encoding genes sed, sej, and ser (sedjr; adjusted OR 0.07; 95% CI 0.004-0.457). CC30 S. aureus has been repeatedly reported to exhibit enhanced fitness in bloodstream infections, which might impact its ability to cause embolism. sedjr-encoded enterotoxins, whose superantigenic activity is unlikely to protect against embolism, possibly acted as a proxy to others genes of the pIB485-like plasmid found in genetically unrelated isolates from mostly embolism-free patients. mecA did not independently predict embolism but was strongly associated with sedjr. This mecA-sedjr association might have driven previous reports of a negative association of mecA and embolism. Collectively, our results suggest that the influence of S. aureus genotypic features on the risk of embolism may be stronger than previously suspected and independent of clinical risk factors.
Collapse
Affiliation(s)
- Jean-Philippe Rasigade
- CIRI, Centre International de Recherche en Infectiologie, Inserm U1111, Université Claude Bernard Lyon 1, CNRS UMR5308, ENS de Lyon, Lyon, France.,Centre National de Référence des Staphylocoques, Hospices Civils de Lyon, Lyon, France
| | - Amélie Leclère
- UMR-1137, Université Paris Diderot, Sorbonne Paris Cité, Paris, France.,Institut National de la Santé et de la Recherche Médicale, UMR-1137, Paris, France
| | - François Alla
- CIC-1433 Epidémiologie Clinique, Institut National de la Santé et de la Recherche Médicale, Centre Hospitalier Universitaire de Nancy, Nancy, France.,EA4360, Apemac, Université de Lorraine, Nancy, France
| | - Adrien Tessier
- UMR-1137, Université Paris Diderot, Sorbonne Paris Cité, Paris, France.,Institut National de la Santé et de la Recherche Médicale, UMR-1137, Paris, France
| | - Michèle Bes
- CIRI, Centre International de Recherche en Infectiologie, Inserm U1111, Université Claude Bernard Lyon 1, CNRS UMR5308, ENS de Lyon, Lyon, France.,Centre National de Référence des Staphylocoques, Hospices Civils de Lyon, Lyon, France
| | - Catherine Lechiche
- Service de Maladies Infectieuses et Tropicales Centre Hospitalier Universitaire de Nîmes Caremeau, Nîmes, France
| | - Véronique Vernet-Garnier
- Faculté de Médecine EA 4687 Université de Reims Champagne Ardenne, Reims, France.,Laboratoire de Bactériologie, Centre Hospitalier Universitaire de Reims Robert Debré, Reims, France
| | - Cédric Laouénan
- UMR-1137, Université Paris Diderot, Sorbonne Paris Cité, Paris, France.,Institut National de la Santé et de la Recherche Médicale, UMR-1137, Paris, France.,Service de Biostatistiques, Hôpital Bichat, AP-HP, Paris, France
| | - François Vandenesch
- CIRI, Centre International de Recherche en Infectiologie, Inserm U1111, Université Claude Bernard Lyon 1, CNRS UMR5308, ENS de Lyon, Lyon, France.,Centre National de Référence des Staphylocoques, Hospices Civils de Lyon, Lyon, France
| | - Catherine Leport
- UMR-1137, Université Paris Diderot, Sorbonne Paris Cité, Paris, France.,Institut National de la Santé et de la Recherche Médicale, UMR-1137, Paris, France.,Unité de Coordination du Risque Épidémique et Biologique, AP-HP, Paris, France
| | | |
Collapse
|
30
|
Abstract
Functional near-infrared spectroscopy (fNIRS) is a noninvasive neuroimaging technique that uses low-levels of light (650-900 nm) to measure changes in cerebral blood volume and oxygenation. Over the last several decades, this technique has been utilized in a growing number of functional and resting-state brain studies. The lower operation cost, portability, and versatility of this method make it an alternative to methods such as functional magnetic resonance imaging for studies in pediatric and special populations and for studies without the confining limitations of a supine and motionless acquisition setup. However, the analysis of fNIRS data poses several challenges stemming from the unique physics of the technique, the unique statistical properties of data, and the growing diversity of non-traditional experimental designs being utilized in studies due to the flexibility of this technology. For these reasons, specific analysis methods for this technology must be developed. In this paper, we introduce the NIRS Brain AnalyzIR toolbox as an open-source Matlab-based analysis package for fNIRS data management, pre-processing, and first- and second-level (i.e., single subject and group-level) statistical analysis. Here, we describe the basic architectural format of this toolbox, which is based on the object-oriented programming paradigm. We also detail the algorithms for several of the major components of the toolbox including statistical analysis, probe registration, image reconstruction, and region-of-interest based statistics.
Collapse
Affiliation(s)
- Hendrik Santosa
- Department of Radiology, University of Pittsburgh, Pittsburgh, PA 15213-2536, USA
| | - Xuetong Zhai
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA 15213-2536, USA
| | - Frank Fishburn
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA 15213-2536, USA
| | - Theodore Huppert
- Departments of Radiology and Bioengineering, University of Pittsburgh, Clinical Science Translational Institute, and Center for the Neural Basis of Cognition, Pittsburgh, PA 15213-2536, USA
| |
Collapse
|
31
|
|
32
|
Davidov O, Jelsema CM, Peddada S. Testing for inequality constraints in singular models by trimming or winsorizing the variance matrix. J Am Stat Assoc 2018; 113:906-918. [PMID: 33093735 DOI: 10.1080/01621459.2017.1301258] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
There are many applications in which a statistic follows, at least asymptotically, a normal distribution with a singular or nearly singular variance matrix. A classic example occurs in linear regression models under multicollinearity but there are many more such examples. There is well-developed theory for testing linear equality constraints when the alternative is two-sided and the variance matrix is either singular or non-singular. In recent years there is considerable, and growing, interest in developing methods for situations in which the estimated variance matrix is nearly singular. However, there is no corresponding methodology for addressing one-sided, i.e., constrained or ordered alternatives. In this paper we develop a unified framework for analyzing such problems. Our approach may be viewed as the trimming or winsorizing of the eigenvalues of the corresponding variance matrix. The proposed methodology is applicable to a wide range of scientific problems and to a variety of statistical models in which inequality constraints arise. We illustrate the methodology using data from a gene expression microarray experiment obtained from the NIEHS' Fibroid Growth Study.
Collapse
Affiliation(s)
- Ori Davidov
- Department of Statistics, University of Haifa, Mount Carmel, Haifa 31905 Israel
| | - Casey M Jelsema
- Department of Statistics, West Virginia University, Morgantown, WV, 26506
| | - Shyamal Peddada
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Alexander Drive, RTP, NC 27709
| |
Collapse
|
33
|
Tu-Chan AP, Natraj N, Godlove J, Abrams G, Ganguly K. Effects of somatosensory electrical stimulation on motor function and cortical oscillations. J Neuroeng Rehabil 2017; 14:113. [PMID: 29132379 PMCID: PMC5683582 DOI: 10.1186/s12984-017-0323-1] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Accepted: 10/30/2017] [Indexed: 01/11/2023] Open
Abstract
Background Few patients recover full hand dexterity after an acquired brain injury such as stroke. Repetitive somatosensory electrical stimulation (SES) is a promising method to promote recovery of hand function. However, studies using SES have largely focused on gross motor function; it remains unclear if it can modulate distal hand functions such as finger individuation. Objective The specific goal of this study was to monitor the effects of SES on individuation as well as on cortical oscillations measured using EEG, with the additional goal of identifying neurophysiological biomarkers. Methods Eight participants with a history of acquired brain injury and distal upper limb motor impairments received a single two-hour session of SES using transcutaneous electrical nerve stimulation. Pre- and post-intervention assessments consisted of the Action Research Arm Test (ARAT), finger fractionation, pinch force, and the modified Ashworth scale (MAS), along with resting-state EEG monitoring. Results SES was associated with significant improvements in ARAT, MAS and finger fractionation. Moreover, SES was associated with a decrease in low frequency (0.9-4 Hz delta) ipsilesional parietomotor EEG power. Interestingly, changes in ipsilesional motor theta (4.8–7.9 Hz) and alpha (8.8–11.7 Hz) power were significantly correlated with finger fractionation improvements when using a multivariate model. Conclusions We show the positive effects of SES on finger individuation and identify cortical oscillations that may be important electrophysiological biomarkers of individual responsiveness to SES. These biomarkers can be potential targets when customizing SES parameters to individuals with hand dexterity deficits. Trial registration: NCT03176550; retrospectively registered. Electronic supplementary material The online version of this article (10.1186/s12984-017-0323-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Adelyn P Tu-Chan
- Department of Neurology, University of California, San Francisco, USA. .,Neurology & Rehabilitation Service, San Francisco VA Medical Center, 1700 Owens Street, San Francisco, California, 94158, USA.
| | - Nikhilesh Natraj
- Department of Neurology, University of California, San Francisco, USA.,Neurology & Rehabilitation Service, San Francisco VA Medical Center, 1700 Owens Street, San Francisco, California, 94158, USA
| | - Jason Godlove
- Department of Neurology, University of California, San Francisco, USA.,Neurology & Rehabilitation Service, San Francisco VA Medical Center, 1700 Owens Street, San Francisco, California, 94158, USA
| | - Gary Abrams
- Department of Neurology, University of California, San Francisco, USA.,Neurology & Rehabilitation Service, San Francisco VA Medical Center, 1700 Owens Street, San Francisco, California, 94158, USA
| | - Karunesh Ganguly
- Department of Neurology, University of California, San Francisco, USA. .,Neurology & Rehabilitation Service, San Francisco VA Medical Center, 1700 Owens Street, San Francisco, California, 94158, USA.
| |
Collapse
|
34
|
Yang SP, Emura T. A Bayesian approach with generalized ridge estimation for high-dimensional regression and testing. COMMUN STAT-SIMUL C 2017. [DOI: 10.1080/03610918.2016.1193195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Szu-Peng Yang
- Graduate Institute of Statistics, National Central University, Taiwan
| | - Takeshi Emura
- Graduate Institute of Statistics, National Central University, Taiwan
| |
Collapse
|
35
|
Yu W, Lee S, Park T. A unified model based multifactor dimensionality reduction framework for detecting gene-gene interactions. Bioinformatics 2017; 32:i605-i610. [PMID: 27587680 DOI: 10.1093/bioinformatics/btw424] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Gene-gene interaction (GGI) is one of the most popular approaches for finding and explaining the missing heritability of common complex traits in genome-wide association studies. The multifactor dimensionality reduction (MDR) method has been widely studied for detecting GGI effects. However, there are several disadvantages of the existing MDR-based approaches, such as the lack of an efficient way of evaluating the significance of multi-locus models and the high computational burden due to intensive permutation. Furthermore, the MDR method does not distinguish marginal effects from pure interaction effects. METHODS We propose a two-step unified model based MDR approach (UM-MDR), in which, the significance of a multi-locus model, even a high-order model, can be easily obtained through a regression framework with a semi-parametric correction procedure for controlling Type I error rates. In comparison to the conventional permutation approach, the proposed semi-parametric correction procedure avoids heavy computation in order to achieve the significance of a multi-locus model. The proposed UM-MDR approach is flexible in the sense that it is able to incorporate different types of traits and evaluate significances of the existing MDR extensions. RESULTS The simulation studies and the analysis of a real example are provided to demonstrate the utility of the proposed method. UM-MDR can achieve at least the same power as MDR for most scenarios, and it outperforms MDR especially when there are some single nucleotide polymorphisms that only have marginal effects, which masks the detection of causal epistasis for the existing MDR approaches. CONCLUSIONS UM-MDR provides a very good supplement of existing MDR method due to its efficiency in achieving significance for every multi-locus model, its power and its flexibility of handling different types of traits. AVAILABILITY AND IMPLEMENTATION A R package "umMDR" and other source codes are freely available at http://statgen.snu.ac.kr/software/umMDR/ CONTACT: tspark@stats.snu.ac.kr SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wenbao Yu
- Department of Statistics, Seoul National University, Shilim-Dong, Kwanak-Gu, Seoul 151-742, Korea
| | - Seungyeoun Lee
- Department of Mathematics and Statistics, Sejong University, Seoul 143-747, Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Shilim-Dong, Kwanak-Gu, Seoul 151-742, Korea
| |
Collapse
|
36
|
Salehe BR, Jones CI, Di Fatta G, McGuffin LJ. RAPIDSNPs: A new computational pipeline for rapidly identifying key genetic variants reveals previously unidentified SNPs that are significantly associated with individual platelet responses. PLoS One 2017; 12:e0175957. [PMID: 28441463 PMCID: PMC5404774 DOI: 10.1371/journal.pone.0175957] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 04/03/2017] [Indexed: 01/14/2023] Open
Abstract
Advances in omics technologies have led to the discovery of genetic markers, or single nucleotide polymorphisms (SNPs), that are associated with particular diseases or complex traits. Although there have been significant improvements in the approaches used to analyse associations of SNPs with disease, further optimised and rapid techniques are needed to keep up with the rate of SNP discovery, which has exacerbated the 'missing heritability' problem. Here, we have devised a novel, integrated, heuristic-based, hybrid analytical computational pipeline, for rapidly detecting novel or key genetic variants that are associated with diseases or complex traits. Our pipeline is particularly useful in genetic association studies where the genotyped SNP data are highly dimensional, and the complex trait phenotype involved is continuous. In particular, the pipeline is more efficient for investigating small sets of genotyped SNPs defined in high dimensional spaces that may be associated with continuous phenotypes, rather than for the investigation of whole genome variants. The pipeline, which employs a consensus approach based on the random forest, was able to rapidly identify previously unseen key SNPs, that are significantly associated with the platelet response phenotype, which was used as our complex trait case study. Several of these SNPs, such as rs6141803 of COMMD7 and rs41316468 in PKT2B, have independently confirmed associations with cardiovascular diseases (CVDs) according to other unrelated studies, suggesting that our pipeline is robust in identifying key genetic variants. Our new pipeline provides an important step towards addressing the problem of 'missing heritability' through enhanced detection of key genetic variants (SNPs) that are associated with continuous complex traits/disease phenotypes.
Collapse
Affiliation(s)
| | - Chris Ian Jones
- School of Biological Sciences, University of Reading, Reading, United Kingdom
| | - Giuseppe Di Fatta
- Department of Computer Science, University of Reading, Reading, United Kingdom
| | - Liam James McGuffin
- School of Biological Sciences, University of Reading, Reading, United Kingdom
| |
Collapse
|
37
|
Yang X, Wang S, Zhang S, Sha Q. Detecting association of rare and common variants based on cross-validation prediction error. Genet Epidemiol 2017; 41:233-243. [PMID: 28176359 DOI: 10.1002/gepi.22034] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2016] [Revised: 11/22/2016] [Accepted: 11/26/2016] [Indexed: 12/13/2022]
Abstract
Despite the extensive discovery of disease-associated common variants, much of the genetic contribution to complex traits remains unexplained. Rare variants may explain additional disease risk or trait variability. Although sequencing technology provides a supreme opportunity to investigate the roles of rare variants in complex diseases, detection of these variants in sequencing-based association studies presents substantial challenges. In this article, we propose novel statistical tests to test the association between rare and common variants in a genomic region and a complex trait of interest based on cross-validation prediction error (PE). We first propose a PE method based on Ridge regression. Based on PE, we also propose another two tests PE-WS and PE-TOW by testing a weighted combination of variants with two different weighting schemes. PE-WS is the PE version of the test based on the weighted sum statistic (WS) and PE-TOW is the PE version of the test based on the optimally weighted combination of variants (TOW). Using extensive simulation studies, we are able to show that (1) PE-TOW and PE-WS are consistently more powerful than TOW and WS, respectively, and (2) PE is the most powerful test when causal variants contain both common and rare variants.
Collapse
Affiliation(s)
- Xinlan Yang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI, USA
| | | | - Shuanglin Zhang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI, USA
| | - Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI, USA
| |
Collapse
|
38
|
Foveau B, Albrecht S, Bennett DA, Correa JA, LeBlanc AC. Increased Caspase-6 activity in the human anterior olfactory nuclei of the olfactory bulb is associated with cognitive impairment. Acta Neuropathol Commun 2016; 4:127. [PMID: 27931265 PMCID: PMC5146837 DOI: 10.1186/s40478-016-0400-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Accepted: 12/01/2016] [Indexed: 02/06/2023] Open
Abstract
Abnormally elevated hippocampal Caspase-6 (Casp6) activity is intimately associated with age-related cognitive impairment in humans and in mice. In humans, these high levels of Casp6 activity are initially localized in the entorhinal cortex, the area of the brain first affected by the formation of neurofibrillary tangles, according to Braak staging. The reason for the high vulnerability of entorhinal cortex neurons to neurofibrillary tangle pathology and Casp6 activity is unknown. Casp6 activity is involved in axonal degeneration, therefore, one possibility to explain increased vulnerability of the entorhinal cortex neurons would be that the afferent neurons of the olfactory bulb, some of which project their axons to the entorhinal cortex, are equally degenerating. To examine this possibility, we examined the presence of Casp6 activity, neurofibrillary tangle formation and amyloid deposition by immunohistochemistry with neoepitope antisera against the p20 subunit of active Casp6 and Tau cleaved by Casp6 (Tau∆Casp6), phosphorylated Tau paired helical filament (PHF-1) antibodies and anti-β-amyloid antiserum, respectively, in brains from individuals with no or mild cognitive impairment and Alzheimer disease (AD) dementia. Co-localization of Casp6 activity, PHF-1 and β-amyloid was detected mostly in the anterior olfactory nucleus (AON) of the olfactory bulb. The levels of active Casp6 in the AON, which were the highest in the AD brains, correlated with PHF-1 levels, but not with β-amyloid levels. AON Tau∆Casp6 levels correlated with entorhinal cortex Casp6 activity and PHF-1 levels. Multiple regression analyses demonstrated that AON Casp6 activity was associated with lower global cognitive function, mini mental state exam, episodic memory and semantic memory scores. These results suggest that AON Casp6 activity could lead to Casp6-mediated degeneration in the entorhinal cortex, but cannot exclude the possibilities that entorhinal cortex degeneration signals degeneration in the AON or that the pathologies occur in both regions independently. Nevertheless, AON Casp6 activity reflects that of the entorhinal cortex.
Collapse
Affiliation(s)
- Benedicte Foveau
- Bloomfield Center for Research in Aging, Lady Davis Institute for Medical Research, Jewish General Hospital, 3755 ch. Côte Ste-Catherine, Montreal, QC Canada
| | - Steffen Albrecht
- Department of Pathology, Montreal Children’s Hospital and McGill University, Montreal, QC Canada
| | - David A. Bennett
- Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, IL USA
| | - José A. Correa
- Department of Mathematics and Statistics, McGill University, Montreal, QC Canada
| | - Andrea C. LeBlanc
- Bloomfield Center for Research in Aging, Lady Davis Institute for Medical Research, Jewish General Hospital, 3755 ch. Côte Ste-Catherine, Montreal, QC Canada ,Department of Neurology and Neurosurgery, McGill University, Montreal, QC Canada
| |
Collapse
|
39
|
Covariance Association Test (CVAT) Identifies Genetic Markers Associated with Schizophrenia in Functionally Associated Biological Processes. Genetics 2016; 203:1901-13. [PMID: 27317683 DOI: 10.1534/genetics.116.189498] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2016] [Accepted: 06/09/2016] [Indexed: 12/12/2022] Open
Abstract
Schizophrenia is a psychiatric disorder with large personal and social costs, and understanding the genetic etiology is important. Such knowledge can be obtained by testing the association between a disease phenotype and individual genetic markers; however, such single-marker methods have limited power to detect genetic markers with small effects. Instead, aggregating genetic markers based on biological information might increase the power to identify sets of genetic markers of etiological significance. Several set test methods have been proposed: Here we propose a new set test derived from genomic best linear unbiased prediction (GBLUP), the covariance association test (CVAT). We compared the performance of CVAT to other commonly used set tests. The comparison was conducted using a simulated study population having the same genetic parameters as for schizophrenia. We found that CVAT was among the top performers. When extending CVAT to utilize a mixture of SNP effects, we found an increase in power to detect the causal sets. Applying the methods to a Danish schizophrenia case-control data set, we found genomic evidence for association of schizophrenia with vitamin A metabolism and immunological responses, which previously have been implicated with schizophrenia based on experimental and observational studies.
Collapse
|
40
|
Sarup P, Jensen J, Ostersen T, Henryon M, Sørensen P. Increased prediction accuracy using a genomic feature model including prior information on quantitative trait locus regions in purebred Danish Duroc pigs. BMC Genet 2016; 17:11. [PMID: 26728402 PMCID: PMC4700613 DOI: 10.1186/s12863-015-0322-9] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2015] [Accepted: 12/20/2015] [Indexed: 12/31/2022] Open
Abstract
Background In animal breeding, genetic variance for complex traits is often estimated using linear mixed models that incorporate information from single nucleotide polymorphism (SNP) markers using a realized genomic relationship matrix. In such models, individual genetic markers are weighted equally and genomic variation is treated as a “black box.” This approach is useful for selecting animals with high genetic potential, but it does not generate or utilise knowledge of the biological mechanisms underlying trait variation. Here we propose a linear mixed-model approach that can evaluate the collective effects of sets of SNPs and thereby open the “black box.” The described genomic feature best linear unbiased prediction (GFBLUP) model has two components that are defined by genomic features. Results We analysed data on average daily gain, feed efficiency, and lean meat percentage from 3,085 Duroc boars, along with genotypes from a 60 K SNP chip. In addition information on known quantitative trait loci (QTL) from the animal QTL database was integrated in the GFBLUP as a genomic feature. Our results showed that the most significant QTL categories were indeed biologically meaningful. Additionally, for high heritability traits, prediction accuracy was improved by the incorporation of biological knowledge in prediction models. A simulation study using the real genotypes and simulated phenotypes demonstrated challenges regarding detection of causal variants in low to medium heritability traits. Conclusions The GFBLUP model showed increased predictive ability when enough causal variants were included in the genomic feature to explain over 10 % of the genomic variance, and when dilution by non-causal markers was minimal. In the observed data set, predictive ability was increased by the inclusion of prior QTL information obtained outside the training data set, but only for the trait with highest heritability. Electronic supplementary material The online version of this article (doi:10.1186/s12863-015-0322-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Pernille Sarup
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Blichers Allé 20, 8830, Tjele, Denmark.
| | - Just Jensen
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Blichers Allé 20, 8830, Tjele, Denmark.
| | - Tage Ostersen
- SEGES Danish Pig Research Centre, Axeltorv 3, 1609, Copenhagen V, Denmark.
| | - Mark Henryon
- SEGES Danish Pig Research Centre, Axeltorv 3, 1609, Copenhagen V, Denmark.
| | - Peter Sørensen
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Blichers Allé 20, 8830, Tjele, Denmark.
| |
Collapse
|
41
|
Exploiting Linkage Disequilibrium for Ultrahigh-Dimensional Genome-Wide Data with an Integrated Statistical Approach. Genetics 2015; 202:411-26. [PMID: 26661113 DOI: 10.1534/genetics.115.179507] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2015] [Accepted: 11/19/2015] [Indexed: 01/08/2023] Open
Abstract
Genome-wide data with millions of single-nucleotide polymorphisms (SNPs) can be highly correlated due to linkage disequilibrium (LD). The ultrahigh dimensionality of big data brings unprecedented challenges to statistical modeling such as noise accumulation, the curse of dimensionality, computational burden, spurious correlations, and a processing and storing bottleneck. The traditional statistical approaches lose their power due to [Formula: see text] (n is the number of observations and p is the number of SNPs) and the complex correlation structure among SNPs. In this article, we propose an integrated distance correlation ridge regression (DCRR) approach to accommodate the ultrahigh dimensionality, joint polygenic effects of multiple loci, and the complex LD structures. Initially, a distance correlation (DC) screening approach is used to extensively remove noise, after which LD structure is addressed using a ridge penalized multiple logistic regression (LRR) model. The false discovery rate, true positive discovery rate, and computational cost were simultaneously assessed through a large number of simulations. A binary trait of Arabidopsis thaliana, the hypersensitive response to the bacterial elicitor AvrRpm1, was analyzed in 84 inbred lines (28 susceptibilities and 56 resistances) with 216,130 SNPs. Compared to previous SNP discovery methods implemented on the same data set, the DCRR approach successfully detected the causative SNP while dramatically reducing spurious associations and computational time.
Collapse
|
42
|
Dumancas GG, Ramasahayam S, Bello G, Hughes J, Kramer R. Chemometric regression techniques as emerging, powerful tools in genetic association studies. Trends Analyt Chem 2015. [DOI: 10.1016/j.trac.2015.05.007] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
43
|
van de Wiel MA, Lien TG, Verlaat W, van Wieringen WN, Wilting SM. Better prediction by use of co-data: adaptive group-regularized ridge regression. Stat Med 2015; 35:368-81. [PMID: 26365903 DOI: 10.1002/sim.6732] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2014] [Revised: 05/22/2015] [Accepted: 08/22/2015] [Indexed: 12/23/2022]
Abstract
For many high-dimensional studies, additional information on the variables, like (genomic) annotation or external p-values, is available. In the context of binary and continuous prediction, we develop a method for adaptive group-regularized (logistic) ridge regression, which makes structural use of such 'co-data'. Here, 'groups' refer to a partition of the variables according to the co-data. We derive empirical Bayes estimates of group-specific penalties, which possess several nice properties: (i) They are analytical. (ii) They adapt to the informativeness of the co-data for the data at hand. (iii) Only one global penalty parameter requires tuning by cross-validation. In addition, the method allows use of multiple types of co-data at little extra computational effort. We show that the group-specific penalties may lead to a larger distinction between 'near-zero' and relatively large regression parameters, which facilitates post hoc variable selection. The method, termed GRridge, is implemented in an easy-to-use R-package. It is demonstrated on two cancer genomics studies, which both concern the discrimination of precancerous cervical lesions from normal cervix tissues using methylation microarray data. For both examples, GRridge clearly improves the predictive performances of ordinary logistic ridge regression and the group lasso. In addition, we show that for the second study, the relatively good predictive performance is maintained when selecting only 42 variables.
Collapse
Affiliation(s)
- Mark A van de Wiel
- Department of Epidemiology and Biostatistics, VU University Medical Center, Amsterdam, The Netherlands
- Department of Mathematics, VU University, Amsterdam, The Netherlands
| | - Tonje G Lien
- Department of Mathematics, University of Oslo, Oslo, Norway
| | - Wina Verlaat
- Department of Pathology, VU University Medical Center, Amsterdam, The Netherlands
| | - Wessel N van Wieringen
- Department of Epidemiology and Biostatistics, VU University Medical Center, Amsterdam, The Netherlands
- Department of Mathematics, VU University, Amsterdam, The Netherlands
| | - Saskia M Wilting
- Department of Pathology, VU University Medical Center, Amsterdam, The Netherlands
| |
Collapse
|
44
|
Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 2015; 15:550. [PMID: 25516281 PMCID: PMC4302049 DOI: 10.1186/s13059-014-0550-8] [Citation(s) in RCA: 47600] [Impact Index Per Article: 5288.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2014] [Indexed: 12/12/2022] Open
Abstract
In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html webcite.
Collapse
|
45
|
Tessier A, Bertrand J, Chenel M, Comets E. Comparison of Nonlinear Mixed Effects Models and Noncompartmental Approaches in Detecting Pharmacogenetic Covariates. AAPS JOURNAL 2015; 17:597-608. [PMID: 25693489 DOI: 10.1208/s12248-015-9726-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Accepted: 01/28/2015] [Indexed: 11/30/2022]
Abstract
Genetic data is now collected in many clinical trials, especially in population pharmacokinetic studies. There is no consensus on methods to test the association between pharmacokinetics and genetic covariates. We performed a simulation study inspired by real clinical trials, using the pharmacokinetics (PK) of a compound under development having a nonlinear bioavailability along with genotypes for 176 single nucleotide polymorphisms (SNPs). Scenarios included 78 subjects extensively sampled (16 observations per subject) to simulate a phase I study, or 384 subjects with the same rich design. Under the alternative hypothesis (H1), six SNPs were drawn randomly to affect the log-clearance under an additive linear model. For each scenario, 200 PK data sets were simulated under the null hypothesis (no gene effect) and H1. We compared 16 combinations of four association tests, a stepwise procedure and three penalised regressions (ridge regression, Lasso, HyperLasso), applied to four pharmacokinetic phenotypes, two observed concentrations, area under the curve estimated by noncompartmental analysis and model-based clearance. The different combinations were compared in terms of true and false positives and probability to detect the genetic effects. In presence of nonlinearity and/or variability in bioavailability, model-based phenotype allowed a higher probability to detect the SNPs than other phenotypes. In a realistic setting with a limited number of subjects, all methods showed a low ability to detect genetic effects. Ridge regression had the best probability to detect SNPs, but also a higher number of false positives. No association test showed a much higher power than the others.
Collapse
Affiliation(s)
- Adrien Tessier
- INSERM, IAME, UMR 1137, Faculté de médecine Paris Diderot Paris 7 - site Bichat, 16 rue Henri Huchard, 75018, Paris, France,
| | | | | | | |
Collapse
|
46
|
Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 2014. [PMID: 25516281 DOI: 10.1186/s13059‐014‐0550‐8] [Citation(s) in RCA: 283] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html webcite.
Collapse
|
47
|
Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 2014. [PMID: 25516281 DOI: 10.1186/s13059-014-0550-8.] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html webcite.
Collapse
|
48
|
Aslam M. Using Heteroscedasticity-Consistent Standard Errors for the Linear Regression Model with Correlated Regressors. COMMUN STAT-SIMUL C 2014. [DOI: 10.1080/03610918.2012.750354] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
49
|
Liu J, Calhoun VD. A review of multivariate analyses in imaging genetics. Front Neuroinform 2014; 8:29. [PMID: 24723883 PMCID: PMC3972473 DOI: 10.3389/fninf.2014.00029] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2013] [Accepted: 03/04/2014] [Indexed: 12/13/2022] Open
Abstract
Recent advances in neuroimaging technology and molecular genetics provide the unique opportunity to investigate genetic influence on the variation of brain attributes. Since the year 2000, when the initial publication on brain imaging and genetics was released, imaging genetics has been a rapidly growing research approach with increasing publications every year. Several reviews have been offered to the research community focusing on various study designs. In addition to study design, analytic tools and their proper implementation are also critical to the success of a study. In this review, we survey recent publications using data from neuroimaging and genetics, focusing on methods capturing multivariate effects accommodating the large number of variables from both imaging data and genetic data. We group the analyses of genetic or genomic data into either a priori driven or data driven approach, including gene-set enrichment analysis, multifactor dimensionality reduction, principal component analysis, independent component analysis (ICA), and clustering. For the analyses of imaging data, ICA and extensions of ICA are the most widely used multivariate methods. Given detailed reviews of multivariate analyses of imaging data available elsewhere, we provide a brief summary here that includes a recently proposed method known as independent vector analysis. Finally, we review methods focused on bridging the imaging and genetic data by establishing multivariate and multiple genotype-phenotype-associations, including sparse partial least squares, sparse canonical correlation analysis, sparse reduced rank regression and parallel ICA. These methods are designed to extract latent variables from both genetic and imaging data, which become new genotypes and phenotypes, and the links between the new genotype-phenotype pairs are maximized using different cost functions. The relationship between these methods along with their assumptions, advantages, and limitations are discussed.
Collapse
Affiliation(s)
- Jingyu Liu
- The Mind Research Network and Lovelace Biomedical and Environmental Research InstituteAlbuquerque, NM, USA
- Department of Electrical and Computer Engineering, University of New MexicoAlbuquerque, NM, USA
| | - Vince D. Calhoun
- The Mind Research Network and Lovelace Biomedical and Environmental Research InstituteAlbuquerque, NM, USA
- Department of Electrical and Computer Engineering, University of New MexicoAlbuquerque, NM, USA
| |
Collapse
|
50
|
Shigemizu D, Abe T, Morizono T, Johnson TA, Boroevich KA, Hirakawa Y, Ninomiya T, Kiyohara Y, Kubo M, Nakamura Y, Maeda S, Tsunoda T. The construction of risk prediction models using GWAS data and its application to a type 2 diabetes prospective cohort. PLoS One 2014; 9:e92549. [PMID: 24651836 PMCID: PMC3961382 DOI: 10.1371/journal.pone.0092549] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2013] [Accepted: 02/24/2014] [Indexed: 02/07/2023] Open
Abstract
Recent genome-wide association studies (GWAS) have identified several novel single nucleotide polymorphisms (SNPs) associated with type 2 diabetes (T2D). Various models using clinical and/or genetic risk factors have been developed for T2D risk prediction. However, analysis considering algorithms for genetic risk factor detection and regression methods for model construction in combination with interactions of risk factors has not been investigated. Here, using genotype data of 7,360 Japanese individuals, we investigated risk prediction models, considering the algorithms, regression methods and interactions. The best model identified was based on a Bayes factor approach and the lasso method. Using nine SNPs and clinical factors, this method achieved an area under a receiver operating characteristic curve (AUC) of 0.8057 on an independent test set. With the addition of a pair of interaction factors, the model was further improved (p-value 0.0011, AUC 0.8085). Application of our model to prospective cohort data showed significantly better outcome in disease-free survival, according to the log-rank trend test comparing Kaplan-Meier survival curves (p--value 2:09 x 10(-11)). While the major contribution was from clinical factors rather than the genetic factors, consideration of genetic risk factors contributed to an observable, though small, increase in predictive ability. This is the first report to apply risk prediction models constructed from GWAS data to a T2D prospective cohort. Our study shows our model to be effective in prospective prediction and has the potential to contribute to practical clinical use in T2D.
Collapse
Affiliation(s)
- Daichi Shigemizu
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Testuo Abe
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Takashi Morizono
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Todd A. Johnson
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Keith A. Boroevich
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yoichiro Hirakawa
- Department of Environmental Medicine, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Toshiharu Ninomiya
- Department of Environmental Medicine, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
- Department of Medicine and Clinical Science, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Yutaka Kiyohara
- Department of Environmental Medicine, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Michiaki Kubo
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yusuke Nakamura
- Human Genome Center, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Shiro Maeda
- Laboratory for Endocrinology, Metabolism and Kidney Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| |
Collapse
|