1
|
Long L, He H, Shen Q, Peng H, Zhou X, Wang H, Zhang S, Qin S, Lu Z, Zhu Y, Tian J, Chang J, Miao X, Shen N, Zhong R. Birthweight, genetic risk, and gastrointestinal cancer incidence: a prospective cohort study. Ann Med 2023; 55:62-71. [PMID: 36503347 PMCID: PMC9754019 DOI: 10.1080/07853890.2022.2146743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND The epidemiologic studies investigating the association of birthweight and genetic factors with gastrointestinal cancer remain scarce. The study aimed to prospectively assess the interactions and joint effects of birthweight and genetic risk levels on gastrointestinal cancer incidence in adulthood. METHODS A total of 254,997 participants were included in the UK Biobank study. We used multivariate restricted cubic splines and Cox regression models to estimate the hazard ratios (HRs) and 95% confidential intervals (CI) for the association between birthweight and gastrointestinal cancer risk, then constructed a polygenic risk score (PRS) to assess its interaction and joint effect with birthweight on the development of gastrointestinal cancer. RESULTS We documented 2512 incident cases during a median follow-up of 8.88 years. Compare with participants reporting a normal birthweight (2.5-4.5 kg), multivariable-adjusted HR of gastrointestinal cancer incidence for participants with high birthweight (≥4.5 kg) was 1.17 (95%CI: 1.01-1.36). Such association was remarkably observed in pancreatic cancer, with an HR of 1.82 (95%CI: 1.26-2.64). No statistically significant association was observed between low birth weight and gastrointestinal cancers. Participants with high birthweight and high PRS had the highest risk of gastrointestinal cancer (HR: 2.95, 95%CI: 2.19-3.96). CONCLUSION Our findings highlight that high birthweight is associated with a higher incidence of gastrointestinal cancer, especially for pancreatic cancer. Benefits would be obtained from birthweight control, particularly for individuals with a high genetic risk.KEY MESSAGESThe epidemiologic studies investigating the association of birthweight and genetic factors with gastrointestinal cancer remain scarce.This cohort study of 254,997 adults in the United Kingdom found an association of high birthweight with the incidence of gastrointestinal cancer, especially for pancreatic cancer, and also found that participants with high birthweight and high polygenic risk score had the highest risk of gastrointestinal cancer.Our data suggests a possible effect of in utero or early life exposures on adulthood gastrointestinal cancer, especially for those with a high genetic risk.
Collapse
Affiliation(s)
- Lu Long
- Department of Epidemiology and Biostatistics, Ministry of Education Key Lab of Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Department of Epidemiology and Biostatistics, West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, China
| | - Heng He
- Department of Epidemiology and Health Statistics, School of Public Health, Fujian Medical University, Fuzhou, China
| | - Qian Shen
- Department of Epidemiology and Biostatistics, Ministry of Education Key Lab of Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Hongxia Peng
- Department of Epidemiology and Biostatistics, West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, China
| | - Xiaorui Zhou
- Department of Epidemiology and Biostatistics, West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, China
| | - Haoxue Wang
- Department of Epidemiology and Biostatistics, Ministry of Education Key Lab of Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Shanshan Zhang
- Department of Epidemiology and Biostatistics, Ministry of Education Key Lab of Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Shifan Qin
- Department of Epidemiology and Biostatistics, Ministry of Education Key Lab of Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Zequn Lu
- Department of Epidemiology and Biostatistics, Ministry of Education Key Lab of Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Ying Zhu
- School of Public Health, Wuhan University, Wuhan, China
| | - Jianbo Tian
- School of Public Health, Wuhan University, Wuhan, China
| | - Jiang Chang
- Department of Epidemiology and Biostatistics, Ministry of Education Key Lab of Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Xiaoping Miao
- School of Public Health, Wuhan University, Wuhan, China
| | - Na Shen
- Department of Laboratory Medicine, Tongji Hospital, Tongji Medical College, HUST, Wuhan, China
- Na Shen Department of Laboratory Medicine, Tongji Hospital, Tongji Medical College, HUST, Wuhan, 430030, China
| | - Rong Zhong
- Department of Epidemiology and Biostatistics, Ministry of Education Key Lab of Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- CONTACT Rong Zhong Department of Epidemiology and Biostatistics, Ministry of Education Key Lab of Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China
| |
Collapse
|
2
|
Christiansen CE, Arathimos R, Pain O, Molokhia M, Bell JT, Lewis CM. Stratified genome-wide association analysis of type 2 diabetes reveals subgroups with genetic and environmental heterogeneity. Hum Mol Genet 2023; 32:2638-2645. [PMID: 37364045 PMCID: PMC10407708 DOI: 10.1093/hmg/ddad093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 04/18/2023] [Accepted: 05/31/2023] [Indexed: 06/28/2023] Open
Abstract
Type 2 diabetes (T2D) is a heterogeneous illness caused by genetic and environmental factors. Previous genome-wide association studies (GWAS) have identified many genetic variants associated with T2D and found evidence of differing genetic profiles by age-at-onset. This study seeks to explore further the genetic and environmental drivers of T2D by analyzing subgroups on the basis of age-at-onset of diabetes and body mass index (BMI). In the UK Biobank, 36 494 T2D cases were stratified into three subgroups, and GWAS was performed for all T2D cases and for each subgroup relative to 421 021 controls. Altogether, 18 single nucleotide polymorphisms were significantly associated with T2D genome-wide in one or more subgroups and also showed evidence of heterogeneity between the subgroups (Cochrane's Q P < 0.01), with two SNPs remaining significant after multiple testing (in CDKN2B and CYTIP). Combined risk scores, on the basis of genetic profile, BMI and age, resulted in excellent diabetes prediction [area under the ROC curve (AUC) = 0.92]. A modest improvement in prediction (AUC = 0.93) was seen when the contribution of genetic and environmental factors was evaluated separately for each subgroup. Increasing sample sizes of genetic studies enables us to stratify disease cases into subgroups, which have sufficient power to highlight areas of genetic heterogeneity. Despite some evidence that optimizing combined risk scores by subgroup improves prediction, larger sample sizes are likely needed for prediction when using a stratification approach.
Collapse
Affiliation(s)
- Colette E Christiansen
- Department of Twin Research and Genetic Epidemiology, King’s College London, London, SE1 7EH, UK
- School of Mathematics and Statistics, The Open University, Milton Keynes, MK7 6AA, UK
| | - Ryan Arathimos
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and NeuroscienceKing’s College London, London, SE5 8AF, UK
- NIHR Maudsley Biomedical Research Centre, South London and Maudsley NHS Trust UK, London, SE5 8AF, UK
| | - Oliver Pain
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and NeuroscienceKing’s College London, London, SE5 8AF, UK
- NIHR Maudsley Biomedical Research Centre, South London and Maudsley NHS Trust UK, London, SE5 8AF, UK
| | - Mariam Molokhia
- School of Population Health and Environmental Sciences, King’s College London, London, SE1 1UL, UK
| | - Jordana T Bell
- Department of Twin Research and Genetic Epidemiology, King’s College London, London, SE1 7EH, UK
| | - Cathryn M Lewis
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and NeuroscienceKing’s College London, London, SE5 8AF, UK
- NIHR Maudsley Biomedical Research Centre, South London and Maudsley NHS Trust UK, London, SE5 8AF, UK
- Department of Medical and Molecular Genetics, Faculty of Life Sciences & Medicine, King’s College London, London, SE1 9RT, UK
| |
Collapse
|
3
|
The necessity of incorporating non-genetic risk factors into polygenic risk score models. Sci Rep 2023; 13:1351. [PMID: 36807592 PMCID: PMC9941118 DOI: 10.1038/s41598-023-27637-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 01/05/2023] [Indexed: 02/22/2023] Open
Abstract
The growing public interest in genetic risk scores for various health conditions can be harnessed to inspire preventive health action. However, current commercially available genetic risk scores can be deceiving as they do not consider other, easily attainable risk factors, such as sex, BMI, age, smoking habits, parental disease status and physical activity. Recent scientific literature shows that adding these factors can improve PGS based predictions significantly. However, implementation of existing PGS based models that also consider these factors requires reference data based on a specific genotyping chip, which is not always available. In this paper, we offer a method naïve to the genotyping chip used. We train these models using the UK Biobank data and test these externally in the Lifelines cohort. We show improved performance at identifying the 10% most at-risk individuals for type 2 diabetes (T2D) and coronary artery disease (CAD) by including common risk factors. Incidence in the highest risk group increases from 3.0- and 4.0-fold to 5.8 for T2D, when comparing the genetics-based model, common risk factor-based model and combined model, respectively. Similarly, we observe an increase from 2.4- and 3.0-fold to 4.7-fold risk for CAD. As such, we conclude that it is paramount that these additional variables are considered when reporting risk, unlike current practice with current available genetic tests.
Collapse
|
4
|
Brigante G, Lazzaretti C, Paradiso E, Nuzzo F, Sitti M, Tüttelmann F, Moretti G, Silvestri R, Gemignani F, Försti A, Hemminki K, Elisei R, Romei C, Zizzi EA, Deriu MA, Simoni M, Landi S, Casarini L. Genetic signature of differentiated thyroid carcinoma susceptibility: a machine learning approach. Eur Thyroid J 2022; 11:e220058. [PMID: 35976137 PMCID: PMC9513665 DOI: 10.1530/etj-22-0058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/17/2022] [Accepted: 08/17/2022] [Indexed: 11/30/2022] Open
Abstract
To identify a peculiar genetic combination predisposing to differentiated thyroid carcinoma (DTC), we selected a set of single nucleotide polymorphisms (SNPs) associated with DTC risk, considering polygenic risk score (PRS), Bayesian statistics and a machine learning (ML) classifier to describe cases and controls in three different datasets. Dataset 1 (649 DTC, 431 controls) has been previously genotyped in a genome-wide association study (GWAS) on Italian DTC. Dataset 2 (234 DTC, 101 controls) and dataset 3 (404 DTC, 392 controls) were genotyped. Associations of 171 SNPs reported to predispose to DTC in candidate studies were extracted from the GWAS of dataset 1, followed by replication of SNPs associated with DTC risk (P < 0.05) in dataset 2. The reliability of the identified SNPs was confirmed by PRS and Bayesian statistics after merging the three datasets. SNPs were used to describe the case/control state of individuals by ML classifier. Starting from 171 SNPs associated with DTC, 15 were positive in both datasets 1 and 2. Using these markers, PRS revealed that individuals in the fifth quintile had a seven-fold increased risk of DTC than those in the first. Bayesian inference confirmed that the selected 15 SNPs differentiate cases from controls. Results were corroborated by ML, finding a maximum AUC of about 0.7. A restricted selection of only 15 DTC-associated SNPs is able to describe the inner genetic structure of Italian individuals, and ML allows a fair prediction of case or control status based solely on the individual genetic background.
Collapse
Affiliation(s)
- Giulia Brigante
- Unit of Endocrinology, Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, Modena, Italy
- Unit of Endocrinology, Department of Medical Specialties, Azienda Ospedaliero-Universitaria, Modena, Italy
| | - Clara Lazzaretti
- Unit of Endocrinology, Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, Modena, Italy
| | - Elia Paradiso
- Unit of Endocrinology, Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, Modena, Italy
| | - Federico Nuzzo
- Unit of Endocrinology, Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, Modena, Italy
| | - Martina Sitti
- Unit of Endocrinology, Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, Modena, Italy
| | - Frank Tüttelmann
- Institute of Reproductive Genetics, University of Münster, Münster, Germany
| | | | | | | | - Asta Försti
- Hopp Children’s Cancer Center (KiTZ), Heidelberg, Germany
- Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Kari Hemminki
- Biomedical Center, Faculty of Medicine and Biomedical Center in Pilsen, Charles University in Prague, Pilsen, Czech Republic
- Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Rossella Elisei
- Department of Endocrinology, University Hospital, Pisa, Italy
| | - Cristina Romei
- Department of Endocrinology, University Hospital, Pisa, Italy
| | - Eric Adriano Zizzi
- Polito Med Lab, Department of Mechanical and Aerospace Engineering, Politecnico di Torino, Italy
| | - Marco Agostino Deriu
- Polito Med Lab, Department of Mechanical and Aerospace Engineering, Politecnico di Torino, Italy
| | - Manuela Simoni
- Unit of Endocrinology, Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, Modena, Italy
- Unit of Endocrinology, Department of Medical Specialties, Azienda Ospedaliero-Universitaria, Modena, Italy
- Center for Genomic Research, University of Modena and Reggio Emilia, Modena, Italy
| | - Stefano Landi
- Department of Biology, University of Pisa, Pisa, Italy
| | - Livio Casarini
- Unit of Endocrinology, Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, Modena, Italy
- Center for Genomic Research, University of Modena and Reggio Emilia, Modena, Italy
| |
Collapse
|
5
|
Blass I, Sahar T, Shraibman A, Ofer D, Rappoport N, Linial M. Revisiting the Risk Factors for Endometriosis: A Machine Learning Approach. J Pers Med 2022; 12:1114. [PMID: 35887611 PMCID: PMC9317820 DOI: 10.3390/jpm12071114] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Revised: 05/16/2022] [Accepted: 07/05/2022] [Indexed: 11/30/2022] Open
Abstract
Endometriosis is a condition characterized by implants of endometrial tissues into extrauterine sites, mostly within the pelvic peritoneum. The prevalence of endometriosis is under-diagnosed and is estimated to account for 5-10% of all women of reproductive age. The goal of this study was to develop a model for endometriosis based on the UK-biobank (UKB) and re-assess the contribution of known risk factors to endometriosis. We partitioned the data into those diagnosed with endometriosis (5924; ICD-10: N80) and a control group (142,723). We included over 1000 variables from the UKB covering personal information about female health, lifestyle, self-reported data, genetic variants, and medical history prior to endometriosis diagnosis. We applied machine learning algorithms to train an endometriosis prediction model. The optimal prediction was achieved with the gradient boosting algorithms of CatBoost for the data-combined model with an area under the ROC curve (ROC-AUC) of 0.81. The same results were obtained for women from a mixed ethnicity population of the UKB (7112; ICD-10: N80). We discovered that, prior to being diagnosed with endometriosis, affected women had significantly more ICD-10 diagnoses than the average unaffected woman. We used SHAP, an explainable AI tool, to estimate the marginal impact of a feature, given all other features. The informative features ranked by SHAP values included irritable bowel syndrome (IBS) and the length of the menstrual cycle. We conclude that the rich population-based retrospective data from the UKB are valuable for developing unified machine learning endometriosis models despite the limitations of missing data, noisy medical input, and participant age. The informative features of the model may improve clinical utility for endometriosis diagnosis.
Collapse
Affiliation(s)
- Ido Blass
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem 91904, Israel;
| | - Tali Sahar
- Alan Edwards Pain Management Unit, McGill University Health Centre, Montreal, QC H3G 1A4, Canada;
| | - Adi Shraibman
- Department of Computer Science, The Academic College of Tel Aviv-Yaffo, Tel Aviv 69978, Israel;
| | - Dan Ofer
- Department of Software and Information Systems Engineering, Faculty of Engineering Sciences, Ben-Gurion University of the Negev, Be’er Sheva 84105, Israel;
| | - Nadav Rappoport
- Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem 91904, Israel;
| | - Michal Linial
- Department of Software and Information Systems Engineering, Faculty of Engineering Sciences, Ben-Gurion University of the Negev, Be’er Sheva 84105, Israel;
| |
Collapse
|
6
|
Abstract
Genetic studies of human traits have revolutionized our understanding of the variation between individuals, and yet, the genetics of most traits is still poorly understood. In this review, we highlight the major open problems that need to be solved, and by discussing these challenges provide a primer to the field. We cover general issues such as population structure, epistasis and gene-environment interactions, data-related issues such as ancestry diversity and rare genetic variants, and specific challenges related to heritability estimates, genetic association studies, and polygenic risk scores. We emphasize the interconnectedness of these problems and suggest promising avenues to address them.
Collapse
Affiliation(s)
- Nadav Brandes
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel.
| | - Omer Weissbrod
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Michal Linial
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| |
Collapse
|
7
|
Drapkina OM, Kontsevaya AV, Kalinina AM, Avdeev SM, Agaltsov MV, Alexandrova LM, Antsiferova AA, Aronov DM, Akhmedzhanov NM, Balanova YA, Balakhonova TV, Berns SA, Bochkarev MV, Bochkareva EV, Bubnova MV, Budnevsky AV, Gambaryan MG, Gorbunov VM, Gorny BE, Gorshkov AY, Gumanova NG, Dadaeva VA, Drozdova LY, Egorov VA, Eliashevich SO, Ershova AI, Ivanova ES, Imaeva AE, Ipatov PV, Kaprin AD, Karamnova NS, Kobalava ZD, Konradi AO, Kopylova OV, Korostovtseva LS, Kotova MB, Kulikova MS, Lavrenova EA, Lischenko OV, Lopatina MV, Lukina YV, Lukyanov MM, Mayev IV, Mamedov MN, Markelova SV, Martsevich SY, Metelskaya VA, Meshkov AN, Milushkina OY, Mukaneeva DK, Myrzamatova AO, Nebieridze DV, Orlov DO, Poddubskaya EA, Popovich MV, Popovkina OE, Potievskaya VI, Prozorova GG, Rakovskaya YS, Rotar OP, Rybakov IA, Sviryaev YV, Skripnikova IA, Skoblina NA, Smirnova MI, Starinsky VV, Tolpygina SN, Usova EV, Khailova ZV, Shalnova SA, Shepel RN, Shishkova VN, Yavelov IS. 2022 Prevention of chronic non-communicable diseases in Of the Russian Federation. National guidelines. КАРДИОВАСКУЛЯРНАЯ ТЕРАПИЯ И ПРОФИЛАКТИКА 2022. [DOI: 10.15829/1728-8800-2022-3235] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
|
8
|
Ershova AI, Ivanova AA, Kiseleva AV, Sotnikova EA, Meshkov AN, Drapkina OM. From biobanking to personalized prevention of obesity, diabetes and metabolic syndrome. КАРДИОВАСКУЛЯРНАЯ ТЕРАПИЯ И ПРОФИЛАКТИКА 2022. [DOI: 10.15829/1728-8800-2021-3123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
The growing prevalence of metabolic disorders creates an increasing demand for novel approaches to their prevention and therapy. Novel genetic diagnostic technologies are developed every year, which makes it possible to identify people who are at the highest genetic risk of diabetes, non-alcoholic fatty liver disease, and metabolic syndrome. Early intervention strategies can be used to prevent metabolic disorders in this group of people. Genetic risk scores (GRSs) are a powerful tool to identify people with a high genetic risk. Millions of genetic variants are analyzed in genome-wide association studies in order to combine them into GRSs. It has become possible to store and process such huge amounts of data with the help of biobanks, where biological samples are stored according to international standards. Genetic studies include more and more people every year that increases the predictive power of GRSs. It has already been demonstrated that the use of GRSs makes future preventive measures more effective. In the near future, GRSs are likely to become part of clinical guidelines so that they can be widely used to identify people at high risk for metabolic syndrome and its components.
Collapse
Affiliation(s)
- A. I. Ershova
- National Medical Research Center for Therapy and Preventive Medicine
| | - A. A. Ivanova
- National Medical Research Center for Therapy and Preventive Medicine
| | - A. V. Kiseleva
- National Medical Research Center for Therapy and Preventive Medicine
| | - E. A. Sotnikova
- National Medical Research Center for Therapy and Preventive Medicine
| | - A. N. Meshkov
- National Medical Research Center for Therapy and Preventive Medicine; Pirogov Russian National Research Medical University
| | - O. M. Drapkina
- National Medical Research Center for Therapy and Preventive Medicine
| |
Collapse
|
9
|
Di Y, Wang J, Liu X, Zhu T. Combining Polygenic Risk Score and Voice Features to Detect Major Depressive Disorders. Front Genet 2021; 12:761141. [PMID: 34987547 PMCID: PMC8721147 DOI: 10.3389/fgene.2021.761141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 11/12/2021] [Indexed: 11/29/2022] Open
Abstract
Background: The application of polygenic risk scores (PRSs) in major depressive disorder (MDD) detection is constrained by its simplicity and uncertainty. One promising way to further extend its usability is fusion with other biomarkers. This study constructed an MDD biomarker by combining the PRS and voice features and evaluated their ability based on large clinical samples. Methods: We collected genome-wide sequences and utterances edited from clinical interview speech records from 3,580 women with recurrent MDD and 4,016 healthy people. Then, we constructed PRS as a gene biomarker by p value-based clumping and thresholding and extracted voice features using the i-vector method. Using logistic regression, we compared the ability of gene or voice biomarkers with the ability of both in combination for MDD detection. We also tested more machine learning models to further improve the detection capability. Results: With a p-value threshold of 0.005, the combined biomarker improved the area under the receiver operating characteristic curve (AUC) by 9.09% compared to that of genes only and 6.73% compared to that of voice only. Multilayer perceptron can further heighten the AUC by 3.6% compared to logistic regression, while support vector machine and random forests showed no better performance. Conclusion: The addition of voice biomarkers to genes can effectively improve the ability to detect MDD. The combination of PRS and voice biomarkers in MDD detection is feasible. This study provides a foundation for exploring the clinical application of genetic and voice biomarkers in the diagnosis of MDD.
Collapse
Affiliation(s)
- Yazheng Di
- Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Jingying Wang
- School of Optometry, Faculty of Health and Social Sciences, Hong Kong Polytechnic University, Hong Kong, China
| | - Xiaoqian Liu
- Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Tingshao Zhu
- Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
- *Correspondence: Tingshao Zhu,
| |
Collapse
|