1
|
He X, Xue N, Liu X, Tang X, Peng S, Qu Y, Jiang L, Xu Q, Liu W, Chen S. A novel clinical model for predicting malignancy of solitary pulmonary nodules: a multicenter study in chinese population. Cancer Cell Int 2021; 21:115. [PMID: 33596917 PMCID: PMC7890629 DOI: 10.1186/s12935-021-01810-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Revised: 01/25/2021] [Accepted: 02/03/2021] [Indexed: 12/26/2022] Open
Abstract
Background This study aimed to establish and validate a novel clinical model to differentiate between benign and malignant solitary pulmonary nodules (SPNs). Methods
Records from 295 patients with SPNs in Sun Yat-sen University Cancer Center were retrospectively reviewed. The novel prediction model was established using LASSO logistic regression analysis by integrating clinical features, radiologic characteristics and laboratory test data, the calibration of model was analyzed using the Hosmer-Lemeshow test (HL test). Subsequently, the model was compared with PKUPH, Shanghai and Mayo models using receiver-operating characteristics curve (ROC), decision curve analysis (DCA), net reclassification improvement index (NRI), and integrated discrimination improvement index (IDI) with the same data. Other 101 SPNs patients in Henan Tumor Hospital were used for external validation cohort. Results A total of 11 variables were screened out and then aggregated to generate new prediction model. The model showed good calibration with the HL test (P = 0.964). The AUC for our model was 0.768, which was higher than other three reported models. DCA also showed our model was superior to the other three reported models. In our model, sensitivity = 78.84%, specificity = 61.32%. Compared with the PKUPH, Shanghai and Mayo models, the NRI of our model increased by 0.177, 0.127, and 0.396 respectively, and the IDI changed − 0.019, -0.076, and 0.112, respectively. Furthermore, the model was significant positive correlation with PKUPH, Shanghai and Mayo models. Conclusions The novel model in our study had a high clinical value in diagnose of MSPNs.
Collapse
Affiliation(s)
- Xia He
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Sun Yat-sen University Cancer Center, 651 Dongfeng Road East, 510060, Guangzhou, People's Republic of China
| | - Ning Xue
- Department of Clinical Laboratory, Affiliated Cancer Hospital of Zhengzhou University, Zhengzhou Key Laboratory of Digestive Tumor Markers, Henan, 450008, Zhengzhou, People's Republic of China
| | - Xiaohua Liu
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Sun Yat-sen University Cancer Center, 651 Dongfeng Road East, 510060, Guangzhou, People's Republic of China
| | - Xuemiao Tang
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Sun Yat-sen University Cancer Center, 651 Dongfeng Road East, 510060, Guangzhou, People's Republic of China
| | - Songguo Peng
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Sun Yat-sen University Cancer Center, 651 Dongfeng Road East, 510060, Guangzhou, People's Republic of China
| | - Yuanye Qu
- Department of Clinical Laboratory, Affiliated Cancer Hospital of Zhengzhou University, Zhengzhou Key Laboratory of Digestive Tumor Markers, Henan, 450008, Zhengzhou, People's Republic of China
| | - Lina Jiang
- Department of Radiology , Affiliated Tumor Hospital of Zhengzhou University , Henan, 450008, Zhengzhou, People's Republic of China
| | - Qingxia Xu
- Department of Clinical Laboratory, Affiliated Cancer Hospital of Zhengzhou University, Zhengzhou Key Laboratory of Digestive Tumor Markers, Henan, 450008, Zhengzhou, People's Republic of China
| | - Wanli Liu
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Sun Yat-sen University Cancer Center, 651 Dongfeng Road East, 510060, Guangzhou, People's Republic of China
| | - Shulin Chen
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Sun Yat-sen University Cancer Center, 651 Dongfeng Road East, 510060, Guangzhou, People's Republic of China. .,Research Center for Translational Medicine, the First Affiliated Hospital, Sun Yat-sen University, 58 Zhongshan Road 2, Guangdong, 510080, Guangzhou, People's Republic of China.
| |
Collapse
|
2
|
Chen S, Huang H, Liu Y, Lai C, Peng S, Zhou L, Chen H, Xu Y, He X. A multi-parametric prognostic model based on clinical features and serological markers predicts overall survival in non-small cell lung cancer patients with chronic hepatitis B viral infection. Cancer Cell Int 2020; 20:555. [PMID: 33292228 PMCID: PMC7678183 DOI: 10.1186/s12935-020-01635-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 10/29/2020] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND To establish and validate a multi-parametric prognostic model based on clinical features and serological markers to estimate the overall survival (OS) in non-small cell lung cancer (NSCLC) patients with chronic hepatitis B viral (HBV) infection. METHODS The prognostic model was established by using Lasso regression analysis in the training cohort. The incremental predictive value of the model compared to traditional TNM staging and clinical treatment for individualized survival was evaluated by the concordance index (C-index), time-dependent ROC (tdROC) curve, and decision curve analysis (DCA). A prognostic model risk score based nomogram for OS was built by combining TNM staging and clinical treatment. Patients were divided into high-risk and low-risk subgroups according to the model risk score. The difference in survival between subgroups was analyzed using Kaplan-Meier survival analysis, and correlations between the prognostic model, TNM staging, and clinical treatment were analysed. RESULTS The C-index of the model for OS is 0.769 in the training cohorts and 0.676 in the validation cohorts, respectively, which is higher than that of TNM staging and clinical treatment. The tdROC curve and DCA show the model have good predictive accuracy and discriminatory power compare to the TNM staging and clinical treatment. The prognostic model risk score based nomogram show some net clinical benefit. According to the model risk score, patients are divided into low-risk and high-risk subgroups. The difference in OS rates is significant in the subgroups. Furthermore, the model show a positive correlation with TNM staging and clinical treatment. CONCLUSIONS The prognostic model showed good performance compared to traditional TNM staging and clinical treatment for estimating the OS in NSCLC (HBV+) patients.
Collapse
Affiliation(s)
- Shulin Chen
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Sun Yat-Sen University Cancer Center, 651 Dongfeng Road East, Guangzhou, 510060, People's Republic of China
| | - Hanqing Huang
- Department of Thoracic Surgery, Maoming People's Hospital, Maoming, 525000, Guangdong, People's Republic of China
| | - Yijun Liu
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Sun Yat-Sen University Cancer Center, 651 Dongfeng Road East, Guangzhou, 510060, People's Republic of China
| | - Changchun Lai
- Department of Clinical Laboratory, Maoming People's Hospital, Maoming, 525000, Guangdong, People's Republic of China
| | - Songguo Peng
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Sun Yat-Sen University Cancer Center, 651 Dongfeng Road East, Guangzhou, 510060, People's Republic of China
| | - Lei Zhou
- Department of Pathology Laboratory, Maoming People's Hospital, Maoming, 525000, Guangdong, People's Republic of China
| | - Hao Chen
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Sun Yat-Sen University Cancer Center, 651 Dongfeng Road East, Guangzhou, 510060, People's Republic of China
| | - Yiwei Xu
- Department of Clinical Laboratory, The Cancer Hospital of Shantou University Medical College, Precision Medicine Research Center, Shantou University Medical College, Shantou, 515041, People's Republic of China
| | - Xia He
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Sun Yat-Sen University Cancer Center, 651 Dongfeng Road East, Guangzhou, 510060, People's Republic of China.
| |
Collapse
|
3
|
Ni M, Wang L, Yu H, Wen X, Yang Y, Liu G, Hu Y, Li Z. Radiomics Approaches for Predicting Liver Fibrosis With Nonenhanced T 1 -Weighted Imaging: Comparison of Different Radiomics Models. J Magn Reson Imaging 2020; 53:1080-1089. [PMID: 33043991 DOI: 10.1002/jmri.27391] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Revised: 09/24/2020] [Accepted: 09/25/2020] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Liver fibrosis is a common process resulting from various etiologies. Sustained progression of liver fibrosis leads to cirrhosis, even hepatocellular carcinoma. Thus, noninvasive staging of liver fibrosis is of clinical importance. Radiomics is an emerging approach for staging liver fibrosis. However, the feature selection methods and classifier models are complicated, and may result in a discrepancy of diagnostic performance owing to different radiomics models. PURPOSE To identify the optimal feature selection and classifier methods for predicting liver fibrosis by using nonenhanced T1 -weighted imaging. STUDY TYPE Prospective. ANIMAL MODEL Wistar rats, total 97. FIELD STRENGTH/SEQUENCE 3T, 3D T1 -weighted images with fast-spoiled gradient echo (FSPGR). ASSESSMENT Liver fibrosis rats were induced via subcutaneous injection of a mixture of carbon tetrachloride. Rats in the control group were injected with saline. Segmentation and feature extraction were performed by 3D slicer and the image biomarker explorer (IBEX) software package. Data preprocessing, feature selection, model building, and model comparative evaluation were conducted with Python. The liver fibrosis stage was determined by pathological examination. STATISTICAL TESTS Receiver operating characteristic curve, fuzzy comprehensive evaluation. RESULTS For discriminating between F0 and F1-2, F0 and F3-4, F0 and F1-4, F0-1 and F2-4, F0-2 and F3-4, and F0-3 and F4, the accuracies of 12 radiomics models were 77.27-90.91%, 73.33-86.67%, 80.56-91.67%, 74.07-88.89%, 76.47-88.24%, and 79.49-92.31%, respectively. The AUCs of the radiomics models were 0.86-0.97, 0.85-0.95, 0.89-0.97, 0.81-0.96, 0.82-0.93, and 0.85-0.96, respectively. The least absolute shrinkage and selection operator / support vector machine (LASSO-SVM) model had high AUCs of 0.93-0.97. For discriminating between F0 and F1-2, F0 and F3-4, F0 and F1-4, F0-1 and F2-4, and F0-2 and F3-4, the fuzzy comprehensive evaluation showed that the LASSO-SVM model had a high fuzzy score/order of 0.087-0.091/1. DATA CONCLUSION LASSO-SVM appears to be the optimal model for predicting liver fibrosis by using nonenhanced T1 -weighted imaging in a rodent model of liver fibrosis. LEVEL OF EVIDENCE 2. TECHNICAL EFFICACY STAGE 2.
Collapse
Affiliation(s)
- Ming Ni
- Department of Radiology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Lili Wang
- Department of Pathology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Haiyang Yu
- Department of Radiology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Xiaoyi Wen
- Department of Statistics and Actuarial Science, University of Hong Kong, Hong Kong, China
| | - Yinghua Yang
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, China
| | - Guangzhen Liu
- Department of Radiology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Yabin Hu
- Department of Radiology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Zhiming Li
- Department of Radiology, The Affiliated Hospital of Qingdao University, Qingdao, China
| |
Collapse
|
4
|
Zhang D, Li Z, Zhang R, Yang X, Zhang D, Li Q, Wang C, Yang X, Xiong Y. Identification of differentially expressed and methylated genes associated with rheumatoid arthritis based on network. Autoimmunity 2020; 53:303-313. [DOI: 10.1080/08916934.2020.1786069] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Di Zhang
- Institute of Endemic Diseases and Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission of the People’s Republic of China, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, P.R. China
| | - ZhaoFang Li
- Institute of Endemic Diseases and Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission of the People’s Republic of China, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, P.R. China
| | - RongQiang Zhang
- Shaanxi University of Chinese Medicine, Xianyang, Shaanxi, P.R. China
| | - XiaoLi Yang
- Institute of Endemic Diseases and Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission of the People’s Republic of China, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, P.R. China
| | - DanDan Zhang
- Institute of Endemic Diseases and Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission of the People’s Republic of China, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, P.R. China
| | - Qiang Li
- Institute of Endemic Diseases and Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission of the People’s Republic of China, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, P.R. China
| | - Chen Wang
- Institute of Endemic Diseases and Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission of the People’s Republic of China, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, P.R. China
| | - Xuena Yang
- Institute of Endemic Diseases and Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission of the People’s Republic of China, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, P.R. China
| | - YongMin Xiong
- Institute of Endemic Diseases and Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission of the People’s Republic of China, School of Public Health, Xi’an Jiaotong University Health Science Center, Xi’an, Shaanxi, P.R. China
| |
Collapse
|
5
|
Bainter SA, McCaulley TG, Wager T, Losin ER. Improving Practices for Selecting a Subset of Important Predictors in Psychology: An Application to Predicting Pain. ADVANCES IN METHODS AND PRACTICES IN PSYCHOLOGICAL SCIENCE 2020; 3:66-80. [PMID: 34327305 PMCID: PMC8317830 DOI: 10.1177/2515245919885617] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Frequently, researchers in psychology are faced with the challenge of narrowing down a large set of predictors to a smaller subset. There are a variety of ways to do this, but commonly it is done by choosing predictors with the strongest bivariate correlations with the outcome. However, when predictors are correlated, bivariate relationships may not translate into multivariate relationships. Further, any attempts to control for multiple testing are likely to result in extremely low power. Here we introduce a Bayesian variable-selection procedure frequently used in other disciplines, stochastic search variable selection (SSVS). We apply this technique to choosing the best set of predictors of the perceived unpleasantness of an experimental pain stimulus from among a large group of sociocultural, psychological, and neurobiological (functional MRI) individual-difference measures. Using SSVS provides information about which variables predict the outcome, controlling for uncertainty in the other variables of the model. This approach yields new, useful information to guide the choice of relevant predictors. We have provided Web-based open-source software for performing SSVS and visualizing the results.
Collapse
Affiliation(s)
| | | | - Tor Wager
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH
| | | |
Collapse
|
6
|
Tse LA, Dai J, Chen M, Liu Y, Zhang H, Wong TW, Leung CC, Kromhout H, Meijer E, Liu S, Wang F, Yu ITS, Shen H, Chen W. Prediction models and risk assessment for silicosis using a retrospective cohort study among workers exposed to silica in China. Sci Rep 2015; 5:11059. [PMID: 26090590 PMCID: PMC4473532 DOI: 10.1038/srep11059] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2014] [Accepted: 05/15/2015] [Indexed: 11/09/2022] Open
Abstract
This study aims to develop a prognostic risk prediction model for the development of silicosis among workers exposed to silica dust in China. The prediction model was performed by using retrospective cohort of 3,492 workers exposed to silica in an iron ore, with 33 years of follow-up. We developed a risk score system using a linear combination of the predictors weighted by the LASSO penalized Cox regression coefficients. The model's predictive accuracy was evaluated using time-dependent ROC curves. Six predictors were selected into the final prediction model (age at entry of the cohort, mean concentration of respirable silica, net years of dust exposure, smoking, illiteracy, and no. of jobs). We classified workers into three risk groups according to the quartile (Q1, Q3) of risk score; 203 (23.28%) incident silicosis cases were derived from the high risk group (risk score ≥ 5.91), whilst only 4 (0.46%) cases were from the low risk group (risk score < 3.97). The score system was regarded as accurate given the range of AUCs (83-96%). This study developed a unique score system with a good internal validity, which provides scientific guidance to the clinicians to identify high-risk workers, thus has important cost efficient implications.
Collapse
Affiliation(s)
- Lap Ah Tse
- Division of Occupational and Environmental Health, JC School of Public Health and Primary Care, the Chinese University of Hong Kong, HKSAR, China
| | - Juncheng Dai
- 1] Division of Occupational and Environmental Health, JC School of Public Health and Primary Care, the Chinese University of Hong Kong, HKSAR, China [2] Department of Epidemiology and Biostatistics, Collaborative Innovation Center of Cancer Medicine, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Minghui Chen
- Division of Occupational and Environmental Health, JC School of Public Health and Primary Care, the Chinese University of Hong Kong, HKSAR, China
| | - Yuewei Liu
- Department of Occupational &Environmental Health and MOE Key lab of Environmental and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Hao Zhang
- Department of Occupational &Environmental Health and MOE Key lab of Environmental and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Tze Wai Wong
- Division of Occupational and Environmental Health, JC School of Public Health and Primary Care, the Chinese University of Hong Kong, HKSAR, China
| | - Chi Chiu Leung
- Pneumoconiosis Clinic, Department of Health, HKSAR, China
| | - Hans Kromhout
- Institute for Risk Assessment Sciences, Utrecht University, Netherlands
| | - Evert Meijer
- Pneumoconiosis Clinic, Department of Health, HKSAR, China
| | - Su Liu
- Division of Occupational and Environmental Health, JC School of Public Health and Primary Care, the Chinese University of Hong Kong, HKSAR, China
| | - Feng Wang
- Division of Occupational and Environmental Health, JC School of Public Health and Primary Care, the Chinese University of Hong Kong, HKSAR, China
| | - Ignatius Tak-sun Yu
- 1] Division of Occupational and Environmental Health, JC School of Public Health and Primary Care, the Chinese University of Hong Kong, HKSAR, China [2] Hong Kong Academy of Occupational and Environmental Health
| | - Hongbing Shen
- Department of Epidemiology and Biostatistics, Collaborative Innovation Center of Cancer Medicine, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Weihong Chen
- Department of Occupational &Environmental Health and MOE Key lab of Environmental and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| |
Collapse
|
7
|
Stingo FC, Swartz MD, Vannucci M. A Bayesian approach to identify genes and gene-level SNP aggregates in a genetic analysis of cancer data. STATISTICS AND ITS INTERFACE 2015; 8:137-151. [PMID: 28989562 PMCID: PMC5630184 DOI: 10.4310/sii.2015.v8.n2.a2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Complex diseases, such as cancer, arise from complex etiologies consisting of multiple single-nucleotide polymorphisms (SNPs), each contributing a small amount to the overall risk of disease. Thus, many researchers have gone beyond single-SNPs analysis methods, focusing instead on groups of SNPs, for example by analysing haplotypes. More recently, pathway-based methods have been proposed that use prior biological knowledge on gene function to achieve a more powerful analysis of genome-wide association studies (GWAS) data. In this paper we propose a novel Bayesian modeling framework to identify molecular biomarkers for disease prediction. Our method combines pathway-based approaches with multiple SNP analyses of a specified region of interest. The model's development is motivated by SNP data from a lung cancer study. In our approach we define gene-level scores based on SNP allele frequencies and use a linear modeling setting to study the scores association to the observed phenotype. The basic idea behind the definition of gene-level scores is to weigh the SNPs within the gene according to their rarity, based on genotype frequencies expected under the Hardy-Weinberg equilibrium law. This results in scores giving more importance to the unusually low frequencies, i.e. to SNPs that might indicate peculiar genetic differences between subjects belonging to different groups. An additional feature of our approach is that we incorporate information on SNP-to-SNP associations into the model. In particular, we use network priors that model the linkage disequilibrium between SNPs. For posterior inference, we design a stochastic search method that identifies significant biomarkers (genes and SNPs) for disease prediction. We assess performances on simulated data and compare results to existing approaches. We then show the ability of the proposed methodology to detect relevant genes and associated SNPs in a lung cancer dataset.
Collapse
Affiliation(s)
- Francesco C Stingo
- Department of Biostatistics, MD Anderson Cancer Center, 1400 Pressler St. Houston, TX 77030, USA
| | - Michael D Swartz
- Department of Biostatistics, UT School of Public Health, 1200 Pressler St. Houston, TX 77030, USA
| | - Marina Vannucci
- Department of Statistics, MS 138, Rice University, 6100 Main St. Houston, TX 77251-1892 USA
| |
Collapse
|
8
|
Zhang X, Xue F, Liu H, Zhu D, Peng B, Wiemels JL, Yang X. Integrative Bayesian variable selection with gene-based informative priors for genome-wide association studies. BMC Genet 2014; 15:130. [PMID: 25491445 PMCID: PMC4275962 DOI: 10.1186/s12863-014-0130-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2014] [Accepted: 11/17/2014] [Indexed: 11/10/2022] Open
Abstract
Background Genome-wide Association Studies (GWAS) are typically designed to identify phenotype-associated single nucleotide polymorphisms (SNPs) individually using univariate analysis methods. Though providing valuable insights into genetic risks of common diseases, the genetic variants identified by GWAS generally account for only a small proportion of the total heritability for complex diseases. To solve this “missing heritability” problem, we implemented a strategy called integrative Bayesian Variable Selection (iBVS), which is based on a hierarchical model that incorporates an informative prior by considering the gene interrelationship as a network. It was applied here to both simulated and real data sets. Results Simulation studies indicated that the iBVS method was advantageous in its performance with highest AUC in both variable selection and outcome prediction, when compared to Stepwise and LASSO based strategies. In an analysis of a leprosy case–control study, iBVS selected 94 SNPs as predictors, while LASSO selected 100 SNPs. The Stepwise regression yielded a more parsimonious model with only 3 SNPs. The prediction results demonstrated that the iBVS method had comparable performance with that of LASSO, but better than Stepwise strategies. Conclusions The proposed iBVS strategy is a novel and valid method for Genome-wide Association Studies, with the additional advantage in that it produces more interpretable posterior probabilities for each variable unlike LASSO and other penalized regression methods. Electronic supplementary material The online version of this article (doi:10.1186/s12863-014-0130-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xiaoshuai Zhang
- School of Public Health, Shandong University, Jinan, Shandong, 250012, China.
| | - Fuzhong Xue
- School of Public Health, Shandong University, Jinan, Shandong, 250012, China.
| | - Hong Liu
- Shandong Provincial Institute of Dermatology and Venereology, Shandong Academy of Medical Science, Jinan, Shandong, 250022, China.
| | - Dianwen Zhu
- Bayessoft, Inc., 2221 Caravaggio Drive, Davis, CA, 95618, USA.
| | - Bin Peng
- School of Public Health, Chongqing Medical University, Chongqing, 400016, China.
| | - Joseph L Wiemels
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, 94158, USA.
| | - Xiaowei Yang
- Bayessoft, Inc., 2221 Caravaggio Drive, Davis, CA, 95618, USA.
| |
Collapse
|
9
|
Swartz MD, Peterson CB, Lupo PJ, Wu X, Forman MR, Spitz MR, Hernandez LM, Vannucci M, Shete S. Investigating multiple candidate genes and nutrients in the folate metabolism pathway to detect genetic and nutritional risk factors for lung cancer. PLoS One 2013; 8:e53475. [PMID: 23372658 PMCID: PMC3553105 DOI: 10.1371/journal.pone.0053475] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2012] [Accepted: 11/28/2012] [Indexed: 11/18/2022] Open
Abstract
PURPOSE Folate metabolism, with its importance to DNA repair, provides a promising region for genetic investigation of lung cancer risk. This project investigates genes (MTHFR, MTR, MTRR, CBS, SHMT1, TYMS), folate metabolism related nutrients (B vitamins, methionine, choline, and betaine) and their gene-nutrient interactions. METHODS We analyzed 115 tag single nucleotide polymorphisms (SNPs) and 15 nutrients from 1239 and 1692 non-Hispanic white, histologically-confirmed lung cancer cases and controls, respectively, using stochastic search variable selection (a Bayesian model averaging approach). Analyses were stratified by current, former, and never smoking status. RESULTS Rs6893114 in MTRR (odds ratio [OR] = 2.10; 95% credible interval [CI]: 1.20-3.48) and alcohol (drinkers vs. non-drinkers, OR = 0.48; 95% CI: 0.26-0.84) were associated with lung cancer risk in current smokers. Rs13170530 in MTRR (OR = 1.70; 95% CI: 1.10-2.87) and two SNP*nutrient interactions [betaine*rs2658161 (OR = 0.42; 95% CI: 0.19-0.88) and betaine*rs16948305 (OR = 0.54; 95% CI: 0.30-0.91)] were associated with lung cancer risk in former smokers. SNPs in MTRR (rs13162612; OR = 0.25; 95% CI: 0.11-0.58; rs10512948; OR = 0.61; 95% CI: 0.41-0.90; rs2924471; OR = 3.31; 95% CI: 1.66-6.59), and MTHFR (rs9651118; OR = 0.63; 95% CI: 0.43-0.95) and three SNP*nutrient interactions (choline*rs10475407; OR = 1.62; 95% CI: 1.11-2.42; choline*rs11134290; OR = 0.51; 95% CI: 0.27-0.92; and riboflavin*rs8767412; OR = 0.40; 95% CI: 0.15-0.95) were associated with lung cancer risk in never smokers. CONCLUSIONS This study identified possible nutrient and genetic factors related to folate metabolism associated with lung cancer risk, which could potentially lead to nutritional interventions tailored by smoking status to reduce lung cancer risk.
Collapse
Affiliation(s)
- Michael D Swartz
- Division of Biostatistics, University of Texas School of Public Health, Houston, Texas, United States of America.
| | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Identifying highly conserved and highly differentiated gene ontology categories in human populations. PLoS One 2011; 6:e27871. [PMID: 22140477 PMCID: PMC3227580 DOI: 10.1371/journal.pone.0027871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2010] [Accepted: 10/27/2011] [Indexed: 11/19/2022] Open
Abstract
Detecting and interpreting certain system-level characteristics associated with human population genetic differences is a challenge for human geneticists. In this study, we conducted a population genetic study using the HapMap genotype data to identify certain special Gene Ontology (GO) categories associated with high/low genetic difference among 11 Hapmap populations. Initially, the genetic differences in each gene region among these populations were measured using allele frequency, linkage disequilibrium (LD) pattern, and transferability of tagSNPs. The associations between each GO term and these genetic differences were then identified. The results showed that cellular process, catalytic activity, binding, and some of their sub-terms were associated with high levels of genetic difference, and genes involved in these functional categories displayed, on average, high genetic diversity among different populations. By contrast, multicellular organismal processes, molecular transducer activity, and some of their sub-terms were associated with low levels of genetic difference. In particular, the neurological system process under the multicellular organismal process category had low levels of genetic difference; the neurological function also showed high evolutionary conservation between species in some previous studies. These results may provide a new insight into the understanding of human evolutionary history at the system-level.
Collapse
|
11
|
Abstract
Recent advances in next-generation sequencing technologies have made it possible to generate large amounts of sequence data with rare variants in a cost-effective way. Statistical methods that test variants individually are underpowered to detect rare variants, so it is desirable to perform association analysis of rare variants by combining the information from all variants. In this study, we use a Bayesian regression method to model all variants simultaneously to identify rare variants in a data set from Genetic Analysis Workshop 17. We studied the association between the quantitative risk traits Q1, Q2, and Q4 and the single-nucleotide polymorphisms and identified several positive single-nucleotide polymorphisms for traits Q1 and Q2. However, the model also generated several apparent false positives and missed many true positives, suggesting that there is room for improvement in this model.
Collapse
Affiliation(s)
- Aimin Yan
- Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA 02115, USA.
| | | | | |
Collapse
|
12
|
Ghosh S. Genome-wide association analyses of quantitative traits: the GAW16 experience. Genet Epidemiol 2010; 33 Suppl 1:S13-8. [PMID: 19924711 DOI: 10.1002/gepi.20466] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
The group that formed on the theme of genome-wide association analyses of quantitative traits (Group 2) in the Genetic Analysis Workshop 16 comprised eight sets of investigators. Three data sets were available: one on autoantibodies related to rheumatoid arthritis provided by the North American Rheumatoid Arthritis Consortium; the second on anthropometric, lipid, and biochemical measures provided by the Framingham Heart Study (FHS); and the third a simulated data set modeled after FHS. The different investigators in the group addressed a large set of statistical challenges and applied a wide spectrum of association methods in analyzing quantitative traits at the genome-wide level. While some previously reported genes were validated, some novel chromosomal regions provided significant evidence of association in multiple contributions in the group. In this report, we discuss the different strategies explored by the different investigators with the common goal of improving the power to detect association.
Collapse
Affiliation(s)
- Saurabh Ghosh
- Human Genetics Unit, Indian Statistical Institute, Kolkata, India.
| |
Collapse
|