Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

154
(from Reference Citation Analysis)

Article PDFs (44)

Cited by > 0 (118)

Searched Name

Lasso

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Svensson T, Svensson AK, Kitlinski M, Engström G, Nilsson J, Orho-Melander M, Nilsson PM, Melander O. Very short sleep duration reveals a proteomic fingerprint that is selectively associated with incident diabetes mellitus but not with incident coronary heart disease: a cohort study. BMC Med 2024;22:173. [PMID: 38649900 PMCID: PMC11035142 DOI: 10.1186/s12916-024-03392-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 04/15/2024] [Indexed: 04/25/2024] Open

Abstract

BACKGROUND

The molecular pathways linking short and long sleep duration with incident diabetes mellitus (iDM) and incident coronary heart disease (iCHD) are not known. We aimed to identify circulating protein patterns associated with sleep duration and test their impact on incident cardiometabolic disease.

METHODS

We assessed sleep duration and measured 78 plasma proteins among 3336 participants aged 46-68 years, free from DM and CHD at baseline, and identified cases of iDM and iCHD using national registers. Incident events occurring in the first 3 years of follow-up were excluded from analyses. Tenfold cross-fit partialing-out lasso logistic regression adjusted for age and sex was used to identify proteins that significantly predicted sleep duration quintiles when compared with the referent quintile 3 (Q3). Predictive proteins were weighted and combined into proteomic scores (PS) for sleep duration Q1, Q2, Q4, and Q5. Combinations of PS were included in a linear regression model to identify the best predictors of habitual sleep duration. Cox proportional hazards regression models with sleep duration quintiles and sleep-predictive PS as the main exposures were related to iDM and iCHD after adjustment for known covariates.

RESULTS

Sixteen unique proteomic markers, predominantly reflecting inflammation and apoptosis, predicted sleep duration quintiles. The combination of PSQ1 and PSQ5 best predicted sleep duration. Mean follow-up times for iDM (n = 522) and iCHD (n = 411) were 21.8 and 22.4 years, respectively. Compared with sleep duration Q3, all sleep duration quintiles were positively and significantly associated with iDM. Only sleep duration Q1 was positively and significantly associated with iCHD. Inclusion of PSQ1 and PSQ5 abrogated the association between sleep duration Q1 and iDM. Moreover, PSQ1 was significantly associated with iDM (HR = 1.27, 95% CI: 1.06-1.53). PSQ1 and PSQ5 were not associated with iCHD and did not markedly attenuate the association between sleep duration Q1 with iCHD.

CONCLUSIONS

We here identify plasma proteomic fingerprints of sleep duration and suggest that PSQ1 could explain the association between very short sleep duration and incident DM.

Collapse

Wang S, Li W, Zeng N, Xu J, Yang Y, Deng X, Chen Z, Duan W, Liu Y, Guo Y, Chen R, Kang Y. Acute exacerbation prediction of COPD based on Auto-metric graph neural network with inspiratory and expiratory chest CT images. Heliyon 2024;10:e28724. [PMID: 38601695 PMCID: PMC11004525 DOI: 10.1016/j.heliyon.2024.e28724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 03/16/2024] [Accepted: 03/22/2024] [Indexed: 04/12/2024] Open

Affiliation(s)

Shicong Wang College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen 518118, China School of Applied Technology, Shenzhen University, Shenzhen 518060, China
Wei Li College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen 518118, China
Nanrong Zeng College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen 518118, China School of Applied Technology, Shenzhen University, Shenzhen 518060, China
Jiaxuan Xu The First Affiliated Hospital of Guangzhou Medical University, State Key Laboratory of Respiratory Disease, National Clinical Research Center for Respiratory Disease, Guangzhou Institute of Respiratory Health, The National Center for Respiratory Medicine, Guangzhou 510120, China
Yingjian Yang College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen 518118, China College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China
Xingguang Deng College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen 518118, China College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China
Ziran Chen College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen 518118, China
Wenxin Duan College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen 518118, China School of Applied Technology, Shenzhen University, Shenzhen 518060, China
Yang Liu College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen 518118, China
Yingwei Guo College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen 518118, China College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China
Rongchang Chen The First Affiliated Hospital of Guangzhou Medical University, State Key Laboratory of Respiratory Disease, National Clinical Research Center for Respiratory Disease, Guangzhou Institute of Respiratory Health, The National Center for Respiratory Medicine, Guangzhou 510120, China Department of Respiratory and Critical Care Medicine, First Affiliated Hospital of Southern University of Science and Technology, Second Clinical Medical College of Jinan University, Shenzhen People's Hospital, Shenzhen Institute of Respiratory Diseases, Shenzhen 518001, China
Yan Kang College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen 518118, China School of Applied Technology, Shenzhen University, Shenzhen 518060, China College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China Engineering Research Centre of Medical Imaging and Intelligent Analysis, Ministry of Education, Shenyang 110169, China

Collapse

Sawant PA, Hiralkar SS, Hulsurkar YP, Phutane MS, Mahajan US, Kudale AM. Predicting over-the-counter antibiotic use in rural Pune, India, using machine learning methods. Epidemiol Health 2024:e2024044. [PMID: 38637971 DOI: 10.4178/epih.e2024044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 03/25/2024] [Indexed: 04/20/2024] Open

Dai B, Breheny P. Cross-validation approaches for penalized Cox regression. Stat Methods Med Res 2024;33:702-715. [PMID: 38445300 DOI: 10.1177/09622802241233770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]

Wyss R, van der Laan M, Gruber S, Shi X, Lee H, Dutcher SK, Nelson JC, Toh S, Russo M, Wang SV, Desai RJ, Lin KJ. Targeted Learning with an Undersmoothed Lasso Propensity Score Model for Large-Scale Covariate Adjustment in Healthcare Database Studies. Am J Epidemiol 2024:kwae023. [PMID: 38517025 DOI: 10.1093/aje/kwae023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 02/13/2024] [Accepted: 03/18/2024] [Indexed: 03/23/2024] Open

Chen Q, Zhou T, Zhang C, Zhong X. Exploring relevant factors of cognitive impairment in the elderly Chinese population using Lasso regression and Bayesian networks. Heliyon 2024;10:e27069. [PMID: 38449590 PMCID: PMC10915566 DOI: 10.1016/j.heliyon.2024.e27069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 02/12/2024] [Accepted: 02/23/2024] [Indexed: 03/08/2024] Open

Allwright M, Guennewig B, Hoffmann AE, Rohleder C, Jieu B, Chung LH, Jiang YC, Lemos Wimmer BF, Qi Y, Don AS, Leweke FM, Couttas TA. ReTimeML: a retention time predictor that supports the LC-MS/MS analysis of sphingolipids. Sci Rep 2024;14:4375. [PMID: 38388524 PMCID: PMC10883992 DOI: 10.1038/s41598-024-53860-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 02/06/2024] [Indexed: 02/24/2024] Open

Hanke M, Dijkstra L, Foraita R, Didelez V. Variable selection in linear regression models: Choosing the best subset is not always the best choice. Biom J 2024;66:e2200209. [PMID: 37643390 DOI: 10.1002/bimj.202200209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Revised: 06/19/2023] [Accepted: 06/22/2023] [Indexed: 08/31/2023]

Wang J, Xu Y, Liu L, Wu W, Shen C, Huang H, Zhen Z, Meng J, Li C, Qu Z, He Q, Tian Y. Comparison of LASSO and random forest models for predicting the risk of premature coronary artery disease. BMC Med Inform Decis Mak 2023;23:297. [PMID: 38124036 PMCID: PMC10734117 DOI: 10.1186/s12911-023-02407-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 12/14/2023] [Indexed: 12/23/2023] Open

Tanigawa Y, Kellis M. Power of inclusion: Enhancing polygenic prediction with admixed individuals. Am J Hum Genet 2023;110:1888-1902. [PMID: 37890495 PMCID: PMC10645553 DOI: 10.1016/j.ajhg.2023.09.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 09/22/2023] [Accepted: 09/22/2023] [Indexed: 10/29/2023] Open

Abstract

Admixed individuals offer unique opportunities for addressing limited transferability in polygenic scores (PGSs), given the substantial trans-ancestry genetic correlation in many complex traits. However, they are rarely considered in PGS training, given the challenges in representing ancestry-matched linkage-disequilibrium reference panels for admixed individuals. Here we present inclusive PGS (iPGS), which captures ancestry-shared genetic effects by finding the exact solution for penalized regression on individual-level data and is thus naturally applicable to admixed individuals. We validate our approach in a simulation study across 33 configurations with varying heritability, polygenicity, and ancestry composition in the training set. When iPGS is applied to n = 237,055 ancestry-diverse individuals in the UK Biobank, it shows the greatest improvements in Africans by 48.9% on average across 60 quantitative traits and up to 50-fold improvements for some traits (neutrophil count, R2 = 0.058) over the baseline model trained on the same number of European individuals. When we allowed iPGS to use n = 284,661 individuals, we observed an average improvement of 60.8% for African, 11.6% for South Asian, 7.3% for non-British White, 4.8% for White British, and 17.8% for the other individuals. We further developed iPGS+refit to jointly model the ancestry-shared and -dependent genetic effects when heterogeneous genetic associations were present. For neutrophil count, for example, iPGS+refit showed the highest predictive performance in the African group (R2 = 0.115), which exceeds the best predictive performance for the White British group (R2 = 0.090 in the iPGS model), even though only 1.49% of individuals used in the iPGS training are of African ancestry. Our results indicate the power of including diverse individuals for developing more equitable PGS models.

Collapse

John M, Lencz T. Potential application of elastic nets for shared polygenicity detection with adapted threshold selection. Int J Biostat 2023;19:417-438. [PMID: 36327464 PMCID: PMC10154439 DOI: 10.1515/ijb-2020-0108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 10/05/2022] [Indexed: 11/06/2022]

Park S, Lee ER, Hong HG. Varying-coefficients for regional quantile via KNN-based LASSO with applications to health outcome study. Stat Med 2023;42:3903-3918. [PMID: 37365909 DOI: 10.1002/sim.9839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 06/04/2023] [Accepted: 06/18/2023] [Indexed: 06/28/2023]

orwa J, Oduor P, Okelloh D, Gethi D, Agaya J, Okumu A, Wandiga S. Comparison of logistic regression with regularized machine learning methods for the prediction of tuberculosis disease in people living with HIV: cross-sectional hospital-based study in Kisumu County, Kenya. Res Sq 2023:rs.3.rs-3354948. [PMID: 37790564 PMCID: PMC10543507 DOI: 10.21203/rs.3.rs-3354948/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]

Laufer B, Docherty PD, Murray R, Krueger-Ziolek S, Jalal NA, Hoeflinger F, Rupitsch SJ, Reindl L, Moeller K. Sensor Selection for Tidal Volume Determination via Linear Regression-Impact of Lasso versus Ridge Regression. Sensors (Basel) 2023;23:7407. [PMID: 37687863 PMCID: PMC10490437 DOI: 10.3390/s23177407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 08/18/2023] [Accepted: 08/23/2023] [Indexed: 09/10/2023]

Raubitzek S, Mallinger K. On the Applicability of Quantum Machine Learning. Entropy (Basel) 2023;25:992. [PMID: 37509939 PMCID: PMC10377777 DOI: 10.3390/e25070992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 06/22/2023] [Accepted: 06/26/2023] [Indexed: 07/30/2023]

Abstract

In this article, we investigate the applicability of quantum machine learning for classification tasks using two quantum classifiers from the Qiskit Python environment: the variational quantum circuit and the quantum kernel estimator (QKE). We provide a first evaluation on the performance of these classifiers when using a hyperparameter search on six widely known and publicly available benchmark datasets and analyze how their performance varies with the number of samples on two artificially generated test classification datasets. As quantum machine learning is based on unitary transformations, this paper explores data structures and application fields that could be particularly suitable for quantum advantages. Hereby, this paper introduces a novel dataset based on concepts from quantum mechanics using the exponential map of a Lie algebra. This dataset will be made publicly available and contributes a novel contribution to the empirical evaluation of quantum supremacy. We further compared the performance of VQC and QKE on six widely applicable datasets to contextualize our results. Our results demonstrate that the VQC and QKE perform better than basic machine learning algorithms, such as advanced linear regression models (Ridge and Lasso). They do not match the accuracy and runtime performance of sophisticated modern boosting classifiers such as XGBoost, LightGBM, or CatBoost. Therefore, we conclude that while quantum machine learning algorithms have the potential to surpass classical machine learning methods in the future, especially when physical quantum infrastructure becomes widely available, they currently lag behind classical approaches. Our investigations also show that classical machine learning approaches have superior performance classifying datasets based on group structures, compared to quantum approaches that particularly use unitary processes. Furthermore, our findings highlight the significant impact of different quantum simulators, feature maps, and quantum circuits on the performance of the employed quantum estimators. This observation emphasizes the need for researchers to provide detailed explanations of their hyperparameter choices for quantum machine learning algorithms, as this aspect is currently overlooked in many studies within the field. To facilitate further research in this area and ensure the transparency of our study, we have made the complete code available in a linked GitHub repository.

Collapse

Hou Y, Zhang A, Lv R, Zhang Y, Ma J, Li T. Machine learning algorithm inversion experiment and pollution analysis of water quality parameters in urban small and medium-sized rivers based on UAV multispectral data. Environ Sci Pollut Res Int 2023:10.1007/s11356-023-27963-6. [PMID: 37278900 DOI: 10.1007/s11356-023-27963-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 05/24/2023] [Indexed: 06/07/2023]

Abstract

To examine and analyze the applicability of UAV multispectral images to urban river monitoring, this paper, taking the Fuyang River in the urban area of Handan Municipality as the object, the orthogonal image data of the river in different seasons were acquired by unmanned aerial vehicles (UAVs) equipped with multispectral sensors, and at the same time, the water samples were collected for physical and chemical indexes detection. Based on the image data, a total of 51 modeling spectral indexes were obtained by constructing three forms of band combinations ranging from the difference index (DI), ratio index (RI), and normalization index (NDI) and combining six single-band spectral values. Through the partial least squares (PLS), random forest (RF), and lasso prediction models, six fitting models of water quality parameters were constructed: turbidity (Turb), suspended, substance (SS), chemical oxygen demand (COD), ammonia nitrogen (NH₄-N), total nitrogen (TN), and total phosphorus (TP). After verifying the results and evaluating the accuracy, the following conclusions were drawn: (1) The inversion accuracy of the three types of models is generally the same-summer is better than spring, and winter is the worst. (2) Water quality parameter inversion model based on two kinds of machine learning algorithms has more prominent advantages than PLS. RF model has good performance in the inversion accuracy and generalization ability of water quality parameters in different seasons. (3) The prediction accuracy and stability of the model are positively correlated to a certain extent with the size of the standard deviation of sample values. To sum up, by using the multispectral image data acquired by UAV and adopting the prediction models built upon machine learning algorithms, water quality parameters in different seasons can be predicted in different degrees.

Collapse

Pellikka P, Luotamo M, Sädekoski N, Hietanen J, Vuorinne I, Räsänen M, Heiskanen J, Siljander M, Karhu K, Klami A. Tropical altitudinal gradient soil organic carbon and nitrogen estimation using Specim IQ portable imaging spectrometer. Sci Total Environ 2023;883:163677. [PMID: 37105488 DOI: 10.1016/j.scitotenv.2023.163677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Revised: 03/25/2023] [Accepted: 04/19/2023] [Indexed: 05/03/2023]

Belhechmi S, Le Teuff G, De Bin R, Rotolo F, Michiels S. Favoring the hierarchical constraint in penalized survival models for randomized trials in precision medicine. BMC Bioinformatics 2023;24:96. [PMID: 36927444 PMCID: PMC10022294 DOI: 10.1186/s12859-023-05162-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Accepted: 01/27/2023] [Indexed: 03/18/2023] Open

Abstract

BACKGROUND

The research of biomarker-treatment interactions is commonly investigated in randomized clinical trials (RCT) for improving medicine precision. The hierarchical interaction constraint states that an interaction should only be in a model if its main effects are also in the model. However, this constraint is not guaranteed in the standard penalized statistical approaches. We aimed to find a compromise for high-dimensional data between the need for sparse model selection and the need for the hierarchical constraint.

RESULTS

To favor the property of the hierarchical interaction constraint, we proposed to create groups composed of the biomarker main effect and its interaction with treatment and to perform the bi-level selection on these groups. We proposed two weighting approaches (Single Wald (SW) and likelihood ratio test (LRT)) for the adaptive lasso method. The selection performance of these two approaches is compared to alternative lasso extensions (adaptive lasso with ridge-based weights, composite Minimax Concave Penalty, group exponential lasso and Sparse Group Lasso) through a simulation study. A RCT (NSABP B-31) randomizing 1574 patients (431 events) with early breast cancer aiming to evaluate the effect of adjuvant trastuzumab on distant-recurrence free survival with expression data from 462 genes measured in the tumour will serve for illustration. The simulation study illustrates that the adaptive lasso LRT and SW, and the group exponential lasso favored the hierarchical interaction constraint. Overall, in the alternative scenarios, they had the best balance of false discovery and false negative rates for the main effects of the selected interactions. For NSABP B-31, 12 gene-treatment interactions were identified more than 20% by the different methods. Among them, the adaptive lasso (SW) approach offered the best trade-off between a high number of selected gene-treatment interactions and a high proportion of selection of both the gene-treatment interaction and its main effect.

CONCLUSIONS

Adaptive lasso with Single Wald and likelihood ratio test weighting and the group exponential lasso approaches outperformed their competitors in favoring the hierarchical constraint of the biomarker-treatment interaction. However, the performance of the methods tends to decrease in the presence of prognostic biomarkers.

Collapse

Xiao Z, Xingjie S, Yiming L, Xu L, Ma S. A General Framework for Identifying Hierarchical Interactions and Its Application to Genomics Data. J Comput Graph Stat 2023;32:873-883. [PMID: 38009111 PMCID: PMC10671243 DOI: 10.1080/10618600.2022.2152034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Accepted: 11/08/2022] [Indexed: 12/03/2022]

Luo S, Zhang W, Mao R, Huang X, Liu F, Liao Q, Sun D, Chen H, Zhang J, Tian F. Establishment and verification of a nomogram model for predicting the risk of post-stroke depression. PeerJ 2023;11:e14822. [PMID: 36751635 PMCID: PMC9899426 DOI: 10.7717/peerj.14822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 01/06/2023] [Indexed: 02/05/2023] Open

Abstract

Objective

The purpose of this study was to establish a nomogram predictive model of clinical risk factors for post-stroke depression (PSD).

Patients and Methods

We used the data of 202 stroke patients collected from Xuanwu Hospital from October 2018 to September 2020 as training data to develop a predictive model. Nineteen clinical factors were selected to evaluate their risk. Minimum absolute contraction and selection operator (LASSO, least absolute shrinkage and selection operator) regression were used to select the best patient attributes, and seven predictive factors with predictive ability were selected, and then multi-factor logistic regression analysis was carried out to determine six predictive factors and establish a nomogram prediction model. The C-index, calibration chart, and decision curve analyses were used to evaluate the predictive ability, accuracy, and clinical practicability of the prediction model. We then used the data of 156 stroke patients collected by Xiangya Hospital from June 2019 to September 2020 for external verification.

Results

The selected predictors including work style, number of children, time from onset to hospitalization, history of hyperlipidemia, stroke area, and the National Institutes of Health Stroke Scale (NIHSS) score. The model showed good prediction ability and a C index of 0.773 (95% confidence interval: [0.696-0.850]). It reached a high C-index value of 0.71 in bootstrap verification, and its C index was observed to be as high as 0.702 (95% confidence interval: [0.616-0.788]) in external verification. Decision curve analyses further showed that the nomogram of post-stroke depression has high clinical usefulness when the threshold probability was 6%.

Conclusion

This novel nomogram, which combines patients' work style, number of children, time from onset to hospitalization, history of hyperlipidemia, stroke area, and NIHSS score, can help clinicians to assess the risk of depression in patients with acute stroke much earlier in the timeline of the disease, and to implement early intervention treatment so as to reduce the incidence of PSD.

Collapse

Tang Q, Pan D, Xu C, Chen L. Identification of molecular subtypes based on chromatin regulator and tumor microenvironment infiltration characterization in papillary renal cell carcinoma. J Cancer Res Clin Oncol 2023;149:231-45. [PMID: 36404389 DOI: 10.1007/s00432-022-04482-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Accepted: 11/14/2022] [Indexed: 11/21/2022]

Abstract

BACKGROUND

Papillary renal cell carcinoma (pRCC) is the second most common histological type of renal cell carcinoma. The prognosis of local pRCC is better than that of ccRCC, but the situation has changed greatly after pRCC metastasis. Chromatin regulators (CRs) are indispensable in epigenetic regulation, and their abnormal expression in tumors leads to the occurrence and development of tumor. However, the role of CRs in pRCC has not been studied yet.

MATERIALS AND METHODS

291 samples were obtained from TCGA-KIPR cohort. Unsupervised clustering analysis was utilized to divide the patients of pRCC into two subtypes. Lasso Cox regression analysis was performed to construct a CRs_score model for predicting OS. The unique characteristics of different molecular subtypes were determined by TME cell infiltration analysis, GO and KEGG analysis and drug sensitivity analysis. We also carried out drug sensitivity experiments in vitro to verify the effect of signature genes on drug sensitivity to sunitinib.

RESULTS

We described the transcriptional and genetic alteration of 19 prognosis-related CRs genes in 291 cases of TCGA-KIRP cohort. We identified two distinct molecular subtypes, which have significant differences in prognosis, clinicopathological features and tumor immune microenvironment (TME). Then, four signature genes were selected by lasso regression analysis to construct a CRs_score for predicting OS, and its predictive ability for patients with pRCC was verified. A nomogram was established to improve the clinical applicability of CRs_score. We found that there was a significant difference in the proportion of immune cell infiltration between high- and low-CRs_score. In addition, CRs_score was significantly correlated with chemosensitivity. Finally, we found that SK-RC-39 cell lines were more sensitive to sunitinib after knocking down the signature gene CDCA3, PDIA4, or SUCNR1.

CONCLUSIONS

Our comprehensive analysis of CRs gene in pRCC showed that CRs gene plays a potential role in TME, prognosis and drug resistance in pRCC. These findings may lay a foundation for further study of the regulatory role of CRs gene in pRCC, and provide a new method for evaluating prognosis and developing more effective targeted therapy.

Collapse

Dagdoug M, Goga C, Haziza D. Model-assisted estimation in high-dimensional settings for survey data. J Appl Stat 2023;50:761-785. [PMID: 36819070 PMCID: PMC9930821 DOI: 10.1080/02664763.2022.2047905] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Yao S, Wang X. Statistical and Machine Learning Methods for Discovering Prognostic Biomarkers for Survival Outcomes. Methods Mol Biol 2023;2629:11-21. [PMID: 36929071 DOI: 10.1007/978-1-0716-2986-4_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]

Chen H, Huang L, Jiang X, Wang Y, Bian Y, Ma S, Liu X. Establishment and analysis of a disease risk prediction model for the systemic lupus erythematosus with random forest. Front Immunol 2022;13:1025688. [PMID: 36405750 PMCID: PMC9667742 DOI: 10.3389/fimmu.2022.1025688] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 10/17/2022] [Indexed: 09/25/2023] Open

Abstract

Systemic lupus erythematosus (SLE) is a latent, insidious autoimmune disease, and with the development of gene sequencing in recent years, our study aims to develop a gene-based predictive model to explore the identification of SLE at the genetic level. First, gene expression datasets of SLE whole blood samples were collected from the Gene Expression Omnibus (GEO) database. After the datasets were merged, they were divided into training and validation datasets in the ratio of 7:3, where the SLE samples and healthy samples of the training dataset were 334 and 71, respectively, and the SLE samples and healthy samples of the validation dataset were 143 and 30, respectively. The training dataset was used to build the disease risk prediction model, and the validation dataset was used to verify the model identification ability. We first analyzed differentially expressed genes (DEGs) and then used Lasso and random forest (RF) to screen out six key genes (OAS3, USP18, RTP4, SPATS2L, IFI27 and OAS1), which are essential to distinguish SLE from healthy samples. With six key genes incorporated and five iterations of 10-fold cross-validation performed into the RF model, we finally determined the RF model with optimal mtry. The mean values of area under the curve (AUC) and accuracy of the models were over 0.95. The validation dataset was then used to evaluate the AUC performance and our model had an AUC of 0.948. An external validation dataset (GSE99967) with an AUC of 0.810, an accuracy of 0.836, and a sensitivity of 0.921 was used to assess the model's performance. The external validation dataset (GSE185047) of all SLE patients yielded an SLE sensitivity of up to 0.954. The final high-throughput RF model had a mean value of AUC over 0.9, again showing good results. In conclusion, we identified key genetic biomarkers and successfully developed a novel disease risk prediction model for SLE that can be used as a new SLE disease risk prediction aid and contribute to the identification of SLE.

Collapse

Breeur M, Ferrari P, Dossus L, Jenab M, Johansson M, Rinaldi S, Travis RC, His M, Key TJ, Schmidt JA, Overvad K, Tjønneland A, Kyrø C, Rothwell JA, Laouali N, Severi G, Kaaks R, Katzke V, Schulze MB, Eichelmann F, Palli D, Grioni S, Panico S, Tumino R, Sacerdote C, Bueno-de-Mesquita B, Olsen KS, Sandanger TM, Nøst TH, Quirós JR, Bonet C, Barranco MR, Chirlaque MD, Ardanaz E, Sandsveden M, Manjer J, Vidman L, Rentoft M, Muller D, Tsilidis K, Heath AK, Keun H, Adamski J, Keski-Rahkonen P, Scalbert A, Gunter MJ, Viallon V. Pan-cancer analysis of pre-diagnostic blood metabolite concentrations in the European Prospective Investigation into Cancer and Nutrition. BMC Med 2022;20:351. [PMID: 36258205 PMCID: PMC9580145 DOI: 10.1186/s12916-022-02553-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 09/05/2022] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Epidemiological studies of associations between metabolites and cancer risk have typically focused on specific cancer types separately. Here, we designed a multivariate pan-cancer analysis to identify metabolites potentially associated with multiple cancer types, while also allowing the investigation of cancer type-specific associations.

METHODS

We analysed targeted metabolomics data available for 5828 matched case-control pairs from cancer-specific case-control studies on breast, colorectal, endometrial, gallbladder, kidney, localized and advanced prostate cancer, and hepatocellular carcinoma nested within the European Prospective Investigation into Cancer and Nutrition (EPIC) cohort. From pre-diagnostic blood levels of an initial set of 117 metabolites, 33 cluster representatives of strongly correlated metabolites and 17 single metabolites were derived by hierarchical clustering. The mutually adjusted associations of the resulting 50 metabolites with cancer risk were examined in penalized conditional logistic regression models adjusted for body mass index, using the data-shared lasso penalty.

RESULTS

Out of the 50 studied metabolites, (i) six were inversely associated with the risk of most cancer types: glutamine, butyrylcarnitine, lysophosphatidylcholine a C18:2, and three clusters of phosphatidylcholines (PCs); (ii) three were positively associated with most cancer types: proline, decanoylcarnitine, and one cluster of PCs; and (iii) 10 were specifically associated with particular cancer types, including histidine that was inversely associated with colorectal cancer risk and one cluster of sphingomyelins that was inversely associated with risk of hepatocellular carcinoma and positively with endometrial cancer risk.

CONCLUSIONS

These results could provide novel insights for the identification of pathways for cancer development, in particular those shared across different cancer types.

Collapse

Affiliation(s)

Marie Breeur Nutrition and Metabolism Branch, International Agency for Research on Cancer, NME Branch, 69372 CEDEX 08, Lyon, France
Pietro Ferrari Nutrition and Metabolism Branch, International Agency for Research on Cancer, NME Branch, 69372 CEDEX 08, Lyon, France
Laure Dossus Nutrition and Metabolism Branch, International Agency for Research on Cancer, NME Branch, 69372 CEDEX 08, Lyon, France
Mazda Jenab Nutrition and Metabolism Branch, International Agency for Research on Cancer, NME Branch, 69372 CEDEX 08, Lyon, France
Mattias Johansson Genetics Branch, International Agency for Research on Cancer, 69372 CEDEX 08, Lyon, France
Sabina Rinaldi Nutrition and Metabolism Branch, International Agency for Research on Cancer, NME Branch, 69372 CEDEX 08, Lyon, France
Ruth C Travis Cancer Epidemiology Unit, Nuffield Department of Population Health, University of Oxford, Oxford, OX3 7LF, UK
Mathilde His Nutrition and Metabolism Branch, International Agency for Research on Cancer, NME Branch, 69372 CEDEX 08, Lyon, France
Tim J Key Cancer Epidemiology Unit, Nuffield Department of Population Health, University of Oxford, Oxford, OX3 7LF, UK
Julie A Schmidt Cancer Epidemiology Unit, Nuffield Department of Population Health, University of Oxford, Oxford, OX3 7LF, UK Department of Clinical Epidemiology, Department of Clinical Medicine, Aarhus University Hospital and Aarhus University, DK-8200, Aarhus N, Denmark
Kim Overvad Department of Public Health, Aarhus University, DK-8000, Aarhus C, Denmark
Anne Tjønneland Danish Cancer Society Research Center Diet, Genes and Environment Nutrition and Biomarkers, DK-2100, Copenhagen, Denmark
Cecilie Kyrø Danish Cancer Society Research Center Diet, Genes and Environment Nutrition and Biomarkers, DK-2100, Copenhagen, Denmark
Joseph A Rothwell Université Paris-Saclay, UVSQ, Inserm, CESP U1018, "Exposome and Heredity" team, Gustave Roussy, 94800, Villejuif, France
Nasser Laouali Université Paris-Saclay, UVSQ, Inserm, CESP U1018, "Exposome and Heredity" team, Gustave Roussy, 94800, Villejuif, France
Gianluca Severi Université Paris-Saclay, UVSQ, Inserm, CESP U1018, "Exposome and Heredity" team, Gustave Roussy, 94800, Villejuif, France
Rudolf Kaaks Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
Verena Katzke Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
Matthias B Schulze Department of Molecular Epidemiology, German Institute of Human Nutrition, 14558, Nuthetal, Germany
Fabian Eichelmann Department of Molecular Epidemiology, German Institute of Human Nutrition, 14558, Nuthetal, Germany German Center for Diabetes Research (DZD), 85764, Neuherberg, Germany
Domenico Palli Institute of Cancer Research, Prevention and Clinical Network (ISPRO), 50139, Florence, Italy
Sara Grioni Epidemiology and Prevention Unit, Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, 20133, Milan, Italy
Salvatore Panico Dipartimento di Medicina Clinica e Chirurgia, Federico II University, 80131, Naples, Italy
Rosario Tumino Hyblean Association for Epidemiological Research, AIRE-ONLUS, 97100, Ragusa, Italy
Carlotta Sacerdote Unit of Cancer Epidemiology Città della Salute e della Scienza University-Hospital, 10126, Turin, Italy
Bas Bueno-de-Mesquita Centre for Nutrition, Prevention and Health Services, National Institute for Public Health and the Environment (RIVM), PO Box 1, 3720, BA, Bilthoven, The Netherlands
Karina Standahl Olsen Department of Community Medicine, UiT The Arctic University of Norway, N-9037, Tromsø, Norway
Torkjel Manning Sandanger Department of Community Medicine, UiT The Arctic University of Norway, N-9037, Tromsø, Norway
Therese Haugdahl Nøst Department of Community Medicine, UiT The Arctic University of Norway, N-9037, Tromsø, Norway
J Ramón Quirós Public Health Directorate, 33006, Oviedo, Asturias, Spain
Catalina Bonet Unit of Nutrition and Cancer, Cancer Epidemiology Research Program, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, 08908, Barcelona, Spain
Miguel Rodríguez Barranco Escuela Andaluza de Salud Pública (EASP), 18011, Granada, Spain Instituto de Investigación Biosanitaria ibs. GRANADA, 18012, Granada, Spain Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP), 28029, Madrid, Spain
María-Dolores Chirlaque Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP), 28029, Madrid, Spain Department of Epidemiology, Regional Health Council, IMIB-Arrixaca, Murcia University, 30003, Murcia, Spain
Eva Ardanaz Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP), 28029, Madrid, Spain Navarra Public Health Institute, 31003, Pamplona, Spain IdiSNA, Navarra Institute for Health Research, 31008, Pamplona, Spain
Malte Sandsveden Department of Clinical Sciences Malmö Lund University, SE-214 28, Malmö, Sweden
Jonas Manjer Departement of Surgery, Skåne University Hospital Malmö, Lund University, SE-214 28, Malmö, Sweden
Linda Vidman Department of Radiation Sciences, Oncology Umeå University, SE-901 87, Umeå, Sweden
Matilda Rentoft Department of Radiation Sciences, Oncology Umeå University, SE-901 87, Umeå, Sweden
David Muller Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, W2 1PG, UK
Kostas Tsilidis Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, W2 1PG, UK
Alicia K Heath Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, W2 1PG, UK
Hector Keun Department of Surgery and Cancer, Cancer Metabolism and Systems Toxicology Group, Division of Cancer, Imperial College London, London, SW7 2AZ, UK
Jerzy Adamski Institute of Experimental Genetics, Helmholtz Zentrum München, German Research Center for Environmental Health, 85764, Neuherberg, Germany Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 117597, Singapore Institute of Biochemistry, Faculty of Medicine, University of Ljubljana, 1000, Ljubljana, Slovenia
Pekka Keski-Rahkonen Nutrition and Metabolism Branch, International Agency for Research on Cancer, NME Branch, 69372 CEDEX 08, Lyon, France
Augustin Scalbert Nutrition and Metabolism Branch, International Agency for Research on Cancer, NME Branch, 69372 CEDEX 08, Lyon, France
Marc J Gunter Nutrition and Metabolism Branch, International Agency for Research on Cancer, NME Branch, 69372 CEDEX 08, Lyon, France
Vivian Viallon Nutrition and Metabolism Branch, International Agency for Research on Cancer, NME Branch, 69372 CEDEX 08, Lyon, France.

Collapse

Jardillier R, Koca D, Chatelain F, Guyon L. Prognosis of lasso-like penalized Cox models with tumor profiling improves prediction over clinical data alone and benefits from bi-dimensional pre-screening. BMC Cancer 2022;22:1045. [PMID: 36199072 PMCID: PMC9533541 DOI: 10.1186/s12885-022-10117-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 09/14/2022] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Prediction of patient survival from tumor molecular '-omics' data is a key step toward personalized medicine. Cox models performed on RNA profiling datasets are popular for clinical outcome predictions. But these models are applied in the context of "high dimension", as the number p of covariates (gene expressions) greatly exceeds the number n of patients and e of events. Thus, pre-screening together with penalization methods are widely used for dimensional reduction.

METHODS

In the present paper, (i) we benchmark the performance of the lasso penalization and three variants (i.e., ridge, elastic net, adaptive elastic net) on 16 cancers from TCGA after pre-screening, (ii) we propose a bi-dimensional pre-screening procedure based on both gene variability and p-values from single variable Cox models to predict survival, and (iii) we compare our results with iterative sure independence screening (ISIS).

RESULTS

First, we show that integration of mRNA-seq data with clinical data improves predictions over clinical data alone. Second, our bi-dimensional pre-screening procedure can only improve, in moderation, the C-index and/or the integrated Brier score, while excluding irrelevant genes for prediction. We demonstrate that the different penalization methods reached comparable prediction performances, with slight differences among datasets. Finally, we provide advice in the case of multi-omics data integration.

CONCLUSIONS

Tumor profiles convey more prognostic information than clinical variables such as stage for many cancer subtypes. Lasso and Ridge penalizations perform similarly than Elastic Net penalizations for Cox models in high-dimension. Pre-screening of the top 200 genes in term of single variable Cox model p-values is a practical way to reduce dimension, which may be particularly useful when integrating multi-omics.

Collapse

Chu B, Qureshi S. Comparing Out-of-Sample Performance of Machine Learning Methods to Forecast U.S. GDP Growth. Comput Econ 2022;62:1-43. [PMID: 36157276 PMCID: PMC9483293 DOI: 10.1007/s10614-022-10312-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 08/04/2022] [Indexed: 06/16/2023]

Karmokar J, Islam MA, Uddin M, Hassan MR, Yousuf MSI. An assessment of meteorological parameters effects on COVID-19 pandemic in Bangladesh using machine learning models. Environ Sci Pollut Res Int 2022;29:67103-67114. [PMID: 35522407 PMCID: PMC9073515 DOI: 10.1007/s11356-022-20196-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Accepted: 04/07/2022] [Indexed: 06/14/2023]

Hou Y, Zhang A, Lv R, Zhao S, Ma J, Zhang H, Li Z. A study on water quality parameters estimation for urban rivers based on ground hyperspectral remote sensing technology. Environ Sci Pollut Res Int 2022;29:63640-63654. [PMID: 35460477 DOI: 10.1007/s11356-022-20293-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 04/12/2022] [Indexed: 06/14/2023]

Abstract

The purpose of this research is to seek a better inversion algorithm. And on this basis, it explores the feasibility of using hyperspectral monitoring technology instead of laboratory physical and chemical index test and evaluates the prediction effect of inversion model on water quality change. So as to be more convenient, more economical and extensive monitoring methods for water quality monitoring of urban internal river are provided. This paper takes the water samples collected in Fuyang River in downtown Handan as the research object and obtains original spectral data of the samples by the ASD FieldSpec 4 field hyperspectral spectrometer. After the smoothing filter pretreatment by the Savitzky-Golay (SG) method and specified mathematical transformations, the modeling spectral indicators of various water quality parameters are selected and determined by calculating the maximum mean of absolute values for correlation coefficients of various spectral indicators and measured values in the wavelength range from 400 to 950 nm. By introducing partial least squares (PLS), random forest (RF), and Lasso (least absolute shrinkage and selection operator), six water quality parameter fitting models were constructed including turbidity (Turb), suspended substance (SS), chemical oxygen demand (COD), NH4-N, total nitrogen (TN), and total phosphorus (TP), which are also testified and evaluated through hyperspectral data. The results show that different spectral transformation methods highlight different information inversion effects. The first derivative of reciprocal logarithm of spectral data after SG smoothing has a good modeling effect on four water quality parameters including Turb, COD, NH₄-N, and TP; and the first derivative of smoothed spectral data has a good modeling effect on both water quality parameters of SS and TN. Among the three models, the PLS model has a good prediction effect, with the [Formula: see text] for COD, TN, and TP ranging from 0.74 to 0.80, while that for Turb and SS shows relatively poorer prediction effect, followed by even worse effect on HN₄-H. Both machine learning algorithms of RF and Lasso have respectively obtained the best prediction models for different water quality parameters. The Lasso model has a [Formula: see text] value above 0.8 for water body organic pollutants COD, TN, and TP, and the decrease value for [Formula: see text] and [Formula: see text] is below 0.1, which indicates that the model has high prediction accuracy and strong generalization ability, but the results of SS and NH₄-N do not meet the expected accuracy. In the inversion model of RF for COD, [Formula: see text] is higher than [Formula: see text], which shows excellent performance, and has certain prediction ability for SS and NH₄-N. The RF model and Lasso model complement each other effectively in applicability and prediction accuracy. Compared with the traditional regression model PLS, machine learning has obvious overall advantages, making it more suitable for classified inversion prediction of urban river water quality parameters.

Collapse

Jiménez S, Angeles-Valdez D, Rodríguez-Delgado A, Fresán A, Miranda E, Alcalá-Lozano R, Duque-Alarcón X, Arango de Montis I, Garza-Villarreal EA. Machine learning detects predictors of symptom severity and impulsivity after dialectical behavior therapy skills training group in borderline personality disorder. J Psychiatr Res 2022;151:42-49. [PMID: 35447506 DOI: 10.1016/j.jpsychires.2022.03.063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 12/08/2021] [Accepted: 03/31/2022] [Indexed: 10/18/2022]

Tian Y, Feng Y. Transfer Learning under High-dimensional Generalized Linear Models. J Am Stat Assoc 2022;118:2684-2697. [PMID: 38562655 PMCID: PMC10982637 DOI: 10.1080/01621459.2022.2071278] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Accepted: 04/20/2022] [Indexed: 10/18/2022]

Kamenetsky ME, Trentham-Dietz A, Newcomb P, Zhu J, Gangnon RE. A Flexible Method for Identifying Spatial Clusters of Breast Cancer Using Individual-Level Data. Ann Epidemiol 2022;73:9-16. [PMID: 35772615 DOI: 10.1016/j.annepidem.2022.06.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 05/16/2022] [Accepted: 06/10/2022] [Indexed: 11/22/2022]

Ivansic D, Palm J, Pantev C, Brüggemann P, Mazurek B, Guntinas-Lichius O, Dobel C. Prediction of treatment outcome in patients suffering from chronic tinnitus - from individual characteristics to early and long-term change. J Psychosom Res 2022;157:110794. [PMID: 35339906 DOI: 10.1016/j.jpsychores.2022.110794] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 01/07/2022] [Accepted: 03/19/2022] [Indexed: 11/23/2022]

Wang JH, Wang KH, Chen YH. Overlapping group screening for detection of gene-environment interactions with application to TCGA high-dimensional survival genomic data. BMC Bioinformatics 2022;23:202. [PMID: 35637439 PMCID: PMC9150322 DOI: 10.1186/s12859-022-04750-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 05/25/2022] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

In the context of biomedical and epidemiological research, gene-environment (G-E) interaction is of great significance to the etiology and progression of many complex diseases. In high-dimensional genetic data, two general models, marginal and joint models, are proposed to identify important interaction factors. Most existing approaches for identifying G-E interactions are limited owing to the lack of robustness to outliers/contamination in response and predictor data. In particular, right-censored survival outcomes make the associated feature screening even challenging. In this article, we utilize the overlapping group screening (OGS) approach to select important G-E interactions related to clinical survival outcomes by incorporating the gene pathway information under a joint modeling framework.

RESULTS

Simulation studies under various scenarios are carried out to compare the performances of our proposed method with some commonly used methods. In the real data applications, we use our proposed method to identify G-E interactions related to the clinical survival outcomes of patients with head and neck squamous cell carcinoma, and esophageal carcinoma in The Cancer Genome Atlas clinical survival genetic data, and further establish corresponding survival prediction models. Both simulation and real data studies show that our method performs well and outperforms existing methods in the G-E interaction selection, effect estimation, and survival prediction accuracy.

CONCLUSIONS

The OGS approach is useful for selecting important environmental factors, genes and G-E interactions in the ultra-high dimensional feature space. The prediction ability of OGS with the Lasso penalty is better than existing methods. The same idea of the OGS approach can apply to other outcome models, such as the proportional odds survival time model, the logistic regression model for binary outcomes, and the multinomial logistic regression model for multi-class outcomes.

Collapse

Li J, Yu G, Li Q, Liu Y. Sample-wise Combined Missing Effect Model with Penalization. J Comput Graph Stat 2022;32:263-274. [PMID: 37274355 PMCID: PMC10237115 DOI: 10.1080/10618600.2022.2070172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2020] [Accepted: 04/11/2022] [Indexed: 10/18/2022]

Xie T, Zhang N, Mao Y, Zhu B. How to predict the electronic health literacy of Chinese primary and secondary school students?: establishment of a model and web nomograms. BMC Public Health 2022;22:1048. [PMID: 35614408 DOI: 10.1186/s12889-022-13421-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Accepted: 05/12/2022] [Indexed: 12/23/2022] Open

Abstract

Background

The internet has become an important resource for the public to obtain health information. Therefore, the ability to obtain and use such resources has become important for health literacy. This study aimed to establish a prediction model of Chinese students’ electronic health literacy (EHL) to guide government policymaking and parental interventions, identify the predictors of EHL in Chinese students using random forests, and establish a corresponding prediction model to help policymakers and parents determine whether primary and secondary school students have high EHL.

Methods

This is a cross-sectional study. From June to August 2021, a cluster sample survey was conducted with 1,300 students from seven primary and secondary schools in Shaanxi Province, China. We evaluated 1,235 primary and secondary school students using the e-health literacy scale. The data were divided into training and testing datasets in a 70:30 ratio for further analysis using random forest. The predictive accuracy of the score was measured using the area under the receiver operating characteristic curve. We also used decision curve analysis to determine the usefulness of the prediction model by quantifying the net benefits at different threshold probabilities in the validation dataset.

Results

We found that 33.6% of students had high EHL. The univariate analysis showed that age (P < 0.001), grade (P < 0.001), employment status (P < 0.001), household location (P < 0.001), parental phubbing behavior (P < 0.001), and general self-efficacy (P < 0.001) were significantly associated with EHL. A random forest classification model was developed with the training dataset (872 students), and seven variables were confirmed as important: age, grade, employment status, father education level, game time, parental phubbing behavior, and general self-efficacy. The validation of the model showed good discrimination, with an area under the curve of 0.975 in the training dataset and 0.738 in the testing dataset. The model was translated into an online risk calculator, which is freely available (https://xietao.shinyapps.io/DynNomapp/).

Conclusions

In this study, an intuitive tool to predict the EHL of Chinese primary and secondary school students was developed and validated.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12889-022-13421-4.

Collapse

Sajal IH, Chowdhury M, Wang T, Euhus D, Choudhary PK, Biswas S. CBCRisk-Black: a personalized contralateral breast cancer risk prediction model for black women. Breast Cancer Res Treat 2022. [PMID: 35562619 DOI: 10.1007/s10549-022-06612-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Accepted: 04/18/2022] [Indexed: 11/02/2022]

Liang B, Wei R, Zhang J, Li Y, Yang T, Xu S, Zhang K, Xia W, Guo B, Liu B, Zhou F, Wu Q, Dai J. Applying pytorch toolkit to plan optimization for circular cone based robotic radiotherapy. Radiat Oncol 2022;17:82. [PMID: 35443714 PMCID: PMC9022303 DOI: 10.1186/s13014-022-02045-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 03/31/2022] [Indexed: 11/25/2022] Open

Affiliation(s)

Bin Liang Department of Radiation Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Chaoyang Dist, 17 Panjianyuannanli Rd., Beijing, 100021, China
Ran Wei Department of Radiation Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Chaoyang Dist, 17 Panjianyuannanli Rd., Beijing, 100021, China
Jianghu Zhang Department of Radiation Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Chaoyang Dist, 17 Panjianyuannanli Rd., Beijing, 100021, China
Yongbao Li Sun Yat-Sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangzhou, 510060, Guangdong, China
Tao Yang Department of Radiation Oncology, PLA General Hospital, Beijing, 100853, China
Shouping Xu Department of Radiation Oncology, PLA General Hospital, Beijing, 100853, China
Ke Zhang Department of Radiation Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Chaoyang Dist, 17 Panjianyuannanli Rd., Beijing, 100021, China
Wenlong Xia Department of Radiation Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Chaoyang Dist, 17 Panjianyuannanli Rd., Beijing, 100021, China
Bin Guo Image Processing Center, Beihang University, Beijing, 100191, China
Bo Liu Image Processing Center, Beihang University, Beijing, 100191, China.,Beijing Advanced Innovation Center for Biomedical Engineering, Beihang University, Beijing, 100083, China
Fugen Zhou Image Processing Center, Beihang University, Beijing, 100191, China.,Beijing Advanced Innovation Center for Biomedical Engineering, Beihang University, Beijing, 100083, China
Qiuwen Wu Division of Radiation Physics, Department of Radiation Oncology, Duke University Medical Center, Box 3295, Durham, NC, 27710, USA.
Jianrong Dai Department of Radiation Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Chaoyang Dist, 17 Panjianyuannanli Rd., Beijing, 100021, China.

Collapse

Buchaillot ML, Soba D, Shu T, Liu J, Aranjuelo I, Araus JL, Runion GB, Prior SA, Kefauver SC, Sanz-Saez A. Estimating peanut and soybean photosynthetic traits using leaf spectral reflectance and advance regression models. Planta 2022;255:93. [PMID: 35325309 PMCID: PMC8948130 DOI: 10.1007/s00425-022-03867-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 03/03/2022] [Indexed: 06/14/2023]

Zhang J, Fuhrer T, Ye H, Kwan B, Montemayor D, Tumova J, Darshi M, Afshinnia F, Scialla JJ, Anderson A, Porter AC, Taliercio JJ, Rincon-Choles H, Rao P, Xie D, Feldman H, Sauer U, Sharma K, Natarajan L. High-Throughput Metabolomics and Diabetic Kidney Disease Progression: Evidence from the Chronic Renal Insufficiency (CRIC) Study. Am J Nephrol 2022;53:215-225. [PMID: 35196658 PMCID: PMC9116599 DOI: 10.1159/000521940] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 12/30/2021] [Indexed: 01/14/2023]

Abstract

INTRODUCTION

Metabolomics could offer novel prognostic biomarkers and elucidate mechanisms of diabetic kidney disease (DKD) progression. Via metabolomic analysis of urine samples from 995 CRIC participants with diabetes and state-of-the-art statistical modeling, we aimed to identify metabolites prognostic to DKD progression.

METHODS

Urine samples (N = 995) were assayed for relative metabolite abundance by untargeted flow-injection mass spectrometry, and stringent statistical criteria were used to eliminate noisy compounds, resulting in 698 annotated metabolite ions. Utilizing the 698 metabolites' ion abundance along with clinical data (demographics, blood pressure, HbA1c, eGFR, and albuminuria), we developed univariate and multivariate models for the eGFR slope using penalized (lasso) and random forest models. Final models were tested on time-to-ESKD (end-stage kidney disease) via cross-validated C-statistics. We also conducted pathway enrichment analysis and a targeted analysis of a subset of metabolites.

RESULTS

Six eGFR slope models selected 9-30 variables. In the adjusted ESKD model with highest C-statistic, valine (or betaine) and 3-(4-methyl-3-pentenyl)thiophene were associated (p < 0.05) with 44% and 65% higher hazard of ESKD per doubling of metabolite abundance, respectively. Also, 13 (of 15) prognostic amino acids, including valine and betaine, were confirmed in the targeted analysis. Enrichment analysis revealed pathways implicated in kidney and cardiometabolic disease.

CONCLUSIONS

Using the diverse CRIC sample, a high-throughput untargeted assay, followed by targeted analysis, and rigorous statistical analysis to reduce false discovery, we identified several novel metabolites implicated in DKD progression. If replicated in independent cohorts, our findings could inform risk stratification and treatment strategies for patients with DKD.

Collapse

Affiliation(s)

Jing Zhang Moores Cancer Center, University of California, San Diego, California, USA
Tobias Fuhrer Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
Hongping Ye Department of Medicine, Center for Renal Precision Medicine, University of Texas Health Science Center at San Antonio, San Antonio, Texas, USA
Brian Kwan Moores Cancer Center, University of California, San Diego, California, USA Division of Biostatistics and Bioinformatics, Herbert Wertheim School of Public Health and Human Longevity Science, University of California, San Diego, California, USA
Daniel Montemayor Department of Medicine, Center for Renal Precision Medicine, University of Texas Health Science Center at San Antonio, San Antonio, Texas, USA
Jana Tumova Department of Medicine, Center for Renal Precision Medicine, University of Texas Health Science Center at San Antonio, San Antonio, Texas, USA
Manjula Darshi Department of Medicine, Center for Renal Precision Medicine, University of Texas Health Science Center at San Antonio, San Antonio, Texas, USA
Farsad Afshinnia Division of Nephrology, Department of Internal Medicine, University of Michigan, Medical School, Ann Arbor, Michigan, USA
Julia J. Scialla Departments of Medicine and Public Health Sciences, University of Virginia School of Medicine, Charlottesville, Virginia, USA
Amanda Anderson Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, Louisiana, USA Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
Anna C. Porter Jesse Brown VA Medical Center, University of Illinois at Chicago, Chicago, Illinois, USA
Jonathan J. Taliercio Cleveland Clinic Foundation, Glickman Urological & Kidney Institute, Department of Nephrology, Cleveland, Ohio, USA
Hernan Rincon-Choles Cleveland Clinic Foundation, Glickman Urological & Kidney Institute, Department of Nephrology, Cleveland, Ohio, USA
Panduranga Rao Division of Nephrology, Department of Internal Medicine, University of Michigan, Medical School, Ann Arbor, Michigan, USA
Dawei Xie Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA Center for Clinical Epidemiology and Biostatistics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
Harold Feldman Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA Center for Clinical Epidemiology and Biostatistics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
Uwe Sauer Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
Kumar Sharma Department of Medicine, Center for Renal Precision Medicine, University of Texas Health Science Center at San Antonio, San Antonio, Texas, USA
Loki Natarajan Moores Cancer Center, University of California, San Diego, California, USA Division of Biostatistics and Bioinformatics, Herbert Wertheim School of Public Health and Human Longevity Science, University of California, San Diego, California, USA

Collapse

Cui J, Lu J, Weng Y, Yi GY, He W. COVID-19 impact on mental health. BMC Med Res Methodol 2022;22:15. [PMID: 35026998 PMCID: PMC8758244 DOI: 10.1186/s12874-021-01411-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Accepted: 09/20/2021] [Indexed: 12/04/2022] Open

Abstract

Background

The coronavirus disease 2019 (COVID-19) pandemic has posed a significant influence on public mental health. Current efforts focus on alleviating the impacts of the disease on public health and the economy, with the psychological effects due to COVID-19 relatively ignored. In this research, we are interested in exploring the quantitative characterization of the pandemic impact on public mental health by studying an online survey dataset of the United States.

Methods

The analyses are conducted based on a large scale of online mental health-related survey study in the United States, conducted over 12 consecutive weeks from April 23, 2020 to July 21, 2020. We are interested in examining the risk factors that have a significant impact on mental health as well as in their estimated effects over time. We employ the multiple imputation by chained equations (MICE) method to deal with missing values and take logistic regression with the least absolute shrinkage and selection operator (Lasso) method to identify risk factors for mental health.

Results

Our analysis shows that risk predictors for an individual to experience mental health issues include the pandemic situation of the State where the individual resides, age, gender, race, marital status, health conditions, the number of household members, employment status, the level of confidence of the future food affordability, availability of health insurance, mortgage status, and the information of kids enrolling in school. The effects of most of the predictors seem to change over time though the degree varies for different risk factors. The effects of risk factors, such as States and gender show noticeable change over time, whereas the factor age exhibits seemingly unchanged effects over time.

Conclusions

The analysis results unveil evidence-based findings to identify the groups who are psychologically vulnerable to the COVID-19 pandemic. This study provides helpful evidence for assisting healthcare providers and policymakers to take steps for mitigating the pandemic effects on public mental health, especially in boosting public health care, improving public confidence in future food conditions, and creating more job opportunities.

Trial registration

This article does not report the results of a health care intervention on human participants.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12874-021-01411-w.

Collapse

Hamidi F, Gilani N, Belaghi RA, Sarbakhsh P, Edgünlü T, Santaguida P. Exploration of Potential miRNA Biomarkers and Prediction for Ovarian Cancer Using Artificial Intelligence. Front Genet 2021;12:724785. [PMID: 34899827 PMCID: PMC8656459 DOI: 10.3389/fgene.2021.724785] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 10/07/2021] [Indexed: 12/20/2022] Open

Zheng E, Zhang J, Wang Q, Qiao H. Continuous Multi-DoF Wrist Kinematics Estimation Based on a Human-Machine Interface With Electrical-Impedance-Tomography. Front Neurorobot 2021;15:734525. [PMID: 34658831 PMCID: PMC8515921 DOI: 10.3389/fnbot.2021.734525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 08/16/2021] [Indexed: 11/21/2022] Open

Ballout N, Garcia C, Viallon V. Sparse estimation for case-control studies with multiple disease subtypes. Biostatistics 2021;22:738-755. [PMID: 31977036 DOI: 10.1093/biostatistics/kxz063] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2019] [Revised: 12/13/2019] [Accepted: 12/16/2019] [Indexed: 11/15/2022] Open

Satheeshkumar PS, El-Dallal M, Mohan MP. Feature selection and predicting chemotherapy-induced ulcerative mucositis using machine learning methods. Int J Med Inform 2021;154:104563. [PMID: 34479094 DOI: 10.1016/j.ijmedinf.2021.104563] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Revised: 08/17/2021] [Accepted: 08/24/2021] [Indexed: 11/28/2022]

Abstract

OBJECTIVE

Ulcerative mucositis (UM) is a devastating complication of most cancer therapies with less recognized risk factors. Whilst risk predictions are most vital in adverse events, we utilized Machine learning (ML) approaches for predicting chemotherapy-induced UM.

METHODS

We utilized 2017 National Inpatient Sample database to identify discharges with antineoplastic chemotherapy-induced UM among those received chemotherapy as part of their cancer treatment. We used forward selection and backward elimination for feature selection; lasso and Gradient Boosting Method were used for building our linear and non-linear models.

RESULTS

In 2017, there were 253 (unweighted numbers) chemotherapy-induced UM patient discharges from 21,626 (unweighted numbers) adult patients who received antineoplastic chemotherapy as part of their cancer treatment. Our linear model, lasso showed performance (C-statistics) AUC: 0.75 (test dataset), 0.75 (training dataset); the Gradient Boosting Method (GBM) model showed AUC: 0.76 in the training and 0.79 in the test datasets. The feature selection derived from stepwise forward selection and backward elimination methods showed variables of importance--antineoplastic chemotherapy-induced pancytopenia, agranulocytosis due to cancer chemotherapy, fluid and electrolyte imbalance, age, anemia due to chemotherapy, median household income, and depression. Higher importance variable derived from GBM in the order of importance were antineoplastic chemotherapy-induced pancytopenia > co-morbidity score > agranulocytosis due to cancer chemotherapy > age > and fluid and electrolyte imbalance. Further, when the analysis was stratified to females only, the ML models performed better than the unstratified model.

CONCLUSION

Our study showed ML methods performed well in predicting the chemotherapy-induced UM. Predictors identified through ML approach matched to the clinically meaningful and previously discussed predictors of the chemotherapy-induced UM.

Collapse

Escribe C, Lu T, Keller-Baruch J, Forgetta V, Xiao B, Richards JB, Bhatnagar S, Oualkacha K, Greenwood CMT. Block coordinate descent algorithm improves variable selection and estimation in error-in-variables regression. Genet Epidemiol 2021;45:874-890. [PMID: 34468045 PMCID: PMC9292988 DOI: 10.1002/gepi.22430] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 07/19/2021] [Accepted: 08/12/2021] [Indexed: 11/13/2022]

Affiliation(s)

Célia Escribe Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Québec, Canada.,Operations Research Center, Massachusetts Institute of Technology, Cambridge, MA, United States
Tianyuan Lu Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Québec, Canada.,Quantitative Life Sciences Program, McGill University, Montreal, Québec, Canada
Julyan Keller-Baruch Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Québec, Canada.,Department of Human Genetics, McGill University, Montreal, Québec, Canada
Vincenzo Forgetta Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Québec, Canada
Bowei Xiao Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Québec, Canada.,Quantitative Life Sciences Program, McGill University, Montreal, Québec, Canada
J Brent Richards Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Québec, Canada.,Department of Human Genetics, McGill University, Montreal, Québec, Canada.,Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Québec, Canada.,Department of Twin Research and Genetic Epidemiology, King's College London, London, United Kingdom
Sahir Bhatnagar Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Québec, Canada.,Department of Diagnostic Radiology, McGill University, Montreal, Québec, Canada
Karim Oualkacha Département de Mathématiques, Université du Québec à Montréal, Montreal, Québec, Canada
Celia M T Greenwood Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Québec, Canada.,Department of Human Genetics, McGill University, Montreal, Québec, Canada.,Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Québec, Canada.,Gerald Bronfman Department of Oncology, McGill University, Montreal, Québec, Canada

Collapse

Mohr H, Ruge H. Fast Estimation of L1-Regularized Linear Models in the Mass-Univariate Setting. Neuroinformatics 2021;19:385-92. [PMID: 32935193 DOI: 10.1007/s12021-020-09489-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Abstract

In certain modeling approaches, activation analyses of task-based fMRI data can involve a relatively large number of predictors. For example, in the encoding model approach, complex stimuli are represented in a high-dimensional feature space, resulting in design matrices with many predictors. Similarly, single-trial models and finite impulse response models may also encompass a large number of predictors. In settings where only few of those predictors are expected to be informative, a sparse model fit can be obtained via L1-regularization. However, estimating L1-regularized models requires an iterative fitting procedure, which considerably increases computation time compared to estimating unregularized or L2-regularized models, and complicates the application of L1-regularization on whole-brain data and large sample sizes. Here we provide several functions for estimating L1-regularized models that are optimized for the mass-univariate analysis approach. The package includes a parallel implementation of the coordinate descent algorithm for CPU-only systems and two implementations of the alternating direction method of multipliers algorithm requiring a GPU device. While the core algorithms are implemented in C++/CUDA, data input/output and parameter settings can be conveniently handled via Matlab. The CPU-based implementation is highly memory-efficient and provides considerable speed-up compared to the standard implementation not optimized for the mass-univariate approach. Further acceleration can be achieved on systems equipped with a CUDA-enabled GPU. Using the fastest GPU-based implementation, computation time for whole-brain estimates can be reduced from 9 h to 5 min in an exemplary data setting. Overall, the provided package facilitates the use of L1-regularization for fMRI activation analyses and enables an efficient employment of L1-regularization on whole-brain data and large sample sizes.

Collapse

Stidham RW, Liu Y, Enchakalody B, Van T, Krishnamurthy V, Su GL, Zhu J, Waljee AK. The Use of Readily Available Longitudinal Data to Predict the Likelihood of Surgery in Crohn Disease. Inflamm Bowel Dis 2021;27:1328-1334. [PMID: 33769477 PMCID: PMC8314116 DOI: 10.1093/ibd/izab035] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Indexed: 12/11/2022]

Abstract

BACKGROUND

Although imaging, endoscopy, and inflammatory biomarkers are associated with future Crohn disease (CD) outcomes, common laboratory studies may also provide prognostic opportunities. We evaluated machine learning models incorporating routinely collected laboratory studies to predict surgical outcomes in U.S. Veterans with CD.

METHODS

Adults with CD from a Veterans Health Administration, Veterans Integrated Service Networks (VISN) 10 cohort examined between 2001 and 2015 were used for analysis. Patient demographics, medication use, and longitudinal laboratory values were used to model future surgical outcomes within 1 year. Specifically, data at the time of prediction combined with historical laboratory data characteristics, described as slope, distribution statistics, fluctuation, and linear trend of laboratory values, were considered and principal component analysis transformations were performed to reduce the dimensionality. Lasso regularized logistic regression was used to select features and construct prediction models, with performance assessed by area under the receiver operating characteristic using 10-fold cross-validation.

RESULTS

We included 4950 observations from 2809 unique patients, among whom 256 had surgery, for modeling. Our optimized model achieved a mean area under the receiver operating characteristic of 0.78 (SD, 0.002). Anti-tumor necrosis factor use was associated with a lower probability of surgery within 1 year and was the most influential predictor in the model, and corticosteroid use was associated with a higher probability of surgery. Among the laboratory variables, high platelet counts, high mean cell hemoglobin concentrations, low albumin levels, and low blood urea nitrogen values were identified as having an elevated influence and association with future surgery.

CONCLUSIONS

Using machine learning methods that incorporate current and historical data can predict the future risk of CD surgery.

Collapse

Du Y, Chen H, Varadhan R. Lasso estimation of hierarchical interactions for analyzing heterogeneity of treatment effect. Stat Med 2021;40:5417-5433. [PMID: 34240443 DOI: 10.1002/sim.9132] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Revised: 12/14/2020] [Accepted: 12/30/2020] [Indexed: 11/12/2022]

Rafique R, Islam SR, Kazi JU. Machine learning in the prediction of cancer therapy. Comput Struct Biotechnol J 2021;19:4003-4017. [PMID: 34377366 PMCID: PMC8321893 DOI: 10.1016/j.csbj.2021.07.003] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2021] [Revised: 07/06/2021] [Accepted: 07/07/2021] [Indexed: 12/15/2022] Open