Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Das P, Roychowdhury A, Das S, Roychoudhury S, Tripathy S. sigFeature: Novel Significant Feature Selection Method for Classification of Gene Expression Data Using Support Vector Machine and t Statistic. Front Genet 2020;11:247. [PMID: 32346383 PMCID: PMC7169426 DOI: 10.3389/fgene.2020.00247] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Accepted: 03/02/2020] [Indexed: 11/26/2022] Open

For:	Das P, Roychowdhury A, Das S, Roychoudhury S, Tripathy S. sigFeature: Novel Significant Feature Selection Method for Classification of Gene Expression Data Using Support Vector Machine and t Statistic. Front Genet 2020;11:247. [PMID: 32346383 PMCID: PMC7169426 DOI: 10.3389/fgene.2020.00247] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Accepted: 03/02/2020] [Indexed: 11/26/2022] Open

Number

Cited by Other Article(s)

Pradhan UK, Mahapatra A, Naha S, Gupta A, Parsad R, Gahlaut V, Rath SN, Meher PK. ASPTF: A computational tool to predict abiotic stress-responsive transcription factors in plants by employing machine learning algorithms. Biochim Biophys Acta Gen Subj 2024;1868:130597. [PMID: 38490467 DOI: 10.1016/j.bbagen.2024.130597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 02/26/2024] [Accepted: 03/10/2024] [Indexed: 03/17/2024]

Powell RT, Rinkenbaugh AL, Guo L, Cai S, Shao J, Zhou X, Zhang X, Jeter-Jones S, Fu C, Qi Y, Baameur Hancock F, White JB, Stephan C, Davies PJ, Moulder S, Symmans WF, Chang JT, Piwnica-Worms H. Targeting neddylation and sumoylation in chemoresistant triple negative breast cancer. NPJ Breast Cancer 2024;10:37. [PMID: 38802426 PMCID: PMC11130334 DOI: 10.1038/s41523-024-00644-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 05/09/2024] [Indexed: 05/29/2024] Open

Affiliation(s)

Reid T Powell Center for Translational Cancer Research, Institute of Bioscience and Technology Texas A&M Health Science Center, Houston, TX, USA
Amanda L Rinkenbaugh Department of Experimental Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
Lei Guo Center for Translational Cancer Research, Institute of Bioscience and Technology Texas A&M Health Science Center, Houston, TX, USA
Shirong Cai Department of Experimental Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
Jiansu Shao Department of Experimental Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
Xinhui Zhou Department of Experimental Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
Xiaomei Zhang Department of Experimental Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
Sabrina Jeter-Jones Department of Experimental Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
Chunxiao Fu Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
Yuan Qi Department of Experimental Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
Faiza Baameur Hancock Department of Experimental Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
Jason B White Department of Breast Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
Clifford Stephan Center for Translational Cancer Research, Institute of Bioscience and Technology Texas A&M Health Science Center, Houston, TX, USA
Peter J Davies Center for Translational Cancer Research, Institute of Bioscience and Technology Texas A&M Health Science Center, Houston, TX, USA
Stacy Moulder Department of Breast Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA Eli Lilly and Company, Indianapolis, IN, USA
W Fraser Symmans Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
Jeffrey T Chang Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA Department of Integrative Biology and Pharmacology, The University of Texas Health Science Center at Houston, Houston, TX, USA
Helen Piwnica-Worms Department of Experimental Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.

Collapse

Lin S, Wei C, Wei Y, Fan J. Construction and verification of an endoplasmic reticulum stress-related prognostic model for endometrial cancer based on WGCNA and machine learning algorithms. Front Oncol 2024;14:1362891. [PMID: 38725627 PMCID: PMC11079237 DOI: 10.3389/fonc.2024.1362891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 04/11/2024] [Indexed: 05/12/2024] Open

Abstract

Background

Endoplasmic reticulum (ER) stress arises from the accumulation of misfolded or unfolded proteins within the cell and is intricately linked to the initiation and progression of various tumors and their therapeutic strategies. However, the precise role of ER stress in uterine corpus endometrial cancer (UCEC) remains unclear.

Methods

Data on patients with UCEC and control subjects were obtained from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases. Using differential expression analysis and Weighted Gene Co-expression Network Analysis (WGCNA), we identified pivotal differentially expressed ER stress-related genes (DEERGs). Further validation of the significance of these genes in UCEC was achieved through consensus clustering and bioinformatic analyses. Using Cox regression analysis and several machine learning algorithms (least absolute shrinkage and selection operator [LASSO], eXtreme Gradient Boosting [XGBoost], support vector machine recursive feature elimination [SVM-RFE], and Random Forest), hub DEERGs associated with patient prognosis were effectively identified. Based on the four identified hub genes, a prognostic model and nomogram were constructed. Additionally, a drug sensitivity analysis and in vitro validation experiments were performed.

Results

A total of 94 DEERGs were identified in patients with UCEC and healthy controls. Consensus clustering analysis revealed significant differences in prognosis, typical immune checkpoints, and tumor microenvironments between the subtypes. Using Cox regression analysis and machine learning, four hub DEERGs, MYBL2, RADX, RUSC2, and CYP46A1, were identified to construct a prognostic model. The reliability of the model was validated using receiver operating characteristic (ROC) curves. Decision curve analysis (DCA) demonstrated the superior predictive ability of the nomogram in terms of 3- and 5-year survival, compared with that of other clinical indicators. Drug sensitivity analysis revealed increased sensitivity to dactinomycin, docetaxel, selumetinib, and trametinib in the low-risk group. The expressions of RADX, RUSC2, and CYP46A1 were downregulated, whereas that of MYBL2 was upregulated in UCEC tissues, as demonstrated by reverse transcription-quantitative polymerase chain reaction (RT-qPCR) and immunofluorescence assays.

Conclusion

This study developed a stable and accurate prognostic model based on multiple bioinformatics analyses, which can be used to assess the prognosis of UCEC. This model may contribute to future research on the risk stratification of patients with UCEC and the formulation of novel treatment strategies.

Collapse

Wang SS, Hall ML, Lee E, Kim SC, Ramesh N, Lee SH, Jang JY, Bold RJ, Ku JL, Hwang CI. Whole-genome bisulfite sequencing identifies stage- and subtype-specific DNA methylation signatures in pancreatic cancer. iScience 2024;27:109414. [PMID: 38532888 PMCID: PMC10963232 DOI: 10.1016/j.isci.2024.109414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 02/03/2024] [Accepted: 02/29/2024] [Indexed: 03/28/2024] Open

Meher PK, Sahu TK, Gupta A, Kumar A, Rustgi S. ASRpro: A machine-learning computational model for identifying proteins associated with multiple abiotic stress in plants. THE PLANT GENOME 2024;17:e20259. [PMID: 36098562 DOI: 10.1002/tpg2.20259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 08/10/2022] [Indexed: 06/15/2023]

Abstract

One of the thrust areas of research in plant breeding is to develop crop cultivars with enhanced tolerance to abiotic stresses. Thus, identifying abiotic stress-responsive genes (SRGs) and proteins is important for plant breeding research. However, identifying such genes via established genetic approaches is laborious and resource intensive. Although transcriptome profiling has remained a reliable method of SRG identification, it is species specific. Additionally, identifying multistress responsive genes using gene expression studies is cumbersome. Thus, endorsing the need to develop a computational method for identifying the genes associated with different abiotic stresses. In this work, we aimed to develop a computational model for identifying genes responsive to six abiotic stresses: cold, drought, heat, light, oxidative, and salt. The predictions were performed using support vector machine (SVM), random forest, adaptive boosting (ADB), and extreme gradient boosting (XGB), where the autocross covariance (ACC) and K-mer compositional features were used as input. With ACC, K-mer, and ACC + K-mer compositional features, the overall accuracy of ∼60-77, ∼75-86, and ∼61-78% were respectively obtained using the SVM algorithm with fivefold cross-validation. The SVM also achieved higher accuracy than the other three algorithms. The proposed model was also assessed with an independent dataset and obtained an accuracy consistent with cross-validation. The proposed model is the first of its kind and is expected to serve the requirement of experimental biologists; however, the prediction accuracy was modest. Given its importance for the research community, the online prediction application, ASRpro, is made freely available (https://iasri-sg.icar.gov.in/asrpro/) for predicting abiotic SRGs and proteins.

Collapse

Huang P, Song Y, Yang Y, Bai F, Li N, Liu D, Li C, Li X, Gou W, Zong L. Identification and verification of diagnostic biomarkers based on mitochondria-related genes related to immune microenvironment for preeclampsia using machine learning algorithms. Front Immunol 2024;14:1304165. [PMID: 38259465 PMCID: PMC10800455 DOI: 10.3389/fimmu.2023.1304165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 12/14/2023] [Indexed: 01/24/2024] Open

Mouat JS, Li S, Myint SS, Laufer BI, Lupo PJ, Schraw JM, Woodhouse JP, de Smith AJ, LaSalle JM. Epigenomic signature of major congenital heart defects in newborns with Down syndrome. Hum Genomics 2023;17:92. [PMID: 37803336 PMCID: PMC10559462 DOI: 10.1186/s40246-023-00540-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 10/02/2023] [Indexed: 10/08/2023] Open

Abstract

BACKGROUND

Congenital heart defects (CHDs) affect approximately half of individuals with Down syndrome (DS), but the molecular reasons for incomplete penetrance are unknown. Previous studies have largely focused on identifying genetic risk factors associated with CHDs in individuals with DS, but comprehensive studies of the contribution of epigenetic marks are lacking. We aimed to identify and characterize DNA methylation differences from newborn dried blood spots (NDBS) of DS individuals with major CHDs compared to DS individuals without CHDs.

METHODS

We used the Illumina EPIC array and whole-genome bisulfite sequencing (WGBS) to quantitate DNA methylation for 86 NDBS samples from the California Biobank Program: (1) 45 DS-CHD (27 female, 18 male) and (2) 41 DS non-CHD (27 female, 14 male). We analyzed global CpG methylation and identified differentially methylated regions (DMRs) in DS-CHD versus DS non-CHD comparisons (both sex-combined and sex-stratified) corrected for sex, age of blood collection, and cell-type proportions. CHD DMRs were analyzed for enrichment in CpG and genic contexts, chromatin states, and histone modifications by genomic coordinates and for gene ontology enrichment by gene mapping. DMRs were also tested in a replication dataset and compared to methylation levels in DS versus typical development (TD) WGBS NDBS samples.

RESULTS

We found global CpG hypomethylation in DS-CHD males compared to DS non-CHD males, which was attributable to elevated levels of nucleated red blood cells and not seen in females. At a regional level, we identified 58, 341, and 3938 CHD-associated DMRs in the Sex Combined, Females Only, and Males Only groups, respectively, and used machine learning algorithms to select 19 Males Only loci that could distinguish CHD from non-CHD. DMRs in all comparisons were enriched for gene exons, CpG islands, and bivalent chromatin and mapped to genes enriched for terms related to cardiac and immune functions. Lastly, a greater percentage of CHD-associated DMRs than background regions were differentially methylated in DS versus TD samples.

CONCLUSIONS

A sex-specific signature of DNA methylation was detected in NDBS of DS-CHD compared to DS non-CHD individuals. This supports the hypothesis that epigenetics can reflect the variability of phenotypes in DS, particularly CHDs.

Collapse

Affiliation(s)

Julia S Mouat Department of Medical Microbiology and Immunology, School of Medicine, University of California, Davis, CA, USA Perinatal Origins of Disparities Center, University of California, Davis, CA, USA Genome Center, University of California, Davis, CA, USA MIND Institute, University of California, Davis, CA, USA
Shaobo Li Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
Swe Swe Myint Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
Benjamin I Laufer Department of Medical Microbiology and Immunology, School of Medicine, University of California, Davis, CA, USA Perinatal Origins of Disparities Center, University of California, Davis, CA, USA Genome Center, University of California, Davis, CA, USA MIND Institute, University of California, Davis, CA, USA
Philip J Lupo Division of Hematology-Oncology, Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
Jeremy M Schraw Division of Hematology-Oncology, Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
John P Woodhouse Division of Hematology-Oncology, Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
Adam J de Smith Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
Janine M LaSalle Department of Medical Microbiology and Immunology, School of Medicine, University of California, Davis, CA, USA. Perinatal Origins of Disparities Center, University of California, Davis, CA, USA. Genome Center, University of California, Davis, CA, USA. MIND Institute, University of California, Davis, CA, USA.

Collapse

Morabito F, Adornetto C, Monti P, Amaro A, Reggiani F, Colombo M, Rodriguez-Aldana Y, Tripepi G, D’Arrigo G, Vener C, Torricelli F, Rossi T, Neri A, Ferrarini M, Cutrona G, Gentile M, Greco G. Genes selection using deep learning and explainable artificial intelligence for chronic lymphocytic leukemia predicting the need and time to therapy. Front Oncol 2023;13:1198992. [PMID: 37719021 PMCID: PMC10501728 DOI: 10.3389/fonc.2023.1198992] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Accepted: 07/31/2023] [Indexed: 09/19/2023] Open

Abstract

Analyzing gene expression profiles (GEP) through artificial intelligence provides meaningful insight into cancer disease. This study introduces DeepSHAP Autoencoder Filter for Genes Selection (DSAF-GS), a novel deep learning and explainable artificial intelligence-based approach for feature selection in genomics-scale data. DSAF-GS exploits the autoencoder's reconstruction capabilities without changing the original feature space, enhancing the interpretation of the results. Explainable artificial intelligence is then used to select the informative genes for chronic lymphocytic leukemia prognosis of 217 cases from a GEP database comprising roughly 20,000 genes. The model for prognosis prediction achieved an accuracy of 86.4%, a sensitivity of 85.0%, and a specificity of 87.5%. According to the proposed approach, predictions were strongly influenced by CEACAM19 and PIGP, moderately influenced by MKL1 and GNE, and poorly influenced by other genes. The 10 most influential genes were selected for further analysis. Among them, FADD, FIBP, FIBP, GNE, IGF1R, MKL1, PIGP, and SLC39A6 were identified in the Reactome pathway database as involved in signal transduction, transcription, protein metabolism, immune system, cell cycle, and apoptosis. Moreover, according to the network model of the 3D protein-protein interaction (PPI) explored using the NetworkAnalyst tool, FADD, FIBP, IGF1R, QTRT1, GNE, SLC39A6, and MKL1 appear coupled into a complex network. Finally, all 10 selected genes showed a predictive power on time to first treatment (TTFT) in univariate analyses on a basic prognostic model including IGHV mutational status, del(11q) and del(17p), NOTCH1 mutations, β2-microglobulin, Rai stage, and B-lymphocytosis known to predict TTFT in CLL. However, only IGF1R [hazard ratio (HR) 1.41, 95% CI 1.08-1.84, P=0.013), COL28A1 (HR 0.32, 95% CI 0.10-0.97, P=0.045), and QTRT1 (HR 7.73, 95% CI 2.48-24.04, P<0.001) genes were significantly associated with TTFT in multivariable analyses when combined with the prognostic factors of the basic model, ultimately increasing the Harrell's c-index and the explained variation to 78.6% (versus 76.5% of the basic prognostic model) and 52.6% (versus 42.2% of the basic prognostic model), respectively. Also, the goodness of model fit was enhanced (χ2 = 20.1, P=0.002), indicating its improved performance above the basic prognostic model. In conclusion, DSAF-GS identified a group of significant genes for CLL prognosis, suggesting future directions for bio-molecular research.

Collapse

Affiliation(s)

Fortunato Morabito Biotechnology Research Unit, ‘A. Sforza’ Foundation, Cosenza, Italy
Carlo Adornetto Department of Mathematics and Computer Science, University of Calabria, Cosenza, Italy
Paola Monti Mutagenesis and Cancer Prevention Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
Adriana Amaro Tumor Epigenetics Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
Francesco Reggiani Tumor Epigenetics Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
Monica Colombo Molecular Pathology Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
Yissel Rodriguez-Aldana Department of Mathematics and Computer Science, University of Calabria, Cosenza, Italy
Giovanni Tripepi Consiglio Nazionale delle Ricerche, Istituto di Fisiologia Clinica del Consiglio Nazionale delle Ricerche (CNR), Reggio Calabria, Italy
Graziella D’Arrigo Consiglio Nazionale delle Ricerche, Istituto di Fisiologia Clinica del Consiglio Nazionale delle Ricerche (CNR), Reggio Calabria, Italy
Claudia Vener Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy
Federica Torricelli Laboratory of Translational Research, Azienda Unità Sanitaria Locale - Istituto di Ricovero e Cura a Crabtree Scientifico (USL-IRCCS) of Reggio Emilia, Reggio Emilia, Italy
Teresa Rossi Laboratory of Translational Research, Azienda Unità Sanitaria Locale - Istituto di Ricovero e Cura a Crabtree Scientifico (USL-IRCCS) of Reggio Emilia, Reggio Emilia, Italy
Antonino Neri Scientific Directorate, Azienda Unità Sanitaria Locale - Istituto di Ricovero e Cura a Carattere Scientifico (USL-IRCCS) of Reggio Emilia, Reggio Emilia, Italy
Manlio Ferrarini Unità Operariva (UO) Molecular Pathology, Ospedale Policlinico San Martino Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS), Genoa, Italy
Giovanna Cutrona Molecular Pathology Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
Massimo Gentile Hematology Unit, Department of Onco-Hematology, Azienda Ospedaliera (A.O.) of Cosenza, Cosenza, Italy Department of Pharmacy and Health and Nutritional Sciences, University of Calabria, Cosenza, Italy
Gianluigi Greco Department of Mathematics and Computer Science, University of Calabria, Cosenza, Italy

Collapse

Mouat JS, Li S, Myint SS, Laufer BI, Lupo PJ, Schraw JM, Woodhouse JP, de Smith AJ, LaSalle JM. Epigenomic signature of major congenital heart defects in newborns with Down syndrome. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.05.02.23289417. [PMID: 37205408 PMCID: PMC10187438 DOI: 10.1101/2023.05.02.23289417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]

Abstract

Background

Congenital heart defects (CHDs) affect approximately half of individuals with Down syndrome (DS) but the molecular reasons for incomplete penetrance are unknown. Previous studies have largely focused on identifying genetic risk factors associated with CHDs in individuals with DS, but comprehensive studies of the contribution of epigenetic marks are lacking. We aimed to identify and characterize DNA methylation differences from newborn dried blood spots (NDBS) of DS individuals with major CHDs compared to DS individuals without CHDs.

Methods

We used the Illumina EPIC array and whole-genome bisulfite sequencing (WGBS) to quantitate DNA methylation for 86 NDBS samples from the California Biobank Program: 1) 45 DS-CHD (27 female, 18 male) and 2) 41 DS non-CHD (27 female, 14 male). We analyzed global CpG methylation and identified differentially methylated regions (DMRs) in DS-CHD vs DS non-CHD comparisons (both sex-combined and sex-stratified) corrected for sex, age of blood collection, and cell type proportions. CHD DMRs were analyzed for enrichment in CpG and genic contexts, chromatin states, and histone modifications by genomic coordinates and for gene ontology enrichment by gene mapping. DMRs were also tested in a replication dataset and compared to methylation levels in DS vs typical development (TD) WGBS NDBS samples.

Results

We found global CpG hypomethylation in DS-CHD males compared to DS non-CHD males, which was attributable to elevated levels of nucleated red blood cells and not seen in females. At a regional level, we identified 58, 341, and 3,938 CHD-associated DMRs in the Sex Combined, Females Only, and Males Only groups, respectively, and used machine learning algorithms to select 19 Males Only loci that could distinguish CHD from non-CHD. DMRs in all comparisons were enriched for gene exons, CpG islands, and bivalent chromatin and mapped to genes enriched for terms related to cardiac and immune functions. Lastly, a greater percentage of CHD-associated DMRs than background regions were differentially methylated in DS vs TD samples.

Conclusions

Collapse

Huang AA, Huang SY. Computation of the distribution of model accuracy statistics in machine learning: Comparison between analytically derived distributions and simulation-based methods. Health Sci Rep 2023;6:e1214. [PMID: 37091362 PMCID: PMC10119581 DOI: 10.1002/hsr2.1214] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 03/16/2023] [Accepted: 03/20/2023] [Indexed: 04/25/2023] Open

Sahoo B, Pinnix Z, Sims S, Zelikovsky A. Identifying Biomarkers Using Support Vector Machine to Understand the Racial Disparity in Triple-Negative Breast Cancer. J Comput Biol 2023;30:502-517. [PMID: 36716280 PMCID: PMC10325814 DOI: 10.1089/cmb.2022.0422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open

Pradhan UK, Meher PK, Naha S, Rao AR, Gupta A. ASLncR: a novel computational tool for prediction of abiotic stress-responsive long non-coding RNAs in plants. Funct Integr Genomics 2023;23:113. [PMID: 37000299 DOI: 10.1007/s10142-023-01040-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 03/23/2023] [Accepted: 03/24/2023] [Indexed: 04/01/2023]

Abstract

Abiotic stresses are detrimental to plant growth and development and have a major negative impact on crop yields. A growing body of evidence indicates that a large number of long non-coding RNAs (lncRNAs) are key to many abiotic stress responses. Thus, identifying abiotic stress-responsive lncRNAs is essential in crop breeding programs in order to develop crop cultivars resistant to abiotic stresses. In this study, we have developed the first machine learning-based computational model for predicting abiotic stress-responsive lncRNAs. The lncRNA sequences which were responsive and non-responsive to abiotic stresses served as the two classes of the dataset for binary classification using the machine learning algorithms. The training dataset was created using 263 stress-responsive and 263 non-stress-responsive sequences, whereas the independent test set consists of 101 sequences from both classes. As the machine learning model can adopt only the numeric data, the Kmer features ranging from sizes 1 to 6 were utilized to represent lncRNAs in numeric form. To select important features, four different feature selection strategies were utilized. Among the seven learning algorithms, the support vector machine (SVM) achieved the highest cross-validation accuracy with the selected feature sets. The observed 5-fold cross-validation accuracy, AU-ROC, and AU-PRC were found to be 68.84, 72.78, and 75.86%, respectively. Furthermore, the robustness of the developed model (SVM with the selected feature) was evaluated using an independent test dataset, where the overall accuracy, AU-ROC, and AU-PRC were found to be 76.23, 87.71, and 88.49%, respectively. The developed computational approach was also implemented in an online prediction tool ASLncR accessible at https://iasri-sg.icar.gov.in/aslncr/ . The proposed computational model and the developed prediction tool are believed to supplement the existing effort for the identification of abiotic stress-responsive lncRNAs in plants.

Collapse

Pradhan UK, Meher PK, Naha S, Rao AR, Kumar U, Pal S, Gupta A. ASmiR: a machine learning framework for prediction of abiotic stress-specific miRNAs in plants. Funct Integr Genomics 2023;23:92. [PMID: 36939943 DOI: 10.1007/s10142-023-01014-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 01/18/2023] [Accepted: 03/06/2023] [Indexed: 03/21/2023]

Bockorny B, Muthuswamy L, Huang L, Hadisurya M, Lim CM, Tsai LL, Gill RR, Wei JL, Bullock AJ, Grossman JE, Besaw RJ, Narasimhan S, Tao WA, Perea S, Sawhney MS, Freedman SD, Hidalgo M, Iliuk A, Muthuswamy SK. A Large-Scale Proteomics Resource of Circulating Extracellular Vesicles for Biomarker Discovery in Pancreatic Cancer. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.03.13.23287216. [PMID: 36993200 PMCID: PMC10055460 DOI: 10.1101/2023.03.13.23287216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]

Affiliation(s)

Bruno Bockorny Division of Medical Oncology, Beth Israel Deaconess Medical Center, Boston, MA, USA Harvard Medical School, Boston, MA, USA
Lakshmi Muthuswamy Blueprint Medicines, Cambridge, MA, USA
Ling Huang Henry Ford Cancer Institute, Detroit, MI, USA
Marco Hadisurya Department of Biochemistry, Purdue University, West Lafayette, IN, USA
Christine Maria Lim Nanyang Technological University, Singapore
Leo L. Tsai Harvard Medical School, Boston, MA, USA Department of Radiology, Beth Israel Deaconess Medical Center, Boston, MA, USA
Ritu R. Gill Harvard Medical School, Boston, MA, USA Department of Radiology, Beth Israel Deaconess Medical Center, Boston, MA, USA
Jesse L. Wei Harvard Medical School, Boston, MA, USA Department of Radiology, Beth Israel Deaconess Medical Center, Boston, MA, USA
Andrea J. Bullock Division of Medical Oncology, Beth Israel Deaconess Medical Center, Boston, MA, USA Harvard Medical School, Boston, MA, USA
Joseph E. Grossman Agenus Inc, Lexington, MA, USA
Robert J. Besaw Division of Medical Oncology, Beth Israel Deaconess Medical Center, Boston, MA, USA
Supraja Narasimhan Deciphera Pharmaceuticals, Waltham, MA, USA
W. Andy Tao Department of Biochemistry, Purdue University, West Lafayette, IN, USA
Sofia Perea Division of Medical Oncology, Beth Israel Deaconess Medical Center, Boston, MA, USA
Mandeep S. Sawhney Harvard Medical School, Boston, MA, USA Division of Gastroenterology, Beth Israel Deaconess Medical Center, Boston, MA
Steven D. Freedman Harvard Medical School, Boston, MA, USA Division of Gastroenterology, Beth Israel Deaconess Medical Center, Boston, MA
Manuel Hidalgo Division of Hematology-Oncology, Weill Cornell Medical College, New York, NY, USA New York-Presbyterian Hospital, New York, NY, USA
Anton Iliuk Tymora Analytical Operations, West Lafayette, IN, USA
Senthil K. Muthuswamy National Cancer Institute, NIH, Bethesda, MD, USA

Collapse

Zhao T, Zhu G, Dubey HV, Flaherty P. Identification of significant gene expression changes in multiple perturbation experiments using knockoffs. Brief Bioinform 2023;24:bbad084. [PMID: 36892174 PMCID: PMC10025447 DOI: 10.1093/bib/bbad084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 01/20/2023] [Accepted: 02/13/2023] [Indexed: 03/10/2023] Open

Pradhan UK, Meher PK, Naha S, Pal S, Gupta A, Parsad R. PlDBPred: a novel computational model for discovery of DNA binding proteins in plants. Brief Bioinform 2023;24:6840070. [PMID: 36416116 DOI: 10.1093/bib/bbac483] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 10/10/2022] [Accepted: 10/11/2022] [Indexed: 11/24/2022] Open

Hamraz M, Ali A, Mashwani WK, Aldahmani S, Khan Z. Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio. PLoS One 2023;18:e0284619. [PMID: 37098036 PMCID: PMC10128961 DOI: 10.1371/journal.pone.0284619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 04/04/2023] [Indexed: 04/26/2023] Open

Zarei Ghobadi M, Emamzadeh R, Teymoori-Rad M, Afsaneh E. Exploration of blood−derived coding and non-coding RNA diagnostic immunological panels for COVID-19 through a co-expressed-based machine learning procedure. Front Immunol 2022;13:1001070. [DOI: 10.3389/fimmu.2022.1001070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 10/17/2022] [Indexed: 11/06/2022] Open

Chen Z, Shi J, Zhang Y, Zhang J, Li S, Guan L, Jia G. Screening of Serum Biomarkers of Coal Workers' Pneumoconiosis by Metabolomics Combined with Machine Learning Strategy. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022;19:ijerph19127051. [PMID: 35742299 PMCID: PMC9222502 DOI: 10.3390/ijerph19127051] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 06/04/2022] [Accepted: 06/07/2022] [Indexed: 12/03/2022]

Ma L, Gong J, Zhao M, Kong X, Gao P, Jiang Y, Liu Y, Feng X, Si S, Cao Y. A Novel Stool Methylation Test for the Non-Invasive Screening of Gastric and Colorectal Cancer. Front Oncol 2022;12:860701. [PMID: 35419280 PMCID: PMC8995552 DOI: 10.3389/fonc.2022.860701] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Accepted: 03/02/2022] [Indexed: 11/13/2022] Open

Tan MS, Cheah PL, Chin AV, Looi LM, Chang SW. A review on omics-based biomarkers discovery for Alzheimer's disease from the bioinformatics perspectives: Statistical approach vs machine learning approach. Comput Biol Med 2021;139:104947. [PMID: 34678481 DOI: 10.1016/j.compbiomed.2021.104947] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 10/12/2021] [Accepted: 10/12/2021] [Indexed: 12/26/2022]

Zhang Y, Wei X, Cao C, Yu F, Li W, Zhao G, Wei H, Zhang F, Meng P, Sun S, Lammi MJ, Guo X. Identifying discriminative features for diagnosis of Kashin-Beck disease among adolescents. BMC Musculoskelet Disord 2021;22:801. [PMID: 34537022 PMCID: PMC8449456 DOI: 10.1186/s12891-021-04514-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Accepted: 07/07/2021] [Indexed: 11/23/2022] Open

Affiliation(s)

Yanan Zhang School of Public Health, Xi'an Jiaotong University, Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission of the People's Republic of China, Xi'an, Shaanxi, P.R. China
Xiaoli Wei School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, Shaanxi, P.R. China
Chunxia Cao Institute of Disaster Medicine, Tianjin University, Tianjin, P.R. China
Fangfang Yu Department of Health Statistics, College of Public Health, Zhengzhou University, Zhengzhou, P. R. China
Wenrong Li School of Public Health, Xi'an Jiaotong University, Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission of the People's Republic of China, Xi'an, Shaanxi, P.R. China Department of Medical Imaging, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, P. R. China
Guanghui Zhao Xi'an Honghui Hospital, Health Science Center of Xi'an Jiaotong University, Xi'an, Shaanxi, P.R. China
Haiyan Wei School of Public Health, Xi'an Jiaotong University, Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission of the People's Republic of China, Xi'an, Shaanxi, P.R. China
Feng'e Zhang School of Public Health, Xi'an Jiaotong University, Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission of the People's Republic of China, Xi'an, Shaanxi, P.R. China
Peilin Meng School of Public Health, Xi'an Jiaotong University, Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission of the People's Republic of China, Xi'an, Shaanxi, P.R. China
Shiquan Sun School of Public Health, Xi'an Jiaotong University, Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission of the People's Republic of China, Xi'an, Shaanxi, P.R. China
Mikko Juhani Lammi School of Public Health, Xi'an Jiaotong University, Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission of the People's Republic of China, Xi'an, Shaanxi, P.R. China. Department of Integrative Medical Biology, University of Umeå, 90187, Umeå, Sweden.
Xiong Guo School of Public Health, Xi'an Jiaotong University, Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission of the People's Republic of China, Xi'an, Shaanxi, P.R. China.

Collapse

Greco FA, McKee AC, Kowall NW, Hanlon EB. Near-Infrared Optical Spectroscopy In Vivo Distinguishes Subjects with Alzheimer's Disease from Age-Matched Controls. J Alzheimers Dis 2021;82:791-802. [PMID: 34092628 PMCID: PMC8385529 DOI: 10.3233/jad-201021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Chen Z, Han S, Zhang J, Zheng P, Liu X, Zhang Y, Jia G. Metabolomics screening of serum biomarkers for occupational exposure of titanium dioxide nanoparticles. Nanotoxicology 2021;15:832-849. [PMID: 33961536 DOI: 10.1080/17435390.2021.1921872] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Integrated meta-analysis and machine learning approach identifies acyl-CoA thioesterase with other novel genes responsible for biofilm development in Staphylococcus aureus. INFECTION, GENETICS AND EVOLUTION : JOURNAL OF MOLECULAR EPIDEMIOLOGY AND EVOLUTIONARY GENETICS IN INFECTIOUS DISEASES 2021;88:104702. [PMID: 33388440 DOI: 10.1016/j.meegid.2020.104702] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 12/24/2020] [Accepted: 12/29/2020] [Indexed: 02/08/2023]

Abstract

Biofilm forming Staphylococcus aureus is a major threat to the health-care industry. It is important to understand the differences between planktonic and biofilm growth forms in the pathogen since conventional treatments targeting the planktonic forms are not effective against biofilms. The current study conducts a meta-analysis of three public transcriptomic profiles to examine the differences in gene expression between the planktonic and biofilm states of S. aureus using random-effects modeling. Mean effect sizes were calculated for 2847 genes among which 726 differentially expressed genes were taken for further analysis. Major genes that are discriminatory between the two conditions were mined using supervised learning techniques and validated by high-accuracy classifiers. Ten different feature selection algorithms were applied and used to rank the most important genes in S. aureus biofilms. Finally, an optimal set of 36 genes are presented as candidate genes in biofilm formation or development while throwing light on the novel roles of an acyl-CoA thioesterase enzyme and 10 hypothetical proteins in biofilms. The relevance of the identified gene set was further validated by building five different classification models using SVM, RF, kNN, NB and DT algorithms that were compared with models built from other relevant gene sets and by reviewing the functional role of 25 previously known genes in biofilm development. The study combines meta-analysis of differential expression with supervised machine learning strategies and feature selection for the first time to identify and validate a discriminatory set of genes important in biofilms of S. aureus. The functional roles of the identified genes predicted to be important in biofilms are further scrutinized and can be considered as a signature target list to develop anti-biofilm therapeutics in S. aureus.

Collapse

Hamraz M, Gul N, Raza M, Khan DM, Khalil U, Zubair S, Khan Z. Robust proportional overlapping analysis for feature selection in binary classification within functional genomic experiments. PeerJ Comput Sci 2021;7:e562. [PMID: 34141889 PMCID: PMC8176540 DOI: 10.7717/peerj-cs.562] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 05/04/2021] [Indexed: 05/10/2023]

Yousef M, Kumar A, Bakir-Gungor B. Application of Biological Domain Knowledge Based Feature Selection on Gene Expression Data. ENTROPY (BASEL, SWITZERLAND) 2020;23:E2. [PMID: 33374969 PMCID: PMC7821996 DOI: 10.3390/e23010002] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Revised: 12/14/2020] [Accepted: 12/16/2020] [Indexed: 12/19/2022]

Abstract

In the last two decades, there have been massive advancements in high throughput technologies, which resulted in the exponential growth of public repositories of gene expression datasets for various phenotypes. It is possible to unravel biomarkers by comparing the gene expression levels under different conditions, such as disease vs. control, treated vs. not treated, drug A vs. drug B, etc. This problem refers to a well-studied problem in the machine learning domain, i.e., the feature selection problem. In biological data analysis, most of the computational feature selection methodologies were taken from other fields, without considering the nature of the biological data. Thus, integrative approaches that utilize the biological knowledge while performing feature selection are necessary for this kind of data. The main idea behind the integrative gene selection process is to generate a ranked list of genes considering both the statistical metrics that are applied to the gene expression data, and the biological background information which is provided as external datasets. One of the main goals of this review is to explore the existing methods that integrate different types of information in order to improve the identification of the biomolecular signatures of diseases and the discovery of new potential targets for treatment. These integrative approaches are expected to aid the prediction, diagnosis, and treatment of diseases, as well as to enlighten us on disease state dynamics, mechanisms of their onset and progression. The integration of various types of biological information will necessitate the development of novel techniques for integration and data analysis. Another aim of this review is to boost the bioinformatics community to develop new approaches for searching and determining significant groups/clusters of features based on one or more biological grouping functions.

Collapse

Laufer BI, Hwang H, Jianu JM, Mordaunt CE, Korf IF, Hertz-Picciotto I, LaSalle JM. Low-pass whole genome bisulfite sequencing of neonatal dried blood spots identifies a role for RUNX1 in Down syndrome DNA methylation profiles. Hum Mol Genet 2020;29:3465-3476. [PMID: 33001180 PMCID: PMC7788293 DOI: 10.1093/hmg/ddaa218] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Revised: 09/16/2020] [Accepted: 09/25/2020] [Indexed: 12/17/2022] Open