Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Libbrecht MW, Noble WS. Machine learning applications in genetics and genomics. Nat Rev Genet 2015;16:321-32. [PMID: 25948244 PMCID: PMC5204302 DOI: 10.1038/nrg3920] [Citation(s) in RCA: 833] [Impact Index Per Article: 92.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]

For:	Libbrecht MW, Noble WS. Machine learning applications in genetics and genomics. Nat Rev Genet 2015;16:321-32. [PMID: 25948244 PMCID: PMC5204302 DOI: 10.1038/nrg3920] [Citation(s) in RCA: 833] [Impact Index Per Article: 92.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]

Number

Cited by Other Article(s)

Barash M, McNevin D, Fedorenko V, Giverts P. Machine learning applications in forensic DNA profiling: A critical review. Forensic Sci Int Genet 2024;69:102994. [PMID: 38086200 DOI: 10.1016/j.fsigen.2023.102994] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Revised: 11/06/2023] [Accepted: 11/26/2023] [Indexed: 01/29/2024]

Chen Y, Mancini M, Zhu X, Akata Z. Semi-Supervised and Unsupervised Deep Visual Learning: A Survey. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024;46:1327-1347. [PMID: 36006881 DOI: 10.1109/tpami.2022.3201576] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Fang C, Dziedzic A, Zhang L, Oliva L, Verma A, Razak F, Papernot N, Wang B. Decentralised, collaborative, and privacy-preserving machine learning for multi-hospital data. EBioMedicine 2024;101:105006. [PMID: 38377795 PMCID: PMC10884342 DOI: 10.1016/j.ebiom.2024.105006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Revised: 01/26/2024] [Accepted: 01/28/2024] [Indexed: 02/22/2024] Open

Abstract

BACKGROUND

Machine Learning (ML) has demonstrated its great potential on medical data analysis. Large datasets collected from diverse sources and settings are essential for ML models in healthcare to achieve better accuracy and generalizability. Sharing data across different healthcare institutions or jurisdictions is challenging because of complex and varying privacy and regulatory requirements. Hence, it is hard but crucial to allow multiple parties to collaboratively train an ML model leveraging the private datasets available at each party without the need for direct sharing of those datasets or compromising the privacy of the datasets through collaboration.

METHODS

In this paper, we address this challenge by proposing Decentralized, Collaborative, and Privacy-preserving ML for Multi-Hospital Data (DeCaPH). This framework offers the following key benefits: (1) it allows different parties to collaboratively train an ML model without transferring their private datasets (i.e., no data centralization); (2) it safeguards patients' privacy by limiting the potential privacy leakage arising from any contents shared across the parties during the training process; and (3) it facilitates the ML model training without relying on a centralized party/server.

FINDINGS

We demonstrate the generalizability and power of DeCaPH on three distinct tasks using real-world distributed medical datasets: patient mortality prediction using electronic health records, cell-type classification using single-cell human genomes, and pathology identification using chest radiology images. The ML models trained with DeCaPH framework have less than 3.2% drop in model performance comparing to those trained by the non-privacy-preserving collaborative framework. Meanwhile, the average vulnerability to privacy attacks of the models trained with DeCaPH decreased by up to 16%. In addition, models trained with our DeCaPH framework achieve better performance than those models trained solely with the private datasets from individual parties without collaboration and those trained with the previous privacy-preserving collaborative training framework under the same privacy guarantee by up to 70% and 18.2% respectively.

INTERPRETATION

We demonstrate that the ML models trained with DeCaPH framework have an improved utility-privacy trade-off, showing DeCaPH enables the models to have good performance while preserving the privacy of the training data points. In addition, the ML models trained with DeCaPH framework in general outperform those trained solely with the private datasets from individual parties, showing that DeCaPH enhances the model generalizability.

FUNDING

This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC, RGPIN-2020-06189 and DGECR-2020-00294), Canadian Institute for Advanced Research (CIFAR) AI Catalyst Grants, CIFAR AI Chair programs, Temerty Professor of AI Research and Education in Medicine, University of Toronto, Amazon, Apple, DARPA through the GARD project, Intel, Meta, the Ontario Early Researcher Award, and the Sloan Foundation. Resources used in preparing this research were provided, in part, by the Province of Ontario, the Government of Canada through CIFAR, and companies sponsoring the Vector Institute.

Collapse

Bai Y, Lin H, Wang C, Wang Q, Qu J. Digitalizing river aquatic ecosystems. J Environ Sci (China) 2024;137:677-680. [PMID: 37980050 DOI: 10.1016/j.jes.2023.03.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 03/07/2023] [Accepted: 03/08/2023] [Indexed: 11/20/2023]

Hazan JM, Amador R, Ali-Nasser T, Lahav T, Shotan SR, Steinberg M, Cohen Z, Aran D, Meiri D, Assaraf YG, Guigó R, Bester AC. Integration of transcription regulation and functional genomic data reveals lncRNA SNHG6's role in hematopoietic differentiation and leukemia. J Biomed Sci 2024;31:27. [PMID: 38419051 PMCID: PMC10900714 DOI: 10.1186/s12929-024-01015-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 02/22/2024] [Indexed: 03/02/2024] Open

Abstract

BACKGROUND

Long non-coding RNAs (lncRNAs) are pivotal players in cellular processes, and their unique cell-type specific expression patterns render them attractive biomarkers and therapeutic targets. Yet, the functional roles of most lncRNAs remain enigmatic. To address the need to identify new druggable lncRNAs, we developed a comprehensive approach integrating transcription factor binding data with other genetic features to generate a machine learning model, which we have called INFLAMeR (Identifying Novel Functional LncRNAs with Advanced Machine Learning Resources).

METHODS

INFLAMeR was trained on high-throughput CRISPR interference (CRISPRi) screens across seven cell lines, and the algorithm was based on 71 genetic features. To validate the predictions, we selected candidate lncRNAs in the human K562 leukemia cell line and determined the impact of their knockdown (KD) on cell proliferation and chemotherapeutic drug response. We further performed transcriptomic analysis for candidate genes. Based on these findings, we assessed the lncRNA small nucleolar RNA host gene 6 (SNHG6) for its role in myeloid differentiation. Finally, we established a mouse K562 leukemia xenograft model to determine whether SNHG6 KD attenuates tumor growth in vivo.

RESULTS

The INFLAMeR model successfully reconstituted CRISPRi screening data and predicted functional lncRNAs that were previously overlooked. Intensive cell-based and transcriptomic validation of nearly fifty genes in K562 revealed cell type-specific functionality for 85% of the predicted lncRNAs. In this respect, our cell-based and transcriptomic analyses predicted a role for SNHG6 in hematopoiesis and leukemia. Consistent with its predicted role in hematopoietic differentiation, SNHG6 transcription is regulated by hematopoiesis-associated transcription factors. SNHG6 KD reduced the proliferation of leukemia cells and sensitized them to differentiation. Treatment of K562 leukemic cells with hemin and PMA, respectively, demonstrated that SNHG6 inhibits red blood cell differentiation but strongly promotes megakaryocyte differentiation. Using a xenograft mouse model, we demonstrate that SNHG6 KD attenuated tumor growth in vivo.

CONCLUSIONS

Our approach not only improved the identification and characterization of functional lncRNAs through genomic approaches in a cell type-specific manner, but also identified new lncRNAs with roles in hematopoiesis and leukemia. Such approaches can be readily applied to identify novel targets for precision medicine.

Collapse

Rahit KMTH, Avramovic V, Chong JX, Tarailo-Graovac M. GPAD: a natural language processing-based application to extract the gene-disease association discovery information from OMIM. BMC Bioinformatics 2024;25:84. [PMID: 38413851 PMCID: PMC10898068 DOI: 10.1186/s12859-024-05693-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 02/09/2024] [Indexed: 02/29/2024] Open

Abstract

BACKGROUND

Thousands of genes have been associated with different Mendelian conditions. One of the valuable sources to track these gene-disease associations (GDAs) is the Online Mendelian Inheritance in Man (OMIM) database. However, most of the information in OMIM is textual, and heterogeneous (e.g. summarized by different experts), which complicates automated reading and understanding of the data. Here, we used Natural Language Processing (NLP) to make a tool (Gene-Phenotype Association Discovery (GPAD)) that could syntactically process OMIM text and extract the data of interest.

RESULTS

GPAD applies a series of language-based techniques to the text obtained from OMIM API to extract GDA discovery-related information. GPAD can inform when a particular gene was associated with a specific phenotype, as well as the type of validation-whether through model organisms or cohort-based patient-matching approaches-for such an association. GPAD extracted data was validated with published reports and was compared with large language model. Utilizing GPAD's extracted data, we analysed trends in GDA discoveries, noting a significant increase in their rate after the introduction of exome sequencing, rising from an average of about 150-250 discoveries each year. Contrary to hopes of resolving most GDAs for Mendelian disorders by now, our data indicate a substantial decline in discovery rates over the past five years (2017-2022). This decline appears to be linked to the increasing necessity for larger cohorts to substantiate GDAs. The rising use of zebrafish and Drosophila as model organisms in providing evidential support for GDAs is also observed.

CONCLUSIONS

GPAD's real-time analyzing capacity offers an up-to-date view of GDA discovery and could help in planning and managing the research strategies. In future, this solution can be extended or modified to capture other information in OMIM and scientific literature.

Collapse

Li G, Li C, Wang C, Wang Z. Suboptimal capability of individual machine learning algorithms in modeling small-scale imbalanced clinical data of local hospital. PLoS One 2024;19:e0298328. [PMID: 38394317 PMCID: PMC10890755 DOI: 10.1371/journal.pone.0298328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 01/22/2024] [Indexed: 02/25/2024] Open

Fong WJ, Tan HM, Garg R, Teh AL, Pan H, Gupta V, Krishna B, Chen ZH, Purwanto NY, Yap F, Tan KH, Chan KYJ, Chan SY, Goh N, Rane N, Tan ESE, Jiang Y, Han M, Meaney M, Wang D, Keppo J, Tan GCY. Comparing feature selection and machine learning approaches for predicting CYP2D6 methylation from genetic variation. Front Neuroinform 2024;17:1244336. [PMID: 38449836 PMCID: PMC10915285 DOI: 10.3389/fninf.2023.1244336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 10/18/2023] [Indexed: 03/08/2024] Open

Abstract

Introduction

Pharmacogenetics currently supports clinical decision-making on the basis of a limited number of variants in a few genes and may benefit paediatric prescribing where there is a need for more precise dosing. Integrating genomic information such as methylation into pharmacogenetic models holds the potential to improve their accuracy and consequently prescribing decisions. Cytochrome P450 2D6 (CYP2D6) is a highly polymorphic gene conventionally associated with the metabolism of commonly used drugs and endogenous substrates. We thus sought to predict epigenetic loci from single nucleotide polymorphisms (SNPs) related to CYP2D6 in children from the GUSTO cohort.

Methods

Buffy coat DNA methylation was quantified using the Illumina Infinium Methylation EPIC beadchip. CpG sites associated with CYP2D6 were used as outcome variables in Linear Regression, Elastic Net and XGBoost models. We compared feature selection of SNPs from GWAS mQTLs, GTEx eQTLs and SNPs within 2 MB of the CYP2D6 gene and the impact of adding demographic data. The samples were split into training (75%) sets and test (25%) sets for validation. In Elastic Net model and XGBoost models, optimal hyperparameter search was done using 10-fold cross validation. Root Mean Square Error and R-squared values were obtained to investigate each models' performance. When GWAS was performed to determine SNPs associated with CpG sites, a total of 15 SNPs were identified where several SNPs appeared to influence multiple CpG sites.

Results

Overall, Elastic Net models of genetic features appeared to perform marginally better than heritability estimates and substantially better than Linear Regression and XGBoost models. The addition of nongenetic features appeared to improve performance for some but not all feature sets and probes. The best feature set and Machine Learning (ML) approach differed substantially between CpG sites and a number of top variables were identified for each model.

Discussion

The development of SNP-based prediction models for CYP2D6 CpG methylation in Singaporean children of varying ethnicities in this study has clinical application. With further validation, they may add to the set of tools available to improve precision medicine and pharmacogenetics-based dosing.

Collapse

Affiliation(s)

Wei Jing Fong Computational Biology, National University of Singapore, Singapore, Singapore
Hong Ming Tan Computational Biology, National University of Singapore, Singapore, Singapore
Rishabh Garg Computational Biology, National University of Singapore, Singapore, Singapore
Ai Ling Teh Singapore Institute for Clinical Sciences (SICS), Agency for Science, Technology and Research (ASTAR), Singapore, Singapore Bioinformatics Institute (BII), Agency for Science, Technology and Research (ASTAR), Singapore, Singapore
Hong Pan Singapore Institute for Clinical Sciences (SICS), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
Varsha Gupta Singapore Institute for Clinical Sciences (SICS), Agency for Science, Technology and Research (ASTAR), Singapore, Singapore Bioinformatics Institute (BII), Agency for Science, Technology and Research (ASTAR), Singapore, Singapore
Bernadus Krishna Computational Biology, National University of Singapore, Singapore, Singapore
Zou Hui Chen Computational Biology, National University of Singapore, Singapore, Singapore
Natania Yovela Purwanto Computational Biology, National University of Singapore, Singapore, Singapore
Fabian Yap KK Women's and Children's Hospital, Singapore, Singapore
Kok Hian Tan KK Women's and Children's Hospital, Singapore, Singapore Duke NUS Medical School, Singapore, Singapore
Kok Yen Jerry Chan KK Women's and Children's Hospital, Singapore, Singapore Duke NUS Medical School, Singapore, Singapore
Shiao-Yng Chan Singapore Institute for Clinical Sciences (SICS), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore National University Hospital, Singapore, Singapore
Nicole Goh Yale-NUS College, Singapore, Singapore
Nikita Rane Institute of Mental Health,Singapore, Singapore
Ethel Siew Ee Tan Institute of Mental Health,Singapore, Singapore
Yuheng Jiang Institute of Mental Health,Singapore, Singapore
Mei Han Computational Biology, National University of Singapore, Singapore, Singapore
Michael Meaney Singapore Institute for Clinical Sciences (SICS), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
Dennis Wang Singapore Institute for Clinical Sciences (SICS), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore National Heart and Lung Institute, Imperial College London, London, United Kingdom
Jussi Keppo Computational Biology, National University of Singapore, Singapore, Singapore
Geoffrey Chern-Yee Tan Computational Biology, National University of Singapore, Singapore, Singapore Institute of Mental Health,Singapore, Singapore

Collapse

Lv Q, Liu Y, Sun Y, Wu M. Insight into deep learning for glioma IDH medical image analysis: A systematic review. Medicine (Baltimore) 2024;103:e37150. [PMID: 38363910 PMCID: PMC10869095 DOI: 10.1097/md.0000000000037150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 01/11/2024] [Indexed: 02/18/2024] Open

Shahjahan, Dey JK, Dey SK. Translational bioinformatics approach to combat cardiovascular disease and cancers. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2024;139:221-261. [PMID: 38448136 DOI: 10.1016/bs.apcsb.2023.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]

Abstract

Bioinformatics is an interconnected subject of science dealing with diverse fields including biology, chemistry, physics, statistics, mathematics, and computer science as the key fields to answer complicated physiological problems. Key intention of bioinformatics is to store, analyze, organize, and retrieve essential information about genome, proteome, transcriptome, metabolome, as well as organisms to investigate the biological system along with its dynamics, if any. The outcome of bioinformatics depends on the type, quantity, and quality of the raw data provided and the algorithm employed to analyze the same. Despite several approved medicines available, cardiovascular disorders (CVDs) and cancers comprises of the two leading causes of human deaths. Understanding the unknown facts of both these non-communicable disorders is inevitable to discover new pathways, find new drug targets, and eventually newer drugs to combat them successfully. Since, all these goals involve complex investigation and handling of various types of macro- and small- molecules of the human body, bioinformatics plays a key role in such processes. Results from such investigation has direct human application and thus we call this filed as translational bioinformatics. Current book chapter thus deals with diverse scope and applications of this translational bioinformatics to find cure, diagnosis, and understanding the mechanisms of CVDs and cancers. Developing complex yet small or long algorithms to address such problems is very common in translational bioinformatics. Structure-based drug discovery or AI-guided invention of novel antibodies that too with super-high accuracy, speed, and involvement of considerably low amount of investment are some of the astonishing features of the translational bioinformatics and its applications in the fields of CVDs and cancers.

Collapse

Hassan J, Saeed SM, Deka L, Uddin MJ, Das DB. Applications of Machine Learning (ML) and Mathematical Modeling (MM) in Healthcare with Special Focus on Cancer Prognosis and Anticancer Therapy: Current Status and Challenges. Pharmaceutics 2024;16:260. [PMID: 38399314 PMCID: PMC10892549 DOI: 10.3390/pharmaceutics16020260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 01/29/2024] [Accepted: 02/07/2024] [Indexed: 02/25/2024] Open

Wong EY, Chu TN, Ladi-Seyedian SS. Genomics and Artificial Intelligence: Prostate Cancer. Urol Clin North Am 2024;51:27-33. [PMID: 37945100 DOI: 10.1016/j.ucl.2023.06.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2023]

Gunter NB, Gebre RK, Graff-Radford J, Heckman MG, Jack CR, Lowe VJ, Knopman DS, Petersen RC, Ross OA, Vemuri P, Ramanan VK. Machine Learning Models of Polygenic Risk for Enhanced Prediction of Alzheimer Disease Endophenotypes. Neurol Genet 2024;10:e200120. [PMID: 38250184 PMCID: PMC10798228 DOI: 10.1212/nxg.0000000000200120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 11/01/2023] [Indexed: 01/23/2024]

Abstract

Background and Objectives

Alzheimer disease (AD) has a polygenic architecture, for which genome-wide association studies (GWAS) have helped elucidate sequence variants (SVs) influencing susceptibility. Polygenic risk score (PRS) approaches show promise for generating summary measures of inherited risk for clinical AD based on the effects of APOE and other GWAS hits. However, existing PRS approaches, based on traditional regression models, explain only modest variation in AD dementia risk and AD-related endophenotypes. We hypothesized that machine learning (ML) models of polygenic risk (ML-PRS) could outperform standard regression-based PRS methods and therefore have the potential for greater clinical utility.

Methods

We analyzed combined data from the Mayo Clinic Study of Aging (n = 1,791) and the Alzheimer's Disease Neuroimaging Initiative (n = 864). An AD PRS was computed for each participant using the top common SVs obtained from a large AD dementia GWAS. In parallel, ML models were trained using those SV genotypes, with amyloid PET burden as the primary outcome. Secondary outcomes included amyloid PET positivity and clinical diagnosis (cognitively unimpaired vs impaired). We compared performance between ML-PRS and standard PRS across 100 training sessions with different data splits. In each session, data were split into 80% training and 20% testing, and then five-fold cross-validation was used within the training set to ensure the best model was produced for testing. We also applied permutation importance techniques to assess which genetic factors contributed most to outcome prediction.

Results

ML-PRS models outperformed the AD PRS (r2 = 0.28 vs r2 = 0.24 in test set) in explaining variation in amyloid PET burden. Among ML approaches, methods accounting for nonlinear genetic influences were superior to linear methods. ML-PRS models were also more accurate when predicting amyloid PET positivity (area under the curve [AUC] = 0.80 vs AUC = 0.63) and the presence of cognitive impairment (AUC = 0.75 vs AUC = 0.54) compared with the standard PRS.

Discussion

We found that ML-PRS approaches improved upon standard PRS for prediction of AD endophenotypes, partly related to improved accounting for nonlinear effects of genetic susceptibility alleles. Further adaptations of the ML-PRS framework could help to close the gap of remaining unexplained heritability for AD and therefore facilitate more accurate presymptomatic and early-stage risk stratification for clinical decision-making.

Collapse

Affiliation(s)

Nathaniel B Gunter From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
Robel K Gebre From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
Jonathan Graff-Radford From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
Michael G Heckman From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
Clifford R Jack From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
Val J Lowe From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
David S Knopman From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
Ronald C Petersen From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
Owen A Ross From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
Prashanthi Vemuri From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
Vijay K Ramanan From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville

Collapse

Cho H, She J, De Marchi D, El-Zaatari H, Barnes EL, Kahkoska AR, Kosorok MR, Virkud AV. Machine Learning and Health Science Research: Tutorial. J Med Internet Res 2024;26:e50890. [PMID: 38289657 PMCID: PMC10865203 DOI: 10.2196/50890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Revised: 11/30/2023] [Accepted: 12/21/2023] [Indexed: 02/01/2024] Open

Lebatteux D, Soudeyns H, Boucoiran I, Gantt S, Diallo AB. Machine learning-based approach KEVOLVE efficiently identifies SARS-CoV-2 variant-specific genomic signatures. PLoS One 2024;19:e0296627. [PMID: 38241279 PMCID: PMC10798494 DOI: 10.1371/journal.pone.0296627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 12/07/2023] [Indexed: 01/21/2024] Open

Yu X, Zhao H, Wang R, Chen Y, Ouyang X, Li W, Sun Y, Peng A. Cancer epigenetics: from laboratory studies and clinical trials to precision medicine. Cell Death Discov 2024;10:28. [PMID: 38225241 PMCID: PMC10789753 DOI: 10.1038/s41420-024-01803-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 12/23/2023] [Accepted: 01/04/2024] [Indexed: 01/17/2024] Open

Sen SK, Green ED, Hutter CM, Craven M, Ideker T, Di Francesco V. Opportunities for basic, clinical, and bioethics research at the intersection of machine learning and genomics. CELL GENOMICS 2024;4:100466. [PMID: 38190108 PMCID: PMC10794834 DOI: 10.1016/j.xgen.2023.100466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 07/14/2023] [Accepted: 11/20/2023] [Indexed: 01/09/2024]

Barnett EJ, Onete DG, Salekin A, Faraone SV. Genomic Machine Learning Meta-regression: Insights on Associations of Study Features With Reported Model Performance. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024;21:169-177. [PMID: 38109236 DOI: 10.1109/tcbb.2023.3343808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]

Li J, Varghese RS, Ressom HW. RNA-Seq Data Analysis. Methods Mol Biol 2024;2822:263-290. [PMID: 38907924 DOI: 10.1007/978-1-0716-3918-4_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/24/2024]

Abstract

RNA-Seq data analysis stands as a vital part of genomics research, turning vast and complex datasets into meaningful biological insights. It is a field marked by rapid evolution and ongoing innovation, necessitating a thorough understanding for anyone seeking to unlock the potential of RNA-Seq data. In this chapter, we describe the intricate landscape of RNA-seq data analysis, elucidating a comprehensive pipeline that navigates through the entirety of this complex process. Beginning with quality control, the chapter underscores the paramount importance of ensuring the integrity of RNA-seq data, as it lays the groundwork for subsequent analyses. Preprocessing is then addressed, where the raw sequence data undergoes necessary modifications and enhancements, setting the stage for the alignment phase. This phase involves mapping the processed sequences to a reference genome, a step pivotal for decoding the origins and functions of these sequences.Venturing into the heart of RNA-seq analysis, the chapter then explores differential expression analysis-the process of identifying genes that exhibit varying expression levels across different conditions or sample groups. Recognizing the biological context of these differentially expressed genes is pivotal; hence, the chapter transitions into functional analysis. Here, methods and tools like Gene Ontology and pathway analyses help contextualize the roles and interactions of the identified genes within broader biological frameworks. However, the chapter does not stop at conventional analysis methods. Embracing the evolving paradigms of data science, it delves into machine learning applications for RNA-seq data, introducing advanced techniques in dimension reduction and both unsupervised and supervised learning. These approaches allow for patterns and relationships to be discerned in the data that might be imperceptible through traditional methods.

Collapse

JIA KEGANG, WANG YAWEI, CAO QI, WANG YOUYU. Extensive prediction of drug response in mutation-subtype-specific LUAD with machine learning approach. Oncol Res 2023;32:409-419. [PMID: 38186568 PMCID: PMC10765129 DOI: 10.32604/or.2023.042863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 09/25/2023] [Indexed: 01/09/2024] Open

Abstract

Background

Lung cancer is the most prevalent cancer diagnosis and the leading cause of cancer death worldwide. Therapeutic failure in lung cancer (LUAD) is heavily influenced by drug resistance. This challenge stems from the diverse cell populations within the tumor, each having unique genetic, epigenetic, and phenotypic profiles. Such variations lead to varied therapeutic responses, thereby contributing to tumor relapse and disease progression.

Methods

The Genomics of Drug Sensitivity in Cancer (GDSC) database was used in this investigation to obtain the mRNA expression dataset, genomic mutation profile, and drug sensitivity information of NSCLS. Machine Learning (ML) methods, including Random Forest (RF), Artificial Neurol Network (ANN), and Support Vector Machine (SVM), were used to predict the response status of each compound based on the mRNA and mutation characteristics determined using statistical methods. The most suitable method for each drug was proposed by comparing the prediction accuracy of different ML methods, and the selected mRNA and mutation characteristics were identified as molecular features for the drug-responsive cancer subtype. Finally, the prognostic influence of molecular features on the mutational subtype of LUAD in publicly available datasets.

Results

Our analyses yielded 1,564 gene features and 45 mutational features for 46 drugs. Applying the ML approach to predict the drug response for each medication revealed an upstanding performance for SVM in predicting Afuresertib drug response (area under the curve [AUC] 0.875) using CIT, GAS2L3, STAG3L3, ATP2B4-mut, and IL15RA-mut as molecular features. Furthermore, the ANN algorithm using 9 mRNA characteristics demonstrated the highest prediction performance (AUC 0.780) in Gefitinib with CCL23-mut.

Conclusion

This work extensively investigated the mRNA and mutation signatures associated with drug response in LUAD using a machine-learning approach and proposed a priority algorithm to predict drug response for different drugs.

Collapse

Li R, Chen S, Matsumoto H, Gouda M, Gafforov Y, Wang M, Liu Y. Predicting rice diseases using advanced technologies at different scales: present status and future perspectives. ABIOTECH 2023;4:359-371. [PMID: 38106429 PMCID: PMC10721578 DOI: 10.1007/s42994-023-00126-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Accepted: 10/30/2023] [Indexed: 12/19/2023]

Chao H, Zhang S, Hu Y, Ni Q, Xin S, Zhao L, Ivanisenko VA, Orlov YL, Chen M. Integrating omics databases for enhanced crop breeding. J Integr Bioinform 2023;20:jib-2023-0012. [PMID: 37486120 PMCID: PMC10777369 DOI: 10.1515/jib-2023-0012] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 06/12/2023] [Indexed: 07/25/2023] Open

Gao S, Chen S, Yang M, Wu J, Chen S, Li H. Mining salt stress-related genes in Spartina alterniflora via analyzing co-evolution signal across 365 plant species using phylogenetic profiling. ABIOTECH 2023;4:291-302. [PMID: 38106430 PMCID: PMC10721760 DOI: 10.1007/s42994-023-00125-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 10/23/2023] [Indexed: 12/19/2023]

Lyu K, Xiao J, Lyu S, Liu R. Comparative Analysis of Transposable Elements in Strawberry Genomes of Different Ploidy Levels. Int J Mol Sci 2023;24:16935. [PMID: 38069258 PMCID: PMC10706760 DOI: 10.3390/ijms242316935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 11/25/2023] [Accepted: 11/27/2023] [Indexed: 12/18/2023] Open

Toussaint PA, Leiser F, Thiebes S, Schlesner M, Brors B, Sunyaev A. Explainable artificial intelligence for omics data: a systematic mapping study. Brief Bioinform 2023;25:bbad453. [PMID: 38113073 PMCID: PMC10729786 DOI: 10.1093/bib/bbad453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 07/28/2023] [Accepted: 11/08/2023] [Indexed: 12/21/2023] Open

Sun Y, Zhao Z, Tong H, Sun B, Liu Y, Ren N, You S. Machine Learning Models for Inverse Design of the Electrochemical Oxidation Process for Water Purification. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023;57:17990-18000. [PMID: 37189261 DOI: 10.1021/acs.est.2c08771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]

Arjmandi M, Fattahi M, Motevassel M, Rezaveisi H. Evaluating algorithms of decision tree, support vector machine and regression for anode side catalyst data in proton exchange membrane water electrolysis. Sci Rep 2023;13:20309. [PMID: 37985795 PMCID: PMC10662483 DOI: 10.1038/s41598-023-47174-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Accepted: 11/09/2023] [Indexed: 11/22/2023] Open

Yue T, Wang Y, Zhang L, Gu C, Xue H, Wang W, Lyu Q, Dun Y. Deep Learning for Genomics: From Early Neural Nets to Modern Large Language Models. Int J Mol Sci 2023;24:15858. [PMID: 37958843 PMCID: PMC10649223 DOI: 10.3390/ijms242115858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 10/24/2023] [Accepted: 10/30/2023] [Indexed: 11/15/2023] Open

Rodríguez-López M, Bordin N, Lees J, Scholes H, Hassan S, Saintain Q, Kamrad S, Orengo C, Bähler J. Broad functional profiling of fission yeast proteins using phenomics and machine learning. eLife 2023;12:RP88229. [PMID: 37787768 PMCID: PMC10547477 DOI: 10.7554/elife.88229] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2023] Open

Wu K, Xu C, Li T, Ma H, Gong J, Li X, Sun X, Hu X. Application of Nanotechnology in Plant Genetic Engineering. Int J Mol Sci 2023;24:14836. [PMID: 37834283 PMCID: PMC10573821 DOI: 10.3390/ijms241914836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 09/20/2023] [Accepted: 09/28/2023] [Indexed: 10/15/2023] Open

Affiliation(s)

Kexin Wu Collaborative Innovation Center for Efficient and Green Production of Agriculture in Mountainous Areas of Zhejiang Province, College of Horticulture Science, Zhejiang A&F University, Hangzhou 311300, China Key Laboratory of Quality and Safety Control for Subtropical Fruit and Vegetable, Ministry of Agriculture and Rural Affairs, Hangzhou 311300, China
Changbin Xu Collaborative Innovation Center for Efficient and Green Production of Agriculture in Mountainous Areas of Zhejiang Province, College of Horticulture Science, Zhejiang A&F University, Hangzhou 311300, China Key Laboratory of Quality and Safety Control for Subtropical Fruit and Vegetable, Ministry of Agriculture and Rural Affairs, Hangzhou 311300, China
Tong Li Collaborative Innovation Center for Efficient and Green Production of Agriculture in Mountainous Areas of Zhejiang Province, College of Horticulture Science, Zhejiang A&F University, Hangzhou 311300, China Key Laboratory of Quality and Safety Control for Subtropical Fruit and Vegetable, Ministry of Agriculture and Rural Affairs, Hangzhou 311300, China
Haijie Ma Collaborative Innovation Center for Efficient and Green Production of Agriculture in Mountainous Areas of Zhejiang Province, College of Horticulture Science, Zhejiang A&F University, Hangzhou 311300, China Key Laboratory of Quality and Safety Control for Subtropical Fruit and Vegetable, Ministry of Agriculture and Rural Affairs, Hangzhou 311300, China
Jinli Gong Collaborative Innovation Center for Efficient and Green Production of Agriculture in Mountainous Areas of Zhejiang Province, College of Horticulture Science, Zhejiang A&F University, Hangzhou 311300, China Key Laboratory of Quality and Safety Control for Subtropical Fruit and Vegetable, Ministry of Agriculture and Rural Affairs, Hangzhou 311300, China
Xiaolong Li Collaborative Innovation Center for Efficient and Green Production of Agriculture in Mountainous Areas of Zhejiang Province, College of Horticulture Science, Zhejiang A&F University, Hangzhou 311300, China Key Laboratory of Quality and Safety Control for Subtropical Fruit and Vegetable, Ministry of Agriculture and Rural Affairs, Hangzhou 311300, China
Xuepeng Sun Collaborative Innovation Center for Efficient and Green Production of Agriculture in Mountainous Areas of Zhejiang Province, College of Horticulture Science, Zhejiang A&F University, Hangzhou 311300, China Key Laboratory of Quality and Safety Control for Subtropical Fruit and Vegetable, Ministry of Agriculture and Rural Affairs, Hangzhou 311300, China
Xiaoli Hu Collaborative Innovation Center for Efficient and Green Production of Agriculture in Mountainous Areas of Zhejiang Province, College of Horticulture Science, Zhejiang A&F University, Hangzhou 311300, China Key Laboratory of Quality and Safety Control for Subtropical Fruit and Vegetable, Ministry of Agriculture and Rural Affairs, Hangzhou 311300, China

Collapse

Chalka A, Dallman TJ, Vohra P, Stevens MP, Gally DL. The advantage of intergenic regions as genomic features for machine-learning-based host attribution of Salmonella Typhimurium from the USA. Microb Genom 2023;9:001116. [PMID: 37843883 PMCID: PMC10634445 DOI: 10.1099/mgen.0.001116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 10/02/2023] [Indexed: 10/17/2023] Open

Abstract

Salmonella enterica is a taxonomically diverse pathogen with over 2600 serovars associated with a wide variety of animal hosts including humans, other mammals, birds and reptiles. Some serovars are host-specific or host-restricted and cause disease in distinct host species, while others, such as serovar S. Typhimurium (STm), are generalists and have the potential to colonize a wide variety of species. However, even within generalist serovars such as STm it is becoming clear that pathovariants exist that differ in tropism and virulence. Identifying the genetic factors underlying host specificity is complex, but the availability of thousands of genome sequences and advances in machine learning have made it possible to build specific host prediction models to aid outbreak control and predict the human pathogenic potential of isolates from animals and other reservoirs. We have advanced this area by building host-association prediction models trained on a wide range of genomic features and compared them with predictions based on nearest-neighbour phylogeny. SNPs, protein variants (PVs), antimicrobial resistance (AMR) profiles and intergenic regions (IGRs) were extracted from 3883 high-quality STm assemblies collected from humans, swine, bovine and poultry in the USA, and used to construct Random Forest (RF) machine learning models. An additional 244 recent STm assemblies from farm animals were used as a test set for further validation. The models based on PVs and IGRs had the best performance in terms of predicting the host of origin of isolates and outperformed nearest-neighbour phylogenetic host prediction as well as models based on SNPs or AMR data. However, the models did not yield reliable predictions when tested with isolates that were phylogenetically distinct from the training set. The IGR and PV models were often able to differentiate human isolates in clusters where the majority of isolates were from a single animal source. Notably, IGRs were the feature with the best performance across multiple models which may be due to IGRs acting as both a representation of their flanking genes, equivalent to PVs, while also capturing genomic regulatory variation, such as altered promoter regions. The IGR and PV models predict that ~45 % of the human infections with STm in the USA originate from bovine, ~40 % from poultry and ~14.5 % from swine, although sequences of isolates from other sources were not used for training. In summary, the research demonstrates a significant gain in accuracy for models with IGRs and PVs as features compared to SNP-based and core genome phylogeny predictions when applied within the existing population structure. This article contains data hosted by Microreact.

Collapse

Mizrahi L, Choudhary A, Ofer P, Goldberg G, Milanesi E, Kelsoe JR, Gurwitz D, Alda M, Gage FH, Stern S. Immunoglobulin genes expressed in lymphoblastoid cell lines discern and predict lithium response in bipolar disorder patients. Mol Psychiatry 2023;28:4280-4293. [PMID: 37488168 PMCID: PMC10827667 DOI: 10.1038/s41380-023-02183-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 07/03/2023] [Accepted: 07/06/2023] [Indexed: 07/26/2023]

He M, Tang B, Xiao Y, Tang S. Transmission dynamics informed neural network with application to COVID-19 infections. Comput Biol Med 2023;165:107431. [PMID: 37696183 DOI: 10.1016/j.compbiomed.2023.107431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 07/26/2023] [Accepted: 08/28/2023] [Indexed: 09/13/2023]

Abstract

Since the end of 2019 the COVID-19 repeatedly surges with most countries/territories experiencing multiple waves, and mechanism-based epidemic models played important roles in understanding the transmission mechanism of multiple epidemic waves. However, capturing temporal changes of the transmissibility of COVID-19 during the multiple waves keeps ill-posed problem for traditional mechanism-based epidemic compartment models, because that the transmission rate is usually assumed to be specific piecewise functions and more parameters are added to the model once multiple epidemic waves involved, which poses a huge challenge to parameter estimation. Meanwhile, data-driven deep neural networks fail to discover the driving factors of repeated outbreaks and lack interpretability. In this study, aiming at developing a data-driven method to project time-dependent parameters but also merging the advantage of mechanism-based models, we propose a transmission dynamics informed neural network (TDINN) by encoding the SEIRD compartment model into deep neural networks. We show that the proposed TDINN algorithm performs very well when fitting the COVID-19 epidemic data with multiple waves, where the epidemics in the United States, Italy, South Africa, and Kenya, and several outbreaks the Omicron variant in China are taken as examples. In addition, the numerical simulation shows that the trained TDINN can also perform as a predictive model to capture the future development of COVID-19 epidemic. We find that the transmission rate inferred by the TDINN frequently fluctuates, and a feedback loop between the epidemic shifting and the changes of transmissibility drives the occurrence of multiple waves. We observe a long response delay to the implementation of control interventions in the four countries, while the decline of the transmission rate in the outbreaks in China usually happens once the implementation of control interventions. The further simulation show that 17 days' delay of the response to the implementation of control interventions lead to a roughly four-fold increase in daily reported cases in one epidemic wave in Italy, which suggest that a rapid response to policies that strengthen control interventions can be effective in flattening the epidemic curve or avoiding subsequent epidemic waves. We observe that the transmission rate in the outbreaks in China is already decreasing before enhancing control interventions, providing the evidence that the increasing of the epidemics can drive self-conscious behavioural changes to protect against infections.

Collapse

Nazir A, Memon Z, Sadiq T, Rahman H, Khan IU. A Novel Feature-Selection Algorithm in IoT Networks for Intrusion Detection. SENSORS (BASEL, SWITZERLAND) 2023;23:8153. [PMID: 37836983 PMCID: PMC10575335 DOI: 10.3390/s23198153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 09/19/2023] [Accepted: 09/25/2023] [Indexed: 10/15/2023]

Rehman S, Ahmad Z, Ramakrishnan M, Kalendar R, Zhuge Q. Regulation of plant epigenetic memory in response to cold and heat stress: towards climate resilient agriculture. Funct Integr Genomics 2023;23:298. [PMID: 37700098 DOI: 10.1007/s10142-023-01219-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2023] [Revised: 08/18/2023] [Accepted: 08/23/2023] [Indexed: 09/14/2023]

Chang Q, Yan Z, Zhou M, Qu H, He X, Zhang H, Baskaran L, Al'Aref S, Li H, Zhang S, Metaxas DN. Mining multi-center heterogeneous medical data with distributed synthetic learning. Nat Commun 2023;14:5510. [PMID: 37679325 PMCID: PMC10484909 DOI: 10.1038/s41467-023-40687-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 08/03/2023] [Indexed: 09/09/2023] Open

Lakiotaki K, Papadovasilakis Z, Lagani V, Fafalios S, Charonyktakis P, Tsagris M, Tsamardinos I. Automated machine learning for genome wide association studies. Bioinformatics 2023;39:btad545. [PMID: 37672022 PMCID: PMC10562960 DOI: 10.1093/bioinformatics/btad545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 06/29/2023] [Accepted: 09/05/2023] [Indexed: 09/07/2023] Open

Bustos-Aibar M, Aguilera CM, Alcalá-Fdez J, Ruiz-Ojeda FJ, Plaza-Díaz J, Plaza-Florido A, Tofe I, Gil-Campos M, Gacto MJ, Anguita-Ruiz A. Shared gene expression signatures between visceral adipose and skeletal muscle tissues are associated with cardiometabolic traits in children with obesity. Comput Biol Med 2023;163:107085. [PMID: 37399741 DOI: 10.1016/j.compbiomed.2023.107085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 04/28/2023] [Accepted: 05/27/2023] [Indexed: 07/05/2023]

Abstract

Obesity in children is related to the development of cardiometabolic complications later in life, where molecular changes of visceral adipose tissue (VAT) and skeletal muscle tissue (SMT) have been proven to be fundamental. The aim of this study is to unveil the gene expression architecture of both tissues in a cohort of Spanish boys with obesity, using a clustering method known as weighted gene co-expression network analysis. For this purpose, we have followed a multi-objective analytic pipeline consisting of three main approaches; identification of gene co-expression clusters associated with childhood obesity, individually in VAT and SMT (intra-tissue, approach I); identification of gene co-expression clusters associated with obesity-metabolic alterations, individually in VAT and SMT (intra-tissue, approach II); and identification of gene co-expression clusters associated with obesity-metabolic alterations simultaneously in VAT and SMT (inter-tissue, approach III). In both tissues, we identified independent and inter-tissue gene co-expression signatures associated with obesity and cardiovascular risk, some of which exceeded multiple-test correction filters. In these signatures, we could identify some central hub genes (e.g., NDUFB8, GUCY1B1, KCNMA1, NPR2, PPP3CC) participating in relevant metabolic pathways exceeding multiple-testing correction filters. We identified the central hub genes PIK3R2, PPP3C and PTPN5 associated with MAPK signaling and insulin resistance terms. This is the first time that these genes have been associated with childhood obesity in both tissues. Therefore, they could be potential novel molecular targets for drugs and health interventions, opening new lines of research on the personalized care in this pathology. This work generates interesting hypotheses about the transcriptomics alterations underlying metabolic health alterations in obesity in the pediatric population.

Collapse

Affiliation(s)

Mireia Bustos-Aibar Department of Biochemistry and Molecular Biology II, School of Pharmacy, University of Granada, 18071, Granada, Spain.
Concepción M Aguilera Department of Biochemistry and Molecular Biology II, School of Pharmacy, University of Granada, 18071, Granada, Spain; Biomedical Research Networking Center for Physiopathology of Obesity and Nutrition, Carlos III Health Institute, 28029, Madrid, Spain.
Jesús Alcalá-Fdez Department of Computer Science and Artificial Intelligence, Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI), University of Granada, 18071, Granada, Spain.
Francisco J Ruiz-Ojeda Department of Biochemistry and Molecular Biology II, School of Pharmacy, University of Granada, 18071, Granada, Spain; RG Adipocytes and Metabolism, Institute for Diabetes and Obesity, Helmholtz Diabetes Center at the Helmholtz Zentrum München, Neuherberg, 85764, Munich, Germany.
Julio Plaza-Díaz Department of Biochemistry and Molecular Biology II, School of Pharmacy, University of Granada, 18071, Granada, Spain; Children's Hospital of Eastern Ontario Research Institute, Ottawa, ON K1H 8L1, Ontario, Canada.
Abel Plaza-Florido PROmoting FITness and Health through physical activity research group, Sport and Health University Research Institute, Department of Physical Education and Sports, University of Granada, 18071, Granada, Spain; Pediatric Exercise and Genomics Research Center, Department of Pediatrics, School of Medicine, University of California at Irvine, Irvine, 92617, CA, United States.
Inés Tofe Biomedical Research Networking Center for Physiopathology of Obesity and Nutrition, Carlos III Health Institute, 28029, Madrid, Spain; University Clinical Hospital, Institute Maimónides of Biomedicine Investigation of Córdoba, University of Córdoba, 14004, Córdoba, Spain.
Mercedes Gil-Campos Biomedical Research Networking Center for Physiopathology of Obesity and Nutrition, Carlos III Health Institute, 28029, Madrid, Spain; University Clinical Hospital, Institute Maimónides of Biomedicine Investigation of Córdoba, University of Córdoba, 14004, Córdoba, Spain.
María J Gacto Department of Software Engineering, University of Granada, 18071, Granada, Spain.
Augusto Anguita-Ruiz Department of Biochemistry and Molecular Biology II, School of Pharmacy, University of Granada, 18071, Granada, Spain; Barcelona Institute for Global Health, ISGlobal, 08003, Barcelona, Spain.

Collapse

Prusokiene A, Prusokas A, Retkute R. Machine learning based lineage tree reconstruction improved with knowledge of higher level relationships between cells and genomic barcodes. NAR Genom Bioinform 2023;5:lqad077. [PMID: 37608801 PMCID: PMC10440785 DOI: 10.1093/nargab/lqad077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 06/26/2023] [Accepted: 08/11/2023] [Indexed: 08/24/2023] Open

Belova T, Biondi N, Hsieh PH, Lutsik P, Chudasama P, Kuijjer M. Heterogeneity in the gene regulatory landscape of leiomyosarcoma. NAR Cancer 2023;5:zcad037. [PMID: 37492373 PMCID: PMC10365024 DOI: 10.1093/narcan/zcad037] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 07/06/2023] [Accepted: 07/18/2023] [Indexed: 07/27/2023] Open

Aradhya S, Facio FM, Metz H, Manders T, Colavin A, Kobayashi Y, Nykamp K, Johnson B, Nussbaum RL. Applications of artificial intelligence in clinical laboratory genomics. AMERICAN JOURNAL OF MEDICAL GENETICS. PART C, SEMINARS IN MEDICAL GENETICS 2023;193:e32057. [PMID: 37507620 DOI: 10.1002/ajmg.c.32057] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 07/13/2023] [Accepted: 07/19/2023] [Indexed: 07/30/2023]

Zhuravleva SI, Zadorozhny AD, Shilov BV, Lagunin AA. Prediction of Amino Acid Substitutions in ABL1 Protein Leading to Tumor Drug Resistance Based on "Structure-Property" Relationship Classification Models. Life (Basel) 2023;13:1807. [PMID: 37763211 PMCID: PMC10532460 DOI: 10.3390/life13091807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Revised: 08/15/2023] [Accepted: 08/21/2023] [Indexed: 09/29/2023] Open

Nguyen AH, Wang Z. Time-Distributed Framework for 3D Reconstruction Integrating Fringe Projection with Deep Learning. SENSORS (BASEL, SWITZERLAND) 2023;23:7284. [PMID: 37631820 PMCID: PMC10458373 DOI: 10.3390/s23167284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 08/07/2023] [Accepted: 08/18/2023] [Indexed: 08/27/2023]

Liu C, Wu F, Jiang X, Hu Y, Shao K, Tang X, Qin B, Gao G. Climate Change Causes Salinity To Become Determinant in Shaping the Microeukaryotic Spatial Distribution among the Lakes of the Inner Mongolia-Xinjiang Plateau. Microbiol Spectr 2023;11:e0317822. [PMID: 37306569 PMCID: PMC10434070 DOI: 10.1128/spectrum.03178-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 05/06/2023] [Indexed: 06/13/2023] Open

Abstract

Climate change greatly affects lake microorganisms in arid and semiarid zones, which alters ecosystem functions and the ecological security of lakes. However, the responses of lake microorganisms, especially microeukaryotes, to climate change are poorly understood. Here, using 18S ribosomal RNA (rRNA) high-throughput sequencing, we investigated the distribution patterns of microeukaryotic communities and whether and how climate change directly or indirectly affected the microeukaryotic communities on the Inner Mongolia-Xinjiang Plateau. Our results showed that climate change, as the main driving force of lake change, drives salinity to become a determinant of the microeukaryotic community among the lakes of the Inner Mongolia-Xinjiang Plateau. Salinity shapes the diversity and trophic level of the microeukaryotic community and further affects lake carbon cycling. Co-occurrence network analysis further revealed that increasing salinity reduced the complexity but improved the stability of microeukaryotic communities and changed ecological relationships. Meanwhile, increasing salinity enhanced the importance of deterministic processes in microeukaryotic community assembly, and the dominance of stochastic processes in freshwater lakes transformed into deterministic processes in salt lakes. Furthermore, we established lake biomonitoring and climate sentinel models by integrating microeukaryotic information, which would provide substantial improvements to our predictive ability of lake responses to climate change. IMPORTANCE Our findings have important implications for understanding the distribution patterns and the driving mechanisms of microeukaryotic communities among the lakes of the Inner Mongolia-Xinjiang Plateau and whether and how climate change directly or indirectly affects microeukaryotic communities. Our study also establishes the groundwork to use the lake microbiome for the assessment of aquatic ecological health and climate change, which is critical for ecosystem management and for projecting the ecological consequences of future climate warming.

Collapse

Kabir M, Stuart HM, Lopes FM, Fotiou E, Keavney B, Doig AJ, Woolf AS, Hentges KE. Predicting congenital renal tract malformation genes using machine learning. Sci Rep 2023;13:13204. [PMID: 37580336 PMCID: PMC10425350 DOI: 10.1038/s41598-023-38110-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 07/03/2023] [Indexed: 08/16/2023] Open

Affiliation(s)

Mitra Kabir CentreDivision of Evolution, Infection and Genomics, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, The University of Manchester, Oxford Road, Manchester, M13 9PT, UK
Helen M Stuart CentreDivision of Evolution, Infection and Genomics, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, The University of Manchester, Oxford Road, Manchester, M13 9PT, UK Manchester Centre for Genomic Medicine, St. Mary's Hospital, Health Innovation Manchester, Manchester University Foundation NHS Trust, Manchester, M13 9WL, UK
Filipa M Lopes Division of Cell Matrix Biology and Regenerative Medicine, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, M13 9PL, UK
Elisavet Fotiou Division of Cardiovascular Sciences, School of Medical Sciences, Faculty of Biology, Medicine, and Health, The University of Manchester, Manchester, M13 9PL, UK C.B.B Lifeline Biotech Ltd, 5 Propontidos Street, Strovolos, 2033, Nicosia, Cyprus
Bernard Keavney Division of Cardiovascular Sciences, School of Medical Sciences, Faculty of Biology, Medicine, and Health, The University of Manchester, Manchester, M13 9PL, UK Manchester Heart Institute, Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, M13 9WL, UK
Andrew J Doig Division of Neuroscience, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Stopford Building, Manchester, M13 9BL, UK
Adrian S Woolf Division of Cell Matrix Biology and Regenerative Medicine, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, M13 9PL, UK Department of Nephrology, Royal Manchester Children's Hospital, Manchester Academic Health Science Centre, Manchester, M13 9WL, UK
Kathryn E Hentges CentreDivision of Evolution, Infection and Genomics, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, The University of Manchester, Oxford Road, Manchester, M13 9PT, UK.

Collapse

Komuro J, Kusumoto D, Hashimoto H, Yuasa S. Machine learning in cardiology: Clinical application and basic research. J Cardiol 2023;82:128-133. [PMID: 37141938 DOI: 10.1016/j.jjcc.2023.04.020] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 04/23/2023] [Accepted: 04/28/2023] [Indexed: 05/06/2023]

Ong J, Waisberg E, Masalkhi M, Kamran SA, Lowry K, Sarker P, Zaman N, Paladugu P, Tavakkoli A, Lee AG. Artificial Intelligence Frameworks to Detect and Investigate the Pathophysiology of Spaceflight Associated Neuro-Ocular Syndrome (SANS). Brain Sci 2023;13:1148. [PMID: 37626504 PMCID: PMC10452366 DOI: 10.3390/brainsci13081148] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 07/24/2023] [Accepted: 07/28/2023] [Indexed: 08/27/2023] Open

McDonnell KJ. Leveraging the Academic Artificial Intelligence Silecosystem to Advance the Community Oncology Enterprise. J Clin Med 2023;12:4830. [PMID: 37510945 PMCID: PMC10381436 DOI: 10.3390/jcm12144830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 07/05/2023] [Accepted: 07/07/2023] [Indexed: 07/30/2023] Open

Su YY, Liu YL, Huang HC, Lin CC. Ensemble learning model for identifying the hallmark genes of NFκB/TNF signaling pathway in cancers. J Transl Med 2023;21:485. [PMID: 37475016 PMCID: PMC10357720 DOI: 10.1186/s12967-023-04355-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 07/13/2023] [Indexed: 07/22/2023] Open

Abstract

BACKGROUND

The nuclear factor kappa B (NFκB) regulatory pathways downstream of tumor necrosis factor (TNF) play a critical role in carcinogenesis. However, the widespread influence of NFκB in cells can result in off-target effects, making it a challenging therapeutic target. Ensemble learning is a machine learning technique where multiple models are combined to improve the performance and robustness of the prediction. Accordingly, an ensemble learning model could uncover more precise targets within the NFκB/TNF signaling pathway for cancer therapy.

METHODS

In this study, we trained an ensemble learning model on the transcriptome profiles from 16 cancer types in the TCGA database to identify a robust set of genes that are consistently associated with the NFκB/TNF pathway in cancer. Our model uses cancer patients as features to predict the genes involved in the NFκB/TNF signaling pathway and can be adapted to predict the genes for different cancer types by switching the cancer type of patients. We also performed functional analysis, survival analysis, and a case study of triple-negative breast cancer to demonstrate our model's potential in translational cancer medicine.

RESULTS

Our model accurately identified genes regulated by NFκB in response to TNF in cancer patients. The downstream analysis showed that the identified genes are typically involved in the canonical NFκB-regulated pathways, particularly in adaptive immunity, anti-apoptosis, and cellular response to cytokine stimuli. These genes were found to have oncogenic properties and detrimental effects on patient survival. Our model also could distinguish patients with a specific cancer subtype, triple-negative breast cancer (TNBC), which is known to be influenced by NFκB-regulated pathways downstream of TNF. Furthermore, a functional module known as mononuclear cell differentiation was identified that accurately predicts TNBC patients and poor short-term survival in non-TNBC patients, providing a potential avenue for developing precision medicine for cancer subtypes.

CONCLUSIONS

In conclusion, our approach enables the discovery of genes in NFκB-regulated pathways in response to TNF and their relevance to carcinogenesis. We successfully categorized these genes into functional groups, providing valuable insights for discovering more precise and targeted cancer therapeutics.

Collapse

100

Chen H, Liu Y, Balabani S, Hirayama R, Huang J. Machine Learning in Predicting Printable Biomaterial Formulations for Direct Ink Writing. RESEARCH (WASHINGTON, D.C.) 2023;6:0197. [PMID: 37469394 PMCID: PMC10353544 DOI: 10.34133/research.0197] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 06/29/2023] [Indexed: 07/21/2023]