Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Li Y, Wu FX, Ngom A. A review on machine learning principles for multi-view biological data integration. Brief Bioinform 2019;19:325-340. [PMID: 28011753 DOI: 10.1093/bib/bbw113] [Citation(s) in RCA: 126] [Impact Index Per Article: 25.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2016] [Indexed: 01/08/2023] Open

For:	Li Y, Wu FX, Ngom A. A review on machine learning principles for multi-view biological data integration. Brief Bioinform 2019;19:325-340. [PMID: 28011753 DOI: 10.1093/bib/bbw113] [Citation(s) in RCA: 126] [Impact Index Per Article: 25.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2016] [Indexed: 01/08/2023] Open

Number

Cited by Other Article(s)

Zhao YX, Yu CQ, Li LP, Wang DW, Song HF, Wei Y. BJLD-CMI: a predictive circRNA-miRNA interactions model combining multi-angle feature information. Front Genet 2024;15:1399810. [PMID: 38798699 PMCID: PMC11116695 DOI: 10.3389/fgene.2024.1399810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Accepted: 04/03/2024] [Indexed: 05/29/2024] Open

Cho H, She J, De Marchi D, El-Zaatari H, Barnes EL, Kahkoska AR, Kosorok MR, Virkud AV. Machine Learning and Health Science Research: Tutorial. J Med Internet Res 2024;26:e50890. [PMID: 38289657 PMCID: PMC10865203 DOI: 10.2196/50890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Revised: 11/30/2023] [Accepted: 12/21/2023] [Indexed: 02/01/2024] Open

Wieder C, Cooke J, Frainay C, Poupin N, Bowler R, Jourdan F, Kechris KJ, Lai RP, Ebbels T. PathIntegrate: Multivariate modelling approaches for pathway-based multi-omics data integration. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.09.574780. [PMID: 38260498 PMCID: PMC10802464 DOI: 10.1101/2024.01.09.574780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]

Rahman A, Debnath T, Kundu D, Khan MSI, Aishi AA, Sazzad S, Sayduzzaman M, Band SS. Machine learning and deep learning-based approach in smart healthcare: Recent advances, applications, challenges and opportunities. AIMS Public Health 2024;11:58-109. [PMID: 38617415 PMCID: PMC11007421 DOI: 10.3934/publichealth.2024004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Accepted: 12/18/2023] [Indexed: 04/16/2024] Open

Yue T, Wang Y, Zhang L, Gu C, Xue H, Wang W, Lyu Q, Dun Y. Deep Learning for Genomics: From Early Neural Nets to Modern Large Language Models. Int J Mol Sci 2023;24:15858. [PMID: 37958843 PMCID: PMC10649223 DOI: 10.3390/ijms242115858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 10/24/2023] [Accepted: 10/30/2023] [Indexed: 11/15/2023] Open

Chen X, Feng B, Xu K, Chen Y, Duan X, Jin Z, Li K, Li R, Long W, Liu X. Development and validation of a deep learning radiomics nomogram for preoperatively differentiating thymic epithelial tumor histologic subtypes. Eur Radiol 2023;33:6804-6816. [PMID: 37148352 DOI: 10.1007/s00330-023-09690-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 02/20/2023] [Accepted: 02/27/2023] [Indexed: 05/08/2023]

Abstract

OBJECTIVES

Using contrast-enhanced computed tomography (CECT) and deep learning technology to develop a deep learning radiomics nomogram (DLRN) to preoperative predict risk status of patients with thymic epithelial tumors (TETs).

METHODS

Between October 2008 and May 2020, 257 consecutive patients with surgically and pathologically confirmed TETs were enrolled from three medical centers. We extracted deep learning features from all lesions using a transformer-based convolutional neural network and created a deep learning signature (DLS) using selector operator regression and least absolute shrinkage. The predictive capability of a DLRN incorporating clinical characteristics, subjective CT findings and DLS was evaluated by the area under the curve (AUC) of a receiver operating characteristic curve.

RESULTS

To construct a DLS, 25 deep learning features with non-zero coefficients were selected from 116 low-risk TETs (subtypes A, AB, and B1) and 141 high-risk TETs (subtypes B2, B3, and C). The combination of subjective CT features such as infiltration and DLS demonstrated the best performance in differentiating TETs risk status. The AUCs in the training, internal validation, external validation 1 and 2 cohorts were 0.959 (95% confidence interval [CI]: 0.924-0.993), 0.868 (95% CI: 0.765-0.970), 0.846 (95% CI: 0.750-0.942), and 0.846 (95% CI: 0.735-0.957), respectively. The DeLong test and decision in curve analysis revealed that the DLRN was the most predictive and clinically useful model.

CONCLUSIONS

The DLRN comprised of CECT-derived DLS and subjective CT findings showed a high performance in predicting risk status of patients with TETs.

CLINICAL RELEVANCE STATEMENT

Accurate risk status assessment of thymic epithelial tumors (TETs) may aid in determining whether preoperative neoadjuvant treatment is necessary. A deep learning radiomics nomogram incorporating enhancement CT-based deep learning features, clinical characteristics, and subjective CT findings has the potential to predict the histologic subtypes of TETs, which can facilitate decision-making and personalized therapy in clinical practice.

KEY POINTS

• A non-invasive diagnostic method that can predict the pathological risk status may be useful for pretreatment stratification and prognostic evaluation in TET patients. • DLRN demonstrated superior performance in differentiating the risk status of TETs when compared to the deep learning signature, radiomics signature, or clinical model. • The DeLong test and decision in curve analysis revealed that the DLRN was the most predictive and clinically useful in differentiating the risk status of TETs.

Collapse

Chafai N, Hayah I, Houaga I, Badaoui B. A review of machine learning models applied to genomic prediction in animal breeding. Front Genet 2023;14:1150596. [PMID: 37745853 PMCID: PMC10516561 DOI: 10.3389/fgene.2023.1150596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 08/22/2023] [Indexed: 09/26/2023] Open

Abstract

The advent of modern genotyping technologies has revolutionized genomic selection in animal breeding. Large marker datasets have shown several drawbacks for traditional genomic prediction methods in terms of flexibility, accuracy, and computational power. Recently, the application of machine learning models in animal breeding has gained a lot of interest due to their tremendous flexibility and their ability to capture patterns in large noisy datasets. Here, we present a general overview of a handful of machine learning algorithms and their application in genomic prediction to provide a meta-picture of their performance in genomic estimated breeding values estimation, genotype imputation, and feature selection. Finally, we discuss a potential adoption of machine learning models in genomic prediction in developing countries. The results of the reviewed studies showed that machine learning models have indeed performed well in fitting large noisy data sets and modeling minor nonadditive effects in some of the studies. However, sometimes conventional methods outperformed machine learning models, which confirms that there's no universal method for genomic prediction. In summary, machine learning models have great potential for extracting patterns from single nucleotide polymorphism datasets. Nonetheless, the level of their adoption in animal breeding is still low due to data limitations, complex genetic interactions, a lack of standardization and reproducibility, and the lack of interpretability of machine learning models when trained with biological data. Consequently, there is no remarkable outperformance of machine learning methods compared to traditional methods in genomic prediction. Therefore, more research should be conducted to discover new insights that could enhance livestock breeding programs.

Collapse

Morabito F, Adornetto C, Monti P, Amaro A, Reggiani F, Colombo M, Rodriguez-Aldana Y, Tripepi G, D’Arrigo G, Vener C, Torricelli F, Rossi T, Neri A, Ferrarini M, Cutrona G, Gentile M, Greco G. Genes selection using deep learning and explainable artificial intelligence for chronic lymphocytic leukemia predicting the need and time to therapy. Front Oncol 2023;13:1198992. [PMID: 37719021 PMCID: PMC10501728 DOI: 10.3389/fonc.2023.1198992] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Accepted: 07/31/2023] [Indexed: 09/19/2023] Open

Abstract

Analyzing gene expression profiles (GEP) through artificial intelligence provides meaningful insight into cancer disease. This study introduces DeepSHAP Autoencoder Filter for Genes Selection (DSAF-GS), a novel deep learning and explainable artificial intelligence-based approach for feature selection in genomics-scale data. DSAF-GS exploits the autoencoder's reconstruction capabilities without changing the original feature space, enhancing the interpretation of the results. Explainable artificial intelligence is then used to select the informative genes for chronic lymphocytic leukemia prognosis of 217 cases from a GEP database comprising roughly 20,000 genes. The model for prognosis prediction achieved an accuracy of 86.4%, a sensitivity of 85.0%, and a specificity of 87.5%. According to the proposed approach, predictions were strongly influenced by CEACAM19 and PIGP, moderately influenced by MKL1 and GNE, and poorly influenced by other genes. The 10 most influential genes were selected for further analysis. Among them, FADD, FIBP, FIBP, GNE, IGF1R, MKL1, PIGP, and SLC39A6 were identified in the Reactome pathway database as involved in signal transduction, transcription, protein metabolism, immune system, cell cycle, and apoptosis. Moreover, according to the network model of the 3D protein-protein interaction (PPI) explored using the NetworkAnalyst tool, FADD, FIBP, IGF1R, QTRT1, GNE, SLC39A6, and MKL1 appear coupled into a complex network. Finally, all 10 selected genes showed a predictive power on time to first treatment (TTFT) in univariate analyses on a basic prognostic model including IGHV mutational status, del(11q) and del(17p), NOTCH1 mutations, β2-microglobulin, Rai stage, and B-lymphocytosis known to predict TTFT in CLL. However, only IGF1R [hazard ratio (HR) 1.41, 95% CI 1.08-1.84, P=0.013), COL28A1 (HR 0.32, 95% CI 0.10-0.97, P=0.045), and QTRT1 (HR 7.73, 95% CI 2.48-24.04, P<0.001) genes were significantly associated with TTFT in multivariable analyses when combined with the prognostic factors of the basic model, ultimately increasing the Harrell's c-index and the explained variation to 78.6% (versus 76.5% of the basic prognostic model) and 52.6% (versus 42.2% of the basic prognostic model), respectively. Also, the goodness of model fit was enhanced (χ2 = 20.1, P=0.002), indicating its improved performance above the basic prognostic model. In conclusion, DSAF-GS identified a group of significant genes for CLL prognosis, suggesting future directions for bio-molecular research.

Collapse

Affiliation(s)

Fortunato Morabito Biotechnology Research Unit, ‘A. Sforza’ Foundation, Cosenza, Italy
Carlo Adornetto Department of Mathematics and Computer Science, University of Calabria, Cosenza, Italy
Paola Monti Mutagenesis and Cancer Prevention Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
Adriana Amaro Tumor Epigenetics Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
Francesco Reggiani Tumor Epigenetics Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
Monica Colombo Molecular Pathology Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
Yissel Rodriguez-Aldana Department of Mathematics and Computer Science, University of Calabria, Cosenza, Italy
Giovanni Tripepi Consiglio Nazionale delle Ricerche, Istituto di Fisiologia Clinica del Consiglio Nazionale delle Ricerche (CNR), Reggio Calabria, Italy
Graziella D’Arrigo Consiglio Nazionale delle Ricerche, Istituto di Fisiologia Clinica del Consiglio Nazionale delle Ricerche (CNR), Reggio Calabria, Italy
Claudia Vener Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy
Federica Torricelli Laboratory of Translational Research, Azienda Unità Sanitaria Locale - Istituto di Ricovero e Cura a Crabtree Scientifico (USL-IRCCS) of Reggio Emilia, Reggio Emilia, Italy
Teresa Rossi Laboratory of Translational Research, Azienda Unità Sanitaria Locale - Istituto di Ricovero e Cura a Crabtree Scientifico (USL-IRCCS) of Reggio Emilia, Reggio Emilia, Italy
Antonino Neri Scientific Directorate, Azienda Unità Sanitaria Locale - Istituto di Ricovero e Cura a Carattere Scientifico (USL-IRCCS) of Reggio Emilia, Reggio Emilia, Italy
Manlio Ferrarini Unità Operariva (UO) Molecular Pathology, Ospedale Policlinico San Martino Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS), Genoa, Italy
Giovanna Cutrona Molecular Pathology Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
Massimo Gentile Hematology Unit, Department of Onco-Hematology, Azienda Ospedaliera (A.O.) of Cosenza, Cosenza, Italy Department of Pharmacy and Health and Nutritional Sciences, University of Calabria, Cosenza, Italy
Gianluigi Greco Department of Mathematics and Computer Science, University of Calabria, Cosenza, Italy

Collapse

Curti PDF, Selli A, Pinto DL, Merlos-Ruiz A, Balieiro JCDC, Ventura RV. Applications of livestock monitoring devices and machine learning algorithms in animal production and reproduction: an overview. Anim Reprod 2023;20:e20230077. [PMID: 37700909 PMCID: PMC10494883 DOI: 10.1590/1984-3143-ar2023-0077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 07/10/2023] [Indexed: 09/14/2023] Open

Abstract

Some sectors of animal production and reproduction have shown great technological advances due to the development of research areas such as Precision Livestock Farming (PLF). PLF is an innovative approach that allows animals to be monitored, through the adoption of cutting-edge technologies that continuously collect real-time data by combining the use of sensors with advanced algorithms to provide decision tools for farmers. Artificial Intelligence (AI) is a field that merges computer science and large datasets to create expert systems that are able to generate predictions and classifications similarly to human intelligence. In a simplified manner, Machine Learning (ML) is a branch of AI, and can be considered as a broader field that encompasses Deep Learning (DL, a Neural Network formed by at least three layers), generating a hierarchy of subsets formed by AI, ML and DL, respectively. Both ML and DL provide innovative methods for analyzing data, especially beneficial for large datasets commonly found in livestock-related activities. These approaches enable the extraction of valuable insights to address issues related to behavior, health, reproduction, production, and the environment, facilitating informed decision-making. In order to create the referred technologies, studies generally go through five steps involving data processing: acquisition, transferring, storage, analysis and delivery of results. Although the data collection and analysis steps are usually thoroughly reported by the scientific community, a good execution of each step is essential to achieve good and credible results, which impacts the degree of acceptance of the proposed technologies in real life practical circumstances. In this context, the present work aims to describe an overview of the current implementations of ML/DL in livestock reproduction and production, as well to identify potential challenges and critical points in each of the five steps mentioned, which can affect results and application of AI techniques by farmers in practical situations.

Collapse

Gupta NS, Kumar P. Perspective of artificial intelligence in healthcare data management: A journey towards precision medicine. Comput Biol Med 2023;162:107051. [PMID: 37271113 DOI: 10.1016/j.compbiomed.2023.107051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Revised: 05/06/2023] [Accepted: 05/20/2023] [Indexed: 06/06/2023]

Kuzudisli C, Bakir-Gungor B, Bulut N, Qaqish B, Yousef M. Review of feature selection approaches based on grouping of features. PeerJ 2023;11:e15666. [PMID: 37483989 PMCID: PMC10358338 DOI: 10.7717/peerj.15666] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 06/08/2023] [Indexed: 07/25/2023] Open

Li C, Dubbelaar ML, Zhang X, Zheng JC. Editorial: Understanding the heterogeneity and spatial brain environment of neurodegenerative diseases through conventional and future methods. Front Cell Neurosci 2023;17:1211273. [PMID: 37287510 PMCID: PMC10242171 DOI: 10.3389/fncel.2023.1211273] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 05/02/2023] [Indexed: 06/09/2023] Open

Hauptmann T, Kramer S. A fair experimental comparison of neural network architectures for latent representations of multi-omics for drug response prediction. BMC Bioinformatics 2023;24:45. [PMID: 36788531 PMCID: PMC9926634 DOI: 10.1186/s12859-023-05166-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 01/31/2023] [Indexed: 02/16/2023] Open

Abstract

BACKGROUND

Recent years have seen a surge of novel neural network architectures for the integration of multi-omics data for prediction. Most of the architectures include either encoders alone or encoders and decoders, i.e., autoencoders of various sorts, to transform multi-omics data into latent representations. One important parameter is the depth of integration: the point at which the latent representations are computed or merged, which can be either early, intermediate, or late. The literature on integration methods is growing steadily, however, close to nothing is known about the relative performance of these methods under fair experimental conditions and under consideration of different use cases.

RESULTS

We developed a comparison framework that trains and optimizes multi-omics integration methods under equal conditions. We incorporated early integration, PCA and four recently published deep learning methods: MOLI, Super.FELT, OmiEmbed, and MOMA. Further, we devised a novel method, Omics Stacking, that combines the advantages of intermediate and late integration. Experiments were conducted on a public drug response data set with multiple omics data (somatic point mutations, somatic copy number profiles and gene expression profiles) that was obtained from cell lines, patient-derived xenografts, and patient samples. Our experiments confirmed that early integration has the lowest predictive performance. Overall, architectures that integrate triplet loss achieved the best results. Statistical differences can, overall, rarely be observed, however, in terms of the average ranks of methods, Super.FELT is consistently performing best in a cross-validation setting and Omics Stacking best in an external test set setting.

CONCLUSIONS

We recommend researchers to follow fair comparison protocols, as suggested in the paper. When faced with a new data set, Super.FELT is a good option in the cross-validation setting as well as Omics Stacking in the external test set setting. Statistical significances are hardly observable, despite trends in the algorithms' rankings. Future work on refined methods for transfer learning tailored for this domain may improve the situation for external test sets. The source code of all experiments is available under https://github.com/kramerlab/Multi-Omics_analysis.

Collapse

Flores JE, Claborne DM, Weller ZD, Webb-Robertson BJM, Waters KM, Bramer LM. Missing data in multi-omics integration: Recent advances through artificial intelligence. Front Artif Intell 2023;6:1098308. [PMID: 36844425 PMCID: PMC9949722 DOI: 10.3389/frai.2023.1098308] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 01/23/2023] [Indexed: 02/11/2023] Open

Katta MR, Kalluru PKR, Bavishi DA, Hameed M, Valisekka SS. Artificial intelligence in pancreatic cancer: diagnosis, limitations, and the future prospects-a narrative review. J Cancer Res Clin Oncol 2023:10.1007/s00432-023-04625-1. [PMID: 36739356 DOI: 10.1007/s00432-023-04625-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 01/27/2023] [Indexed: 02/06/2023]

Lorefice L, Pitzalis M, Murgia F, Fenu G, Atzori L, Cocco E. Omics approaches to understanding the efficacy and safety of disease-modifying treatments in multiple sclerosis. Front Genet 2023;14:1076421. [PMID: 36793897 PMCID: PMC9922720 DOI: 10.3389/fgene.2023.1076421] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 01/09/2023] [Indexed: 02/03/2023] Open

Devarajan AK, Truu M, Gopalasubramaniam SK, Muthukrishanan G, Truu J. Application of data integration for rice bacterial strain selection by combining their osmotic stress response and plant growth-promoting traits. Front Microbiol 2022;13:1058772. [PMID: 36590400 PMCID: PMC9797599 DOI: 10.3389/fmicb.2022.1058772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 11/29/2022] [Indexed: 12/23/2022] Open

Abstract

Agricultural application of plant-beneficial bacteria to improve crop yield and alleviate the stress caused by environmental conditions, pests, and pathogens is gaining popularity. However, before using these bacterial strains in plant experiments, their environmental stress responses and plant health improvement potential should be examined. In this study, we explored the applicability of three unsupervised machine learning-based data integration methods, including principal component analysis (PCA) of concatenated data, multiple co-inertia analysis (MCIA), and multiple kernel learning (MKL), to select osmotic stress-tolerant plant growth-promoting (PGP) bacterial strains isolated from the rice phyllosphere. The studied datasets consisted of direct and indirect PGP activity measurements and osmotic stress responses of eight bacterial strains previously isolated from the phyllosphere of drought-tolerant rice cultivar. The production of phytohormones, such as indole-acetic acid (IAA), gibberellic acid (GA), abscisic acid (ABA), and cytokinin, were used as direct PGP traits, whereas the production of hydrogen cyanide and siderophore and antagonistic activity against the foliar pathogens Pyricularia oryzae and Helminthosporium oryzae were evaluated as measures of indirect PGP activity. The strains were subjected to a range of osmotic stress levels by adding PEG 6000 (0, 11, 21, and 32.6%) to their growth medium. The results of the osmotic stress response experiments showed that all bacterial strains accumulated endogenous proline and glycine betaine (GB) and exhibited an increase in growth, when osmotic stress levels were increased to a specific degree, while the production of IAA and GA considerably decreased. The three applied data integration methods did not provide a similar grouping of the strains. Especially deviant was the ordination of microbial strains based on the PCA of concatenated data. However, all three data integration methods indicated that the strains Bacillus altitudinis PB46 and B. megaterium PB50 shared high similarity in PGP traits and osmotic stress response. Overall, our results indicate that data integration methods complement the single-table data analysis approach and improve the selection process for PGP microbial strains.

Collapse

Taguchi YH, Turki T. A tensor decomposition-based integrated analysis applicable to multiple gene expression profiles without sample matching. Sci Rep 2022;12:21242. [PMID: 36481877 PMCID: PMC9732005 DOI: 10.1038/s41598-022-25524-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Accepted: 11/30/2022] [Indexed: 12/13/2022] Open

Zhang Y, Deng Y, Zhou Z, Zhang X, Jiao P, Zhao Z. Multimodal learning for fetal distress diagnosis using a multimodal medical information fusion framework. Front Physiol 2022;13:1021400. [DOI: 10.3389/fphys.2022.1021400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 10/25/2022] [Indexed: 11/09/2022] Open

Abstract Cardiotocography (CTG) monitoring is an important medical diagnostic tool for fetal well-being evaluation in late pregnancy. In this regard, intelligent CTG classification based on Fetal Heart Rate (FHR) signals is a challenging research area that can assist obstetricians in making clinical decisions, thereby improving the efficiency and accuracy of pregnancy management. Most existing methods focus on one specific modality, that is, they only detect one type of modality and inevitably have limitations such as incomplete or redundant source domain feature extraction, and poor repeatability. This study focuses on modeling multimodal learning for Fetal Distress Diagnosis (FDD); however, exists three major challenges: unaligned multimodalities; failure to learn and fuse the causality and inclusion between multimodal biomedical data; modality sensitivity, that is, difficulty in implementing a task in the absence of modalities. To address these three issues, we propose a Multimodal Medical Information Fusion framework named MMIF, where the Category Constrained-Parallel ViT model (CCPViT) was first proposed to explore multimodal learning tasks and address the misalignment between multimodalities. Based on CCPViT, a cross-attention-based image-text joint component is introduced to establish a Multimodal Representation Alignment Network model (MRAN), explore the deep-level interactive representation between cross-modal data, and assist multimodal learning. Furthermore, we designed a simple-structured FDD test model based on the highly modal alignment MMIF, realizing task delegation from multimodal model training (image and text) to unimodal pathological diagnosis (image). Extensive experiments, including model parameter sensitivity analysis, cross-modal alignment assessment, and pathological diagnostic accuracy evaluation, were conducted to show our models’ superior performance and effectiveness. Collapse

Zhanpeng H, Jiekang W. A Multiview Clustering Method With Low-Rank and Sparsity Constraints for Cancer Subtyping. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:3213-3223. [PMID: 34705654 DOI: 10.1109/tcbb.2021.3122917] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

IoMT-Based Mitochondrial and Multifactorial Genetic Inheritance Disorder Prediction Using Machine Learning. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022;2022:2650742. [PMID: 35909844 PMCID: PMC9334098 DOI: 10.1155/2022/2650742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Accepted: 07/04/2022] [Indexed: 11/18/2022]

Feng H, Xiang Y, Wang X, Xue W, Yue Z. MTAGCN: predicting miRNA-target associations in Camellia sinensis var. assamica through graph convolution neural network. BMC Bioinformatics 2022;23:271. [PMID: 35820798 PMCID: PMC9275082 DOI: 10.1186/s12859-022-04819-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 07/01/2022] [Indexed: 11/10/2022] Open

Combining Molecular, Imaging, and Clinical Data Analysis for Predicting Cancer Prognosis. Cancers (Basel) 2022;14:cancers14133215. [PMID: 35804988 PMCID: PMC9265023 DOI: 10.3390/cancers14133215] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 06/24/2022] [Accepted: 06/27/2022] [Indexed: 02/04/2023] Open

Abstract

Simple Summary

The rise of Big Data, the widespread use of Machine Learning, and the cheapening of omics techniques have allowed for the creation of more sophisticated and accurate models in biomedical research. This article presents the state-of-the-art predictive models of cancer prognosis that use multimodal data, considering clinical, molecular (omics and non-omics), and image data. The subject of study, the data modalities used, the data processing and modelling methods applied, the validation strategies involved, the integration strategies encompassed, and the evolution of prognostic predictive models are discussed. Finally, we discuss challenges and opportunities in this field of cancer research, with great potential impact on the clinical management of patients and, by extension, on the implementation of personalised and precision medicine.

Abstract

Cancer is one of the most detrimental diseases globally. Accordingly, the prognosis prediction of cancer patients has become a field of interest. In this review, we have gathered 43 state-of-the-art scientific papers published in the last 6 years that built cancer prognosis predictive models using multimodal data. We have defined the multimodality of data as four main types: clinical, anatomopathological, molecular, and medical imaging; and we have expanded on the information that each modality provides. The 43 studies were divided into three categories based on the modelling approach taken, and their characteristics were further discussed together with current issues and future trends. Research in this area has evolved from survival analysis through statistical modelling using mainly clinical and anatomopathological data to the prediction of cancer prognosis through a multi-faceted data-driven approach by the integration of complex, multimodal, and high-dimensional data containing multi-omics and medical imaging information and by applying Machine Learning and, more recently, Deep Learning techniques. This review concludes that cancer prognosis predictive multimodal models are capable of better stratifying patients, which can improve clinical management and contribute to the implementation of personalised medicine as well as provide new and valuable knowledge on cancer biology and its progression.

Collapse

Caligola S, De Sanctis F, Canè S, Ugel S. Breaking the Immune Complexity of the Tumor Microenvironment Using Single-Cell Technologies. Front Genet 2022;13:867880. [PMID: 35651929 PMCID: PMC9149246 DOI: 10.3389/fgene.2022.867880] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 04/27/2022] [Indexed: 12/31/2022] Open

van Loon W, de Vos F, Fokkema M, Szabo B, Koini M, Schmidt R, de Rooij M. Analyzing Hierarchical Multi-View MRI Data With StaPLR: An Application to Alzheimer's Disease Classification. Front Neurosci 2022;16:830630. [PMID: 35546881 PMCID: PMC9082949 DOI: 10.3389/fnins.2022.830630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 03/23/2022] [Indexed: 11/16/2022] Open

Watson ER, Taherian Fard A, Mar JC. Computational Methods for Single-Cell Imaging and Omics Data Integration. Front Mol Biosci 2022;8:768106. [PMID: 35111809 PMCID: PMC8801747 DOI: 10.3389/fmolb.2021.768106] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 11/29/2021] [Indexed: 12/12/2022] Open

Viaud G, Mayilvahanan P, Cournede PH. Representation Learning for the Clustering of Multi-Omics Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:135-145. [PMID: 33600320 DOI: 10.1109/tcbb.2021.3060340] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Huminiecki Ł. Virtual Gene Concept and a Corresponding Pragmatic Research Program in Genetical Data Science. ENTROPY (BASEL, SWITZERLAND) 2021;24:17. [PMID: 35052043 PMCID: PMC8774939 DOI: 10.3390/e24010017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 12/02/2021] [Accepted: 12/14/2021] [Indexed: 06/14/2023]

Vijayakumar S, Angione C. Protocol for hybrid flux balance, statistical, and machine learning analysis of multi-omic data from the cyanobacterium Synechococcus sp. PCC 7002. STAR Protoc 2021;2:100837. [PMID: 34632416 PMCID: PMC8488602 DOI: 10.1016/j.xpro.2021.100837] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Mourragui SMC, Loog M, Vis DJ, Moore K, Manjon AG, van de Wiel MA, Reinders MJT, Wessels LFA. Predicting patient response with models trained on cell lines and patient-derived xenografts by nonlinear transfer learning. Proc Natl Acad Sci U S A 2021;118:e2106682118. [PMID: 34873056 PMCID: PMC8670522 DOI: 10.1073/pnas.2106682118] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/18/2021] [Indexed: 12/13/2022] Open

Ferré Q, Chèneby J, Puthier D, Capponi C, Ballester B. Anomaly detection in genomic catalogues using unsupervised multi-view autoencoders. BMC Bioinformatics 2021;22:460. [PMID: 34563116 PMCID: PMC8467021 DOI: 10.1186/s12859-021-04359-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 06/04/2021] [Accepted: 08/09/2021] [Indexed: 11/13/2022] Open

Liu J, Ge S, Cheng Y, Wang X. Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype Prediction. Front Genet 2021;12:718915. [PMID: 34552619 PMCID: PMC8450448 DOI: 10.3389/fgene.2021.718915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 08/05/2021] [Indexed: 11/24/2022] Open

Abstract

It is a vital task to design an integrated machine learning model to discover cancer subtypes and understand the heterogeneity of cancer based on multiple omics data. In recent years, some multi-view clustering algorithms have been proposed and applied to the prediction of cancer subtypes. Among them, the multi-view clustering methods based on graph learning are widely concerned. These multi-view approaches usually have one or more of the following problems. Many multi-view algorithms use the original omics data matrix to construct the similarity matrix and ignore the learning of the similarity matrix. They separate the data clustering process from the graph learning process, resulting in a highly dependent clustering performance on the predefined graph. In the process of graph fusion, these methods simply take the average value of the affinity graph of multiple views to represent the result of the fusion graph, and the rich heterogeneous information is not fully utilized. To solve the above problems, in this paper, a Multi-view Spectral Clustering Based on Multi-smooth Representation Fusion (MRF-MSC) method was proposed. Firstly, MRF-MSC constructs a smooth representation for each data type, which can be viewed as a sample (patient) similarity matrix. The smooth representation can explicitly enhance the grouping effect. Secondly, MRF-MSC integrates the smooth representation of multiple omics data to form a similarity matrix containing all biological data information through graph fusion. In addition, MRF-MSC adaptively gives weight factors to the smooth regularization representation of each omics data by using the self-weighting method. Finally, MRF-MSC imposes constrained Laplacian rank on the fusion similarity matrix to get a better cluster structure. The above problems can be transformed into spectral clustering for solving, and the clustering results can be obtained. MRF-MSC unifies the above process of graph construction, graph fusion and spectral clustering under one framework, which can learn better data representation and high-quality graphs, so as to achieve better clustering effect. In the experiment, MRF-MSC obtained good experimental results on the TCGA cancer data sets.

Collapse

Krantz M, Zimmer D, Adler SO, Kitashova A, Klipp E, Mühlhaus T, Nägele T. Data Management and Modeling in Plant Biology. FRONTIERS IN PLANT SCIENCE 2021;12:717958. [PMID: 34539712 PMCID: PMC8446634 DOI: 10.3389/fpls.2021.717958] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 07/29/2021] [Indexed: 05/25/2023]

Zeng P, Tang X, Wu T, Tian Q, Li M, Ding J. [Identification of potential regulatory genes for embryonic stem cell self-renewal and pluripotency by random forest]. NAN FANG YI KE DA XUE XUE BAO = JOURNAL OF SOUTHERN MEDICAL UNIVERSITY 2021;41:1234-1238. [PMID: 34549716 DOI: 10.12122/j.issn.1673-4254.2021.08.16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Abstract

OBJECTIVE

To identify novel genes associated with self-renewal and pluripotency of mouse embryonic stem cells(mESCs)by integrating multiomics data based on machine learning methods.

METHODS

We integrated multiomics information of mESCs involving transcriptome, histone modifications, chromatin accessibility, transcription factor binding and architectural protein binding, and compared the signal differences between known stem cell self-renewal and pluripotency genes and other genes.By integrating these multiomics data, we established prediction models based on several machine learning classifiers including random forests and performed 5-fold cross validations.The model was trained using the training dataset containing two thirds of the input samples, and the remaining one third of the input samples were used as the test dataset to assess the performance of the model in independent tests.Finally, the results predicted by the model were validated through gene function annotation and cell function experiments including cell viability assay, colony formation assay and cell cycle analysis.

RESULTS

Compared with the random genes, the genes known to be associated with self-renewal and pluripotency of mESCs in the multiomics data showed significantly different features.Random forest outperformed the other machine learning algorithms tested on these multiomics data, with an area under the curve (AUC) of 0.883±0.018 for cross validation and an AUC of 0.880±0.028 for independent test.Based on this model, we identified 893 potential regulatory genes associated wwith self-renewal and pluripotency of mESCs, which were similar to the known genes in functional annotation.Known-down of the predicted novel regulator gene Cct6a resulted in significant decreases in the cell viability of mESCs (P < 0.0001) and the number of cell clones (P < 0.01), significantly increased the number of cells in G1 phase (P < 0.01) and decreasedthe number of S phase cells (P < 0.05).Knockdown of Cct6a also led to failure of positive alkaline phosphatase staining of the mESCs.

CONCLUSION

Machine learning model based on multiomics data can be used to predict potential self-renewal and pluripotency regulators with high performance.By using this model, we predicted potential self-renewal and pluripotency regulatory genes including Cct6a and applied experimental validation.This model provides new insights into the regulatory mechanism of mESCs and contribute to stem cell research.

Collapse

Hulot A, Laloë D, Jaffrézic F. A unified framework for the integration of multiple hierarchical clusterings or networks from multi-source data. BMC Bioinformatics 2021;22:392. [PMID: 34348641 PMCID: PMC8336092 DOI: 10.1186/s12859-021-04303-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Accepted: 07/13/2021] [Indexed: 11/30/2022] Open

Abstract

Background

Integrating data from different sources is a recurring question in computational biology. Much effort has been devoted to the integration of data sets of the same type, typically multiple numerical data tables. However, data types are generally heterogeneous: it is a common place to gather data in the form of trees, networks or factorial maps, as these representations all have an appealing visual interpretation that helps to study grouping patterns and interactions between entities. The question we aim to answer in this paper is that of the integration of such representations.

Results

To this end, we provide a simple procedure to compare data with various types, in particular trees or networks, that relies essentially on two steps: the first step projects the representations into a common coordinate system; the second step then uses a multi-table integration approach to compare the projected data. We rely on efficient and well-known methodologies for each step: the projection step is achieved by retrieving a distance matrix for each representation form and then applying multidimensional scaling to provide a new set of coordinates from all the pairwise distances. The integration step is then achieved by applying a multiple factor analysis to the multiple tables of the new coordinates. This procedure provides tools to integrate and compare data available, for instance, as tree or network structures. Our approach is complementary to kernel methods, traditionally used to answer the same question.

Conclusion

Our approach is evaluated on simulation and used to analyze two real-world data sets: first, we compare several clusterings for different cell-types obtained from a transcriptomics single-cell data set in mouse embryos; second, we use our procedure to aggregate a multi-table data set from the TCGA breast cancer database, in order to compare several protein networks inferred for different breast cancer subtypes.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12859-021-04303-4.

Collapse

Yu N, Wu MJ, Liu JX, Zheng CH, Xu Y. Correntropy-Based Hypergraph Regularized NMF for Clustering and Feature Selection on Multi-Cancer Integrated Data. IEEE TRANSACTIONS ON CYBERNETICS 2021;51:3952-3963. [PMID: 32603306 DOI: 10.1109/tcyb.2020.3000799] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Genome wide analysis implicates upregulation of proteasome pathway in major depressive disorder. Transl Psychiatry 2021;11:409. [PMID: 34321460 PMCID: PMC8319154 DOI: 10.1038/s41398-021-01529-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/24/2020] [Revised: 02/27/2021] [Accepted: 06/21/2021] [Indexed: 12/02/2022] Open

Stanton JE, Malijauskaite S, McGourty K, Grabrucker AM. The Metallome as a Link Between the "Omes" in Autism Spectrum Disorders. Front Mol Neurosci 2021;14:695873. [PMID: 34290588 PMCID: PMC8289253 DOI: 10.3389/fnmol.2021.695873] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 06/14/2021] [Indexed: 12/26/2022] Open

Picard M, Scott-Boyer MP, Bodein A, Périn O, Droit A. Integration strategies of multi-omics data for machine learning analysis. Comput Struct Biotechnol J 2021;19:3735-3746. [PMID: 34285775 PMCID: PMC8258788 DOI: 10.1016/j.csbj.2021.06.030] [Citation(s) in RCA: 148] [Impact Index Per Article: 49.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 06/17/2021] [Accepted: 06/21/2021] [Indexed: 12/25/2022] Open

Choi HJ, Wang C, Pan X, Jang J, Cao M, Brazzo JA, Bae Y, Lee K. Emerging machine learning approaches to phenotyping cellular motility and morphodynamics. Phys Biol 2021;18:10.1088/1478-3975/abffbe. [PMID: 33971636 PMCID: PMC9131244 DOI: 10.1088/1478-3975/abffbe] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Accepted: 05/10/2021] [Indexed: 12/22/2022]

Affiliation(s)

Hee June Choi Department of Biomedical Engineering, Worcester Polytechnic Institute, Worcester, MA 01609, United States of America Vascular Biology Program and Department of Surgery, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, United States of America
Chuangqi Wang Department of Biomedical Engineering, Worcester Polytechnic Institute, Worcester, MA 01609, United States of America Present address. Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
Xiang Pan Department of Biomedical Engineering, Worcester Polytechnic Institute, Worcester, MA 01609, United States of America Vascular Biology Program and Department of Surgery, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, United States of America
Junbong Jang Department of Biomedical Engineering, Worcester Polytechnic Institute, Worcester, MA 01609, United States of America Vascular Biology Program and Department of Surgery, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, United States of America
Mengzhi Cao Data Science Program, Worcester Polytechnic Institute, Worcester, MA 01609, United States of America
Joseph A Brazzo Department of Pathology and Anatomical Sciences, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, State University of New York, Buffalo, NY 14203, United States of America
Yongho Bae Department of Pathology and Anatomical Sciences, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, State University of New York, Buffalo, NY 14203, United States of America
Kwonmoo Lee Department of Biomedical Engineering, Worcester Polytechnic Institute, Worcester, MA 01609, United States of America Vascular Biology Program and Department of Surgery, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, United States of America

Collapse

Wang T, Shao W, Huang Z, Tang H, Zhang J, Ding Z, Huang K. MOGONET integrates multi-omics data using graph convolutional networks allowing patient classification and biomarker identification. Nat Commun 2021;12:3445. [PMID: 34103512 PMCID: PMC8187432 DOI: 10.1038/s41467-021-23774-w] [Citation(s) in RCA: 105] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 05/04/2021] [Indexed: 12/18/2022] Open

Jin T, Rehani P, Ying M, Huang J, Liu S, Roussos P, Wang D. scGRNom: a computational pipeline of integrative multi-omics analyses for predicting cell-type disease genes and regulatory networks. Genome Med 2021;13:95. [PMID: 34044854 PMCID: PMC8161957 DOI: 10.1186/s13073-021-00908-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2020] [Accepted: 05/13/2021] [Indexed: 02/06/2023] Open

Guo Y, Wang Q, Guo Y, Zhang Y, Fu Y, Zhang H. Preoperative prediction of perineural invasion with multi-modality radiomics in rectal cancer. Sci Rep 2021;11:9429. [PMID: 33941817 PMCID: PMC8093213 DOI: 10.1038/s41598-021-88831-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Accepted: 04/14/2021] [Indexed: 02/06/2023] Open

Imagine…(a common language for ICU data inquiry and analysis). Intensive Care Med 2021;46:531-533. [PMID: 32123991 DOI: 10.1007/s00134-019-05895-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Tian S, Wang C. An ensemble of the iCluster method to analyze longitudinal lncRNA expression data for psoriasis patients. Hum Genomics 2021;15:23. [PMID: 33879268 PMCID: PMC8056592 DOI: 10.1186/s40246-021-00323-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Accepted: 04/12/2021] [Indexed: 11/17/2022] Open

Abstract

Background

Psoriasis is an immune-mediated, inflammatory disorder of the skin with chronic inflammation and hyper-proliferation of the epidermis. Since psoriasis has genetic components and the diseased tissue of psoriasis is very easily accessible, it is natural to use high-throughput technologies to characterize psoriasis and thus seek targeted therapies. Transcriptional profiles change correspondingly after an intervention. Unlike cross-sectional gene expression data, longitudinal gene expression data can capture the dynamic changes and thus facilitate causal inference.

Methods

Using the iCluster method as a building block, an ensemble method was proposed and applied to a longitudinal gene expression dataset for psoriasis, with the objective of identifying key lncRNAs that can discriminate the responders from the non-responders to two immune treatments of psoriasis.

Results

Using support vector machine models, the leave-one-out predictive accuracy of the 20-lncRNA signature identified by this ensemble was estimated as 80%, which outperforms several competing methods. Furthermore, pathway enrichment analysis was performed on the target mRNAs of the identified lncRNAs. Of the enriched GO terms or KEGG pathways, proteasome, and protein deubiquitination is included. The ubiquitination-proteasome system is regarded as a key player in psoriasis, and a proteasome inhibitor to target ubiquitination pathway holds promises for treating psoriasis.

Conclusions

An integrative method such as iCluster for multiple data integration can be adopted directly to analyze longitudinal gene expression data, which offers more promising options for longitudinal big data analysis. A comprehensive evaluation and validation of the resulting 20-lncRNA signature is highly desirable.

Supplementary Information

The online version contains supplementary material available at 10.1186/s40246-021-00323-6.

Collapse

A New Era of Neuro-Oncology Research Pioneered by Multi-Omics Analysis and Machine Learning. Biomolecules 2021;11:biom11040565. [PMID: 33921457 PMCID: PMC8070530 DOI: 10.3390/biom11040565] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 04/02/2021] [Accepted: 04/07/2021] [Indexed: 02/06/2023] Open

Termine A, Fabrizio C, Strafella C, Caputo V, Petrosini L, Caltagirone C, Giardina E, Cascella R. Multi-Layer Picture of Neurodegenerative Diseases: Lessons from the Use of Big Data through Artificial Intelligence. J Pers Med 2021;11:280. [PMID: 33917161 PMCID: PMC8067806 DOI: 10.3390/jpm11040280] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Revised: 04/05/2021] [Accepted: 04/06/2021] [Indexed: 12/13/2022] Open

ORN: Inferring patient-specific dysregulation status of pathway modules in cancer with OR-gate Network. PLoS Comput Biol 2021;17:e1008792. [PMID: 33819263 PMCID: PMC8049496 DOI: 10.1371/journal.pcbi.1008792] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Revised: 04/15/2021] [Accepted: 02/15/2021] [Indexed: 01/26/2023] Open

Abstract

Pathway level understanding of cancer plays a key role in precision oncology. However, the current amount of high-throughput data cannot support the elucidation of full pathway topology. In this study, instead of directly learning the pathway network, we adapted the probabilistic OR gate to model the modular structure of pathways and regulon. The resulting model, OR-gate Network (ORN), can simultaneously infer pathway modules of somatic alterations, patient-specific pathway dysregulation status, and downstream regulon. In a trained ORN, the differentially expressed genes (DEGs) in each tumour can be explained by somatic mutations perturbing a pathway module. Furthermore, the ORN handles one of the most important properties of pathway perturbation in tumours, the mutual exclusivity. We have applied the ORN to lower-grade glioma (LGG) samples and liver hepatocellular carcinoma (LIHC) samples in TCGA and breast cancer samples from METABRIC. Both datasets have shown abnormal pathway activities related to immune response and cell cycles. In LGG samples, ORN identified pathway modules closely related to glioma development and revealed two pathways closely related to patient survival. We had similar results with LIHC samples. Additional results from the METABRIC datasets showed that ORN could characterize critical mechanisms of cancer and connect them to less studied somatic mutations (e.g., BAP1, MIR604, MICAL3, and telomere activities), which may generate novel hypothesis for targeted therapy.

Cellular functions are carried out by a set of gene products. Mutation of a single gene is often sufficient to disrupt certain biological functions and promote tumorigenesis. Therefore, genes participating in the same function are less likely to mutate in the same sample. Such phenomenon is called “mutual exclusivity”. In this study, our algorithm (ORN) has utilized this property to identify gene-level mutations that affect similar biological functions. It also considers mutations’ impact on mRNA expression. Functional modules identified by ORN tends to be mutually exclusive while causing similar differential expression profiles. When we applied ORN to lower-grade glioma and liver cancer datasets, we have identified gene modules significantly related to patient survival. Furthermore, across different types of cancer, ORN has connected well-known cancer driver mutations with genes whose functions remain unclear. These connections, once validated, can generate novel hypothesis for biologist to further investigate cancer mechanism and develop targeted therapy.

Collapse

Cancer Subtype Recognition Based on Laplacian Rank Constrained Multiview Clustering. Genes (Basel) 2021;12:genes12040526. [PMID: 33916856 PMCID: PMC8065670 DOI: 10.3390/genes12040526] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 03/28/2021] [Accepted: 03/31/2021] [Indexed: 12/13/2022] Open

Vlachavas EI, Bohn J, Ückert F, Nürnberg S. A Detailed Catalogue of Multi-Omics Methodologies for Identification of Putative Biomarkers and Causal Molecular Networks in Translational Cancer Research. Int J Mol Sci 2021;22:2822. [PMID: 33802234 PMCID: PMC8000236 DOI: 10.3390/ijms22062822] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 03/05/2021] [Accepted: 03/05/2021] [Indexed: 02/06/2023] Open