1
|
Liu F, Yang Y, Xu XS, Yuan M. MESBC: A novel mutually exclusive spectral biclustering method for cancer subtyping. Comput Biol Chem 2024; 109:108009. [PMID: 38219419 DOI: 10.1016/j.compbiolchem.2023.108009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 12/22/2023] [Accepted: 12/24/2023] [Indexed: 01/16/2024]
Abstract
Many soft biclustering algorithms have been developed and applied to various biological and biomedical data analyses. However, few mutually exclusive (hard) biclustering algorithms have been proposed, which could better identify disease or molecular subtypes with survival significance based on genomic or transcriptomic data. In this study, we developed a novel mutually exclusive spectral biclustering (MESBC) algorithm based on spectral method to detect mutually exclusive biclusters. MESBC simultaneously detects relevant features (genes) and corresponding conditions (patients) subgroups and, therefore, automatically uses the signature features for each subtype to perform the clustering. Extensive simulations revealed that MESBC provided superior accuracy in detecting pre-specified biclusters compared with the non-negative matrix factorization (NMF) and Dhillon's algorithm, particularly in very noisy data. Further analysis of the algorithm on real datasets obtained from the TCGA database showed that MESBC provided more accurate (i.e., smaller p-value) overall survival prediction in patients with lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) cancers when compared to the existing, gold-standard subtypes for lung cancers (integrative clustering). Furthermore, MESBC detected several genes with significant prognostic value in both LUAD and LUSC patients. External validation on an independent, unseen GEO dataset of LUAD showed that MESBC-derived clusters based on TCGA data still exhibited clear biclustering patterns and consistent, outstanding prognostic predictability, demonstrating robust generalizability of MESBC. Therefore, MESBC could potentially be used as a risk stratification tool to optimize the treatment for the patient, improve the selection of patients for clinical trials, and contribute to the development of novel therapeutic agents.
Collapse
Affiliation(s)
- Fengrong Liu
- Department of Statistics and Finance, University of Science and Technology of China, Hefei 230026, China
| | - Yaning Yang
- Department of Statistics and Finance, University of Science and Technology of China, Hefei 230026, China
| | | | - Min Yuan
- School of Public Health Administration, Anhui Medical University, Hefei 230032, China.
| |
Collapse
|
2
|
Qureshi TA, Chen X, Xie Y, Murakami K, Sakatani T, Kita Y, Kobayashi T, Miyake M, Knott SRV, Li D, Rosser CJ, Furuya H. MRI/RNA-Seq-Based Radiogenomics and Artificial Intelligence for More Accurate Staging of Muscle-Invasive Bladder Cancer. Int J Mol Sci 2023; 25:88. [PMID: 38203254 PMCID: PMC10778815 DOI: 10.3390/ijms25010088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 12/08/2023] [Accepted: 12/14/2023] [Indexed: 01/12/2024] Open
Abstract
Accurate staging of bladder cancer assists in identifying optimal treatment (e.g., transurethral resection vs. radical cystectomy vs. bladder preservation). However, currently, about one-third of patients are over-staged and one-third are under-staged. There is a pressing need for a more accurate staging modality to evaluate patients with bladder cancer to assist clinical decision-making. We hypothesize that MRI/RNA-seq-based radiogenomics and artificial intelligence can more accurately stage bladder cancer. A total of 40 magnetic resonance imaging (MRI) and matched formalin-fixed paraffin-embedded (FFPE) tissues were available for analysis. Twenty-eight (28) MRI and their matched FFPE tissues were available for training analysis, and 12 matched MRI and FFPE tissues were used for validation. FFPE samples were subjected to bulk RNA-seq, followed by bioinformatics analysis. In the radiomics, several hundred image-based features from bladder tumors in MRI were extracted and analyzed. Overall, the model obtained mean sensitivity, specificity, and accuracy of 94%, 88%, and 92%, respectively, in differentiating intra- vs. extra-bladder cancer. The proposed model demonstrated improvement in the three matrices by 17%, 33%, and 25% and 17%, 16%, and 17% as compared to the genetic- and radiomic-based models alone, respectively. The radiogenomics of bladder cancer provides insight into discriminative features capable of more accurately staging bladder cancer. Additional studies are underway.
Collapse
Affiliation(s)
- Touseef Ahmad Qureshi
- Biomedical Imaging Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; (T.A.Q.); (Y.X.); (D.L.)
- Department of Biomedical Science, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; (X.C.); (S.R.V.K.)
| | - Xingyu Chen
- Department of Biomedical Science, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; (X.C.); (S.R.V.K.)
- Department of Urology, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA;
| | - Yibin Xie
- Biomedical Imaging Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; (T.A.Q.); (Y.X.); (D.L.)
| | - Kaoru Murakami
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; (K.M.); (T.S.)
- Department of Urology, Kyoto University, Kyoto 606-8507, Japan; (Y.K.); (T.K.)
| | - Toru Sakatani
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; (K.M.); (T.S.)
| | - Yuki Kita
- Department of Urology, Kyoto University, Kyoto 606-8507, Japan; (Y.K.); (T.K.)
| | - Takashi Kobayashi
- Department of Urology, Kyoto University, Kyoto 606-8507, Japan; (Y.K.); (T.K.)
| | - Makito Miyake
- Department of Urology, Nara Medical University, Kashihara 634-8522, Japan;
| | - Simon R. V. Knott
- Department of Biomedical Science, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; (X.C.); (S.R.V.K.)
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; (K.M.); (T.S.)
| | - Debiao Li
- Biomedical Imaging Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; (T.A.Q.); (Y.X.); (D.L.)
| | - Charles J. Rosser
- Department of Urology, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA;
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; (K.M.); (T.S.)
| | - Hideki Furuya
- Department of Biomedical Science, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; (X.C.); (S.R.V.K.)
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; (K.M.); (T.S.)
| |
Collapse
|
3
|
Paul ED, Huraiová B, Valková N, Birknerova N, Gábrišová D, Gubova S, Ignačáková H, Ondris T, Bendíková S, Bíla J, Buranovská K, Drobná D, Krchnakova Z, Kryvokhyzha M, Lovíšek D, Mamoilyk V, Mančíková V, Vojtaššáková N, Ristová M, Comino-Méndez I, Andrašina I, Morozov P, Tuschl T, Pareja F, Čekan P. Multiplexed RNA-FISH-guided Laser Capture Microdissection RNA Sequencing Improves Breast Cancer Molecular Subtyping, Prognostic Classification, and Predicts Response to Antibody Drug Conjugates. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.12.05.23299341. [PMID: 38105959 PMCID: PMC10723508 DOI: 10.1101/2023.12.05.23299341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
On a retrospective cohort of 1,082 FFPE breast tumors, we demonstrated the analytical validity of a test using multiplexed RNA-FISH-guided laser capture microdissection (LCM) coupled with RNA-sequencing (mFISHseq), which showed 93% accuracy compared to immunohistochemistry. The combination of these technologies makes strides in i) precisely assessing tumor heterogeneity, ii) obtaining pure tumor samples using LCM to ensure accurate biomarker expression and multigene testing, and iii) providing thorough and granular data from whole transcriptome profiling. We also constructed a 293-gene intrinsic subtype classifier that performed equivalent to the research based PAM50 and AIMS classifiers. By combining three molecular classifiers for consensus subtyping, mFISHseq alleviated single sample discordance, provided near perfect concordance with other classifiers (κ > 0.85), and reclassified 30% of samples into different subtypes with prognostic implications. We also use a consensus approach to combine information from 4 multigene prognostic classifiers and clinical risk to characterize high, low, and ultra-low risk patients that relapse early (< 5 years), late (> 10 years), and rarely, respectively. Lastly, to identify potential patient subpopulations that may be responsive to treatments like antibody drug-conjugates (ADC), we curated a list of 92 genes and 110 gene signatures to interrogate their association with molecular subtype and overall survival. Many genes and gene signatures related to ADC processing (e.g., antigen/payload targets, endocytosis, and lysosome activity) were independent predictors of overall survival in multivariate Cox regression models, thus highlighting potential ADC treatment-responsive subgroups. To test this hypothesis, we constructed a unique 19-feature classifier using multivariate logistic regression with elastic net that predicted response to trastuzumab emtansine (T-DM1; AUC = 0.96) better than either ERBB2 mRNA or Her2 IHC alone in the T-DM1 arm of the I-SPY2 trial. This test was deployed in a research-use only format on 26 patients and revealed clinical insights into patient selection for novel therapies like ADCs and immunotherapies and de-escalation of adjuvant chemotherapy.
Collapse
Affiliation(s)
- Evan D. Paul
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| | - Barbora Huraiová
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| | - Natália Valková
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
- Institute of Clinical Biochemistry and Diagnostics, University Hospital, Faculty of Medicine in Hradec Kralove, Charles University, Hradec Kralove, Czech Republic
| | - Natalia Birknerova
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| | - Daniela Gábrišová
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| | - Sona Gubova
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| | - Helena Ignačáková
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| | - Tomáš Ondris
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| | - Silvia Bendíková
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| | - Jarmila Bíla
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| | - Katarína Buranovská
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| | - Diana Drobná
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| | - Zuzana Krchnakova
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| | - Maryna Kryvokhyzha
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| | - Daniel Lovíšek
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| | - Viktoriia Mamoilyk
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| | - Veronika Mančíková
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| | - Nina Vojtaššáková
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| | - Michaela Ristová
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
- Wellcome Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, Scotland, UK
| | - Iñaki Comino-Méndez
- Unidad de Gestión Clínica Intercentros de Oncología Medica, Hospitales Universitarios Regional y Virgen de la Victoria. The Biomedical Research Institute of Málaga (IBIMA-CIMES-UMA), Málaga, Spain
| | - Igor Andrašina
- Department of Radiotherapy and Oncology, East Slovakia Institute of Oncology, Košice, Slovakia
| | - Pavel Morozov
- Laboratory for RNA Molecular Biology, The Rockefeller University, New York NY, USA
| | - Thomas Tuschl
- Laboratory for RNA Molecular Biology, The Rockefeller University, New York NY, USA
| | - Fresia Pareja
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Pavol Čekan
- MultiplexDX, s.r.o., Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., Rockville, MD, USA
| |
Collapse
|
4
|
Guo H, Lv X, Li Y, Li M. Attention-based GCN integrates multi-omics data for breast cancer subtype classification and patient-specific gene marker identification. Brief Funct Genomics 2023; 22:463-474. [PMID: 37114942 DOI: 10.1093/bfgp/elad013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 02/16/2023] [Accepted: 03/17/2023] [Indexed: 04/29/2023] Open
Abstract
Breast cancer is a heterogeneous disease and can be divided into several subtypes with unique prognostic and molecular characteristics. The classification of breast cancer subtypes plays an important role in the precision treatment and prognosis of breast cancer. Benefitting from the relation-aware ability of a graph convolution network (GCN), we present a multi-omics integrative method, the attention-based GCN (AGCN), for breast cancer molecular subtype classification using messenger RNA expression, copy number variation and deoxyribonucleic acid methylation multi-omics data. In the extensive comparative studies, our AGCN models outperform state-of-the-art methods under different experimental conditions and both attention mechanisms and the graph convolution subnetwork play an important role in accurate cancer subtype classification. The layer-wise relevance propagation (LRP) algorithm is used for the interpretation of model decision, which can identify patient-specific important biomarkers that are reported to be related to the occurrence and development of breast cancer. Our results highlighted the effectiveness of the GCN and attention mechanisms in multi-omics integrative analysis and the implement of the LRP algorithm can provide biologically reasonable insights into model decision.
Collapse
Affiliation(s)
- Hui Guo
- College of Chemistry at Sichuan University
| | - Xiang Lv
- College of Chemistry at Sichuan University
| | - Yizhou Li
- College of Cyber Science and Engineering at Sichuan University
| | | |
Collapse
|
5
|
Sun P, Fan S, Li S, Zhao Y, Lu C, Wong KC, Li X. Automated exploitation of deep learning for cancer patient stratification across multiple types. Bioinformatics 2023; 39:btad654. [PMID: 37934154 PMCID: PMC10636288 DOI: 10.1093/bioinformatics/btad654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 10/17/2023] [Indexed: 11/08/2023] Open
Abstract
MOTIVATION Recent frameworks based on deep learning have been developed to identify cancer subtypes from high-throughput gene expression profiles. Unfortunately, the performance of deep learning is highly dependent on its neural network architectures which are often hand-crafted with expertise in deep neural networks, meanwhile, the optimization and adjustment of the network are usually costly and time consuming. RESULTS To address such limitations, we proposed a fully automated deep neural architecture search model for diagnosing consensus molecular subtypes from gene expression data (DNAS). The proposed model uses ant colony algorithm, one of the heuristic swarm intelligence algorithms, to search and optimize neural network architecture, and it can automatically find the optimal deep learning model architecture for cancer diagnosis in its search space. We validated DNAS on eight colorectal cancer datasets, achieving the average accuracy of 95.48%, the average specificity of 98.07%, and the average sensitivity of 96.24%, respectively. Without the loss of generality, we investigated the general applicability of DNAS further on other cancer types from different platforms including lung cancer and breast cancer, and DNAS achieved an area under the curve of 95% and 96%, respectively. In addition, we conducted gene ontology enrichment and pathological analysis to reveal interesting insights into cancer subtype identification and characterization across multiple cancer types. AVAILABILITY AND IMPLEMENTATION The source code and data can be downloaded from https://github.com/userd113/DNAS-main. And the web server of DNAS is publicly accessible at 119.45.145.120:5001.
Collapse
Affiliation(s)
- Pingping Sun
- School of Information Science and Technology, Northeast Normal University, Jilin, China
| | - Shijie Fan
- School of Information Science and Technology, Northeast Normal University, Jilin, China
| | - Shaochuan Li
- School of Information Science and Technology, Northeast Normal University, Jilin, China
- School of Artificial Intelligence, Jilin University, Jilin, China
| | - Yingwei Zhao
- School of Information Science and Technology, Northeast Normal University, Jilin, China
| | - Chang Lu
- School of Information Science and Technology, Northeast Normal University, Jilin, China
- School of Psychology, Northeast Normal University, Jilin, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Hong Kong China
| | - Xiangtao Li
- School of Artificial Intelligence, Jilin University, Jilin, China
| |
Collapse
|
6
|
Hegarty C, Neto N, Cahill P, Floudas A. Computational approaches in rheumatic diseases - Deciphering complex spatio-temporal cell interactions. Comput Struct Biotechnol J 2023; 21:4009-4020. [PMID: 37649712 PMCID: PMC10462794 DOI: 10.1016/j.csbj.2023.08.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 08/04/2023] [Accepted: 08/04/2023] [Indexed: 09/01/2023] Open
Abstract
Inflammatory arthritis, including rheumatoid (RA), and psoriatic (PsA) arthritis, are clinically and immunologically heterogeneous diseases with no identified cure. Chronic inflammation of the synovial tissue ushers loss of function of the joint that severely impacts the patient's quality of life, eventually leading to disability and life-threatening comorbidities. The pathogenesis of synovial inflammation is the consequence of compounded immune and stromal cell interactions influenced by genetic and environmental factors. Deciphering the complexity of the synovial cellular landscape has accelerated primarily due to the utilisation of bulk and single cell RNA sequencing. Particularly the capacity to generate cell-cell interaction networks could reveal evidence of previously unappreciated processes leading to disease. However, there is currently a lack of universal nomenclature as a result of varied experimental and technological approaches that discombobulates the study of synovial inflammation. While spatial transcriptomic analysis that combines anatomical information with transcriptomic data of synovial tissue biopsies promises to provide more insights into disease pathogenesis, in vitro functional assays with single-cell resolution will be required to validate current bioinformatic applications. In order to provide a comprehensive approach and translate experimental data to clinical practice, a combination of clinical and molecular data with machine learning has the potential to enhance patient stratification and identify individuals at risk of arthritis that would benefit from early therapeutic intervention. This review aims to provide a comprehensive understanding of the effect of computational approaches in deciphering synovial inflammation pathogenesis and discuss the impact that further experimental and novel computational tools may have on therapeutic target identification and drug development.
Collapse
Affiliation(s)
- Ciara Hegarty
- Translational Immunology lab, School of Biotechnology, Dublin City University, Dublin, Ireland
| | - Nuno Neto
- Trinity Centre for Biomedical Engineering, Trinity College Dublin, Ireland
| | - Paul Cahill
- Vascular Biology lab, School of Biotechnology, Dublin City University, Dublin, Ireland
| | - Achilleas Floudas
- Translational Immunology lab, School of Biotechnology, Dublin City University, Dublin, Ireland
| |
Collapse
|
7
|
Cascianelli S, Galzerano A, Masseroli M. Supervised Relevance-Redundancy assessments for feature selection in omics-based classification scenarios. J Biomed Inform 2023; 144:104457. [PMID: 37488024 DOI: 10.1016/j.jbi.2023.104457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 06/05/2023] [Accepted: 07/19/2023] [Indexed: 07/26/2023]
Abstract
BACKGROUND AND OBJECTIVE Many classification tasks in translational bioinformatics and genomics are characterized by the high dimensionality of potential features and unbalanced sample distribution among classes. This can affect classifier robustness and increase the risk of overfitting, curse of dimensionality and generalization leaks; furthermore and most importantly, this can prevent obtaining adequate patient stratification required for precision medicine in facing complex diseases, like cancer. Setting up a feature selection strategy able to extract only proper predictive features by removing irrelevant, redundant, and noisy ones is crucial to achieving valuable results on the desired task. METHODS We propose a new feature selection approach, called ReRa, based on supervised Relevance-Redundancy assessments. ReRa consists of a customized step of relevance-based filtering, to identify a reduced subset of meaningful features, followed by a supervised similarity-based procedure to minimize redundancy. This latter step innovatively uses a combination of global and class-specific similarity assessments to remove redundant features while preserving those differentiated across classes, even when these classes are strongly unbalanced. RESULTS We compared ReRa with several existing feature selection methods to obtain feature spaces on which performing breast cancer patient subtyping using several classifiers: we considered two use cases based on gene or transcript isoform expression. In the vast majority of the assessed scenarios, when using ReRa-selected feature spaces, the performances were significantly increased compared to simple feature filtering, LASSO regularization, or even MRmr - another Relevance-Redundancy method. The two use cases represent an insightful example of translational application, taking advantage of ReRa capabilities to investigate and enhance a clinically-relevant patient stratification task, which could be easily applied also to other cancer types and diseases. CONCLUSIONS ReRa approach has the potential to improve the performance of machine learning models used in an unbalanced classification scenario. Compared to another Relevance-Redundancy approach like MRmr, ReRa does not require tuning the number of preserved features, ensures efficiency and scalability over huge initial dimensionalities and allows re-evaluation of all previously selected features at each iteration of the redundancy assessment, to ultimately preserve only the most relevant and class-differentiated features.
Collapse
Affiliation(s)
- Silvia Cascianelli
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Piazza Leonardo da Vinci, 32, Milano, 20133, Italy.
| | - Arianna Galzerano
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Piazza Leonardo da Vinci, 32, Milano, 20133, Italy
| | - Marco Masseroli
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Piazza Leonardo da Vinci, 32, Milano, 20133, Italy
| |
Collapse
|
8
|
Ortiz MMO, Andrechek ER. Molecular Characterization and Landscape of Breast cancer Models from a multi-omics Perspective. J Mammary Gland Biol Neoplasia 2023; 28:12. [PMID: 37269418 DOI: 10.1007/s10911-023-09540-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Accepted: 05/25/2023] [Indexed: 06/05/2023] Open
Abstract
Breast cancer is well-known to be a highly heterogenous disease. This facet of cancer makes finding a research model that mirrors the disparate intrinsic features challenging. With advances in multi-omics technologies, establishing parallels between the various models and human tumors is increasingly intricate. Here we review the various model systems and their relation to primary breast tumors using available omics data platforms. Among the research models reviewed here, breast cancer cell lines have the least resemblance to human tumors since they have accumulated many mutations and copy number alterations during their long use. Moreover, individual proteomic and metabolomic profiles do not overlap with the molecular landscape of breast cancer. Interestingly, omics analysis revealed that the initial subtype classification of some breast cancer cell lines was inappropriate. In cell lines the major subtypes are all well represented and share some features with primary tumors. In contrast, patient-derived xenografts (PDX) and patient-derived organoids (PDO) are superior in mirroring human breast cancers at many levels, making them suitable models for drug screening and molecular analysis. While patient derived organoids are spread across luminal, basal- and normal-like subtypes, the PDX samples were initially largely basal but other subtypes have been increasingly described. Murine models offer heterogenous tumor landscapes, inter and intra-model heterogeneity, and give rise to tumors of different phenotypes and histology. Murine models have a reduced mutational burden compared to human breast cancer but share some transcriptomic resemblance, and representation of many breast cancer subtypes can be found among the variety subtypes. To date, while mammospheres and three- dimensional cultures lack comprehensive omics data, these are excellent models for the study of stem cells, cell fate decision and differentiation, and have also been used for drug screening. Therefore, this review explores the molecular landscapes and characterization of breast cancer research models by comparing recent published multi-omics data and analysis.
Collapse
Affiliation(s)
- Mylena M O Ortiz
- Genetics and Genomics Science Program, Michigan State University, East Lansing, MI, USA
| | - Eran R Andrechek
- Department of Physiology, Michigan State University, 2194 BPS Building 567 Wilson Road, East Lansing, MI, 48824, USA.
| |
Collapse
|
9
|
Cascianelli S, Barbera C, Ulla AA, Grassi E, Lupo B, Pasini D, Bertotti A, Trusolino L, Medico E, Isella C, Masseroli M. Multi-label transcriptional classification of colorectal cancer reflects tumor cell population heterogeneity. Genome Med 2023; 15:37. [PMID: 37189167 PMCID: PMC10184353 DOI: 10.1186/s13073-023-01176-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 03/31/2023] [Indexed: 05/17/2023] Open
Abstract
BACKGROUND Transcriptional classification has been used to stratify colorectal cancer (CRC) into molecular subtypes with distinct biological and clinical features. However, it is not clear whether such subtypes represent discrete, mutually exclusive entities or molecular/phenotypic states with potential overlap. Therefore, we focused on the CRC Intrinsic Subtype (CRIS) classifier and evaluated whether assigning multiple CRIS subtypes to the same sample provides additional clinically and biologically relevant information. METHODS A multi-label version of the CRIS classifier (multiCRIS) was applied to newly generated RNA-seq profiles from 606 CRC patient-derived xenografts (PDXs), together with human CRC bulk and single-cell RNA-seq datasets. Biological and clinical associations of single- and multi-label CRIS were compared. Finally, a machine learning-based multi-label CRIS predictor (ML2CRIS) was developed for single-sample classification. RESULTS Surprisingly, about half of the CRC cases could be significantly assigned to more than one CRIS subtype. Single-cell RNA-seq analysis revealed that multiple CRIS membership can be a consequence of the concomitant presence of cells of different CRIS class or, less frequently, of cells with hybrid phenotype. Multi-label assignments were found to improve prediction of CRC prognosis and response to treatment. Finally, the ML2CRIS classifier was validated for retaining the same biological and clinical associations also in the context of single-sample classification. CONCLUSIONS These results show that CRIS subtypes retain their biological and clinical features even when concomitantly assigned to the same CRC sample. This approach could be potentially extended to other cancer types and classification systems.
Collapse
Affiliation(s)
- Silvia Cascianelli
- Department of Electronics, Information and Bioengineering, Politecnico Di Milano, Piazza Leonardo da Vinci 32, 20133, Milan, Italy
| | - Chiara Barbera
- Department of Electronics, Information and Bioengineering, Politecnico Di Milano, Piazza Leonardo da Vinci 32, 20133, Milan, Italy
| | - Alexandra Ambra Ulla
- Department of Oncology, University of Turin, S.P. 142, Km 3.95, 10060, Candiolo (TO), Turin, Italy
| | - Elena Grassi
- Department of Oncology, University of Turin, S.P. 142, Km 3.95, 10060, Candiolo (TO), Turin, Italy
- Candiolo Cancer Institute, FPO-IRCCS, S.P. 142, Km 3.95, 10060, Candiolo (TO), Italy
| | - Barbara Lupo
- Department of Oncology, University of Turin, S.P. 142, Km 3.95, 10060, Candiolo (TO), Turin, Italy
- Candiolo Cancer Institute, FPO-IRCCS, S.P. 142, Km 3.95, 10060, Candiolo (TO), Italy
| | - Diego Pasini
- Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Via Adamello 16, 20139, Milan, Italy
- Department of Health Sciences, University of Milan, Via A. Di Rudini 8, 20142, Milan, Italy
| | - Andrea Bertotti
- Department of Oncology, University of Turin, S.P. 142, Km 3.95, 10060, Candiolo (TO), Turin, Italy
- Candiolo Cancer Institute, FPO-IRCCS, S.P. 142, Km 3.95, 10060, Candiolo (TO), Italy
| | - Livio Trusolino
- Department of Oncology, University of Turin, S.P. 142, Km 3.95, 10060, Candiolo (TO), Turin, Italy
- Candiolo Cancer Institute, FPO-IRCCS, S.P. 142, Km 3.95, 10060, Candiolo (TO), Italy
| | - Enzo Medico
- Department of Oncology, University of Turin, S.P. 142, Km 3.95, 10060, Candiolo (TO), Turin, Italy
- Candiolo Cancer Institute, FPO-IRCCS, S.P. 142, Km 3.95, 10060, Candiolo (TO), Italy
| | - Claudio Isella
- Department of Oncology, University of Turin, S.P. 142, Km 3.95, 10060, Candiolo (TO), Turin, Italy.
- Candiolo Cancer Institute, FPO-IRCCS, S.P. 142, Km 3.95, 10060, Candiolo (TO), Italy.
| | - Marco Masseroli
- Department of Electronics, Information and Bioengineering, Politecnico Di Milano, Piazza Leonardo da Vinci 32, 20133, Milan, Italy.
| |
Collapse
|
10
|
Bergom HE, Shabaneh A, Day A, Ali A, Boytim E, Tape S, Lozada JR, Shi X, Kerkvliet CP, McSweeney S, Pitzen SP, Ludwig M, Antonarakis ES, Drake JM, Dehm SM, Ryan CJ, Wang J, Hwang J. ALAN is a computational approach that interprets genomic findings in the context of tumor ecosystems. Commun Biol 2023; 6:417. [PMID: 37059746 PMCID: PMC10104859 DOI: 10.1038/s42003-023-04795-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Accepted: 04/03/2023] [Indexed: 04/16/2023] Open
Abstract
Gene behavior is governed by activity of other genes in an ecosystem as well as context-specific cues including cell type, microenvironment, and prior exposure to therapy. Here, we developed the Algorithm for Linking Activity Networks (ALAN) to compare gene behavior purely based on patient -omic data. The types of gene behaviors identifiable by ALAN include co-regulators of a signaling pathway, protein-protein interactions, or any set of genes that function similarly. ALAN identified direct protein-protein interactions in prostate cancer (AR, HOXB13, and FOXA1). We found differential and complex ALAN networks associated with the proto-oncogene MYC as prostate tumors develop and become metastatic, between different cancer types, and within cancer subtypes. We discovered that resistant genes in prostate cancer shared an ALAN ecosystem and activated similar oncogenic signaling pathways. Altogether, ALAN represents an informatics approach for developing gene signatures, identifying gene targets, and interpreting mechanisms of progression or therapy resistance.
Collapse
Affiliation(s)
- Hannah E Bergom
- Department of Medicine, University of Minnesota Masonic Cancer Center, Minneapolis, MN, USA
- Division of Hematology, Oncology and Transplantation, University of Minnesota, Minneapolis, MN, USA
| | - Ashraf Shabaneh
- Department of Medicine, University of Minnesota Masonic Cancer Center, Minneapolis, MN, USA
- Institute for Health Informatics, University of Minnesota, Minneapolis, MN, USA
| | - Abderrahman Day
- Department of Medicine, University of Minnesota Masonic Cancer Center, Minneapolis, MN, USA
- Division of Hematology, Oncology and Transplantation, University of Minnesota, Minneapolis, MN, USA
- Institute for Health Informatics, University of Minnesota, Minneapolis, MN, USA
| | - Atef Ali
- Department of Medicine, University of Minnesota Masonic Cancer Center, Minneapolis, MN, USA
- Division of Hematology, Oncology and Transplantation, University of Minnesota, Minneapolis, MN, USA
| | - Ella Boytim
- Department of Medicine, University of Minnesota Masonic Cancer Center, Minneapolis, MN, USA
- Division of Hematology, Oncology and Transplantation, University of Minnesota, Minneapolis, MN, USA
| | - Sydney Tape
- Department of Medicine, University of Minnesota Masonic Cancer Center, Minneapolis, MN, USA
- Division of Hematology, Oncology and Transplantation, University of Minnesota, Minneapolis, MN, USA
| | - John R Lozada
- Department of Medicine, University of Minnesota Masonic Cancer Center, Minneapolis, MN, USA
| | - Xiaolei Shi
- Department of Medicine, University of Minnesota Masonic Cancer Center, Minneapolis, MN, USA
| | - Carlos Perez Kerkvliet
- Department of Medicine, University of Minnesota Masonic Cancer Center, Minneapolis, MN, USA
| | - Sean McSweeney
- Department of Medicine, University of Minnesota Masonic Cancer Center, Minneapolis, MN, USA
| | - Samuel P Pitzen
- Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA
- Graduate Program in Molecular, Cellular, and Developmental Biology and Genetics, University of Minnesota, Minneapolis, MN, USA
| | - Megan Ludwig
- Department of Pharmacology, University of Minnesota, Minneapolis, MN, USA
| | - Emmanuel S Antonarakis
- Department of Medicine, University of Minnesota Masonic Cancer Center, Minneapolis, MN, USA
- Division of Hematology, Oncology and Transplantation, University of Minnesota, Minneapolis, MN, USA
- Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA
| | - Justin M Drake
- Department of Medicine, University of Minnesota Masonic Cancer Center, Minneapolis, MN, USA
- Department of Pharmacology, University of Minnesota, Minneapolis, MN, USA
- Department of Urology, University of Minnesota, Minneapolis, MN, USA
| | - Scott M Dehm
- Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA
- Department of Urology, University of Minnesota, Minneapolis, MN, USA
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, USA
| | - Charles J Ryan
- Department of Medicine, University of Minnesota Masonic Cancer Center, Minneapolis, MN, USA
- Division of Hematology, Oncology and Transplantation, University of Minnesota, Minneapolis, MN, USA
- Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA
- Prostate Cancer Foundation, Santa Monica, CA, USA
| | - Jinhua Wang
- Department of Medicine, University of Minnesota Masonic Cancer Center, Minneapolis, MN, USA
- Institute for Health Informatics, University of Minnesota, Minneapolis, MN, USA
- Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA
| | - Justin Hwang
- Department of Medicine, University of Minnesota Masonic Cancer Center, Minneapolis, MN, USA.
- Division of Hematology, Oncology and Transplantation, University of Minnesota, Minneapolis, MN, USA.
- Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA.
| |
Collapse
|
11
|
Delfino JG, Pennello GA, Barnhart HX, Buckler AJ, Wang X, Huang EP, Raunig DL, Guimaraes AR, Hall TJ, deSouza NM, Obuchowski N. Multiparametric Quantitative Imaging Biomarkers for Phenotype Classification: A Framework for Development and Validation. Acad Radiol 2023; 30:183-195. [PMID: 36202670 PMCID: PMC9825632 DOI: 10.1016/j.acra.2022.09.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 08/22/2022] [Accepted: 09/05/2022] [Indexed: 01/11/2023]
Abstract
This manuscript is the third in a five-part series related to statistical assessment methodology for technical performance of multi-parametric quantitative imaging biomarkers (mp-QIBs). We outline approaches and statistical methodologies for developing and evaluating a phenotype classification model from a set of multiparametric QIBs. We then describe validation studies of the classifier for precision, diagnostic accuracy, and interchangeability with a comparator classifier. We follow with an end-to-end real-world example of development and validation of a classifier for atherosclerotic plaque phenotypes. We consider diagnostic accuracy and interchangeability to be clinically meaningful claims for a phenotype classification model informed by mp-QIB inputs, aiming to provide tools to demonstrate agreement between imaging-derived characteristics and clinically established phenotypes. Understanding that we are working in an evolving field, we close our manuscript with an acknowledgement of existing challenges and a discussion of where additional work is needed. In particular, we discuss the challenges involved with technical performance and analytical validation of mp-QIBs. We intend for this manuscript to further advance the robust and promising science of multiparametric biomarker development.
Collapse
Affiliation(s)
- Jana G Delfino
- Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD.
| | - Gene A Pennello
- Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD
| | - Huiman X Barnhart
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC
| | | | - Xiaofeng Wang
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH
| | - Erich P Huang
- Biometric Research Program, Division of Cancer Treatment and Diagnosis - National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - Dave L Raunig
- Data Science Institute, Statistical and Quantitative Sciences, Takeda Pharmaceuticals America Inc, Lexington, MA
| | - Alexander R Guimaraes
- Department of Diagnostic Radiology, Oregon Health & Sciences University, Portland, OR
| | - Timothy J Hall
- Department of Medical Physics, University of Wisconsin, Madison, WI
| | - Nandita M deSouza
- Division of Radiotherapy and Imaging, the Insitute of Cancer Research and Royal Marsden NHS Foundation Trust, London, United Kingdom; European Imaging Biomarkers Alliance (EIBALL), European Society of Radiology (ESR), Vienna, Austria
| | - Nancy Obuchowski
- Department of Quantitative Health Sciences, Lerner Research Institute Cleveland Clinic, Cleveland, OH
| |
Collapse
|
12
|
Hamaneh M, Yu YK. A Simple Method for Robust and Accurate Intrinsic Subtyping of Breast Cancer. Cancer Inform 2023; 22:11769351231159893. [PMID: 37008073 PMCID: PMC10052604 DOI: 10.1177/11769351231159893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 02/07/2023] [Indexed: 04/04/2023] Open
Abstract
Motivation The PAM50 signature/method is widely used for intrinsic subtyping of breast cancer samples. However, depending on the number and composition of the samples included in a cohort, the method may assign different subtypes to the same sample. This lack of robustness is mainly due to the fact that PAM50 subtracts a reference profile, which is computed using all samples in the cohort, from each sample before classification. In this paper we propose modifications to PAM50 to develop a simple and robust single-sample classifier, called MPAM50, for intrinsic subtyping of breast cancer. Like PAM50, the modified method uses a nearest centroid approach for classification, but the centroids are computed differently, and the distances to the centroids are determined using an alternative method. Additionally, MPAM50 uses unnormalized expression values for classification and does not subtract a reference profile from the samples. In other words, MPAM50 classifies each sample independently, and so avoids the previously mentioned robustness issue. Results A training set was employed to find the new MPAM50 centroids. MPAM50 was then tested on 19 independent datasets (obtained using various expression profiling technologies) containing 9637 samples. Overall good agreement was observed between the PAM50- and MPAM50-assigned subtypes with a median accuracy of 0.792, which (we show) is comparable with the median concordance between various implementations of PAM50. Additionally, MPAM50- and PAM50-assigned intrinsic subtypes were found to agree comparably with the reported clinical subtypes. Also, survival analyses indicated that MPAM50 preserves the prognostic value of the intrinsic subtypes. These observations demonstrate that MPAM50 can replace PAM50 without loss of performance. On the other hand, MPAM50 was compared with 2 previously published single-sample classifiers, and with 3 alternative modified PAM50 approaches. The results indicated a superior performance by MPAM50. Conclusions MPAM50 is a robust, simple, and accurate single-sample classifier of intrinsic subtypes of breast cancer.
Collapse
Affiliation(s)
| | - Yi-Kuo Yu
- Yi-Kuo Yu, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA.
| |
Collapse
|
13
|
Cattelani L, Fortino V. Identifying gene expression-based biomarkers in online learning environments. BIOINFORMATICS ADVANCES 2022; 2:vbac074. [PMID: 36699355 PMCID: PMC9710669 DOI: 10.1093/bioadv/vbac074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 09/07/2022] [Accepted: 10/11/2022] [Indexed: 11/06/2022]
Abstract
Motivation Gene expression-based classifiers are often developed using historical data by training a model on a small set of patients and a large set of features. Models trained in such a way can be afterwards applied for predicting the output for new unseen patient data. However, very often the accuracy of these models starts to decrease as soon as new data is fed into the trained model. This problem, known as concept drift, complicates the task of learning efficient biomarkers from data and requires special approaches, different from commonly used data mining techniques. Results Here, we propose an online ensemble learning method to continually validate and adjust gene expression-based biomarker panels over increasing volume of data. We also propose a computational solution to the problem of feature drift where gene expression signatures used to train the classifier become less relevant over time. A benchmark study was conducted to classify the breast tumors into known subtypes by using a large-scale transcriptomic dataset (∼3500 patients), which was obtained by combining two datasets: SCAN-B and TCGA-BRCA. Remarkably, the proposed strategy improves the classification performances of gold-standard biomarker panels (e.g. PAM50, OncotypeDX and Endopredict) by adding features that are clinically relevant. Moreover, test results show that newly discovered biomarker models can retain a high classification accuracy rate when changing the source generating the gene expression profiles. Availability and implementation github.com/UEFBiomedicalInformaticsLab/OnlineLearningBD. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Luca Cattelani
- Institute of Biomedicine, School of Medicine, University of Eastern Finland, Kuopio, Finland
| | | |
Collapse
|
14
|
van der Kamp A, Waterlander TJ, de Bel T, van der Laak J, van den Heuvel-Eibrink MM, Mavinkurve-Groothuis AMC, de Krijger RR. Artificial Intelligence in Pediatric Pathology: The Extinction of a Medical Profession or the Key to a Bright Future? Pediatr Dev Pathol 2022; 25:380-387. [PMID: 35238696 DOI: 10.1177/10935266211059809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Artificial Intelligence (AI) has become of increasing interest over the past decade. While digital image analysis (DIA) is already being used in radiology, it is still in its infancy in pathology. One of the reasons is that large-scale digitization of glass slides has only recently become available. With the advent of digital slide scanners, that digitize glass slides into whole slide images, many labs are now in a transition phase towards digital pathology. However, only few departments worldwide are currently fully digital. Digital pathology provides the ability to annotate large datasets and train computers to develop and validate robust algorithms, similar to radiology. In this opinionated overview, we will give a brief introduction into AI in pathology, discuss the potential positive and negative implications and speculate about the future role of AI in the field of pediatric pathology.
Collapse
Affiliation(s)
- Ananda van der Kamp
- 541199Princess Máxima Center for Pediatric Oncology, Utrecht, the Netherlands
| | - Tomas J Waterlander
- 541199Princess Máxima Center for Pediatric Oncology, Utrecht, the Netherlands
| | - Thomas de Bel
- Department of Pathology, 234134Radboud University Medical Center, Nijmegen, the Netherlands
| | - Jeroen van der Laak
- Department of Pathology, 234134Radboud University Medical Center, Nijmegen, the Netherlands.,Center for Medical Image Science and Visualization, 4566Linköping University, Linköping, Sweden
| | | | | | - Ronald R de Krijger
- 541199Princess Máxima Center for Pediatric Oncology, Utrecht, the Netherlands.,Department of Pathology, University Medical Center Utrecht, Utrecht, the Netherlands
| |
Collapse
|
15
|
Immune Subtype Profiling and Establishment of Prognostic Immune-Related lncRNA Pairs in Human Ovarian Cancer. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:8338137. [PMID: 35578596 PMCID: PMC9107039 DOI: 10.1155/2022/8338137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 02/28/2022] [Indexed: 11/18/2022]
Abstract
This study collected immune-related genes (IRGs) and used gene expression data from TCGA database to construct a molecular subtype of ovarian cancer (OV) based on immune-related lncRNA gene pairs (IRLnc_GPs). The relationships between molecular subtypes and prognosis and clinical characteristics were further explored. IRGs were acquired from the ImmPort database, and round-robin pairing of immune-related lncRNAs was performed. The NMF algorithm was used to identify molecular subtypes, and the immune score of a single sample was calculated through ESTIMATE, TIMER, ssGSEA, MCPcounter, and CIBERSORT. The relationship between molecular subtypes and immune microenvironments was identified. A hypergeometric test was used to test the lncRNA pairs among the OV molecular subtypes (C1 and C2 subtypes). The BH method was used to screen the different lncRNA pairs, and a predictive risk model was constructed and verified. Finally, correlation analysis between the risk model, immune checkpoint genes, and chemotherapy drugs was carried out. Based on IRLnc_GP to classify 373 OV samples of TCGA, the samples were divided into two subtypes, and the prognosis between the subtypes showed significant differences. The C1 subtype with a poor prognosis was more related to the pathways of tumor occurrence and development. We identified 180 differential lncRNA pairs between subtypes and constructed a prognostic risk model based on 8 IRLnc_GPs. In the independent dataset, the distribution of subtypes in functional modules was different and highly repeatable. There were significant differences in the molecular and clinical characteristics of the subtypes and the drug sensitivity of immunotherapy/chemotherapy. In conclusion, the risk model established based on IRLnc_GP can better evaluate the prognosis of OV samples and can also assess the effects of different drug treatments in the high- and low-risk groups, providing new insights and ideas for the treatment of OV.
Collapse
|
16
|
Sun P, Wu Y, Yin C, Jiang H, Xu Y, Sun H. Molecular Subtyping of Cancer Based on Distinguishing Co-Expression Modules and Machine Learning. Front Genet 2022; 13:866005. [PMID: 35586568 PMCID: PMC9108363 DOI: 10.3389/fgene.2022.866005] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Accepted: 03/07/2022] [Indexed: 02/05/2023] Open
Abstract
Molecular subtyping of cancer is recognized as a critical and challenging step towards individualized therapy. Most existing computational methods solve this problem via multi-classification of gene-expressions of cancer samples. Although these methods, especially deep learning, perform well in data classification, they usually require large amounts of data for model training and have limitations in interpretability. Besides, as cancer is a complex systemic disease, the phenotypic difference between cancer samples can hardly be fully understood by only analyzing single molecules, and differential expression-based molecular subtyping methods are reportedly not conserved. To address the above issues, we present here a new framework for molecular subtyping of cancer through identifying a robust specific co-expression module for each subtype of cancer, generating network features for each sample by perturbing correlation levels of specific edges, and then training a deep neural network for multi-class classification. When applied to breast cancer (BRCA) and stomach adenocarcinoma (STAD) molecular subtyping, it has superior classification performance over existing methods. In addition to improving classification performance, we consider the specific co-expressed modules selected for subtyping to be biologically meaningful, which potentially offers new insight for diagnostic biomarker design, mechanistic studies of cancer, and individualized treatment plan selection.
Collapse
Affiliation(s)
- Peishuo Sun
- School of Artificial Intelligence, Jilin University, Changchun, China
| | - Ying Wu
- Phase I Clinical Trails Center, The First Affiliated Hospital, China Medical University, Shenyang, China
| | - Chaoyi Yin
- School of Artificial Intelligence, Jilin University, Changchun, China
| | - Hongyang Jiang
- School of Artificial Intelligence, Jilin University, Changchun, China
| | - Ying Xu
- Computational Systems Biology Lab, Department of Biochemistry and Molecular Biology and Institute of Bioinformatics University of Georgia, Athens, GA, United States
- *Correspondence: Huiyan Sun, ; Ying Xu,
| | - Huiyan Sun
- School of Artificial Intelligence, Jilin University, Changchun, China
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
- *Correspondence: Huiyan Sun, ; Ying Xu,
| |
Collapse
|
17
|
Abstract
Gastric cancer (GC) is a leading contributor to global cancer incidence and mortality. Pioneering genomic studies, focusing largely on primary GCs, revealed driver alterations in genes such as ERBB2, FGFR2, TP53 and ARID1A as well as multiple molecular subtypes. However, clinical efforts targeting these alterations have produced variable results, hampered by complex co-alteration patterns in molecular profiles and intra-patient genomic heterogeneity. In this Review, we highlight foundational and translational advances in dissecting the genomic cartography of GC, including non-coding variants, epigenomic aberrations and transcriptomic alterations, and describe how these alterations interplay with environmental influences, germline factors and the tumour microenvironment. Mapping of these alterations over the GC life cycle in normal gastric tissues, metaplasia, primary carcinoma and distant metastasis will improve our understanding of biological mechanisms driving GC development and promoting cancer hallmarks. On the translational front, integrative genomic approaches are identifying diverse mechanisms of GC therapy resistance and emerging preclinical targets, enabled by technologies such as single-cell sequencing and liquid biopsies. Validating these insights will require specifically designed GC cohorts, converging multi-modal genomic data with longitudinal data on therapeutic challenges and patient outcomes. Genomic findings from these studies will facilitate 'next-generation' clinical initiatives in GC precision oncology and prevention.
Collapse
Affiliation(s)
- Khay Guan Yeoh
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Gastroenterology and Hepatology, National University Health System, Singapore, Singapore
- Singapore Gastric Cancer Consortium, Singapore, Singapore
| | - Patrick Tan
- Singapore Gastric Cancer Consortium, Singapore, Singapore.
- Cancer and Stem Cell Biology, Duke-NUS Medical School Singapore, Singapore, Singapore.
- Genome Institute of Singapore, Singapore, Singapore.
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore.
| |
Collapse
|
18
|
Cristovao F, Cascianelli S, Canakoglu A, Carman M, Nanni L, Pinoli P, Masseroli M. Investigating Deep Learning Based Breast Cancer Subtyping Using Pan-Cancer and Multi-Omic Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:121-134. [PMID: 33270566 DOI: 10.1109/tcbb.2020.3042309] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Breast Cancer comprises multiple subtypes implicated in prognosis. Existing stratification methods rely on the expression quantification of small gene sets. Next Generation Sequencing promises large amounts of omic data in the next years. In this scenario, we explore the potential of machine learning and, particularly, deep learning for breast cancer subtyping. Due to the paucity of publicly available data, we leverage on pan-cancer and non-cancer data to design semi-supervised settings. We make use of multi-omic data, including microRNA expressions and copy number alterations, and we provide an in-depth investigation of several supervised and semi-supervised architectures. Obtained accuracy results show simpler models to perform at least as well as the deep semi-supervised approaches on our task over gene expression data. When multi-omic data types are combined together, performance of deep models shows little (if any) improvement in accuracy, indicating the need for further analysis on larger datasets of multi-omic data as and when they become available. From a biological perspective, our linear model mostly confirms known gene-subtype annotations. Conversely, deep approaches model non-linear relationships, which is reflected in a more varied and still unexplored set of representative omic features that may prove useful for breast cancer subtyping.
Collapse
|
19
|
Scott MA, Woolums AR, Swiderski CE, Perkins AD, Nanduri B. Genes and regulatory mechanisms associated with experimentally-induced bovine respiratory disease identified using supervised machine learning methodology. Sci Rep 2021; 11:22916. [PMID: 34824337 PMCID: PMC8616896 DOI: 10.1038/s41598-021-02343-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2021] [Accepted: 11/08/2021] [Indexed: 11/28/2022] Open
Abstract
Bovine respiratory disease (BRD) is a multifactorial disease involving complex host immune interactions shaped by pathogenic agents and environmental factors. Advancements in RNA sequencing and associated analytical methods are improving our understanding of host response related to BRD pathophysiology. Supervised machine learning (ML) approaches present one such method for analyzing new and previously published transcriptome data to identify novel disease-associated genes and mechanisms. Our objective was to apply ML models to lung and immunological tissue datasets acquired from previous clinical BRD experiments to identify genes that classify disease with high accuracy. Raw mRNA sequencing reads from 151 bovine datasets (n = 123 BRD, n = 28 control) were downloaded from NCBI-GEO. Quality filtered reads were assembled in a HISAT2/Stringtie2 pipeline. Raw gene counts for ML analysis were normalized, transformed, and analyzed with MLSeq, utilizing six ML models. Cross-validation parameters (fivefold, repeated 10 times) were applied to 70% of the compiled datasets for ML model training and parameter tuning; optimized ML models were tested with the remaining 30%. Downstream analysis of significant genes identified by the top ML models, based on classification accuracy for each etiological association, was performed within WebGestalt and Reactome (FDR ≤ 0.05). Nearest shrunken centroid and Poisson linear discriminant analysis with power transformation models identified 154 and 195 significant genes for IBR and BRSV, respectively; from these genes, the two ML models discriminated IBR and BRSV with 100% accuracy compared to sham controls. Significant genes classified by the top ML models in IBR (154) and BRSV (195), but not BVDV (74), were related to type I interferon production and IL-8 secretion, specifically in lymphoid tissue and not homogenized lung tissue. Genes identified in Mannheimia haemolytica infections (97) were involved in activating classical and alternative pathways of complement. Novel findings, including expression of genes related to reduced mitochondrial oxygenation and ATP synthesis in consolidated lung tissue, were discovered. Genes identified in each analysis represent distinct genomic events relevant to understanding and predicting clinical BRD. Our analysis demonstrates the utility of ML with published datasets for discovering functional information to support the prediction and understanding of clinical BRD.
Collapse
Affiliation(s)
- Matthew A Scott
- Veterinary Education, Research, and Outreach Center, Texas A&M University and West Texas A&M University, Canyon, TX, USA.
| | - Amelia R Woolums
- Department of Pathobiology and Population Medicine, Mississippi State University, Mississippi State, MS, USA
| | - Cyprianna E Swiderski
- Department of Pathobiology and Population Medicine, Mississippi State University, Mississippi State, MS, USA
| | - Andy D Perkins
- Department of Computer Science and Engineering, Mississippi State University, Mississippi State, MS, USA
| | - Bindu Nanduri
- Department of Comparative Biomedical Sciences, Mississippi State University, Mississippi State, MS, USA
| |
Collapse
|
20
|
Eriksson P, Marzouka NAD, Sjödahl G, Bernardo C, Liedberg F, Höglund M. A comparison of rule-based and centroid single-sample multiclass predictors for transcriptomic classification. Bioinformatics 2021; 38:1022-1029. [PMID: 34788787 PMCID: PMC8796360 DOI: 10.1093/bioinformatics/btab763] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Revised: 10/24/2021] [Accepted: 11/02/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Gene expression-based multiclass prediction, such as tumor subtyping, is a non-trivial bioinformatic problem. Most classifier methods operate by comparing expression levels relative to other samples. Methods that base predictions on the expression pattern within a sample have been proposed as an alternative. As these methods are invariant to the cohort composition and can be applied to a sample in isolation, they can collectively be termed single sample predictors (SSP). Such predictors could potentially be used for preprocessing-free classification of new samples and be built to function across different expression platforms where proper batch and dataset normalization is challenging. Here, we evaluate the behavior of several multiclass SSPs based on binary gene-pair rules (k-Top Scoring Pairs, Absolute Intrinsic Molecular Subtyping and a new Random Forest approach) and compare them to centroids built with centered or raw expression values, with the criteria that an optimal predictor should have high accuracy, overcome differences in tumor purity, be robust across expression platforms and provide an informative prediction output score. RESULTS We found that gene-pair-based SSPs showed excellent performance on many expression-based classification tasks. The three methods differed in prediction score output, handling of tied scores and behavior in low purity samples. The k-Top Scoring Pairs and Random Forest approach both achieved high classification accuracy while providing an informative prediction score. Although gene-pair-based SSPs have been touted as being cross-platform compatible (through training on mixed platform data), out-of-the-box compatibility with a new dataset remains a potential issue that warrants cohort-to-cohort verification. AVAILABILITY AND IMPLEMENTATION Our R package 'multiclassPairs' (https://cran.r-project.org/package=multiclassPairs) (https://doi.org/10.1093/bioinformatics/btab088) is freely available and enables easy training, prediction, and visualization using the gene-pair rule-based Random Forest SSP method and provides additional multiclass functionalities to the switchBox k-Top-Scoring Pairs package. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Nour-al-dain Marzouka
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden
| | - Gottfrid Sjödahl
- Urology - urothelial cancer, Department of Translational Medicine, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Carina Bernardo
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden
| | - Fredrik Liedberg
- Urology - urothelial cancer, Department of Translational Medicine, Lund University, Skåne University Hospital, Malmö, Sweden
| | - Mattias Höglund
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden
| |
Collapse
|
21
|
Abstract
High-throughput technologies such as next-generation sequencing allow biologists to observe cell function with unprecedented resolution, but the resulting datasets are too large and complicated for humans to understand without the aid of advanced statistical methods. Machine learning (ML) algorithms, which are designed to automatically find patterns in data, are well suited to this task. Yet these models are often so complex as to be opaque, leaving researchers with few clues about underlying mechanisms. Interpretable machine learning (iML) is a burgeoning subdiscipline of computational statistics devoted to making the predictions of ML models more intelligible to end users. This article is a gentle and critical introduction to iML, with an emphasis on genomic applications. I define relevant concepts, motivate leading methodologies, and provide a simple typology of existing approaches. I survey recent examples of iML in genomics, demonstrating how such techniques are increasingly integrated into research workflows. I argue that iML solutions are required to realize the promise of precision medicine. However, several open challenges remain. I examine the limitations of current state-of-the-art tools and propose a number of directions for future research. While the horizon for iML in genomics is wide and bright, continued progress requires close collaboration across disciplines.
Collapse
Affiliation(s)
- David S Watson
- Department of Statistical Science, University College London, London, UK.
| |
Collapse
|
22
|
Anderson P, Gadgil R, Johnson WA, Schwab E, Davidson JM. Reducing variability of breast cancer subtype predictors by grounding deep learning models in prior knowledge. Comput Biol Med 2021; 138:104850. [PMID: 34536702 DOI: 10.1016/j.compbiomed.2021.104850] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Revised: 08/31/2021] [Accepted: 09/05/2021] [Indexed: 12/23/2022]
Abstract
Deep learning neural networks have improved performance in many cancer informatics problems, including breast cancer subtype classification. However, many networks experience underspecificationwheremultiplecombinationsofparametersachievesimilarperformance, bothin training and validation. Additionally, certain parameter combinations may perform poorly when the test distribution differs from the training distribution. Embedding prior knowledge from the literature may address this issue by boosting predictive models that provide crucial, in-depth information about a given disease. Breast cancer research provides a wealth of such knowledge, particularly in the form of subtype biomarkers and genetic signatures. In this study, we draw on past research on breast cancer subtype biomarkers, label propagation, and neural graph machines to present a novel methodology for embedding knowledge into machine learning systems. We embed prior knowledge into the loss function in the form of inter-subject distances derived from a well-known published breast cancer signature. Our results show that this methodology reduces predictor variability on state-of-the-art deep learning architectures and increases predictor consistency leading to improved interpretation. We find that pathway enrichment analysis is more consistent after embedding knowledge. This novel method applies to a broad range of existing studies and predictive models. Our method moves the traditional synthesis of predictive models from an arbitrary assignment of weights to genes toward a more biologically meaningful approach of incorporating knowledge.
Collapse
Affiliation(s)
- Paul Anderson
- Department of Computer Science and Software Engineering, California Polytechnic State University, San Luis Obispo, CA, USA
| | - Richa Gadgil
- Department of Computer Science and Software Engineering, California Polytechnic State University, San Luis Obispo, CA, USA
| | - William A Johnson
- Department of Biology, California Polytechnic State University, San Luis Obispo, CA, USA
| | - Ella Schwab
- Department of Biology, California Polytechnic State University, San Luis Obispo, CA, USA
| | - Jean M Davidson
- Department of Biology, California Polytechnic State University, San Luis Obispo, CA, USA.
| |
Collapse
|
23
|
Yoon J, Kim M, Posadas EM, Freedland SJ, Liu Y, Davicioni E, Den RB, Trock BJ, Karnes RJ, Klein EA, Freeman MR, You S. A comparative study of PCS and PAM50 prostate cancer classification schemes. Prostate Cancer Prostatic Dis 2021; 24:733-742. [PMID: 33531653 PMCID: PMC8326303 DOI: 10.1038/s41391-021-00325-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 12/20/2020] [Accepted: 01/15/2021] [Indexed: 02/01/2023]
Abstract
BACKGROUND Two prostate cancer (PC) classification methods based on transcriptome profiles, a de novo method referred to as the "Prostate Cancer Classification System" (PCS) and a variation of the established PAM50 breast cancer algorithm, were recently proposed. Both studies concluded that most human PC can be assigned to one of three tumor subtypes, two categorized as luminal and one as basal, suggesting the two methods reflect consistency in underlying biology. Despite the similarity, differences and commonalities between the two classification methods have not yet been reported. METHODS Here, we describe a comparison of the PCS and PAM50 classification systems. PCS and PAM50 signatures consisting of 37 (PCS37) and 50 genes, respectively, were used to categorize 9,947 PC patients into PCS and PAM50 classes. Enrichment of hallmark gene sets and luminal and basal marker gene expression were assessed in the same datasets. Finally, survival analysis was performed to compare PCS and PAM50 subtypes in terms of clinical outcomes. RESULTS PCS and PAM50 subtypes show clear differential expression of PCS37 and PAM50 genes. While only three genes are shared in common between the two systems, there is some consensus between three subtype pairs (PCS1 versus Luminal B, PCS2 versus Luminal A, and PCS3 versus Basal) with respect to gene expression, cellular processes, and clinical outcomes. PCS categories displayed better separation of cellular processes and luminal and basal marker gene expression compared to PAM50. Although both PCS1 and Luminal B tumors exhibited the worst clinical outcomes, outcomes between aggressive and less aggressive subtypes were better defined in the PCS system, based on larger hazard ratios observed. CONCLUSION The PCS and PAM50 classification systems are similar in terms of molecular profiles and clinical outcomes. However, the PCS system exhibits greater separation in multiple clinical outcomes and provides better separation of prostate luminal and basal characteristics.
Collapse
Affiliation(s)
- Junhee Yoon
- Department of Surgery, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Minhyung Kim
- Department of Surgery, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Edwin M Posadas
- Urologic Oncology Program & Uro-Oncology Research Program, Cedars-Sinai Cancer, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Division of Oncology, Department of Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Stephen J Freedland
- Department of Surgery, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Division of Urology, Department of Surgery, Veteran Affairs Healthcare System, Durham, NC, USA
| | - Yang Liu
- Decipher Biosciences Inc., San Diego, CA, USA
| | | | - Robert B Den
- Department of Radiation Oncology, Jefferson Medical College of Thomas Jefferson University, Philadelphia, PA, USA
| | - Bruce J Trock
- James Buchanan Brady Urological Institute, Johns Hopkins Hospital, Baltimore, MD, USA
| | | | - Eric A Klein
- Glickman Urological and Kidney Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Michael R Freeman
- Department of Surgery, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Medicine, University of California, Los Angeles, CA, USA
| | - Sungyong You
- Department of Surgery, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
- Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
| |
Collapse
|
24
|
Fages T, Jolibois F, Poteau R. Recognition of the three-dimensional structure of small metal nanoparticles by a supervised artificial neural network. Theor Chem Acc 2021. [DOI: 10.1007/s00214-021-02795-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
25
|
Kontsevaya I, Lange C, Comella-Del-Barrio P, Coarfa C, DiNardo AR, Gillespie SH, Hauptmann M, Leschczyk C, Mandalakas AM, Martinecz A, Merker M, Niemann S, Reimann M, Rzhepishevska O, Schaible UE, Scheu KM, Schurr E, Abel Zur Wiesch P, Heyckendorf J. Perspectives for systems biology in the management of tuberculosis. Eur Respir Rev 2021; 30:30/160/200377. [PMID: 34039674 DOI: 10.1183/16000617.0377-2020] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Accepted: 01/28/2021] [Indexed: 12/18/2022] Open
Abstract
Standardised management of tuberculosis may soon be replaced by individualised, precision medicine-guided therapies informed with knowledge provided by the field of systems biology. Systems biology is a rapidly expanding field of computational and mathematical analysis and modelling of complex biological systems that can provide insights into mechanisms underlying tuberculosis, identify novel biomarkers, and help to optimise prevention, diagnosis and treatment of disease. These advances are critically important in the context of the evolving epidemic of drug-resistant tuberculosis. Here, we review the available evidence on the role of systems biology approaches - human and mycobacterial genomics and transcriptomics, proteomics, lipidomics/metabolomics, immunophenotyping, systems pharmacology and gut microbiomes - in the management of tuberculosis including prediction of risk for disease progression, severity of mycobacterial virulence and drug resistance, adverse events, comorbidities, response to therapy and treatment outcomes. Application of the Grading of Recommendations, Assessment, Development and Evaluation (GRADE) approach demonstrated that at present most of the studies provide "very low" certainty of evidence for answering clinically relevant questions. Further studies in large prospective cohorts of patients, including randomised clinical trials, are necessary to assess the applicability of the findings in tuberculosis prevention and more efficient clinical management of patients.
Collapse
Affiliation(s)
- Irina Kontsevaya
- Research Center Borstel, Borstel, Germany.,German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Borstel, Germany.,International Health/Infectious Diseases, University of Lübeck, Lübeck, Germany
| | - Christoph Lange
- Research Center Borstel, Borstel, Germany.,German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Borstel, Germany.,International Health/Infectious Diseases, University of Lübeck, Lübeck, Germany
| | - Patricia Comella-Del-Barrio
- Research Institute Germans Trias i Pujol, CIBER Respiratory Diseases, Universitat Autònoma de Barcelona, Badalona, Spain
| | - Cristian Coarfa
- Dan L Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, USA.,Molecular and Cellular Biology, Center for Precision Environmental health, Baylor College of Medicine, Houston, TX, USA
| | - Andrew R DiNardo
- The Global Tuberculosis Program, Texas Children's Hospital, Dept of Pediatrics, Baylor College of Medicine, Houston, TX, USA
| | | | - Matthias Hauptmann
- Research Center Borstel, Borstel, Germany.,German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Borstel, Germany
| | - Christoph Leschczyk
- Research Center Borstel, Borstel, Germany.,German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Borstel, Germany
| | - Anna M Mandalakas
- The Global Tuberculosis Program, Texas Children's Hospital, Dept of Pediatrics, Baylor College of Medicine, Houston, TX, USA
| | - Antal Martinecz
- Dept of Biology, Pennsylvania State University, University Park, PA, USA.,Center for Infectious Disease Dynamics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, USA.,Dept of Pharmacy, Faculty of Health Sciences, UiT, Arctic University of Norway, Tromsø, Norway
| | - Matthias Merker
- Research Center Borstel, Borstel, Germany.,German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Borstel, Germany
| | - Stefan Niemann
- Research Center Borstel, Borstel, Germany.,German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Borstel, Germany
| | - Maja Reimann
- Research Center Borstel, Borstel, Germany.,German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Borstel, Germany.,International Health/Infectious Diseases, University of Lübeck, Lübeck, Germany
| | - Olena Rzhepishevska
- Dept of Chemistry, Umeå University, Umeå, Sweden.,Dept of Clinical Microbiology, Umeå University, Umeå, Sweden
| | - Ulrich E Schaible
- Research Center Borstel, Borstel, Germany.,German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Borstel, Germany
| | | | - Erwin Schurr
- Infectious Diseases and Immunity in Global Health Program, Research Institute of the McGill University Health Centre, Montréal, Canada
| | - Pia Abel Zur Wiesch
- Dept of Biology, Pennsylvania State University, University Park, PA, USA.,Center for Infectious Disease Dynamics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Jan Heyckendorf
- Research Center Borstel, Borstel, Germany .,German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Borstel, Germany.,International Health/Infectious Diseases, University of Lübeck, Lübeck, Germany
| |
Collapse
|
26
|
Del Giudice M, Peirone S, Perrone S, Priante F, Varese F, Tirtei E, Fagioli F, Cereda M. Artificial Intelligence in Bulk and Single-Cell RNA-Sequencing Data to Foster Precision Oncology. Int J Mol Sci 2021; 22:ijms22094563. [PMID: 33925407 PMCID: PMC8123853 DOI: 10.3390/ijms22094563] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Revised: 04/21/2021] [Accepted: 04/23/2021] [Indexed: 02/01/2023] Open
Abstract
Artificial intelligence, or the discipline of developing computational algorithms able to perform tasks that requires human intelligence, offers the opportunity to improve our idea and delivery of precision medicine. Here, we provide an overview of artificial intelligence approaches for the analysis of large-scale RNA-sequencing datasets in cancer. We present the major solutions to disentangle inter- and intra-tumor heterogeneity of transcriptome profiles for an effective improvement of patient management. We outline the contributions of learning algorithms to the needs of cancer genomics, from identifying rare cancer subtypes to personalizing therapeutic treatments.
Collapse
Affiliation(s)
- Marco Del Giudice
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Candiolo Cancer Institute, FPO—IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy
| | - Serena Peirone
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Department of Physics and INFN, Università degli Studi di Torino, via P.Giuria 1, 10125 Turin, Italy
| | - Sarah Perrone
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Department of Physics, Università degli Studi di Torino, via P.Giuria 1, 10125 Turin, Italy
| | - Francesca Priante
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Department of Physics, Università degli Studi di Torino, via P.Giuria 1, 10125 Turin, Italy
| | - Fabiola Varese
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Department of Life Science and System Biology, Università degli Studi di Torino, via Accademia Albertina 13, 10123 Turin, Italy
| | - Elisa Tirtei
- Paediatric Onco-Haematology Division, Regina Margherita Children’s Hospital, City of Health and Science of Turin, 10126 Turin, Italy; (E.T.); (F.F.)
| | - Franca Fagioli
- Paediatric Onco-Haematology Division, Regina Margherita Children’s Hospital, City of Health and Science of Turin, 10126 Turin, Italy; (E.T.); (F.F.)
- Department of Public Health and Paediatric Sciences, University of Torino, 10124 Turin, Italy
| | - Matteo Cereda
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Candiolo Cancer Institute, FPO—IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy
- Correspondence: ; Tel.: +39-011-993-3969
| |
Collapse
|
27
|
A Histone Acetylation Modulator Gene Signature for Classification and Prognosis of Breast Cancer. ACTA ACUST UNITED AC 2021; 28:928-939. [PMID: 33617509 PMCID: PMC7985767 DOI: 10.3390/curroncol28010091] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 02/07/2021] [Accepted: 02/12/2021] [Indexed: 02/08/2023]
Abstract
Regulators of histone acetylation are promising epigenetic targets for therapy in breast cancer. In this study, we comprehensively analyzed the expression of histone acetylation modulator genes in breast cancer using TCGA data sources. A gene signature composed of eight histone acetylation modulators (HAMs) was found to be effective for the classification and prognosis of breast cancers, especially in the HER2-enriched and basal-like molecular subtypes. The eight genes consist of two histone acetylation writers (GTF3C4 and CLOCK), two erasers (HDAC2 and SIRT7) and four readers (BRD4, BRD7, SP100, and BRWD3). Both histone acetylation writer genes and eraser genes were found to be differentially expressed between the two groups indicating a close relationship exists between overall histone acetylation level and prognosis of breast cancer in HER2-enriched and basal-like breast cancer.
Collapse
|