1
|
Borisov N, Tkachev V, Simonov A, Sorokin M, Kim E, Kuzmin D, Karademir-Yilmaz B, Buzdin A. Uniformly shaped harmonization combines human transcriptomic data from different platforms while retaining their biological properties and differential gene expression patterns. Front Mol Biosci 2023; 10:1237129. [PMID: 37745690 PMCID: PMC10511763 DOI: 10.3389/fmolb.2023.1237129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 08/28/2023] [Indexed: 09/26/2023] Open
Abstract
Introduction: Co-normalization of RNA profiles obtained using different experimental platforms and protocols opens avenue for comprehensive comparison of relevant features like differentially expressed genes associated with disease. Currently, most of bioinformatic tools enable normalization in a flexible format that depends on the individual datasets under analysis. Thus, the output data of such normalizations will be poorly compatible with each other. Recently we proposed a new approach to gene expression data normalization termed Shambhala which returns harmonized data in a uniform shape, where every expression profile is transformed into a pre-defined universal format. We previously showed that following shambhalization of human RNA profiles, overall tissue-specific clustering features are strongly retained while platform-specific clustering is dramatically reduced. Methods: Here, we tested Shambhala performance in retention of fold-change gene expression features and other functional characteristics of gene clusters such as pathway activation levels and predicted cancer drug activity scores. Results: Using 6,793 cancer and 11,135 normal tissue gene expression profiles from the literature and experimental datasets, we applied twelve performance criteria for different versions of Shambhala and other methods of transcriptomic harmonization with flexible output data format. Such criteria dealt with the biological type classifiers, hierarchical clustering, correlation/regression properties, stability of drug efficiency scores, and data quality for using machine learning classifiers. Discussion: Shambhala-2 harmonizer demonstrated the best results with the close to 1 correlation and linear regression coefficients for the comparison of training vs validation datasets and more than two times lesser instability for calculation of drug efficiency scores compared to other methods.
Collapse
Affiliation(s)
- Nicolas Borisov
- Omicsway Corp, Walnut, CA, United States
- Moscow Institute of Physics and Technology, Dolgoprudny, Russia
| | | | - Alexander Simonov
- Moscow Institute of Physics and Technology, Dolgoprudny, Russia
- Oncobox Ltd., Moscow, Russia
| | - Maxim Sorokin
- Moscow Institute of Physics and Technology, Dolgoprudny, Russia
- Oncobox Ltd., Moscow, Russia
- World-Class Research Center “Digital Biodesign and Personalized Healthcare”, Sechenov First Moscow State Medical University, Moscow, Russia
| | - Ella Kim
- Clinic for Neurosurgery, Laboratory of Experimental Neurooncology, Johannes Gutenberg University Medical Centre, Mainz, Germany
| | - Denis Kuzmin
- Moscow Institute of Physics and Technology, Dolgoprudny, Russia
| | - Betul Karademir-Yilmaz
- Department of Biochemistry, School of Medicine/Genetic and Metabolic Diseases Research and Investigation Center (GEMHAM) Marmara University, Istanbul, Türkiye
| | - Anton Buzdin
- Moscow Institute of Physics and Technology, Dolgoprudny, Russia
- World-Class Research Center “Digital Biodesign and Personalized Healthcare”, Sechenov First Moscow State Medical University, Moscow, Russia
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia
- PathoBiology Group, European Organization for Research and Treatment of Cancer (EORTC), Brussels, Belgium
| |
Collapse
|
2
|
Huss R, Raffler J, Märkl B. Artificial intelligence and digital biomarker in precision pathology guiding immune therapy selection and precision oncology. Cancer Rep (Hoboken) 2023:e1796. [PMID: 36813293 PMCID: PMC10363837 DOI: 10.1002/cnr2.1796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 01/15/2023] [Accepted: 02/09/2023] [Indexed: 02/24/2023] Open
Abstract
BACKGROUND The currently available immunotherapies already changed the strategy how many cancers are treated from first to last line. Understanding even the most complex heterogeneity in tumor tissue and mapping the spatial cartography of the tumor immunity allows the best and optimized selection of immune modulating agents to (re-)activate the patient's immune system and direct it against the individual cancer in the most effective way. RECENT FINDINGS Primary cancer and metastases maintain a high degree of plasticity to escape any immune surveillance and continue to evolve depending on many intrinsic and extrinsic factors In the field of immune-oncology (IO) immune modulating agents are recognized as practice changing therapeutic modalities. Recent studies have shown that an optimal and lasting efficacy of IO therapeutics depends on the understanding of the spatial communication network and functional context of immune and cancer cells within the tumor microenvironment. Artificial intelligence (AI) provides an insight into the immune-cancer-network through the visualization of very complex tumor and immune interactions in cancer tissue specimens and allows the computer-assisted development and clinical validation of such digital biomarker. CONCLUSIONS The successful implementation of AI-supported digital biomarker solutions guides the clinical selection of effective immune therapeutics based on the retrieval and visualization of spatial and contextual information from cancer tissue images and standardized data. As such, computational pathology (CP) turns into "precision pathology" delivering individual therapy response prediction. Precision Pathology does not only include digital and computational solutions but also high levels of standardized processes in the routine histopathology workflow and the use of mathematical tools to support clinical and diagnostic decisions as the basic principle of a "precision oncology".
Collapse
Affiliation(s)
- Ralf Huss
- Medical Faculty University Augsburg, Augsburg, Germany
- Institute for Digital Medicine, University Hospital Augsburg, Augsburg, Germany
| | - Johannes Raffler
- Institute for Digital Medicine, University Hospital Augsburg, Augsburg, Germany
| | - Bruno Märkl
- Medical Faculty University Augsburg, Augsburg, Germany
| |
Collapse
|
3
|
Borisov N, Sorokin M, Zolotovskaya M, Borisov C, Buzdin A. Shambhala-2: A Protocol for Uniformly Shaped Harmonization of Gene Expression Profiles of Various Formats. Curr Protoc 2022; 2:e444. [PMID: 35617464 DOI: 10.1002/cpz1.444] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Uniformly shaped harmonization of gene expression profiles is central for the simultaneous comparison of multiple gene expression datasets. It is expected to operate with the gene expression data obtained using various experimental methods and equipment, and to return harmonized profiles in a uniform shape. Such uniformly shaped expression profiles from different initial datasets can be further compared directly. However, current harmonization techniques have strong limitations that prevent their broad use for bioinformatic applications. They can either operate with only up to two datasets/platforms or return data in a dynamic format that will be different for every comparison under analysis. This also does not allow for adding new data to the previously harmonized dataset(s), which complicates the analysis and increases calculation costs. We propose here a new method termed Shambhala-2 that can transform multi-platform expression data into a universal format that is identical for all harmonizations made using this technique. Shambhala-2 is based on sample-by-sample cubic conversion of the initial expression dataset into a preselected shape of the reference definitive dataset. Using 8390 samples of 12 healthy human tissue types and 4086 samples of colorectal, kidney, and lung cancer tissues, we verified Shambhala-2's capacity in restoring tissue-specific expression patterns for seven microarray and three RNA sequencing platforms. Shambhala-2 performed well for all tested combinations of RNAseq and microarray profiles, and retained gene-expression ranks, as evidenced by high correlations between different single- or aggregated gene expression metrics in pre- and post-Shambhalized samples, including preserving cancer-specific gene expression and pathway activation features. © 2022 Wiley Periodicals LLC. Basic Protocol: Shambhala-2 harmonizer Alternate Protocol 1: Linear Shambhala/Shambhala-1 Alternate Protocol 2: Alternative (flexible-format and uniformly shaped) normalization methods Support Protocol 1: Watermelon multisection (WM) Support Protocol 2: Calculation of cancer-to-normal log-fold-change (LFC) and pathway activation level (PAL).
Collapse
Affiliation(s)
- Nicolas Borisov
- Omicsway Corp., Walnut, California.,Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russia
| | - Maksim Sorokin
- Omicsway Corp., Walnut, California.,Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russia.,I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| | - Marianna Zolotovskaya
- Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russia.,Oncobox Ltd., Moscow, Russia
| | | | - Anton Buzdin
- Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russia.,Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia.,World-Class Research Center "Digital biodesign and personalized healthcare", Sechenov First Moscow State Medical University, Moscow, Russia.,PathoBiology Group, European Organization for Research and Treatment of Cancer (EORTC), Brussels, Belgium
| |
Collapse
|
4
|
Konovalov N, Timonin S, Asyutin D, Raevskiy M, Sorokin M, Buzdin A, Kaprovoy S. Transcriptomic Portraits and Molecular Pathway Activation Features of Adult Spinal Intramedullary Astrocytomas. Front Oncol 2022; 12:837570. [PMID: 35387112 PMCID: PMC8978956 DOI: 10.3389/fonc.2022.837570] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Accepted: 02/21/2022] [Indexed: 11/30/2022] Open
Abstract
In this study, we report 31 spinal intramedullary astrocytoma (SIA) RNA sequencing (RNA-seq) profiles for 25 adult patients with documented clinical annotations. To our knowledge, this is the first clinically annotated RNA-seq dataset of spinal astrocytomas derived from the intradural intramedullary compartment. We compared these tumor profiles with the previous healthy central nervous system (CNS) RNA-seq data for spinal cord and brain and identified SIA-specific gene sets and molecular pathways. Our findings suggest a trend for SIA-upregulated pathways governing interactions with the immune cells and downregulated pathways for the neuronal functioning in the context of normal CNS activity. In two patient tumor biosamples, we identified diagnostic KIAA1549-BRAF fusion oncogenes, and we also found 16 new SIA-associated fusion transcripts. In addition, we bioinformatically simulated activities of targeted cancer drugs in SIA samples and predicted that several tyrosine kinase inhibitory drugs and thalidomide analogs could be potentially effective as second-line treatment agents to aid in the prevention of SIA recurrence and progression.
Collapse
Affiliation(s)
| | | | | | - Mikhail Raevskiy
- Omicsway Corp., Walnut, CA, United States.,Moscow Institute of Physics and Technology, Moscow, Russia.,Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia.,I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| | - Maxim Sorokin
- Moscow Institute of Physics and Technology, Moscow, Russia.,Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia.,I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| | - Anton Buzdin
- Omicsway Corp., Walnut, CA, United States.,Moscow Institute of Physics and Technology, Moscow, Russia.,Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia.,I.M. Sechenov First Moscow State Medical University, Moscow, Russia.,Oncobox Ltd., Moscow, Russia
| | | |
Collapse
|
5
|
Arjmand B, Hamidpour SK, Tayanloo-Beik A, Goodarzi P, Aghayan HR, Adibi H, Larijani B. Machine Learning: A New Prospect in Multi-Omics Data Analysis of Cancer. Front Genet 2022; 13:824451. [PMID: 35154283 PMCID: PMC8829119 DOI: 10.3389/fgene.2022.824451] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Accepted: 01/10/2022] [Indexed: 12/11/2022] Open
Abstract
Cancer is defined as a large group of diseases that is associated with abnormal cell growth, uncontrollable cell division, and may tend to impinge on other tissues of the body by different mechanisms through metastasis. What makes cancer so important is that the cancer incidence rate is growing worldwide which can have major health, economic, and even social impacts on both patients and the governments. Thereby, the early cancer prognosis, diagnosis, and treatment can play a crucial role at the front line of combating cancer. The onset and progression of cancer can occur under the influence of complicated mechanisms and some alterations in the level of genome, proteome, transcriptome, metabolome etc. Consequently, the advent of omics science and its broad research branches (such as genomics, proteomics, transcriptomics, metabolomics, and so forth) as revolutionary biological approaches have opened new doors to the comprehensive perception of the cancer landscape. Due to the complexities of the formation and development of cancer, the study of mechanisms underlying cancer has gone beyond just one field of the omics arena. Therefore, making a connection between the resultant data from different branches of omics science and examining them in a multi-omics field can pave the way for facilitating the discovery of novel prognostic, diagnostic, and therapeutic approaches. As the volume and complexity of data from the omics studies in cancer are increasing dramatically, the use of leading-edge technologies such as machine learning can have a promising role in the assessments of cancer research resultant data. Machine learning is categorized as a subset of artificial intelligence which aims to data parsing, classification, and data pattern identification by applying statistical methods and algorithms. This acquired knowledge subsequently allows computers to learn and improve accurate predictions through experiences from data processing. In this context, the application of machine learning, as a novel computational technology offers new opportunities for achieving in-depth knowledge of cancer by analysis of resultant data from multi-omics studies. Therefore, it can be concluded that the use of artificial intelligence technologies such as machine learning can have revolutionary roles in the fight against cancer.
Collapse
Affiliation(s)
- Babak Arjmand
- Cell Therapy and Regenerative Medicine Research Center, Endocrinology and Metabolism Molecular-Cellular Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
- *Correspondence: Babak Arjmand, ; Bagher Larijani,
| | - Shayesteh Kokabi Hamidpour
- Cell Therapy and Regenerative Medicine Research Center, Endocrinology and Metabolism Molecular-Cellular Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Akram Tayanloo-Beik
- Cell Therapy and Regenerative Medicine Research Center, Endocrinology and Metabolism Molecular-Cellular Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Parisa Goodarzi
- Cell Therapy and Regenerative Medicine Research Center, Endocrinology and Metabolism Molecular-Cellular Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Hamid Reza Aghayan
- Cell Therapy and Regenerative Medicine Research Center, Endocrinology and Metabolism Molecular-Cellular Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Hossein Adibi
- Diabetes Research Center, Endocrinology and Metabolism Clinical Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Bagher Larijani
- Endocrinology and Metabolism Research Center, Endocrinology and Metabolism Clinical Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
- *Correspondence: Babak Arjmand, ; Bagher Larijani,
| |
Collapse
|
6
|
Improving Risk Assessment of Miscarriage During Pregnancy with Knowledge Graph Embeddings. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2021; 5:359-381. [DOI: 10.1007/s41666-021-00096-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 02/28/2021] [Accepted: 03/03/2021] [Indexed: 01/08/2023]
|
7
|
Borisov N, Sergeeva A, Suntsova M, Raevskiy M, Gaifullin N, Mendeleeva L, Gudkov A, Nareiko M, Garazha A, Tkachev V, Li X, Sorokin M, Surin V, Buzdin A. Machine Learning Applicability for Classification of PAD/VCD Chemotherapy Response Using 53 Multiple Myeloma RNA Sequencing Profiles. Front Oncol 2021; 11:652063. [PMID: 33937058 PMCID: PMC8083158 DOI: 10.3389/fonc.2021.652063] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 03/19/2021] [Indexed: 12/17/2022] Open
Abstract
Multiple myeloma (MM) affects ~500,000 people and results in ~100,000 deaths annually, being currently considered treatable but incurable. There are several MM chemotherapy treatment regimens, among which eleven include bortezomib, a proteasome-targeted drug. MM patients respond differently to bortezomib, and new prognostic biomarkers are needed to personalize treatments. However, there is a shortage of clinically annotated MM molecular data that could be used to establish novel molecular diagnostics. We report new RNA sequencing profiles for 53 MM patients annotated with responses on two similar chemotherapy regimens: bortezomib, doxorubicin, dexamethasone (PAD), and bortezomib, cyclophosphamide, dexamethasone (VCD), or with responses to their combinations. Fourteen patients received both PAD and VCD; six received only PAD, and 33 received only VCD. We compared profiles for the good and poor responders and found five genes commonly regulated here and in the previous datasets for other bortezomib regimens (all upregulated in the good responders): FGFR3, MAF, IGHA2, IGHV1-69, and GRB14. Four of these genes are linked with known immunoglobulin locus rearrangements. We then used five machine learning (ML) methods to build a classifier distinguishing good and poor responders for two cohorts: PAD + VCD (53 patients), and separately VCD (47 patients). We showed that the application of FloWPS dynamic data trimming was beneficial for all ML methods tested in both cohorts, and also in the previous MM bortezomib datasets. However, the ML models build for the different datasets did not allow cross-transferring, which can be due to different treatment regimens, experimental profiling methods, and MM heterogeneity.
Collapse
Affiliation(s)
- Nicolas Borisov
- Moscow Institute of Physics and Technology, Laboratory for Translational Genomic Bioinformatics, Dolgoprudny, Russia
| | - Anna Sergeeva
- National Research Center for Hematology, Ministry of Health of the Russian Federation, Moscow, Russia
| | - Maria Suntsova
- I.M. Sechenov First Moscow State Medical University, Institute of Personalized Medicine, Moscow, Russia
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Group for Genomic Analysis of Cell Signaling Systems, Moscow, Russia
| | - Mikhail Raevskiy
- Moscow Institute of Physics and Technology, Laboratory for Translational Genomic Bioinformatics, Dolgoprudny, Russia
| | - Nurshat Gaifullin
- Department of Pathology, Faculty of Medicine, Lomonosov Moscow State University, Moscow, Russia
| | - Larisa Mendeleeva
- National Research Center for Hematology, Ministry of Health of the Russian Federation, Moscow, Russia
| | - Alexander Gudkov
- I.M. Sechenov First Moscow State Medical University, Institute of Personalized Medicine, Moscow, Russia
| | - Maria Nareiko
- National Research Center for Hematology, Ministry of Health of the Russian Federation, Moscow, Russia
| | - Andrew Garazha
- Omicsway Corp., Research Department, Walnut, CA, United States
- Oncobox Ltd., Research Department, Moscow, Russia
| | - Victor Tkachev
- Omicsway Corp., Research Department, Walnut, CA, United States
- Oncobox Ltd., Research Department, Moscow, Russia
| | - Xinmin Li
- Department of Pathology and Laboratory Medicine, University of California Los Angeles, Los Angeles, CA, United States
| | - Maxim Sorokin
- I.M. Sechenov First Moscow State Medical University, Institute of Personalized Medicine, Moscow, Russia
- Omicsway Corp., Research Department, Walnut, CA, United States
- Oncobox Ltd., Research Department, Moscow, Russia
| | - Vadim Surin
- National Research Center for Hematology, Ministry of Health of the Russian Federation, Moscow, Russia
| | - Anton Buzdin
- I.M. Sechenov First Moscow State Medical University, Institute of Personalized Medicine, Moscow, Russia
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Group for Genomic Analysis of Cell Signaling Systems, Moscow, Russia
- Omicsway Corp., Research Department, Walnut, CA, United States
| |
Collapse
|
8
|
Using proteomic and transcriptomic data to assess activation of intracellular molecular pathways. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2021; 127:1-53. [PMID: 34340765 DOI: 10.1016/bs.apcsb.2021.02.005] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Analysis of molecular pathway activation is the recent instrument that helps to quantize activities of various intracellular signaling, structural, DNA synthesis and repair, and biochemical processes. This may have a deep impact in fundamental research, bioindustry, and medicine. Unlike gene ontology analyses and numerous qualitative methods that can establish whether a pathway is affected in principle, the quantitative approach has the advantage of exactly measuring the extent of a pathway up/downregulation. This results in emergence of a new generation of molecular biomarkers-pathway activation levels, which reflect concentration changes of all measurable pathway components. The input data can be the high-throughput proteomic or transcriptomic profiles, and the output numbers take both positive and negative values and positively reflect overall pathway activation. Due to their nature, the pathway activation levels are more robust biomarkers compared to the individual gene products/protein levels. Here, we review the current knowledge of the quantitative gene expression interrogation methods and their applications for the molecular pathway quantization. We consider enclosed bioinformatic algorithms and their applications for solving real-world problems. Besides a plethora of applications in basic life sciences, the quantitative pathway analysis can improve molecular design and clinical investigations in pharmaceutical industry, can help finding new active biotechnological components and can significantly contribute to the progressive evolution of personalized medicine. In addition to the theoretical principles and concepts, we also propose publicly available software for the use of large-scale protein/RNA expression data to assess the human pathway activation levels.
Collapse
|
9
|
Vladimirova U, Rumiantsev P, Zolotovskaia M, Albert E, Abrosimov A, Slashchuk K, Nikiforovich P, Chukhacheva O, Gaifullin N, Suntsova M, Zakharova G, Glusker A, Nikitin D, Garazha A, Li X, Kamashev D, Drobyshev A, Kochergina-Nikitskaya I, Sorokin M, Buzdin A. DNA repair pathway activation features in follicular and papillary thyroid tumors, interrogated using 95 experimental RNA sequencing profiles. Heliyon 2021; 7:e06408. [PMID: 33748479 PMCID: PMC7970325 DOI: 10.1016/j.heliyon.2021.e06408] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 12/22/2020] [Accepted: 02/26/2021] [Indexed: 12/12/2022] Open
Abstract
DNA repair can prevent mutations and cancer development, but it can also restore damaged tumor cells after chemo and radiation therapy. We performed RNA sequencing on 95 human pathological thyroid biosamples including 17 follicular adenomas, 23 follicular cancers, 3 medullar cancers, 51 papillary cancers and 1 poorly differentiated cancer. The gene expression profiles are annotated here with the clinical and histological diagnoses and, for papillary cancers, with BRAF gene V600E mutation status. DNA repair molecular pathway analysis showed strongly upregulated pathway activation levels for most of the differential pathways in the papillary cancer and moderately upregulated pattern in the follicular cancer, when compared to the follicular adenomas. This was observed for the BRCA1, ATM, p53, excision repair, and mismatch repair pathways. This finding was validated using independent thyroid tumor expression dataset PRJEB11591. We also analyzed gene expression patterns linked with the radioiodine resistant thyroid tumors (n = 13) and identified 871 differential genes that according to Gene Ontology analysis formed two functional groups: (i) response to topologically incorrect protein and (ii) aldo-keto reductase (NADP) activity. We also found RNA sequencing reads for two hybrid transcripts: one in-frame fusion for well-known NCOA4-RET translocation, and another frameshift fusion of ALK oncogene with a new partner ARHGAP12. The latter could probably support increased expression of truncated ALK downstream from 4th exon out of 28. Both fusions were found in papillary thyroid cancers of follicular histologic subtype with node metastases, one of them (NCOA4-RET) for the radioactive iodine resistant tumor. The differences in DNA repair activation patterns may help to improve therapy of different thyroid cancer types under investigation and the data communicated may serve for finding additional markers of radioiodine resistance.
Collapse
Affiliation(s)
- Uliana Vladimirova
- I.M. Sechenov First Moscow State Medical University, Moscow, 119991, Russia
- Pirogov Russian National Research Medical University, Moscow, 117997, Russia
| | - Pavel Rumiantsev
- Endocrinology Research Centre, Moscow, 117312, Russia
- Pirogov Russian National Research Medical University, Moscow, 117997, Russia
| | | | | | | | | | | | | | - Nurshat Gaifullin
- Faculty of Fundamental Medicine, Lomonosov Moscow State University, Moscow, 119992, Russia
| | - Maria Suntsova
- I.M. Sechenov First Moscow State Medical University, Moscow, 119991, Russia
| | | | - Alexander Glusker
- I.M. Sechenov First Moscow State Medical University, Moscow, 119991, Russia
| | - Daniil Nikitin
- Omicsway Corp., Walnut, CA, 91789, USA
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, 117997, Russia
| | | | - Xinmin Li
- Department of Pathology and Laboratory Medicine, University of California, Los Angeles, CA, 90095, USA
| | - Dmitriy Kamashev
- I.M. Sechenov First Moscow State Medical University, Moscow, 119991, Russia
| | - Alexei Drobyshev
- I.M. Sechenov First Moscow State Medical University, Moscow, 119991, Russia
| | | | - Maxim Sorokin
- I.M. Sechenov First Moscow State Medical University, Moscow, 119991, Russia
- Omicsway Corp., Walnut, CA, 91789, USA
- Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Anton Buzdin
- I.M. Sechenov First Moscow State Medical University, Moscow, 119991, Russia
- Omicsway Corp., Walnut, CA, 91789, USA
- Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, 117997, Russia
| |
Collapse
|
10
|
Borisov N, Ilnytskyy Y, Byeon B, Kovalchuk O, Kovalchuk I. System, Method and Software for Calculation of a Cannabis Drug Efficiency Index for the Reduction of Inflammation. Int J Mol Sci 2020; 22:ijms22010388. [PMID: 33396562 PMCID: PMC7795809 DOI: 10.3390/ijms22010388] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 12/26/2020] [Accepted: 12/28/2020] [Indexed: 12/19/2022] Open
Abstract
There are many varieties of Cannabis sativa that differ from each other by composition of cannabinoids, terpenes and other molecules. The medicinal properties of these cultivars are often very different, with some being more efficient than others. This report describes the development of a method and software for the analysis of the efficiency of various cannabis extracts to detect the anti-inflammatory properties of the various cannabis extracts. The method uses high-throughput gene expression profiling data but can potentially use other omics data as well. According to the signaling pathway topology, the gene expression profiles are convoluted into the signaling pathway activities using a signaling pathway impact analysis (SPIA) method. The method was tested by inducing inflammation in human 3D epithelial tissues, including intestine, oral and skin, and then exposing these tissues to various extracts and then performing transcriptome analysis. The analysis showed a different efficiency of the various extracts in restoring the transcriptome changes to the pre-inflammation state, thus allowing to calculate a different cannabis drug efficiency index (CDEI).
Collapse
Affiliation(s)
- Nicolas Borisov
- Moscow Institute of Physics and Technology, 9 Institutsky lane, Dolgoprudny, Moscow Region 141701, Russia;
| | - Yaroslav Ilnytskyy
- Department of Biological Sciences, University of Lethbridge, Lethbridge, AB T1K 3M4, Canada; (Y.I.); (B.B.); (O.K.)
- Pathway Rx., 16 Sandstone Rd. S., Lethbridge, AB T1K 7X8, Canada
| | - Boseon Byeon
- Department of Biological Sciences, University of Lethbridge, Lethbridge, AB T1K 3M4, Canada; (Y.I.); (B.B.); (O.K.)
- Pathway Rx., 16 Sandstone Rd. S., Lethbridge, AB T1K 7X8, Canada
- Biomedical and Health Informatics, Computer Science Department, State University of New York, 2 S Clinton St, Syracuse, NY 13202, USA
| | - Olga Kovalchuk
- Department of Biological Sciences, University of Lethbridge, Lethbridge, AB T1K 3M4, Canada; (Y.I.); (B.B.); (O.K.)
- Pathway Rx., 16 Sandstone Rd. S., Lethbridge, AB T1K 7X8, Canada
| | - Igor Kovalchuk
- Department of Biological Sciences, University of Lethbridge, Lethbridge, AB T1K 3M4, Canada; (Y.I.); (B.B.); (O.K.)
- Pathway Rx., 16 Sandstone Rd. S., Lethbridge, AB T1K 7X8, Canada
- Correspondence:
| |
Collapse
|
11
|
Cancer gene expression profiles associated with clinical outcomes to chemotherapy treatments. BMC Med Genomics 2020; 13:111. [PMID: 32948183 PMCID: PMC7499993 DOI: 10.1186/s12920-020-00759-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Accepted: 07/27/2020] [Indexed: 12/18/2022] Open
Abstract
Background Machine learning (ML) methods still have limited applicability in personalized oncology due to low numbers of available clinically annotated molecular profiles. This doesn’t allow sufficient training of ML classifiers that could be used for improving molecular diagnostics. Methods We reviewed published datasets of high throughput gene expression profiles corresponding to cancer patients with known responses on chemotherapy treatments. We browsed Gene Expression Omnibus (GEO), The Cancer Genome Atlas (TCGA) and Tumor Alterations Relevant for GEnomics-driven Therapy (TARGET) repositories. Results We identified data collections suitable to build ML models for predicting responses on certain chemotherapeutic schemes. We identified 26 datasets, ranging from 41 till 508 cases per dataset. All the datasets identified were checked for ML applicability and robustness with leave-one-out cross validation. Twenty-three datasets were found suitable for using ML that had balanced numbers of treatment responder and non-responder cases. Conclusions We collected a database of gene expression profiles associated with clinical responses on chemotherapy for 2786 individual cancer cases. Among them seven datasets included RNA sequencing data (for 645 cases) and the others – microarray expression profiles. The cases represented breast cancer, lung cancer, low-grade glioma, endothelial carcinoma, multiple myeloma, adult leukemia, pediatric leukemia and kidney tumors. Chemotherapeutics included taxanes, bortezomib, vincristine, trastuzumab, letrozole, tipifarnib, temozolomide, busulfan and cyclophosphamide.
Collapse
|
12
|
Shi XJ, Wei Y, Ji B. Systems Biology of Gastric Cancer: Perspectives on the Omics-Based Diagnosis and Treatment. Front Mol Biosci 2020; 7:203. [PMID: 33005629 PMCID: PMC7479200 DOI: 10.3389/fmolb.2020.00203] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 07/27/2020] [Indexed: 12/14/2022] Open
Abstract
Gastric cancer is the fifth most diagnosed cancer in the world, affecting more than a million people and causing nearly 783,000 deaths each year. The prognosis of advanced gastric cancer remains extremely poor despite the use of surgery and adjuvant therapy. Therefore, understanding the mechanism of gastric cancer development, and the discovery of novel diagnostic biomarkers and therapeutics are major goals in gastric cancer research. Here, we review recent progress in application of omics technologies in gastric cancer research, with special focus on the utilization of systems biology approaches to integrate multi-omics data. In addition, the association between gastrointestinal microbiota and gastric cancer are discussed, which may offer insights in exploring the novel microbiota-targeted therapeutics. Finally, the application of data-driven systems biology and machine learning approaches could provide a predictive understanding of gastric cancer, and pave the way to the development of novel biomarkers and rational design of cancer therapeutics.
Collapse
Affiliation(s)
- Xiao-Jing Shi
- Laboratory Animal Center, State Key Laboratory of Esophageal Cancer Prevention and Treatment, Academy of Medical Science, Zhengzhou University, Zhengzhou, China
| | - Yongjun Wei
- School of Pharmaceutical Sciences, Key Laboratory of Advanced Drug Preparation Technologies, Ministry of Education, Zhengzhou University, Zhengzhou, China
| | - Boyang Ji
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark
| |
Collapse
|
13
|
Jovčevska I. Next Generation Sequencing and Machine Learning Technologies Are Painting the Epigenetic Portrait of Glioblastoma. Front Oncol 2020; 10:798. [PMID: 32500035 PMCID: PMC7243123 DOI: 10.3389/fonc.2020.00798] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2019] [Accepted: 04/23/2020] [Indexed: 12/31/2022] Open
Abstract
Even with a rare occurrence of only 1.35% of cancer cases in the United States of America, brain tumors are considered as one of the most lethal malignancies. The most aggressive and invasive type of brain tumor, glioblastoma, accounts for 60–70% of all gliomas and presents with life expectancy of only 12–18 months. Despite trimodal treatment and advances in diagnostic and therapeutic methods, there are no significant changes in patient outcome. Our understanding of glioblastoma was significantly improved with the introduction of next generation sequencing technologies. This led to the identification of different genetic and molecular subtypes, which greatly improve glioblastoma diagnosis. Still, because of the poor life expectancy, novel diagnostic, and treatment methods are broadly explored. Epigenetic modifications like methylation and changes in histone acetylation are such examples. Recently, in addition to genetic and molecular characteristics, epigenetic profiling of glioblastomas is also used for sample classification. Further advancement of next generation sequencing technologies is expected to identify in detail the epigenetic signature of glioblastoma that can open up new therapeutic opportunities for glioblastoma patients. This should be complemented with the use of computational power i.e., machine and deep learning algorithms for objective diagnostics and design of individualized therapies. Using a combination of phenotypic, genotypic, and epigenetic parameters in glioblastoma diagnostics will bring us closer to precision medicine where therapies will be tailored to suit the genetic profile and epigenetic signature of the tumor, which will grant longer life expectancy and better quality of life. Still, a number of obstacles including potential bias, availability of data for minorities in heterogeneous populations, data protection, and validation and independent testing of the learning algorithms have to be overcome on the way.
Collapse
Affiliation(s)
- Ivana Jovčevska
- Medical Centre for Molecular Biology, Institute of Biochemistry, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia
| |
Collapse
|
14
|
Tkachev V, Sorokin M, Borisov C, Garazha A, Buzdin A, Borisov N. Flexible Data Trimming Improves Performance of Global Machine Learning Methods in Omics-Based Personalized Oncology. Int J Mol Sci 2020; 21:ijms21030713. [PMID: 31979006 PMCID: PMC7037338 DOI: 10.3390/ijms21030713] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Revised: 01/16/2020] [Accepted: 01/17/2020] [Indexed: 12/21/2022] Open
Abstract
(1) Background: Machine learning (ML) methods are rarely used for an omics-based prescription of cancer drugs, due to shortage of case histories with clinical outcome supplemented by high-throughput molecular data. This causes overtraining and high vulnerability of most ML methods. Recently, we proposed a hybrid global-local approach to ML termed floating window projective separator (FloWPS) that avoids extrapolation in the feature space. Its core property is data trimming, i.e., sample-specific removal of irrelevant features. (2) Methods: Here, we applied FloWPS to seven popular ML methods, including linear SVM, k nearest neighbors (kNN), random forest (RF), Tikhonov (ridge) regression (RR), binomial naïve Bayes (BNB), adaptive boosting (ADA) and multi-layer perceptron (MLP). (3) Results: We performed computational experiments for 21 high throughput gene expression datasets (41–235 samples per dataset) totally representing 1778 cancer patients with known responses on chemotherapy treatments. FloWPS essentially improved the classifier quality for all global ML methods (SVM, RF, BNB, ADA, MLP), where the area under the receiver-operator curve (ROC AUC) for the treatment response classifiers increased from 0.61–0.88 range to 0.70–0.94. We tested FloWPS-empowered methods for overtraining by interrogating the importance of different features for different ML methods in the same model datasets. (4) Conclusions: We showed that FloWPS increases the correlation of feature importance between the different ML methods, which indicates its robustness to overtraining. For all the datasets tested, the best performance of FloWPS data trimming was observed for the BNB method, which can be valuable for further building of ML classifiers in personalized oncology.
Collapse
Affiliation(s)
- Victor Tkachev
- OmicsWayCorp, Walnut, CA 91788, USA; (V.T.); (M.S.); (A.G.)
| | - Maxim Sorokin
- OmicsWayCorp, Walnut, CA 91788, USA; (V.T.); (M.S.); (A.G.)
- Institute for Personailzed Medicine, I.M. Sechenov First Moscow State Medical University, 119991 Moscow, Russia
| | - Constantin Borisov
- National Research University—Higher School of Economics, 101000 Moscow, Russia;
| | - Andrew Garazha
- OmicsWayCorp, Walnut, CA 91788, USA; (V.T.); (M.S.); (A.G.)
| | - Anton Buzdin
- OmicsWayCorp, Walnut, CA 91788, USA; (V.T.); (M.S.); (A.G.)
- Institute for Personailzed Medicine, I.M. Sechenov First Moscow State Medical University, 119991 Moscow, Russia
- Moscow Institute of Physics and Technology, 141701 Moscow Oblast, Russia
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, 117997 Moscow, Russia
| | - Nicolas Borisov
- OmicsWayCorp, Walnut, CA 91788, USA; (V.T.); (M.S.); (A.G.)
- Institute for Personailzed Medicine, I.M. Sechenov First Moscow State Medical University, 119991 Moscow, Russia
- Moscow Institute of Physics and Technology, 141701 Moscow Oblast, Russia
- Correspondence: ; Tel.: +7-903-218-7261
| |
Collapse
|