1
|
Borisov N, Tkachev V, Simonov A, Sorokin M, Kim E, Kuzmin D, Karademir-Yilmaz B, Buzdin A. Uniformly shaped harmonization combines human transcriptomic data from different platforms while retaining their biological properties and differential gene expression patterns. Front Mol Biosci 2023; 10:1237129. [PMID: 37745690 PMCID: PMC10511763 DOI: 10.3389/fmolb.2023.1237129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 08/28/2023] [Indexed: 09/26/2023] Open
Abstract
Introduction: Co-normalization of RNA profiles obtained using different experimental platforms and protocols opens avenue for comprehensive comparison of relevant features like differentially expressed genes associated with disease. Currently, most of bioinformatic tools enable normalization in a flexible format that depends on the individual datasets under analysis. Thus, the output data of such normalizations will be poorly compatible with each other. Recently we proposed a new approach to gene expression data normalization termed Shambhala which returns harmonized data in a uniform shape, where every expression profile is transformed into a pre-defined universal format. We previously showed that following shambhalization of human RNA profiles, overall tissue-specific clustering features are strongly retained while platform-specific clustering is dramatically reduced. Methods: Here, we tested Shambhala performance in retention of fold-change gene expression features and other functional characteristics of gene clusters such as pathway activation levels and predicted cancer drug activity scores. Results: Using 6,793 cancer and 11,135 normal tissue gene expression profiles from the literature and experimental datasets, we applied twelve performance criteria for different versions of Shambhala and other methods of transcriptomic harmonization with flexible output data format. Such criteria dealt with the biological type classifiers, hierarchical clustering, correlation/regression properties, stability of drug efficiency scores, and data quality for using machine learning classifiers. Discussion: Shambhala-2 harmonizer demonstrated the best results with the close to 1 correlation and linear regression coefficients for the comparison of training vs validation datasets and more than two times lesser instability for calculation of drug efficiency scores compared to other methods.
Collapse
Affiliation(s)
- Nicolas Borisov
- Omicsway Corp, Walnut, CA, United States
- Moscow Institute of Physics and Technology, Dolgoprudny, Russia
| | | | - Alexander Simonov
- Moscow Institute of Physics and Technology, Dolgoprudny, Russia
- Oncobox Ltd., Moscow, Russia
| | - Maxim Sorokin
- Moscow Institute of Physics and Technology, Dolgoprudny, Russia
- Oncobox Ltd., Moscow, Russia
- World-Class Research Center “Digital Biodesign and Personalized Healthcare”, Sechenov First Moscow State Medical University, Moscow, Russia
| | - Ella Kim
- Clinic for Neurosurgery, Laboratory of Experimental Neurooncology, Johannes Gutenberg University Medical Centre, Mainz, Germany
| | - Denis Kuzmin
- Moscow Institute of Physics and Technology, Dolgoprudny, Russia
| | - Betul Karademir-Yilmaz
- Department of Biochemistry, School of Medicine/Genetic and Metabolic Diseases Research and Investigation Center (GEMHAM) Marmara University, Istanbul, Türkiye
| | - Anton Buzdin
- Moscow Institute of Physics and Technology, Dolgoprudny, Russia
- World-Class Research Center “Digital Biodesign and Personalized Healthcare”, Sechenov First Moscow State Medical University, Moscow, Russia
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia
- PathoBiology Group, European Organization for Research and Treatment of Cancer (EORTC), Brussels, Belgium
| |
Collapse
|
2
|
Neums L, Koestler DC, Xia Q, Hu J, Patel S, Bell-Glenn S, Pei D, Zhang B, Boyd S, Chalise P, Thompson JA. Assessing equivalent and inverse change in genes between diverse experiments. FRONTIERS IN BIOINFORMATICS 2022; 2:893032. [PMID: 36304274 PMCID: PMC9580844 DOI: 10.3389/fbinf.2022.893032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 08/22/2022] [Indexed: 05/26/2024] Open
Abstract
Background: It is important to identify when two exposures impact a molecular marker (e.g., a gene's expression) in similar ways, for example, to learn that a new drug has a similar effect to an existing drug. Currently, statistically robust approaches for making comparisons of equivalence of effect sizes obtained from two independently run treatment vs. control comparisons have not been developed. Results: Here, we propose two approaches for evaluating the question of equivalence between effect sizes of two independent studies: a bootstrap test of the Equivalent Change Index (ECI), which we previously developed, and performing Two One-Sided t-Tests (TOST) on the difference in log-fold changes directly. The ECI of a gene is computed by taking the ratio of the effect size estimates obtained from the two different studies, weighted by the maximum of the two p-values and giving it a sign indicating if the effects are in the same or opposite directions, whereas TOST is a test of whether the difference in log-fold changes lies outside a region of equivalence. We used a series of simulation studies to compare the two tests on the basis of sensitivity, specificity, balanced accuracy, and F1-score. We found that TOST is not efficient for identifying equivalently changed gene expression values (F1-score = 0) because it is too conservative, while the ECI bootstrap test shows good performance (F1-score = 0.95). Furthermore, applying the ECI bootstrap test and TOST to publicly available microarray expression data from pancreatic cancer showed that, while TOST was not able to identify any equivalently or inversely changed genes, the ECI bootstrap test identified genes associated with pancreatic cancer. Additionally, when investigating publicly available RNAseq data of smoking vs. vaping, no equivalently changed genes were identified by TOST, but ECI bootstrap test identified genes associated with smoking. Conclusion: A bootstrap test of the ECI is a promising new statistical approach for determining if two diverse studies show similarity in the differential expression of genes and can help to identify genes which are similarly influenced by a specific treatment or exposure. The R package for the ECI bootstrap test is available at https://github.com/Hecate08/ECIbootstrap.
Collapse
Affiliation(s)
- Lisa Neums
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, KS, United States
- University of Kansas Cancer Center, Kansas City, KS, United States
| | - Devin C. Koestler
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, KS, United States
- University of Kansas Cancer Center, Kansas City, KS, United States
| | - Qing Xia
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, KS, United States
- University of Kansas Cancer Center, Kansas City, KS, United States
| | - Jinxiang Hu
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, KS, United States
- University of Kansas Cancer Center, Kansas City, KS, United States
| | - Shachi Patel
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, KS, United States
- University of Kansas Cancer Center, Kansas City, KS, United States
| | - Shelby Bell-Glenn
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, KS, United States
- University of Kansas Cancer Center, Kansas City, KS, United States
| | - Dong Pei
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, KS, United States
- University of Kansas Cancer Center, Kansas City, KS, United States
| | - Bo Zhang
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, KS, United States
| | - Samuel Boyd
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, KS, United States
- University of Kansas Cancer Center, Kansas City, KS, United States
| | - Prabhakar Chalise
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, KS, United States
- University of Kansas Cancer Center, Kansas City, KS, United States
| | - Jeffrey A. Thompson
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, KS, United States
- University of Kansas Cancer Center, Kansas City, KS, United States
| |
Collapse
|
3
|
Transcriptomic Harmonization as the Way for Suppressing Cross-Platform Bias and Batch Effect. Biomedicines 2022; 10:biomedicines10092318. [PMID: 36140419 PMCID: PMC9496268 DOI: 10.3390/biomedicines10092318] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 09/14/2022] [Accepted: 09/16/2022] [Indexed: 11/16/2022] Open
Abstract
(1) Background: Emergence of methods interrogating gene expression at high throughput gave birth to quantitative transcriptomics, but also posed a question of inter-comparison of expression profiles obtained using different equipment and protocols and/or in different series of experiments. Addressing this issue is challenging, because all of the above variables can dramatically influence gene expression signals and, therefore, cause a plethora of peculiar features in the transcriptomic profiles. Millions of transcriptomic profiles were obtained and deposited in public databases of which the usefulness is however strongly limited due to the inter-comparison issues; (2) Methods: Dozens of methods and software packages that can be generally classified as either flexible or predefined format harmonizers have been proposed, but none has become to the date the gold standard for unification of this type of Big Data; (3) Results: However, recent developments evidence that platform/protocol/batch bias can be efficiently reduced not only for the comparisons of limited transcriptomic datasets. Instead, instruments were proposed for transforming gene expression profiles into the universal, uniformly shaped format that can support multiple inter-comparisons for reasonable calculation costs. This forms a basement for universal indexing of all or most of all types of RNA sequencing and microarray hybridization profiles; (4) Conclusions: In this paper, we attempted to overview the landscape of modern approaches and methods in transcriptomic harmonization and focused on the practical aspects of their application.
Collapse
|
4
|
Yu J, Tu W, Payne A, Rudyk C, Cuadros Sanchez S, Khilji S, Kumarathasan P, Subedi S, Haley B, Wong A, Anghel C, Wang Y, Chauhan V. Adverse Outcome Pathways and Linkages to Transcriptomic Effects Relevant to Ionizing Radiation Injury. Int J Radiat Biol 2022; 98:1789-1801. [PMID: 35939063 DOI: 10.1080/09553002.2022.2110313] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2023]
Abstract
BACKGROUND In the past three decades, a large body of data on the effects of exposure to ionizing radiation and the ensuing changes in gene expression has been generated. These data have allowed for an understanding of molecular-level events and shown a level of consistency in response despite the vast formats and experimental procedures being used across institutions. However, clarity on how this information may inform strategies for health risk assessment needs to be explored. An approach to bridge this gap is the adverse outcome pathway (AOP) framework. AOPs represent an illustrative framework characterizing a stressor associated with a sequential set of causally linked key events (KEs) at different levels of biological organization, beginning with a molecular initiating event (MIE) and culminating in an adverse outcome (AO). Here, we demonstrate the interpretation of transcriptomic datasets in the context of the AOP framework within the field of ionizing radiation by using a lung cancer AOP (AOP 272: https://www.aopwiki.org/aops/272) as a case example. METHODS Through the mining of the literature, radiation exposure-related transcriptomic studies in line with AOP 272 related to lung cancer, DNA damage response, and repair were identified. The differentially expressed genes within relevant studies were collated and subjected to the pathway and network analysis using Reactome and GeneMANIA platforms. Identified pathways were filtered (p < 0.001, ≥ 3 genes) and categorized based on relevance to KEs in the AOP. Gene connectivities were identified and further grouped by gene expression-informed associated events (AEs). Relevant quantitative dose-response data were used to inform the directionality in the expression of the genes in the network across AEs. RESULTS Reactome analyses identified 7 high-level biological processes with multiple pathways and associated genes that mapped to potential KEs in AOP 272. The gene connectivities were further represented as a network of AEs with associated expression profiles that highlighted patterns of gene expression levels. CONCLUSIONS This study demonstrates the application of transcriptomics data in AOP development and provides information on potential data gaps. Although the approach is new and anticipated to evolve, it shows promise for improving the understanding of underlying mechanisms of disease progression with a long-term vision to be predictive of adverse outcomes.
Collapse
Affiliation(s)
- Jihang Yu
- Canadian Nuclear Laboratories, Chalk River, Ontario, Canada
| | - Wangshu Tu
- Carleton University, Ottawa, Ontario, Canada
| | | | - Chris Rudyk
- Carleton University, Ottawa, Ontario, Canada
| | | | | | | | | | - Brittany Haley
- Canadian Nuclear Laboratories, Chalk River, Ontario, Canada
| | - Alicia Wong
- Canadian Nuclear Laboratories, Chalk River, Ontario, Canada.,McMaster University, Hamilton, Ontario, Canada
| | | | - Yi Wang
- Canadian Nuclear Laboratories, Chalk River, Ontario, Canada.,University of Ottawa, Ottawa, Ontario, Canada
| | | |
Collapse
|
5
|
Borisov N, Sorokin M, Zolotovskaya M, Borisov C, Buzdin A. Shambhala-2: A Protocol for Uniformly Shaped Harmonization of Gene Expression Profiles of Various Formats. Curr Protoc 2022; 2:e444. [PMID: 35617464 DOI: 10.1002/cpz1.444] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Uniformly shaped harmonization of gene expression profiles is central for the simultaneous comparison of multiple gene expression datasets. It is expected to operate with the gene expression data obtained using various experimental methods and equipment, and to return harmonized profiles in a uniform shape. Such uniformly shaped expression profiles from different initial datasets can be further compared directly. However, current harmonization techniques have strong limitations that prevent their broad use for bioinformatic applications. They can either operate with only up to two datasets/platforms or return data in a dynamic format that will be different for every comparison under analysis. This also does not allow for adding new data to the previously harmonized dataset(s), which complicates the analysis and increases calculation costs. We propose here a new method termed Shambhala-2 that can transform multi-platform expression data into a universal format that is identical for all harmonizations made using this technique. Shambhala-2 is based on sample-by-sample cubic conversion of the initial expression dataset into a preselected shape of the reference definitive dataset. Using 8390 samples of 12 healthy human tissue types and 4086 samples of colorectal, kidney, and lung cancer tissues, we verified Shambhala-2's capacity in restoring tissue-specific expression patterns for seven microarray and three RNA sequencing platforms. Shambhala-2 performed well for all tested combinations of RNAseq and microarray profiles, and retained gene-expression ranks, as evidenced by high correlations between different single- or aggregated gene expression metrics in pre- and post-Shambhalized samples, including preserving cancer-specific gene expression and pathway activation features. © 2022 Wiley Periodicals LLC. Basic Protocol: Shambhala-2 harmonizer Alternate Protocol 1: Linear Shambhala/Shambhala-1 Alternate Protocol 2: Alternative (flexible-format and uniformly shaped) normalization methods Support Protocol 1: Watermelon multisection (WM) Support Protocol 2: Calculation of cancer-to-normal log-fold-change (LFC) and pathway activation level (PAL).
Collapse
Affiliation(s)
- Nicolas Borisov
- Omicsway Corp., Walnut, California.,Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russia
| | - Maksim Sorokin
- Omicsway Corp., Walnut, California.,Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russia.,I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| | - Marianna Zolotovskaya
- Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russia.,Oncobox Ltd., Moscow, Russia
| | | | - Anton Buzdin
- Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russia.,Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia.,World-Class Research Center "Digital biodesign and personalized healthcare", Sechenov First Moscow State Medical University, Moscow, Russia.,PathoBiology Group, European Organization for Research and Treatment of Cancer (EORTC), Brussels, Belgium
| |
Collapse
|
6
|
Federico A, Saarimäki LA, Serra A, Del Giudice G, Kinaret PAS, Scala G, Greco D. Microarray Data Preprocessing: From Experimental Design to Differential Analysis. Methods Mol Biol 2022; 2401:79-100. [PMID: 34902124 DOI: 10.1007/978-1-0716-1839-4_7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
DNA microarray data preprocessing is of utmost importance in the analytical path starting from the experimental design and leading to a reliable biological interpretation. In fact, when all relevant aspects regarding the experimental plan have been considered, the following steps from data quality check to differential analysis will lead to robust, trustworthy results. In this chapter, all the relevant aspects and considerations about microarray preprocessing will be discussed. Preprocessing steps are organized in an orderly manner, from experimental design to quality check and batch effect removal, including the most common visualization methods. Furthermore, we will discuss data representation and differential testing methods with a focus on the most common microarray technologies, such as gene expression and DNA methylation.
Collapse
Affiliation(s)
- Antonio Federico
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- BioMediTech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Laura Aliisa Saarimäki
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- BioMediTech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Angela Serra
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- BioMediTech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Giusy Del Giudice
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- BioMediTech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Pia Anneli Sofia Kinaret
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- BioMediTech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
- Institute of Biotechnology,, University of Helsinki, Helsinki, Finland
| | - Giovanni Scala
- Department of Biology, University of Naples Federico II, Naples, Italy
| | - Dario Greco
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland.
- BioMediTech Institute, Tampere University, Tampere, Finland.
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland.
- Institute of Biotechnology,, University of Helsinki, Helsinki, Finland.
| |
Collapse
|
7
|
Emdadi A, Eslahchi C. Clinical drug response prediction from preclinical cancer cell lines by logistic matrix factorization approach. J Bioinform Comput Biol 2021; 20:2150035. [PMID: 34923927 DOI: 10.1142/s0219720021500359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Predicting tumor drug response using cancer cell line drug response values for a large number of anti-cancer drugs is a significant challenge in personalized medicine. Predicting patient response to drugs from data obtained from preclinical models is made easier by the availability of different knowledge on cell lines and drugs. This paper proposes the TCLMF method, a predictive model for predicting drug response in tumor samples that was trained on preclinical samples and is based on the logistic matrix factorization approach. The TCLMF model is designed based on gene expression profiles, tissue type information, the chemical structure of drugs and drug sensitivity (IC 50) data from cancer cell lines. We use preclinical data from the Genomics of Drug Sensitivity in Cancer dataset (GDSC) to train the proposed drug response model, which we then use to predict drug sensitivity of samples from the Cancer Genome Atlas (TCGA) dataset. The TCLMF approach focuses on identifying successful features of cell lines and drugs in order to calculate the probability of the tumor samples being sensitive to drugs. The closest cell line neighbours for each tumor sample are calculated using a description of similarity between tumor samples and cell lines in this study. The drug response for a new tumor is then calculated by averaging the low-rank features obtained from its neighboring cell lines. We compare the results of the TCLMF model with the results of the previously proposed methods using two databases and two approaches to test the model's performance. In the first approach, 12 drugs with enough known clinical drug response, considered in previous methods, are studied. For 7 drugs out of 12, the TCLMF can significantly distinguish between patients that are resistance to these drugs and the patients that are sensitive to them. These approaches are converted to classification models using a threshold in the second approach, and the results are compared. The results demonstrate that the TCLMF method provides accurate predictions across the results of the other algorithms. Finally, we accurately classify tumor tissue type using the latent vectors obtained from TCLMF's logistic matrix factorization process. These findings demonstrate that the TCLMF approach produces effective latent vectors for tumor samples. The source code of the TCLMF method is available in https://github.com/emdadi/TCLMF.
Collapse
Affiliation(s)
- Akram Emdadi
- Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran
| | - Changiz Eslahchi
- Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran.,School of Biological Sciences, Institute for Research in Fundamental Sciences(IPM), Tehran, Iran
| |
Collapse
|
8
|
Wang L, Mo C, Wang L, Cheng M. Identification of genes and pathways related to breast cancer metastasis in an integrated cohort. Eur J Clin Invest 2021; 51:e13525. [PMID: 33615456 DOI: 10.1111/eci.13525] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/25/2020] [Revised: 01/20/2021] [Accepted: 02/18/2021] [Indexed: 12/24/2022]
Abstract
BACKGROUND Breast cancer is the most common malignant disease in women. Metastasis is the most common cause of death from this cancer. Screening genes related to breast cancer metastasis may help elucidate the mechanisms governing metastasis and identify molecular targets for antimetastatic therapy. The development of advanced algorithms enables us to perform cross-study analysis to improve the robustness of the results. MATERIALS AND METHODS Ten data sets meeting our criteria for differential expression analyses were obtained from the Gene Expression Omnibus (GEO) database. Among these data sets, five based on the same platform were formed into a large cohort using the XPN algorithm. Differentially expressed genes (DEGs) associated with breast cancer metastasis were identified using the differential expression via distance synthesis (DEDS) algorithm. A cross-platform method was employed to verify these DEGs in all ten selected data sets. The top 50 validated DEGs are represented with heat maps. Based on the validated DEGs, Gene Ontology (GO) functional and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed. Protein interaction (PPI) networks were constructed to further illustrate the direct and indirect associations among the DEGs. Survival analysis was performed to explore whether these genes can affect breast cancer patient prognosis. RESULTS A total of 817 DEGs were identified using the DEDS algorithm. Of these DEGs, 450 genes were validated by the second algorithm. Enriched KEGG pathway terms demonstrated that these 450 DEGs may be involved in the cell cycle and oocyte meiosis in addition to their functions in ECM-receptor interaction and protein digestion and absorption. PPI network analysis for the proteins encoded by the DEGs indicated that these genes may be primarily involved in the cell cycle and extracellular matrix. In particular, several genes played roles in multiple signalling pathways and were related to patient survival. These genes were also observed to be targetable in the CTD2 database. CONCLUSIONS Our study analysed multiple cross-platform data sets using two different algorithms, helping elucidate the molecular mechanisms and identify several potential therapeutic targets of metastatic breast cancer. In addition, several genes exhibited promise for applications in targeted therapy against metastasis in future research.
Collapse
Affiliation(s)
- Lingchen Wang
- Center for Experimental Medicine, The First Affiliated Hospital of Nanchang University, Nanchang, China.,Department of Biostatistics, School of Public Health, Nanchang University, Nanchang, China
| | - Changgan Mo
- Department of Cardiology, The People's Hospital of Hechi, Hechi, China
| | - Liqin Wang
- Department of Traditional Chinese Medicine, The First Affiliated Hospital of Nanchang University, Nanchang, China
| | - Minzhang Cheng
- Center for Experimental Medicine, The First Affiliated Hospital of Nanchang University, Nanchang, China.,Jiangxi Key Laboratory of Molecular Diagnostics and Precision Medicine, Nanchang, China
| |
Collapse
|
9
|
Using proteomic and transcriptomic data to assess activation of intracellular molecular pathways. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2021; 127:1-53. [PMID: 34340765 DOI: 10.1016/bs.apcsb.2021.02.005] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Analysis of molecular pathway activation is the recent instrument that helps to quantize activities of various intracellular signaling, structural, DNA synthesis and repair, and biochemical processes. This may have a deep impact in fundamental research, bioindustry, and medicine. Unlike gene ontology analyses and numerous qualitative methods that can establish whether a pathway is affected in principle, the quantitative approach has the advantage of exactly measuring the extent of a pathway up/downregulation. This results in emergence of a new generation of molecular biomarkers-pathway activation levels, which reflect concentration changes of all measurable pathway components. The input data can be the high-throughput proteomic or transcriptomic profiles, and the output numbers take both positive and negative values and positively reflect overall pathway activation. Due to their nature, the pathway activation levels are more robust biomarkers compared to the individual gene products/protein levels. Here, we review the current knowledge of the quantitative gene expression interrogation methods and their applications for the molecular pathway quantization. We consider enclosed bioinformatic algorithms and their applications for solving real-world problems. Besides a plethora of applications in basic life sciences, the quantitative pathway analysis can improve molecular design and clinical investigations in pharmaceutical industry, can help finding new active biotechnological components and can significantly contribute to the progressive evolution of personalized medicine. In addition to the theoretical principles and concepts, we also propose publicly available software for the use of large-scale protein/RNA expression data to assess the human pathway activation levels.
Collapse
|
10
|
Wang L, Chu CY, McCall MN, Slaunwhite C, Holden-Wiltse J, Corbett A, Falsey AR, Topham DJ, Caserta MT, Mariani TJ, Walsh EE, Qiu X. Airway gene-expression classifiers for respiratory syncytial virus (RSV) disease severity in infants. BMC Med Genomics 2021; 14:57. [PMID: 33632195 PMCID: PMC7908785 DOI: 10.1186/s12920-021-00913-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 02/19/2021] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND A substantial number of infants infected with RSV develop severe symptoms requiring hospitalization. We currently lack accurate biomarkers that are associated with severe illness. METHOD We defined airway gene expression profiles based on RNA sequencing from nasal brush samples from 106 full-tem previously healthy RSV infected subjects during acute infection (day 1-10 of illness) and convalescence stage (day 28 of illness). All subjects were assigned a clinical illness severity score (GRSS). Using AIC-based model selection, we built a sparse linear correlate of GRSS based on 41 genes (NGSS1). We also built an alternate model based upon 13 genes associated with severe infection acutely but displaying stable expression over time (NGSS2). RESULTS NGSS1 is strongly correlated with the disease severity, demonstrating a naïve correlation (ρ) of ρ = 0.935 and cross-validated correlation of 0.813. As a binary classifier (mild versus severe), NGSS1 correctly classifies disease severity in 89.6% of the subjects following cross-validation. NGSS2 has slightly less, but comparable, accuracy with a cross-validated correlation of 0.741 and classification accuracy of 84.0%. CONCLUSION Airway gene expression patterns, obtained following a minimally-invasive procedure, have potential utility for development of clinically useful biomarkers that correlate with disease severity in primary RSV infection.
Collapse
Affiliation(s)
- Lu Wang
- Department of Biostatistics and Computational Biology, University of Rochester School Medicine, Rochester, NY, USA
| | - Chin-Yi Chu
- Department of Pediatrics, University of Rochester School Medicine, Rochester, NY, USA
| | - Matthew N McCall
- Department of Biostatistics and Computational Biology, University of Rochester School Medicine, Rochester, NY, USA
| | | | - Jeanne Holden-Wiltse
- Department of Biostatistics and Computational Biology, University of Rochester School Medicine, Rochester, NY, USA
| | - Anthony Corbett
- Department of Biostatistics and Computational Biology, University of Rochester School Medicine, Rochester, NY, USA
| | - Ann R Falsey
- Department of Medicine, University of Rochester School Medicine, Rochester, NY, USA
- Department of Medicine, Rochester General Hospital, Rochester, NY, USA
| | - David J Topham
- Department of Microbiology and Immunology, University of Rochester School Medicine, Rochester, NY, USA
| | - Mary T Caserta
- Department of Pediatrics, University of Rochester School Medicine, Rochester, NY, USA
| | - Thomas J Mariani
- Department of Pediatrics, University of Rochester School Medicine, Rochester, NY, USA.
| | - Edward E Walsh
- Department of Medicine, University of Rochester School Medicine, Rochester, NY, USA.
- Department of Medicine, Rochester General Hospital, Rochester, NY, USA.
| | - Xing Qiu
- Department of Biostatistics and Computational Biology, University of Rochester School Medicine, Rochester, NY, USA.
| |
Collapse
|
11
|
Junet V, Farrés J, Mas JM, Daura X. CuBlock: a cross-platform normalization method for gene-expression microarrays. Bioinformatics 2021; 37:2365-2373. [PMID: 33609102 PMCID: PMC8388031 DOI: 10.1093/bioinformatics/btab105] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Revised: 02/04/2021] [Accepted: 02/16/2021] [Indexed: 12/28/2022] Open
Abstract
Motivation Cross-(multi)platform normalization of gene-expression microarray data remains an unresolved issue. Despite the existence of several algorithms, they are either constrained by the need to normalize all samples of all platforms together, compromising scalability and reuse, by adherence to the platforms of a specific provider, or simply by poor performance. In addition, many of the methods presented in the literature have not been specifically tested against multi-platform data and/or other methods applicable in this context. Thus, we set out to develop a normalization algorithm appropriate for gene-expression studies based on multiple, potentially large microarray sets collected along multiple platforms and at different times, applicable in systematic studies aimed at extracting knowledge from the wealth of microarray data available in public repositories; for example, for the extraction of Real-World Data to complement data from Randomized Controlled Trials. Our main focus or criterion for performance was on the capacity of the algorithm to properly separate samples from different biological groups. Results We present CuBlock, an algorithm addressing this objective, together with a strategy to validate cross-platform normalization methods. To validate the algorithm and benchmark it against existing methods, we used two distinct datasets, one specifically generated for testing and standardization purposes and one from an actual experimental study. Using these datasets, we benchmarked CuBlock against ComBat (Johnson et al., 2007), UPC (Piccolo et al., 2013), YuGene (Lê Cao et al., 2014), DBNorm (Meng et al., 2017), Shambhala (Borisov et al., 2019) and a simple log2 transform as reference. We note that many other popular normalization methods are not applicable in this context. CuBlock was the only algorithm in this group that could always and clearly differentiate the underlying biological groups after mixing the data, from up to six different platforms in this study. Availability and implementation CuBlock can be downloaded from https://www.mathworks.com/matlabcentral/fileexchange/77882-cublock. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Valentin Junet
- Anaxomics Biotech SL, Barcelona, 08008, Spain.,Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona, 08193, Spain
| | | | - José M Mas
- Anaxomics Biotech SL, Barcelona, 08008, Spain
| | - Xavier Daura
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona, 08193, Spain.,Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, 08010, Spain
| |
Collapse
|
12
|
Lung PY, Zhong D, Pang X, Li Y, Zhang J. Maximizing the reusability of gene expression data by predicting missing metadata. PLoS Comput Biol 2020; 16:e1007450. [PMID: 33156882 PMCID: PMC7673503 DOI: 10.1371/journal.pcbi.1007450] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2019] [Revised: 11/18/2020] [Accepted: 10/09/2020] [Indexed: 11/18/2022] Open
Abstract
Reusability is part of the FAIR data principle, which aims to make data Findable, Accessible, Interoperable, and Reusable. One of the current efforts to increase the reusability of public genomics data has been to focus on the inclusion of quality metadata associated with the data. When necessary metadata are missing, most researchers will consider the data useless. In this study, we developed a framework to predict the missing metadata of gene expression datasets to maximize their reusability. We found that when using predicted data to conduct other analyses, it is not optimal to use all the predicted data. Instead, one should only use the subset of data, which can be predicted accurately. We proposed a new metric called Proportion of Cases Accurately Predicted (PCAP), which is optimized in our specifically-designed machine learning pipeline. The new approach performed better than pipelines using commonly used metrics such as F1-score in terms of maximizing the reusability of data with missing values. We also found that different variables might need to be predicted using different machine learning methods and/or different data processing protocols. Using differential gene expression analysis as an example, we showed that when missing variables are accurately predicted, the corresponding gene expression data can be reliably used in downstream analyses.
Collapse
Affiliation(s)
- Pei-Yau Lung
- Department of Statistics, Florida State University, Tallahassee, United States of America
| | - Dongrui Zhong
- Department of Statistics, Florida State University, Tallahassee, United States of America
| | - Xiaodong Pang
- Insilicom LLC, Tallahassee, United States of America
| | - Yan Li
- Department of Breast Surgery, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, China
| | - Jinfeng Zhang
- Department of Statistics, Florida State University, Tallahassee, United States of America
- * E-mail:
| |
Collapse
|
13
|
Leng D, Yi J, Xiang M, Zhao H, Zhang Y. Identification of common signatures in idiopathic pulmonary fibrosis and lung cancer using gene expression modeling. BMC Cancer 2020; 20:986. [PMID: 33046043 PMCID: PMC7552373 DOI: 10.1186/s12885-020-07494-w] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Accepted: 10/05/2020] [Indexed: 12/24/2022] Open
Abstract
Background Idiopathic pulmonary fibrosis (IPF) is associated with an increased risk for lung cancer, but the underlying mechanisms driving malignant transformation remain largely unknown. This study aimed to identify differentially expressed genes (DEGs) distinguishing IPF and lung cancer from healthy individuals and common genes driving the transformation from healthy to IPF and lung cancer. Methods The gene expression data for IPF and non-small cell lung cancer (NSCLC) were retrieved from the Gene Expression Omnibus (GEO) database. The DEG signatures were identified via unsupervised two-way clustering (TWC) analysis, supervised support vector machine analysis, dimensional reduction, and mutual exclusivity analysis. Gene enrichment and pathway analyses were performed to identify common signaling pathways. The most significant signature genes in common among IPF and lung cancer were further verified by immunohistochemistry. Results The gene expression data from GSE24206 and GSE18842 were merged into a super array dataset comprising 86 patients with lung disorders (17 IPF and 46 NSCLC) and 51 healthy controls and measuring 23,494 unique genes. Seventy-nine signature DEGs were found among IPF and NSCLC. The peroxisome proliferator-activated receptor (PPAR) signaling pathway was the most enriched pathway associated with lung disorders, and matrix metalloproteinase-1 (MMP-1) in this pathway was mutually exclusive with several genes in IPF and NSCLC. Subsequent immunohistochemical analysis verified enhanced MMP1 expression in NSCLC associated with IPF. Conclusions For the first time, we defined common signature genes for IPF and NSCLC. The mutually exclusive sets of genes were potential drivers for IPF and NSCLC.
Collapse
Affiliation(s)
- Dong Leng
- Clinical Laboratory, Beijing Chao-Yang Hospital, Capital Medical University, Beijing, 100020, China
| | - Jiawen Yi
- Department of Respiratory and Critical Care Medicine, Beijing Chao-Yang Hospital, Capital Medical University, No. 8 Gongti South Road, Beijing, 100020, China
| | - Maodong Xiang
- Tokyo Institute of Technology, 4259 Nagatsuta-cho, Midori-ku, Yokohama, Kanagawa, 226-8503, Japan
| | - Hongying Zhao
- Department of Pathology, Beijing Chao-Yang Hospital, Capital Medical University, Beijing, 100020, China
| | - Yuhui Zhang
- Department of Respiratory and Critical Care Medicine, Beijing Chao-Yang Hospital, Capital Medical University, No. 8 Gongti South Road, Beijing, 100020, China.
| |
Collapse
|
14
|
Zhang S, Shao J, Yu D, Qiu X, Zhang J. MatchMixeR: a cross-platform normalization method for gene expression data integration. Bioinformatics 2020; 36:2486-2491. [PMID: 31904810 DOI: 10.1093/bioinformatics/btz974] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 09/19/2019] [Accepted: 12/31/2019] [Indexed: 01/18/2023] Open
Abstract
MOTIVATION Combining gene expression (GE) profiles generated from different platforms enables previously infeasible studies due to sample size limitations. Several cross-platform normalization methods have been developed to remove the systematic differences between platforms, but they may also remove meaningful biological differences among datasets. In this work, we propose a novel approach that removes the platform, not the biological differences. Dubbed as 'MatchMixeR', we model platform differences by a linear mixed effects regression (LMER) model, and estimate them from matched GE profiles of the same cell line or tissue measured on different platforms. The resulting model can then be used to remove platform differences in other datasets. By using LMER, we achieve better bias-variance trade-off in parameter estimation. We also design a computationally efficient algorithm based on the moment method, which is ideal for ultra-high-dimensional LMER analysis. RESULTS Compared with several prominent competing methods, MatchMixeR achieved the highest after-normalization concordance. Subsequent differential expression analyses based on datasets integrated from different platforms showed that using MatchMixeR achieved the best trade-off between true and false discoveries, and this advantage is more apparent in datasets with limited samples or unbalanced group proportions. AVAILABILITY AND IMPLEMENTATION Our method is implemented in a R-package, 'MatchMixeR', freely available at: https://github.com/dy16b/Cross-Platform-Normalization. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Serin Zhang
- Department of Statistics, Florida State University, Tallahassee, FL 32306, USA
| | - Jiang Shao
- Gilead Sciences Inc., Foster City, CA 94404, USA
| | - Disa Yu
- Department of Statistics, Florida State University, Tallahassee, FL 32306, USA
| | - Xing Qiu
- Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY 14624, USA
| | - Jinfeng Zhang
- Department of Statistics, Florida State University, Tallahassee, FL 32306, USA
| |
Collapse
|
15
|
Borisov N, Sorokin M, Garazha A, Buzdin A. Quantitation of Molecular Pathway Activation Using RNA Sequencing Data. Methods Mol Biol 2020; 2063:189-206. [PMID: 31667772 DOI: 10.1007/978-1-0716-0138-9_15] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Intracellular molecular pathways (IMPs) control all major events in the living cell. IMPs are considered hotspots in biomedical sciences and thousands of IMPs have been discovered for humans and model organisms. Knowledge of IMPs activation is essential for understanding biological functions and differences between the biological objects at the molecular level. Here we describe the Oncobox system for accurate quantitative scoring activities of up to several thousand molecular pathways based on high throughput molecular data. Although initially designed for gene expression and mainly RNA sequencing data, Oncobox is now also applicable for quantitative proteomics, microRNA and transcription factor binding sites mapping data. The Oncobox system includes modules of gene expression data harmonization, aggregation and comparison and a recursive algorithm for automatic annotation of molecular pathways. The universal rationale of Oncobox enables scoring of signaling, metabolic, cytoskeleton, immunity, DNA repair, and other pathways in a multitude of biological objects. The Oncobox system can be helpful to all those working in the fields of genetics, biochemistry, interactomics, and big data analytics in molecular biomedicine.
Collapse
Affiliation(s)
- Nicolas Borisov
- Laboratory of Clinical Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia
- Omicsway Corp., Walnut, CA, USA
| | - Maxim Sorokin
- Laboratory of Clinical Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia
- Omicsway Corp., Walnut, CA, USA
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia
| | | | - Anton Buzdin
- Laboratory of Clinical Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia.
- Omicsway Corp., Walnut, CA, USA.
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia.
| |
Collapse
|
16
|
Tkachev V, Sorokin M, Garazha A, Borisov N, Buzdin A. Oncobox Method for Scoring Efficiencies of Anticancer Drugs Based on Gene Expression Data. Methods Mol Biol 2020; 2063:235-255. [PMID: 31667774 DOI: 10.1007/978-1-0716-0138-9_17] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
We describe here the Oncobox method for scoring efficiencies of anticancer target drugs (ATDs) using high throughput gene expression data. The method rationale, design, and validation are given along with the examples of its practical applications in biomedicine. The method is based on the analysis of intracellular molecular pathways activation and measuring expressions of molecular target genes for every ATD under consideration. Using Oncobox method requires collection of normal (control) expression profiles and annotated databases of molecular pathways and drug target genes. Both microarray and RNA sequencing profiles are acceptable, although the latter type of data prevails in the most recent applications of this technique.
Collapse
Affiliation(s)
| | - Maxim Sorokin
- Omicsway Corp., Walnut, CA, USA
- Laboratory of Clinical Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| | | | - Nicolas Borisov
- Omicsway Corp., Walnut, CA, USA
- Laboratory of Clinical Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| | - Anton Buzdin
- Omicsway Corp., Walnut, CA, USA.
- Laboratory of Clinical Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia.
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia.
| |
Collapse
|
17
|
Zhang HP, Li SY. Clinical significance of expression of glutathione peroxidase 3 in gastric cancer. Shijie Huaren Xiaohua Zazhi 2019; 27:1483-1489. [DOI: 10.11569/wcjd.v27.i24.1483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Glutathione peroxidase 3 (GPX3) expression is down-regulated in gastric cancer (GC), but the relationship between GPX3 expression and prognosis in this malignancy is yet unknown.
AIM To explore the expression pattern and prognostic value of GPX3 in GC.
METHODS GPX3 expression was analyzed based on the Oncomine database. The prognostic value of GPX3 in GC patients was investigated using the KM Plotter database. To validate the expression pattern and prognostic value of GPX3, TCGA GC dataset was also analyzed. Finally, the expression pattern and prognostic value of GPX3 was evaluated by tissue microarray and immunohistochemistry in 90 GC patients.
RESULTS Oncomine database analysis showed that GPX3 was significantly down-regulated in GC tissues compared with normal tissues (P < 0.05). Data from the KM Plotter database showed that GPX3 low expression was significantly related with overall survival (P < 0.05). TCGA dataset analysis also showed that GPX3 low expression was an indicator of better prognosis (P < 0.05). Tissue microarray and immunohistochemistry showed that GPX3 was significantly down-regulated in GC tissue (P = 0.037). GPX3 expression was related with GC patient overall survival (HR = 0.48, 95%CI: 0.28-0.85, P = 0.019), rather than age, gender, and tumor clinical stage.
CONCLUSION GPX3 is downregulated in GC, and GPX3 expression can be used to predict GC patients' prognosis.
Collapse
Affiliation(s)
- Hai-Ping Zhang
- Department of Gastroenterology, Zhongshan Hospital of Hubei Province, Wuhan 430000, Hubei Province, China
| | - Shu-Yu Li
- Department of Gastroenterology, Zhongshan Hospital of Hubei Province, Wuhan 430000, Hubei Province, China
| |
Collapse
|
18
|
Ragonnaud E, Moritoh K, Bodogai M, Gusev F, Garaud S, Chen C, Wang X, Baljinnyam T, Becker KG, Maul RW, Willard-Gallo K, Rogaev E, Biragyn A. Tumor-Derived Thymic Stromal Lymphopoietin Expands Bone Marrow B-cell Precursors in Circulation to Support Metastasis. Cancer Res 2019; 79:5826-5838. [PMID: 31575547 DOI: 10.1158/0008-5472.can-19-1058] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 07/29/2019] [Accepted: 09/23/2019] [Indexed: 12/21/2022]
Abstract
Immature B cells in the bone marrow emigrate into the spleen during adult lymphopoiesis. Here, we report that emigration is shifted to earlier B-cell stages in mice with orthotopic breast cancer, spontaneous ovarian cancer, and possibly in human breast carcinoma. Using mouse and human bone marrow aspirates and mouse models challenged with highly metastatic 4T1 breast cancer cells, we demonstrated that this was the result of secretion of thymic stromal lymphopoietin (TSLP) by cancer cells. First, TSLP downregulated surface expression of bone marrow (BM) retention receptors CXCR4 and VLA4 in B-cell precursors, increasing their motility and, presumably, emigration. Then, TSLP supported peripheral survival and proliferation of BM B-cell precursors such as pre-B-like cells. 4T1 cancer cells used the increased pool of circulating pre-B-like cells to generate metastasis-supporting regulatory B cells. As such, the loss of TSLP expression in cancer cells alone or TSLPR deficiency in B cells blocked both accumulation of pre-B-like cells in circulation and cancer metastasis, implying that the pre-B cell-TSLP axis can be an attractive therapeutic target. SIGNIFICANCE: Cancer cells induce premature emigration of B-cell precursors from the bone marrow to generate regulatory B cells.
Collapse
Affiliation(s)
- Emeline Ragonnaud
- Immunoregulation Section, National Institute on Aging, Baltimore, Maryland
| | - Kanako Moritoh
- Immunoregulation Section, National Institute on Aging, Baltimore, Maryland
| | - Monica Bodogai
- Immunoregulation Section, National Institute on Aging, Baltimore, Maryland
| | - Fedor Gusev
- Department of Genomics and Human Genetics, Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Soizic Garaud
- Molecular Immunology Unit, Jules Bordet Institute, Université Libre de Bruxelles, Brussels, Belgium
| | - Chen Chen
- Immunoregulation Section, National Institute on Aging, Baltimore, Maryland
| | - Xin Wang
- Immunoregulation Section, National Institute on Aging, Baltimore, Maryland
| | | | - Kevin G Becker
- Gene Expression and Genomics Unit, National Institute on Aging, Baltimore, Maryland
| | - Robert W Maul
- Antibody Diversity Section, Laboratory of Immunology and Molecular Biology, National Institute on Aging, Baltimore, Maryland
| | - Karen Willard-Gallo
- Center for Genetics and Genetic Technologies, Faculty of Biology, Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| | - Evgeny Rogaev
- Department of Genomics and Human Genetics, Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Center for Genetics and Genetic Technologies, Faculty of Biology, Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
- Department of Psychiatry, University of Massachusetts Medical School, Worcester, Massachusetts
| | - Arya Biragyn
- Immunoregulation Section, National Institute on Aging, Baltimore, Maryland.
| |
Collapse
|
19
|
Buzdin A, Sorokin M, Garazha A, Glusker A, Aleshin A, Poddubskaya E, Sekacheva M, Kim E, Gaifullin N, Giese A, Seryakov A, Rumiantsev P, Moshkovskii S, Moiseev A. RNA sequencing for research and diagnostics in clinical oncology. Semin Cancer Biol 2019; 60:311-323. [PMID: 31412295 DOI: 10.1016/j.semcancer.2019.07.010] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2019] [Accepted: 07/16/2019] [Indexed: 12/26/2022]
Abstract
Molecular diagnostics is becoming one of the major drivers of personalized oncology. With hundreds of different approved anticancer drugs and regimens of their administration, selecting the proper treatment for a patient is at least nontrivial task. This is especially sound for the cases of recurrent and metastatic cancers where the standard lines of therapy failed. Recent trials demonstrated that mutation assays have a strong limitation in personalized selection of therapeutics, consequently, most of the drugs cannot be ranked and only a small percentage of patients can benefit from the screening. Other approaches are, therefore, needed to address a problem of finding proper targeted therapies. The analysis of RNA expression (transcriptomic) profiles presents a reasonable solution because transcriptomics stands a few steps closer to tumor phenotype than the genome analysis. Several recent studies pioneered using transcriptomics for practical oncology and showed truly encouraging clinical results. The possibility of directly measuring of expression levels of molecular drugs' targets and profiling activation of the relevant molecular pathways enables personalized prioritizing for all types of molecular-targeted therapies. RNA sequencing is the most robust tool for the high throughput quantitative transcriptomics. Its use, potentials, and limitations for the clinical oncology will be reviewed here along with the technical aspects such as optimal types of biosamples, RNA sequencing profile normalization, quality controls and several levels of data analysis.
Collapse
Affiliation(s)
- Anton Buzdin
- I.M. Sechenov First Moscow State Medical University, Moscow, Russia; Omicsway Corp., Walnut, CA, USA; Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia.
| | - Maxim Sorokin
- I.M. Sechenov First Moscow State Medical University, Moscow, Russia; Omicsway Corp., Walnut, CA, USA; Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia
| | | | | | - Alex Aleshin
- Stanford University School of Medicine, Stanford, 94305, CA, USA
| | - Elena Poddubskaya
- I.M. Sechenov First Moscow State Medical University, Moscow, Russia; Vitamed Oncological Clinics, Moscow, Russia
| | - Marina Sekacheva
- I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| | - Ella Kim
- Johannes Gutenberg University Mainz, Mainz, Germany
| | - Nurshat Gaifullin
- Lomonosov Moscow State University, Faculty of Medicine, Moscow, Russia
| | | | | | | | - Sergey Moshkovskii
- Institute of Biomedical Chemistry, Moscow, 119121, Russia; Pirogov Russian National Research Medical University (RNRMU), Moscow, 117997, Russia
| | - Alexey Moiseev
- I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| |
Collapse
|
20
|
Fang HY, Wang Q, Zhang JZ, Huang H. Prognostic value of expression of HOXB7 in gastric cancer. Shijie Huaren Xiaohua Zazhi 2019; 27:671-675. [DOI: 10.11569/wcjd.v27.i11.671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Gastric cancer (GC) is the fifth most common cancer and the third leading cause of cancer death worldwide. Identifying new targets for the treatment and predictive evaluation of GC is of great significance, especially for improving the prognosis. Few studies have focused on the clinical significance of homeobox B7 (HOXB7) expression in GC.
AIM To assess the prognostic value of HOXB7 expression in GC.
METHODS HOXB7 data were retrieved from the Oncomine GC database. The prognostic value of HOXB7 was assessed using an online survival analysis tool (KM Plotter database).
RESULTS Based on the Oncomine database, HOXB7 expression in GC was significantly higher than that in normal tissue (P < 0.05). Further analysis revealed that the expression of HOXB7 gene in both intestinal and diffuse GCs was significantly higher than that in normal tissue. Moreover, KM Plotters of overall survival indicated that high HOXB7 expression was closely associated with poor survival in GC (P < 0.05). Furthermore, high HOXB7 expression was also related with overall survival in different GC subtypes (Lauren subtype) (P < 0.05).
CONCLUSION High HOXB7 expression might be an important biological event during gastric oncogenesis, and could be a novel prognostic predictive factor for GC.
Collapse
Affiliation(s)
- Hong-Yan Fang
- Department of Oncology, Wuhan Fifth Hospital, Wuhan 430050, Hubei Province, China
| | - Qun Wang
- Department of Oncology, Wuhan Fifth Hospital, Wuhan 430050, Hubei Province, China
| | - Jiang-Zhou Zhang
- Department of Oncology, Wuhan Fifth Hospital, Wuhan 430050, Hubei Province, China
| | - Hui Huang
- Department of Oncology, Wuhan Fifth Hospital, Wuhan 430050, Hubei Province, China
| |
Collapse
|
21
|
Swindell WR, Kruse CPS, List EO, Berryman DE, Kopchick JJ. ALS blood expression profiling identifies new biomarkers, patient subgroups, and evidence for neutrophilia and hypoxia. J Transl Med 2019; 17:170. [PMID: 31118040 PMCID: PMC6530130 DOI: 10.1186/s12967-019-1909-0] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2019] [Accepted: 05/07/2019] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND Amyotrophic lateral sclerosis (ALS) is a debilitating disease with few treatment options. Progress towards new therapies requires validated disease biomarkers, but there is no consensus on which fluid-based measures are most informative. METHODS This study analyzed microarray data derived from blood samples of patients with ALS (n = 396), ALS mimic diseases (n = 75), and healthy controls (n = 645). Goals were to provide in-depth analysis of differentially expressed genes (DEGs), characterize patient-to-patient heterogeneity, and identify candidate biomarkers. RESULTS We identified 752 ALS-increased and 764 ALS-decreased DEGs (FDR < 0.10 with > 10% expression change). Gene expression shifts in ALS blood broadly resembled acute high altitude stress responses. ALS-increased DEGs had high exosome expression, were neutrophil-specific, associated with translation, and overlapped significantly with genes near ALS susceptibility loci (e.g., IFRD1, TBK1, CREB5). ALS-decreased DEGs, in contrast, had low exosome expression, were erythroid lineage-specific, and associated with anemia and blood disorders. Genes encoding neurofilament proteins (NEFH, NEFL) had poor diagnostic accuracy (50-53%). However, support vector machines distinguished ALS patients from ALS mimics and controls with 87% accuracy (sensitivity: 86%, specificity: 87%). Expression profiles were heterogeneous among patients and we identified two subgroups: (i) patients with higher expression of IL6R and myeloid lineage-specific genes and (ii) patients with higher expression of IL23A and lymphoid-specific genes. The gene encoding copper chaperone for superoxide dismutase (CCS) was most strongly associated with survival (HR = 0.77; P = 1.84e-05) and other survival-associated genes were linked to mitochondrial respiration. We identify a 61 gene signature that significantly improves survival prediction when added to Cox proportional hazard models with baseline clinical data (i.e., age at onset, site of onset and sex). Predicted median survival differed 2-fold between patients with favorable and risk-associated gene expression signatures. CONCLUSIONS Peripheral blood analysis informs our understanding of ALS disease mechanisms and genetic association signals. Our findings are consistent with low-grade neutrophilia and hypoxia as ALS phenotypes, with heterogeneity among patients partly driven by differences in myeloid and lymphoid cell abundance. Biomarkers identified in this study require further validation but may provide new tools for research and clinical practice.
Collapse
Affiliation(s)
- William R. Swindell
- Heritage College of Osteopathic Medicine, Ohio University, Athens, OH 45701 USA
- Department of Internal Medicine, The Jewish Hospital, Cincinnati, OH 45236 USA
| | - Colin P. S. Kruse
- Department of Environmental and Plant Biology, Ohio University, Athens, OH 45701 USA
- Edison Biotechnology Institute, Ohio University, Athens, OH 45701 USA
| | - Edward O. List
- Heritage College of Osteopathic Medicine, Ohio University, Athens, OH 45701 USA
- Edison Biotechnology Institute, Ohio University, Athens, OH 45701 USA
- The Diabetes Institute, Ohio University, Athens, OH 45701 USA
| | - Darlene E. Berryman
- Heritage College of Osteopathic Medicine, Ohio University, Athens, OH 45701 USA
- Edison Biotechnology Institute, Ohio University, Athens, OH 45701 USA
- The Diabetes Institute, Ohio University, Athens, OH 45701 USA
| | - John J. Kopchick
- Heritage College of Osteopathic Medicine, Ohio University, Athens, OH 45701 USA
- Edison Biotechnology Institute, Ohio University, Athens, OH 45701 USA
- The Diabetes Institute, Ohio University, Athens, OH 45701 USA
| |
Collapse
|
22
|
Borisov N, Shabalina I, Tkachev V, Sorokin M, Garazha A, Pulin A, Eremin II, Buzdin A. Shambhala: a platform-agnostic data harmonizer for gene expression data. BMC Bioinformatics 2019; 20:66. [PMID: 30727942 PMCID: PMC6366102 DOI: 10.1186/s12859-019-2641-8] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Accepted: 01/18/2019] [Indexed: 11/10/2022] Open
Abstract
Background Harmonization techniques make different gene expression profiles and their sets compatible and ready for comparisons. Here we present a new bioinformatic tool termed Shambhala for harmonization of multiple human gene expression datasets obtained using different experimental methods and platforms of microarray hybridization and RNA sequencing. Results Unlike previously published methods enabling good quality data harmonization for only two datasets, Shambhala allows conversion of multiple datasets into the universal form suitable for further comparisons. Shambhala harmonization is based on the calibration of gene expression profiles using the auxiliary standardization dataset. Each profile is transformed to make it similar to the output of microarray hybridization platform Affymetrix Human Gene. This platform was chosen because it has the biggest number of human gene expression profiles deposited in public databases. We evaluated Shambhala ability to retain biologically important features after harmonization. The same four biological samples taken in multiple replicates were profiled independently using three and four different experimental platforms, respectively, then Shambhala-harmonized and investigated by hierarchical clustering. Conclusion Our results showed that unlike other frequently used methods: quantile normalization and DESeq/DESeq2 normalization, Shambhala harmonization was the only method supporting sample-specific and platform-independent biologically meaningful clustering for the data obtained from multiple experimental platforms. Electronic supplementary material The online version of this article (10.1186/s12859-019-2641-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Nicolas Borisov
- I.M. Sechenov First Moscow State Medical University, Sechenov University, Moscow, 119991, Russia. .,Department of bioinformatics and molecular networks, OmicsWay Corporation, Walnut, CA, USA.
| | - Irina Shabalina
- Faculty of Mathematics and Information Technologies, Petrozavodsk State University, Anokhina str., 20, Petrozavodsk, 185910, Russia
| | - Victor Tkachev
- Department of bioinformatics and molecular networks, OmicsWay Corporation, Walnut, CA, USA
| | - Maxim Sorokin
- I.M. Sechenov First Moscow State Medical University, Sechenov University, Moscow, 119991, Russia.,Department of bioinformatics and molecular networks, OmicsWay Corporation, Walnut, CA, USA.,Group for Genomic Regulation of Cell Signaling Systems, Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, 117997, Russia
| | - Andrew Garazha
- Department of bioinformatics and molecular networks, OmicsWay Corporation, Walnut, CA, USA.,Laboratory of Bioinformatics, Oncology and Immunology, D. Rogachyov Federal Research Center of Pediatric Hematology, Moscow, 117198, Russia
| | - Andrey Pulin
- Laboratory for Cell Biology and Developmental Pathology, Federal State Institution "Institute of General Pathology and Pathophysiology", FSBSI "IGPP", Moscow, Russia
| | - Ilya I Eremin
- Department for Regenerative Medicine, JSC Generium, Moscow, Russia
| | - Anton Buzdin
- I.M. Sechenov First Moscow State Medical University, Sechenov University, Moscow, 119991, Russia.,Department of bioinformatics and molecular networks, OmicsWay Corporation, Walnut, CA, USA.,Group for Genomic Regulation of Cell Signaling Systems, Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, 117997, Russia
| |
Collapse
|
23
|
Zhang HP, Li SY, Wang JP, Lin J. Clinical significance and biological roles of cyclins in gastric cancer. Onco Targets Ther 2018; 11:6673-6685. [PMID: 30349301 PMCID: PMC6186297 DOI: 10.2147/ott.s171716] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Background and aim Cyclins have been reported to be overexpressed with poor prognosis in several human cancers. However, limited numbers of studies evaluated the expressions and prognostic roles of cyclins in gastric cancer (GC). We aim to evaluate the expressions and prognostic roles of cyclins. Also, further efforts were made to explore biological function of the differentially expressed cyclins. Methods Cyclins expressions were analyzed by Oncomine and The Cancer Genome Atlas datasets, and the prognostic roles of cyclins in GC patients were investigated by the Kaplan–Meier Plotter database. Then, a comprehensive PubMed literature search was performed to identify expression and prognosis of cyclins in GC. Biological functions of the differentially expressed cyclins were explored through Enrich R platform, and KEGG and transcription factor were analyzed. Results The expression levels of CCNA2 (cyclin A2), CCNB1 (cyclin B1), CCNB2 (cyclin B2), and CCNE1 (cyclin E1) mRNAs were identified to be significantly higher in GC tissues than in normal tissues in both Oncomine and The Cancer Genome Atlas datasets. High expressions of CCNA2, CCNB1, and CCNB2 mRNAs were identified to be related with poor overall survival in Kaplan–Meier Plotter dataset. Evidence from clinical studies showed that CCNB1 was related with overall survival in GC patients. Cyclins were associated with several biological pathways, including cell cycle, p53 signaling pathway, FoxO signaling pathway, viral carcinogenesis, and AMPK signaling pathway. Enrichment analysis also showed that cyclins interacted with some certain transcription factors, such as FOXM1, SIN3A, NFYA, and E2F4. Conclusion Based on our results, high expressions of cyclins were related with poor prognosis in GC patients. The above information might be useful for better understanding the clinical and biological roles of cyclins mRNA and guiding individualized treatments for GC patients.
Collapse
Affiliation(s)
- Hai-Ping Zhang
- Department of Gastroenterology, Zhongnan Hospital of Wuhan University, Wuhan City, Hubei Province 430071, China,
| | - Shu-Yu Li
- Department of Gastroenterology, Zhongshan Hospital of Hubei Province, Wuhan City, Hubei Province 430071, China
| | - Jian-Ping Wang
- Department of Gastroenterology, Zhongnan Hospital of Wuhan University, Wuhan City, Hubei Province 430071, China,
| | - Jun Lin
- Department of Gastroenterology, Zhongnan Hospital of Wuhan University, Wuhan City, Hubei Province 430071, China,
| |
Collapse
|
24
|
Song W, Liu H, Wang J, Kong Y, Yin X, Zang W. MATHT: A web server for comprehensive transcriptome data analysis. J Theor Biol 2018; 455:140-146. [PMID: 30040963 DOI: 10.1016/j.jtbi.2018.07.021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Revised: 07/17/2018] [Accepted: 07/19/2018] [Indexed: 12/15/2022]
Abstract
The current software/algorithms for high-throughput sequence data analysis are not user-friendly. We developed MATHT, the Multifaceted Analysis Tool for Human Transcriptome, which is a free web server available at www.biocloudservice.com, to provide more comprehensive and reliable analysis of transcriptome data. The web server provides modules for data preprocessing, differential expression analysis, dataset integration, functional analysis, and network analysis. The sequence and structure analysis module is specially designed for RNA-seq data. MATHT is a user-friendly web server that provides comprehensive analysis of transcriptome data, especially integration analysis using special standardization across different platforms.
Collapse
Affiliation(s)
- Wei Song
- Eryun (ShangHai) Information Technology Co., Ltd., No. 951 Jianchuan Road, Minhang District, Shanghai 201109, PR China
| | - Huaping Liu
- Eryun (ShangHai) Information Technology Co., Ltd., No. 951 Jianchuan Road, Minhang District, Shanghai 201109, PR China
| | - Jiajia Wang
- Eryun (ShangHai) Information Technology Co., Ltd., No. 951 Jianchuan Road, Minhang District, Shanghai 201109, PR China
| | - Yan Kong
- Eryun (ShangHai) Information Technology Co., Ltd., No. 951 Jianchuan Road, Minhang District, Shanghai 201109, PR China
| | - Xia Yin
- Eryun (ShangHai) Information Technology Co., Ltd., No. 951 Jianchuan Road, Minhang District, Shanghai 201109, PR China
| | - Weidong Zang
- Eryun (ShangHai) Information Technology Co., Ltd., No. 951 Jianchuan Road, Minhang District, Shanghai 201109, PR China.
| |
Collapse
|
25
|
Wang G, Gormley M, Qiao J, Zhao Q, Wang M, Di Sante G, Deng S, Dong L, Pestell T, Ju X, Casimiro MC, Addya S, Ertel A, Tozeren A, Li Q, Yu Z, Pestell RG. Cyclin D1-mediated microRNA expression signature predicts breast cancer outcome. Theranostics 2018; 8:2251-2263. [PMID: 29721077 PMCID: PMC5928887 DOI: 10.7150/thno.23877] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2017] [Accepted: 12/25/2017] [Indexed: 01/03/2023] Open
Abstract
Background: Genetic classification of breast cancer based on the coding mRNA suggests the evolution of distinct subtypes. Whether the non-coding genome is altered concordantly with the coding genome and the mechanism by which the cell cycle directly controls the non-coding genome is poorly understood. Methods: Herein, the miRNA signature maintained by endogenous cyclin D1 in human breast cancer cells was defined. In order to determine the clinical significance of the cyclin D1-mediated miRNA signature, we defined a miRNA expression superset from 459 breast cancer samples. We compared the coding and non-coding genome of breast cancer subtypes. Results: Hierarchical clustering of human breast cancers defined four distinct miRNA clusters (G1-G4) associated with distinguishable relapse-free survival by Kaplan-Meier analysis. The cyclin D1-regulated miRNA signature included several oncomirs, was conserved in multiple breast cancer cell lines, was associated with the G2 tumor miRNA cluster, ERα+ status, better outcome and activation of the Wnt pathway. The coding and non-coding genome were discordant within breast cancer subtypes. Seed elements for cyclin D1-regulated miRNA were identified in 63 genes of the Wnt signaling pathway including DKK. Cyclin D1 restrained DKK1 via the 3'UTR. In vivo studies using inducible transgenics confirmed cyclin D1 induces Wnt-dependent gene expression. Conclusion: The non-coding genome defines breast cancer subtypes that are discordant with their coding genome subtype suggesting distinct evolutionary drivers within the tumors. Cyclin D1 orchestrates expression of a miRNA signature that induces Wnt/β-catenin signaling, therefore cyclin D1 serves both upstream and downstream of Wnt/β-catenin signaling.
Collapse
Affiliation(s)
- Guangxue Wang
- Research Center for Translational Medicine, East Hospital, Tongji University School of Medicine, Shanghai 200120, China
| | - Michael Gormley
- Department of Cancer Biology, Thomas Jefferson University, 233 South 10 th St. Philadelphia PA 19107
| | - Jing Qiao
- Research Center for Translational Medicine, East Hospital, Tongji University School of Medicine, Shanghai 200120, China
| | - Qian Zhao
- Research Center for Translational Medicine, East Hospital, Tongji University School of Medicine, Shanghai 200120, China
| | - Min Wang
- Pennsylvania Cancer and Regenerative Medicine Research Center, Baruch S. Blumberg Institute, Pennsylvania Biotechnology Center and Lankenau Institute for Medical Research, 100 East Lancaster Avenue, Suite, 222, Wynnewood, PA. 19096
| | - Gabriele Di Sante
- Pennsylvania Cancer and Regenerative Medicine Research Center, Baruch S. Blumberg Institute, Pennsylvania Biotechnology Center and Lankenau Institute for Medical Research, 100 East Lancaster Avenue, Suite, 222, Wynnewood, PA. 19096
| | - Shengqiong Deng
- Research Center for Translational Medicine, East Hospital, Tongji University School of Medicine, Shanghai 200120, China
- Shanghai Gongli Hospital, the Second Military Medical University, Shanghai 200120, China
| | - Lin Dong
- Research Center for Translational Medicine, East Hospital, Tongji University School of Medicine, Shanghai 200120, China
| | - Tim Pestell
- Department of Cancer Biology, Thomas Jefferson University, 233 South 10 th St. Philadelphia PA 19107
| | - Xiaoming Ju
- Department of Cancer Biology, Thomas Jefferson University, 233 South 10 th St. Philadelphia PA 19107
| | - Mathew C. Casimiro
- Pennsylvania Cancer and Regenerative Medicine Research Center, Baruch S. Blumberg Institute, Pennsylvania Biotechnology Center and Lankenau Institute for Medical Research, 100 East Lancaster Avenue, Suite, 222, Wynnewood, PA. 19096
| | - Sankar Addya
- Department of Cancer Biology, Thomas Jefferson University, 233 South 10 th St. Philadelphia PA 19107
| | - Adam Ertel
- Department of Cancer Biology, Thomas Jefferson University, 233 South 10 th St. Philadelphia PA 19107
| | - Ayden Tozeren
- Center for Integrated Bioinformatics, Drexel University, Philadelphia, PA 19104
- School of Biomedical Engineering, Systems and Health Sciences, Drexel University, Philadelphia, PA 19104
| | - Qinchuan Li
- Research Center for Translational Medicine, East Hospital, Tongji University School of Medicine, Shanghai 200120, China
| | - Zuoren Yu
- Pennsylvania Cancer and Regenerative Medicine Research Center, Baruch S. Blumberg Institute, Pennsylvania Biotechnology Center and Lankenau Institute for Medical Research, 100 East Lancaster Avenue, Suite, 222, Wynnewood, PA. 19096
- Research Center for Translational Medicine, East Hospital, Tongji University School of Medicine, Shanghai 200120, China
| | - Richard G. Pestell
- Pennsylvania Cancer and Regenerative Medicine Research Center, Baruch S. Blumberg Institute, Pennsylvania Biotechnology Center and Lankenau Institute for Medical Research, 100 East Lancaster Avenue, Suite, 222, Wynnewood, PA. 19096
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore 637551, Singapore
| |
Collapse
|
26
|
Zhu W, Li J, Wu B. Gene expression profiling of the mouse gut: Effect of intestinal flora on intestinal health. Mol Med Rep 2018; 17:3667-3673. [PMID: 29257327 PMCID: PMC5802172 DOI: 10.3892/mmr.2017.8298] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 05/12/2017] [Indexed: 12/15/2022] Open
Abstract
The present study aimed to investigate the molecular mechanisms, including potential genes, pathways and interactions, underlying the effect of intestinal flora on intestinal health. The gene expression profiles of GSE22648 were downloaded from the Gene Expression Omnibus database to screen differentially expressed genes (DEGs). The Database for Annotation, Visualization and Integrated Discovery was used for Gene Ontology (GO) functional and pathway enrichment analysis of the DEGs. DEG‑associated literature was mined using the GenCLip 2.0 online tool. Finally, GO and pathway enrichment analyses of the DEGs in the literature were processed. By comparing microbiota‑depleted mouse samples and control mouse samples, a total of 115 DEGs, including 58 upregulated genes and 57 downregulated genes, were screened. The upregulated genes were enriched into various GO terms, including microsome, oxidation reduction and heme binding, whereas the 57 downregulated DEGs were enriched in different functions, including DNA packaging and linoleic acid metabolism. A total of 19 genes, including baculoviral IAP repeat containing 5, aurora kinase A, angiotensin I converting enzyme 2 and free fatty acid receptor 2 were identified and enriched in four modules, including cell division, chromosome segregation, inflammatory bowel disease and inflammatory response. AURKA, inner centromere protein antigens 135/155 kDa, baculoviral IAP repeat containing 5, aurora kinase B and solute carrier family 22 (organic cation/zwitterion transporter) member 4 were identified as potential important genes for intestinal flora and intestinal disease treatment through their involvement in various functions, including cell division, chromosome segregation, inflammatory bowel disease and inflammatory response.
Collapse
Affiliation(s)
- Wenhua Zhu
- Department of Gastroenterology, South Building, Chinese PLA General Hospital, Beijing 100853, P.R. China
- Department of Oncology, The 309th Hospital of Chinese PLA, Beijing 100091, P.R. China
| | - Jun Li
- Department of Gastroenterology, South Building, Chinese PLA General Hospital, Beijing 100853, P.R. China
| | - Benyan Wu
- Department of Gastroenterology, South Building, Chinese PLA General Hospital, Beijing 100853, P.R. China
| |
Collapse
|
27
|
Abdulnour REE, Howrylak JA, Tavares AH, Douda DN, Henkels KM, Miller TE, Fredenburgh LE, Baron RM, Gomez-Cambronero J, Levy BD. Phospholipase D isoforms differentially regulate leukocyte responses to acute lung injury. J Leukoc Biol 2018; 103:919-932. [PMID: 29437245 DOI: 10.1002/jlb.3a0617-252rr] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2017] [Revised: 01/03/2018] [Accepted: 01/10/2018] [Indexed: 12/30/2022] Open
Abstract
Phospholipase D (PLD) plays important roles in cellular responses to tissue injury that are critical to acute inflammatory diseases, such as the acute respiratory distress syndrome (ARDS). We investigated the expression of PLD isoforms and related phospholipid phosphatases in patients with ARDS, and their roles in a murine model of self-limited acute lung injury (ALI). Gene expression microarray analysis on whole blood obtained from patients that met clinical criteria for ARDS and clinically matched controls (non-ARDS) demonstrated that PLD1 gene expression was increased in patients with ARDS relative to non-ARDS and correlated with survival. In contrast, PLD2 expression was associated with mortality. In a murine model of self-resolving ALI, lung Pld1 expression increased and Pld2 expression decreased 24 h after intrabronchial acid. Total lung PLD activity was increased 24 h after injury. Pld1-/- mice demonstrated impaired alveolar barrier function and increased tissue injury relative to WT and Pld2-/- , whereas Pld2-/- mice demonstrated increased recruitment of neutrophils and macrophages, and decreased tissue injury. Isoform-specific PLD inhibitors mirrored the results with isoform-specific Pld-KO mice. PLD1 gene expression knockdown in human leukocytes was associated with decreased phagocytosis by neutrophils, whereas reactive oxygen species production and phagocytosis decreased in M2-macrophages. PLD2 gene expression knockdown increased neutrophil and M2-macrophage transmigration, and increased M2-macrophage phagocytosis. These results uncovered selective regulation of PLD isoforms after ALI, and opposing effects of selective isoform knockdown on host responses and tissue injury. These findings support therapeutic strategies targeting specific PLD isoforms for the treatment of ARDS.
Collapse
Affiliation(s)
- Raja-Elie E Abdulnour
- Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Judie A Howrylak
- Division of Pulmonary Allergy and Critical Care Medicine, Penn State Hershey Medical Center, Hershey, Pennsylvania, USA
| | - Alexander H Tavares
- Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - David N Douda
- Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Karen M Henkels
- Department of Biochemistry and Molecular Biology, Wright State University, Dayton, Ohio, USA
| | - Taylor E Miller
- Department of Biochemistry and Molecular Biology, Wright State University, Dayton, Ohio, USA
| | - Laura E Fredenburgh
- Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Rebecca M Baron
- Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Julian Gomez-Cambronero
- Department of Biochemistry and Molecular Biology, Wright State University, Dayton, Ohio, USA.,Center for Experimental Therapeutics and Reperfusion Injury, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Bruce D Levy
- Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA.,Center for Experimental Therapeutics and Reperfusion Injury, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
28
|
Leng D, Miao R, Huang X, Wang Y. In silico analysis identifies CRISP3 as a potential peripheral blood biomarker for multiple myeloma: From data modeling to validation with RT-PCR. Oncol Lett 2018; 15:5167-5174. [PMID: 29552153 DOI: 10.3892/ol.2018.7969] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2016] [Accepted: 01/05/2018] [Indexed: 12/24/2022] Open
Abstract
Octamer-binding protein 2 (Oct2) binds to the ATGCAAAT octamer on the IgH enhancer and stimulates IgH expression in human multiple myeloma (MM). Cysteine-rich secreted protein 3 (CRISP3) possesses the ATGCAAAT sequence and thus is activated by Oct2 in mouse B cells, suggesting that CRISP3 may be activated in and be a potential biomarker for MM. The present study involved a meta-analysis of the gene expression profiling data of human MM peripheral blood. Significantly expressed genes were analyzed on merged super array microarray data and selected sample data with significantly expressed genes were additionally analyzed by principal component analysis and Bayesian probit regression. CRISP3, Oct2, Apha-1B-glycoprotein (A1GB) and Cyclin D2 (CCND2) were validated in clinical MM peripheral blood samples using reverse transcription quantitative polymerase chain reaction. In the gene expression profiling data, CRISP3 was significantly upregulated and had certain proportions on the discriminated principal component of significantly expressed gene sample data. RT-qPCR analysis revealed CRISP3 was significantly upregulated in MM. Therefore, CRISP3 is a potential peripheral blood biomarker for MM.
Collapse
Affiliation(s)
- Dong Leng
- Clinical Laboratory, Beijing Chao-Yang Hospital, Capital Medical University, Beijing 100020, P.R. China
| | - Ran Miao
- Medical Research Center, Beijing Chao-Yang Hospital, Capital Medical University, Beijing 100020, P.R. China
| | - Xiaoxi Huang
- Medical Research Center, Beijing Chao-Yang Hospital, Capital Medical University, Beijing 100020, P.R. China
| | - Ying Wang
- Clinical Laboratory, Beijing Chao-Yang Hospital, Capital Medical University, Beijing 100020, P.R. China
| |
Collapse
|
29
|
Borisov N, Tkachev V, Suntsova M, Kovalchuk O, Zhavoronkov A, Muchnik I, Buzdin A. A method of gene expression data transfer from cell lines to cancer patients for machine-learning prediction of drug efficiency. Cell Cycle 2018; 17:486-491. [PMID: 29251172 DOI: 10.1080/15384101.2017.1417706] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022] Open
Abstract
Personalized medicine implies that distinct treatment methods are prescribed to individual patients according several features that may be obtained from, e.g., gene expression profile. The majority of machine learning methods suffer from the deficiency of preceding cases, i.e. the gene expression data on patients combined with the confirmed outcome of known treatment methods. At the same time, there exist thousands of various cell lines that were treated with hundreds of anti-cancer drugs in order to check the ability of these drugs to stop the cell proliferation, and all these cell line cultures were profiled in terms of their gene expression. Here we present a new approach in machine learning, which can predict clinical efficiency of anti-cancer drugs for individual patients by transferring features obtained from the expression-based data from cell lines. The method was validated on three datasets for cancer-like diseases (chronic myeloid leukemia, as well as lung adenocarcinoma and renal carcinoma) treated with targeted drugs - kinase inhibitors, such as imatinib or sorafenib.
Collapse
Affiliation(s)
- Nicolas Borisov
- a National Research Centre "Kurchatov Institute" , Centre for Convergence of Nano-, Bio-, Information and Cognitive Sciences and Technologies, Moscow , Russia.,b Department of R&D , First Oncology Research and Advisory Center, Moscow , Russia
| | - Victor Tkachev
- b Department of R&D , First Oncology Research and Advisory Center, Moscow , Russia.,c Department of R&D , OmicsWay Corporation, Walnut , CA , USA
| | - Maria Suntsova
- b Department of R&D , First Oncology Research and Advisory Center, Moscow , Russia.,d Group for Genomic Regulation of Cell Signaling Systems, Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry , Moscow , Russia.,e Laboratory of Bioinformatics, D. Rogachyov Federal Research Center of Pediatric Hematology , Oncology and Immunology, Moscow , 117198 , Russia
| | - Olga Kovalchuk
- f Department of Biological Sciences , University of Lethbridge , Lethbridge , AB , Canada.,g Canada Cancer and Aging Research Laboratories , Lethbridge , AB , Canada
| | - Alex Zhavoronkov
- h Insilico Medicine, Inc, ETC, Johns Hopkins University , Baltimore , MD , USA
| | - Ilya Muchnik
- i Rutgers University , Hill Center, Busch Campus, Piscataway , NJ , USA
| | - Anton Buzdin
- a National Research Centre "Kurchatov Institute" , Centre for Convergence of Nano-, Bio-, Information and Cognitive Sciences and Technologies, Moscow , Russia.,b Department of R&D , First Oncology Research and Advisory Center, Moscow , Russia.,c Department of R&D , OmicsWay Corporation, Walnut , CA , USA.,d Group for Genomic Regulation of Cell Signaling Systems, Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry , Moscow , Russia.,e Laboratory of Bioinformatics, D. Rogachyov Federal Research Center of Pediatric Hematology , Oncology and Immunology, Moscow , 117198 , Russia
| |
Collapse
|
30
|
Kim M, Tagkopoulos I. Data integration and predictive modeling methods for multi-omics datasets. Mol Omics 2018; 14:8-25. [DOI: 10.1039/c7mo00051k] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
We provide an overview of opportunities and challenges in multi-omics predictive analytics with particular emphasis on data integration and machine learning methods.
Collapse
Affiliation(s)
- Minseung Kim
- Department of Computer Science
- University of California
- Davis
- USA
- Genome Center
| | - Ilias Tagkopoulos
- Department of Computer Science
- University of California
- Davis
- USA
- Genome Center
| |
Collapse
|
31
|
Prediction of Drug Efficiency by Transferring Gene Expression Data from Cell Lines to Cancer Patients. BRAVERMAN READINGS IN MACHINE LEARNING. KEY IDEAS FROM INCEPTION TO CURRENT STATE 2018. [DOI: 10.1007/978-3-319-99492-5_9] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
32
|
Zhao N, Liu Y, Wei Y, Yan Z, Zhang Q, Wu C, Chang Z, Xu Y. Optimization of cell lines as tumour models by integrating multi-omics data. Brief Bioinform 2017; 18:515-529. [PMID: 27694350 DOI: 10.1093/bib/bbw082] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2016] [Indexed: 12/13/2022] Open
Abstract
Cell lines are widely used as in vitro models of tumorigenesis. However, an increasing number of researchers have found that cell lines differ from their sourced tumour samples after long-term cell culture. The application of unsuitable cell lines in experiments will affect the experimental accuracy and the treatment of patients. Therefore, it is imperative to identify optimal cell lines for each cancer type. Here, we review the methods used to evaluate cell lines since 2005. Furthermore, gene expression, copy number and mutation profiles from The Cancer Genome Atlas and the Cancer Cell Line Encyclopedia are used to calculate similarity between tumours and cell lines. Then, the ideal cell lines to use for experiments for eight types of cancers are found by combining the results with Gene Ontology functional similarity. After verification, the optimal cell lines have the same genomic characteristics as their homologous tumour samples. The contaminated cell lines identified in previous research are also determined to be unsuitable in vitro cancer models here. Moreover, our study suggests that some of the commonly used cell lines are not suitable cancer models. In summary, we provide a reference for ideal cell lines to use in in vitro experiments and contribute to improving the accuracy of future cancer research. Furthermore, this research provides a foundation for identifying more effective treatment strategies.
Collapse
|
33
|
Borisov N, Suntsova M, Sorokin M, Garazha A, Kovalchuk O, Aliper A, Ilnitskaya E, Lezhnina K, Korzinkin M, Tkachev V, Saenko V, Saenko Y, Sokov DG, Gaifullin NM, Kashintsev K, Shirokorad V, Shabalina I, Zhavoronkov A, Mishra B, Cantor CR, Buzdin A. Data aggregation at the level of molecular pathways improves stability of experimental transcriptomic and proteomic data. Cell Cycle 2017; 16:1810-1823. [PMID: 28825872 DOI: 10.1080/15384101.2017.1361068] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
High throughput technologies opened a new era in biomedicine by enabling massive analysis of gene expression at both RNA and protein levels. Unfortunately, expression data obtained in different experiments are often poorly compatible, even for the same biologic samples. Here, using experimental and bioinformatic investigation of major experimental platforms, we show that aggregation of gene expression data at the level of molecular pathways helps to diminish cross- and intra-platform bias otherwise clearly seen at the level of individual genes. We created a mathematical model of cumulative suppression of data variation that predicts the ideal parameters and the optimal size of a molecular pathway. We compared the abilities to aggregate experimental molecular data for the 5 alternative methods, also evaluated by their capacity to retain meaningful features of biologic samples. The bioinformatic method OncoFinder showed optimal performance in both tests and should be very useful for future cross-platform data analyses.
Collapse
Affiliation(s)
- Nicolas Borisov
- a Centre for Convergence of Nano-, Bio-, Information and Cognitive Sciences and Technologies, National Research Centre "Kurchatov Institute" , Moscow , Russia.,b Department of R&D, First Oncology Research and Advisory Center , Moscow , Russia
| | - Maria Suntsova
- c Department of R&D, Center for Biogerontology and Regenerative Medicine , Moscow , Russia.,d Laboratory of Bioinformatics, D. Rogachyov Federal Research Center of Pediatric Hematology, Oncology and Immunology , Moscow , Russia
| | - Maxim Sorokin
- a Centre for Convergence of Nano-, Bio-, Information and Cognitive Sciences and Technologies, National Research Centre "Kurchatov Institute" , Moscow , Russia.,e Group for Genomic Regulation of Cell Signaling Systems, Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry , Moscow , Russia
| | - Andrew Garazha
- c Department of R&D, Center for Biogerontology and Regenerative Medicine , Moscow , Russia.,f Department of R&D, OmicsWay Corporation , Walnut , CA , USA
| | - Olga Kovalchuk
- g Department of Biological Sciences , University of Lethbridge , Lethbridge , AB , Canada
| | - Alexander Aliper
- d Laboratory of Bioinformatics, D. Rogachyov Federal Research Center of Pediatric Hematology, Oncology and Immunology , Moscow , Russia
| | - Elena Ilnitskaya
- c Department of R&D, Center for Biogerontology and Regenerative Medicine , Moscow , Russia
| | - Ksenia Lezhnina
- b Department of R&D, First Oncology Research and Advisory Center , Moscow , Russia
| | - Mikhail Korzinkin
- c Department of R&D, Center for Biogerontology and Regenerative Medicine , Moscow , Russia
| | - Victor Tkachev
- f Department of R&D, OmicsWay Corporation , Walnut , CA , USA
| | - Vyacheslav Saenko
- h Technological Research Institute S.P. Kapitsa , Ulyanovsk State University , Ulyanovsk , Russia
| | - Yury Saenko
- h Technological Research Institute S.P. Kapitsa , Ulyanovsk State University , Ulyanovsk , Russia
| | - Dmitry G Sokov
- i Chemotherapy Department, Moscow 1st Oncological Hospital , Moscow , Russia
| | - Nurshat M Gaifullin
- j Faculty of Fundamental Medicine , Lomonosov Moscow State University , Moscow , Russia.,k Department of Oncology, Russian Medical Postgraduate Academy , Moscow , Russia
| | - Kirill Kashintsev
- l Chemotherapy Department, Moscow Oncological Hospital 62 , Stepanovskoye , Russia
| | - Valery Shirokorad
- l Chemotherapy Department, Moscow Oncological Hospital 62 , Stepanovskoye , Russia
| | - Irina Shabalina
- m Faculty of Mathematics and Information Technologies , Petrozavodsk State University , Petrozavodsk , Russia
| | - Alex Zhavoronkov
- d Laboratory of Bioinformatics, D. Rogachyov Federal Research Center of Pediatric Hematology, Oncology and Immunology , Moscow , Russia
| | | | - Charles R Cantor
- o Department of Biomedical Engineering , Boston University , Boston , MA , USA
| | - Anton Buzdin
- a Centre for Convergence of Nano-, Bio-, Information and Cognitive Sciences and Technologies, National Research Centre "Kurchatov Institute" , Moscow , Russia.,b Department of R&D, First Oncology Research and Advisory Center , Moscow , Russia.,e Group for Genomic Regulation of Cell Signaling Systems, Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry , Moscow , Russia.,f Department of R&D, OmicsWay Corporation , Walnut , CA , USA
| |
Collapse
|
34
|
Gang X, Sun Y, Li F, Yu T, Jiang Z, Zhu X, Jiang Q, Wang Y. Identification of key genes associated with rheumatoid arthritis with bioinformatics approach. Medicine (Baltimore) 2017; 96:e7673. [PMID: 28767591 PMCID: PMC5626145 DOI: 10.1097/md.0000000000007673] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
We aimed to identify key genes associated with rheumatoid arthritis (RA).The microarray datasets of GSE1919, GSE12021, and GSE21959 (35 RA samples and 32 normal controls) were downloaded from the Gene Expression Omnibus database. Differentially expressed genes (DEGs) in RA samples were identified using the t test in limma package. Functional enrichment analysis was performed using clusterProfiler package. A protein-protein interaction (PPI) network of selected DEGs was constructed based on the Human Protein Reference Database. Active modules were explored using the jActiveModules plug-in in the Cytoscape Network Modeling package.In total, 537 DEGs in RA samples were identified, including 241 upregulated and 296 downregulated genes. A total of 24,451 PPI pairs were collected, and 5 active modules were screened. Furthermore, 19 submodules were acquired from the 5 active modules. Discs large homolog 1 (DLG1) and related DEGs such as guanylate cyclase 1, soluble, alpha 2 (GUCY1A2), N-methyl d-aspartate receptor 2A subunit (GRIN2A), and potassium voltage-gated channel member 1 (KCNA1) were identified in 8 submodules. Plasminogen (PLG) and related DEGs such as chemokine (C-X-C motif) ligand 2 (CXCL2), laminin, alpha 3 (LAMA3), complement component 7 (C7), and coagulation factor X (F10) were identified in 4 submodules.Our results indicate that DLG1, GUCY1A2, GRIN2A, KCNA1, PLG, CXCL2, LAMA3, C7, and F10 may play key roles in the progression of RA and may serve as putative therapeutic targets for treating RA.
Collapse
Affiliation(s)
- Xiaokun Gang
- Department of Endocrinology and Metabolism, The First Hospital of Jilin University
| | - Yan Sun
- Department of Hematology and oncology, The Second Hospital of Jilin University, Changchun, Jilin Province 130041, China
| | - Fei Li
- Department of Endocrinology and Metabolism, The First Hospital of Jilin University
| | - Tong Yu
- Department of Orthopedics, The Second Hospital of Jilin University
| | - Zhende Jiang
- Department of Orthopedics, The Second Hospital of Jilin University
| | - Xiujie Zhu
- Department of Orthopedics, The Second Hospital of Jilin University
| | - Qiyao Jiang
- Department of Orthopedics, The Second Hospital of Jilin University
| | - Yao Wang
- Department of Orthopedics, The Second Hospital of Jilin University
| |
Collapse
|
35
|
Mueller AJ, Canty-Laird EG, Clegg PD, Tew SR. Cross-species gene modules emerge from a systems biology approach to osteoarthritis. NPJ Syst Biol Appl 2017. [PMID: 28649440 PMCID: PMC5460168 DOI: 10.1038/s41540-017-0014-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Complexities in degenerative disorders, such as osteoarthritis, arise from multiscale biological, environmental, and temporal perturbations. Animal models serve to provide controlled representations of the natural history of degenerative disorders, but in themselves represent an additional layer of complexity. Comparing transcriptomic networks arising from gene co-expression data across species can facilitate an understanding of the preservation of functional gene modules and establish associations with disease phenotypes. This study demonstrates the preservation of osteoarthritis-associated gene modules, described by immune system and system development processes, across human and rat studies. Class prediction analysis establishes a minimal gene signature, including the expression of the Rho GDP dissociation inhibitor ARHGDIB, which consistently defined healthy human cartilage from osteoarthritic cartilage in an independent data set. The age of human clinical samples remains a strong confounder in defining the underlying gene regulatory mechanisms in osteoarthritis; however, defining preserved gene models across species may facilitate standardization of animal models of osteoarthritis to better represent human disease and control for ageing phenomena.
Collapse
Affiliation(s)
- Alan James Mueller
- Department of Musculoskeletal Biology, Institute of Ageing and Chronic Disease, Faculty of Health and Life Sciences, University of Liverpool, William Henry Duncan Building, 6 West Derby Street, Liverpool, L7 8TX UK
| | - Elizabeth G Canty-Laird
- Department of Musculoskeletal Biology, Institute of Ageing and Chronic Disease, Faculty of Health and Life Sciences, University of Liverpool, William Henry Duncan Building, 6 West Derby Street, Liverpool, L7 8TX UK.,The MRC-Arthritis Research UK, Centre for Integrated Research into Musculoskeletal Ageing (CIMA), Liverpool, UK
| | - Peter D Clegg
- Department of Musculoskeletal Biology, Institute of Ageing and Chronic Disease, Faculty of Health and Life Sciences, University of Liverpool, William Henry Duncan Building, 6 West Derby Street, Liverpool, L7 8TX UK.,The MRC-Arthritis Research UK, Centre for Integrated Research into Musculoskeletal Ageing (CIMA), Liverpool, UK
| | - Simon R Tew
- Department of Musculoskeletal Biology, Institute of Ageing and Chronic Disease, Faculty of Health and Life Sciences, University of Liverpool, William Henry Duncan Building, 6 West Derby Street, Liverpool, L7 8TX UK.,The MRC-Arthritis Research UK, Centre for Integrated Research into Musculoskeletal Ageing (CIMA), Liverpool, UK
| |
Collapse
|
36
|
Muntané G, Santpere G, Verendeev A, Seeley WW, Jacobs B, Hopkins WD, Navarro A, Sherwood CC. Interhemispheric gene expression differences in the cerebral cortex of humans and macaque monkeys. Brain Struct Funct 2017; 222:3241-3254. [PMID: 28317062 DOI: 10.1007/s00429-017-1401-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2017] [Accepted: 03/05/2017] [Indexed: 11/25/2022]
Abstract
Handedness and language are two well-studied examples of asymmetrical brain function in humans. Approximately 90% of humans exhibit a right-hand preference, and the vast majority shows left-hemisphere dominance for language function. Although genetic models of human handedness and language have been proposed, the actual gene expression differences between cerebral hemispheres in humans remain to be fully defined. In the present study, gene expression profiles were examined in both hemispheres of three cortical regions involved in handedness and language in humans and their homologues in rhesus macaques: ventrolateral prefrontal cortex, posterior superior temporal cortex (STC), and primary motor cortex. Although the overall pattern of gene expression was very similar between hemispheres in both humans and macaques, weighted gene correlation network analysis revealed gene co-expression modules associated with hemisphere, which are different among the three cortical regions examined. Notably, a receptor-enriched gene module in STC was particularly associated with hemisphere and showed different expression levels between hemispheres only in humans.
Collapse
Affiliation(s)
- Gerard Muntané
- Department of Anthropology and Center for the Advanced Study of Human Paleobiology, The George Washington University, Washington, DC, 20052, USA.
- Institut Biologia Evolutiva, Universitat Pompeu Fabra-CSIC, 08003, Barcelona, Spain.
| | - Gabriel Santpere
- Institut Biologia Evolutiva, Universitat Pompeu Fabra-CSIC, 08003, Barcelona, Spain
| | - Andrey Verendeev
- Department of Anthropology and Center for the Advanced Study of Human Paleobiology, The George Washington University, Washington, DC, 20052, USA
| | - William W Seeley
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, 94158, USA
| | - Bob Jacobs
- Laboratory of Quantitative Neuromorphology, Neuroscience Program, Colorado College, Colorado Springs, CO, 80903, USA
| | - William D Hopkins
- Neuroscience Institute and the Language Research Center, Georgia State University, Atlanta, GA, 30302, USA
| | - Arcadi Navarro
- Institut Biologia Evolutiva, Universitat Pompeu Fabra-CSIC, 08003, Barcelona, Spain
| | - Chet C Sherwood
- Department of Anthropology and Center for the Advanced Study of Human Paleobiology, The George Washington University, Washington, DC, 20052, USA
| |
Collapse
|
37
|
Qin J, Yan B, Hu Y, Wang P, Wang J. Applications of integrative OMICs approaches to gene regulation studies. QUANTITATIVE BIOLOGY 2016. [DOI: 10.1007/s40484-016-0085-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
38
|
Ruiz-Larrabeiti O, Plágaro AH, Gracia C, Sevillano E, Gallego L, Hajnsdorf E, Kaberdin VR. A new custom microarray for sRNA profiling in Escherichia coli. FEMS Microbiol Lett 2016; 363:fnw131. [PMID: 27190161 DOI: 10.1093/femsle/fnw131] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/16/2016] [Indexed: 12/25/2022] Open
Abstract
Bacterial small RNAs (sRNAs) play essential roles in the post-transcriptional control of gene expression. To improve their detection by conventional microarrays, we designed a custom microarray containing a group of probes targeting known and some putative Escherichia coli sRNAs. To assess its potential in detection of sRNAs, RNA profiling experiments were performed with total RNA extracted from E. coli MG1655 cells exponentially grown in rich (Luria-Bertani) and minimal (M9/glucose) media. We found that many sRNAs could yield reasonably strong and statistically significant signals corresponding to nearly all sRNAs annotated in the EcoCyc database. Besides differential expression of two sRNAs (GcvB and RydB), expression of other sRNAs was less affected by the composition of the growth media. Other examples of the differentially expressed sRNAs were revealed by comparing gene expression of the wild-type strain and its isogenic mutant lacking functional poly(A) polymerase I (pcnB). Further, northern blot analysis was employed to validate these data and to assess the existence of new putative sRNAs. Our results suggest that the use of custom microarrays with improved capacities for detection of sRNAs can offer an attractive opportunity for efficient gene expression profiling of sRNAs and their target mRNAs at the whole transcriptome level.
Collapse
Affiliation(s)
- Olatz Ruiz-Larrabeiti
- Department of Immunology, Microbiology and Parasitology, University of the Basque Country UPV/EHU, Leioa, Spain
| | - Ander Hernández Plágaro
- Department of Immunology, Microbiology and Parasitology, University of the Basque Country UPV/EHU, Leioa, Spain
| | - Celine Gracia
- CNRS UMR8261 (previously FRE3630), University Paris Diderot, Sorbonne Paris Cité, Institut de Biologie Physico-Chimique, 75005 Paris, France
| | - Elena Sevillano
- Department of Immunology, Microbiology and Parasitology, University of the Basque Country UPV/EHU, Leioa, Spain
| | - Lucía Gallego
- Department of Immunology, Microbiology and Parasitology, University of the Basque Country UPV/EHU, Leioa, Spain
| | - Eliane Hajnsdorf
- CNRS UMR8261 (previously FRE3630), University Paris Diderot, Sorbonne Paris Cité, Institut de Biologie Physico-Chimique, 75005 Paris, France
| | - Vladimir R Kaberdin
- Department of Immunology, Microbiology and Parasitology, University of the Basque Country UPV/EHU, Leioa, Spain IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Spain
| |
Collapse
|
39
|
Efficient and biologically relevant consensus strategy for Parkinson's disease gene prioritization. BMC Med Genomics 2016; 9:12. [PMID: 26961748 PMCID: PMC4784386 DOI: 10.1186/s12920-016-0173-x] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Accepted: 03/01/2016] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The systemic information enclosed in microarray data encodes relevant clues to overcome the poorly understood combination of genetic and environmental factors in Parkinson's disease (PD), which represents the major obstacle to understand its pathogenesis and to develop disease-modifying therapeutics. While several gene prioritization approaches have been proposed, none dominate over the rest. Instead, hybrid approaches seem to outperform individual approaches. METHODS A consensus strategy is proposed for PD related gene prioritization from mRNA microarray data based on the combination of three independent prioritization approaches: Limma, machine learning, and weighted gene co-expression networks. RESULTS The consensus strategy outperformed the individual approaches in terms of statistical significance, overall enrichment and early recognition ability. In addition to a significant biological relevance, the set of 50 genes prioritized exhibited an excellent early recognition ability (6 of the top 10 genes are directly associated with PD). 40 % of the prioritized genes were previously associated with PD including well-known PD related genes such as SLC18A2, TH or DRD2. Eight genes (CCNH, DLK1, PCDH8, SLIT1, DLD, PBX1, INSM1, and BMI1) were found to be significantly associated to biological process affected in PD, representing potentially novel PD biomarkers or therapeutic targets. Additionally, several metrics of standard use in chemoinformatics are proposed to evaluate the early recognition ability of gene prioritization tools. CONCLUSIONS The proposed consensus strategy represents an efficient and biologically relevant approach for gene prioritization tasks providing a valuable decision-making tool for the study of PD pathogenesis and the development of disease-modifying PD therapeutics.
Collapse
|
40
|
Gene Expression in HIV-Associated Neurocognitive Disorders: A Meta-Analysis. J Acquir Immune Defic Syndr 2016; 70:479-88. [PMID: 26569176 DOI: 10.1097/qai.0000000000000800] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
OBJECTIVES To identify differentially expressed (DE) genes in HIV-associated neurocognitive disorders (HAND) patients in comparison with HIV-infected patients without HAND and controls. DESIGN A meta-analysis of publicly available gene expression data from HIV postmortem brain tissue studies. METHODS We selected studies using clearly defined inclusion and exclusion criteria. Within study data preprocessing and individual analyses were performed for each brain region. The following meta-analytic methods were applied: combining P values, combining effect sizes with and without a permutation method. The DE genes were defined with a false discovery rate less than 5% using Benjamini-Hochberg method. RESULTS Our meta-analysis on 3 studies encompasses analyses of over 48 postmortem brains [25 HAND, 7 HIV encephalitis (HIVE), 8 HIV-infected patients, and 8 controls]. Overall, 411 genes in white matter were DE in HAND with HIVE patients when comparing with controls. Of these, 94 genes were significantly expressed in all statistical methods. These 94 genes participate in significant pathways such as immune system, interferon response, or antigen presentation. Sixty-six of the 94 genes were significantly upregulated with log2 intensities greater than 2-fold. Strong examples of the highly upregulated genes were PSMB8-AS1, APOL6, TRIM69, PSME1, CTSB, HLA-E, GPNMB, UBE2L6, PSME2, NET1, CAPG, B2M, RPL38, GBP1, and PLSCR1. Only BTN3A2 was expressed in HAND with HIVE patients as compared with HAND patients without HIVE. CONCLUSION A number of genes were DE in our meta-analysis that were not identified in the individual analyses. The meta-analytic approach has increased statistical power for identifying DE genes in HAND.
Collapse
|
41
|
Maimari N, Pedrigi RM, Russo A, Broda K, Krams R. Integration of flow studies for robust selection of mechanoresponsive genes. Thromb Haemost 2016; 115:474-83. [PMID: 26842798 DOI: 10.1160/th15-09-0704] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2015] [Accepted: 11/13/2015] [Indexed: 11/05/2022]
Abstract
Blood flow is an essential contributor to plaque growth, composition and initiation. It is sensed by endothelial cells, which react to blood flow by expressing > 1000 genes. The sheer number of genes implies that one needs genomic techniques to unravel their response in disease. Individual genomic studies have been performed but lack sufficient power to identify subtle changes in gene expression. In this study, we investigated whether a systematic meta-analysis of available microarray studies can improve their consistency. We identified 17 studies using microarrays, of which six were performed in vivo and 11 in vitro. The in vivo studies were disregarded due to the lack of the shear profile. Of the in vitro studies, a cross-platform integration of human studies (HUVECs in flow cells) showed high concordance (> 90 %). The human data set identified > 1600 genes to be shear responsive, more than any other study and in this gene set all known mechanosensitive genes and pathways were present. A detailed network analysis indicated a power distribution (e. g. the presence of hubs), without a hierarchical organisation. The average cluster coefficient was high and further analysis indicated an aggregation of 3 and 4 element motifs, indicating a high prevalence of feedback and feed forward loops, similar to prokaryotic cells. In conclusion, this initial study presented a novel method to integrate human-based mechanosensitive studies to increase its power. The robust network was large, contained all known mechanosensitive pathways and its structure revealed hubs, and a large aggregate of feedback and feed forward loops.
Collapse
Affiliation(s)
| | | | | | | | - Rob Krams
- Prof. Rob Krams, Chair in Molecular Bioengineering, Dept. Bioengineering, Imperial College London, Room 3.15, Royal School of Mines, Exhibition Road, SW7 2AZ London, UK, Tel.:+44 2075941473, E-mail:
| |
Collapse
|
42
|
Yang Y, Mei Q. miRNA signature identification of retinoblastoma and the correlations between differentially expressed miRNAs during retinoblastoma progression. Mol Vis 2015; 21:1307-17. [PMID: 26730174 PMCID: PMC4688417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2015] [Accepted: 12/12/2015] [Indexed: 11/17/2022] Open
Abstract
PURPOSE Retinoblastoma (RB) is a common pediatric cancer. The study aimed to uncover the mechanisms of RB progression and identify novel therapeutic biomarkers. METHODS The miRNA expression profile GSE7072, which includes three RB samples and three healthy retina samples, was used. After data normalization using the preprocessCore package, differentially expressed miRNAs (DE-miRs) were selected by the limma package. The targets of the DE-miRs were predicted based on two databases, followed by construction of the miRNA-target network. Pathway enrichment analysis was conducted for the targets of the DE-miRNAs using DAVID. The CTD database was used to predict RB-related genes, followed by clustering analysis using the pvclust package. The correlation network of DE-miRs was established. MiRNA expression was validated in another data set, GSE41321. RESULTS In total, 24 DE-miRs were identified whose targets were correlated with the cell cycle pathway. Among them, hsa-miR-373, hsa-miR-125b, and hsa-miR-181a were highlighted in the miRNA-target regulatory network; 14 DE-miRs, including hsa-miR-373, hsa-miR-125b, hsa-miR-18a, hsa-miR-25, hsa-miR-20a, and hsa-let-7 (a, b, c), were shown to distinguish RB from healthy tissue. In addition, hsa-miR-25, hsa-miR-18a, and hsa-miR-20a shared the common target BCL2L11; hsa-let-7b and hsa-miR-125b targeted the genes CDC25A, CDK6, and LIN28A. Expression of three miRNAs in GSE41321 was consistent with that in GSE7072. CONCLUSIONS Several critical miRNAs were identified in RB progression. Hsa-miR-373 might regulate RB invasion and metastasis, hsa-miR-181a might involve in the CDKN1B-mediated cell cycle pathway, and hsa-miR-125b and hsa-let-7b might serve as tumor suppressors by coregulating CDK6, CDC25A, and LIN28A. The miRNAs hsa-miR-25, hsa-miR-18a, and hsa-miR-20a might exert their function by coregulating BCL2L1.
Collapse
Affiliation(s)
- Yang Yang
- Department of Ophthalmology, Renmin Hospital of Wuhan University, Wuhan, China
| | - Qi Mei
- Department of Oncology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| |
Collapse
|
43
|
Yasrebi H. Comparative study of joint analysis of microarray gene expression data in survival prediction and risk assessment of breast cancer patients. Brief Bioinform 2015; 17:771-85. [PMID: 26504096 PMCID: PMC5863785 DOI: 10.1093/bib/bbv092] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2015] [Indexed: 11/16/2022] Open
Abstract
Microarray gene expression data sets are jointly analyzed to increase statistical power.
They could either be merged together or analyzed by meta-analysis. For a given ensemble of
data sets, it cannot be foreseen which of these paradigms, merging or meta-analysis, works
better. In this article, three joint analysis methods, Z -score
normalization, ComBat and the inverse normal method (meta-analysis) were selected for
survival prognosis and risk assessment of breast cancer patients. The methods were applied
to eight microarray gene expression data sets, totaling 1324 patients with two clinical
endpoints, overall survival and relapse-free survival. The performance derived from the
joint analysis methods was evaluated using Cox regression for survival analysis and
independent validation used as bias estimation. Overall, Z -score
normalization had a better performance than ComBat and meta-analysis. Higher Area Under
the Receiver Operating Characteristic curve and hazard ratio were also obtained when
independent validation was used as bias estimation. With a lower time and memory
complexity, Z -score normalization is a simple method for joint analysis
of microarray gene expression data sets. The derived findings suggest further assessment
of this method in future survival prediction and cancer classification applications.
Collapse
|
44
|
Su J, Ekman C, Oskolkov N, Lahti L, Ström K, Brazma A, Groop L, Rung J, Hansson O. A novel atlas of gene expression in human skeletal muscle reveals molecular changes associated with aging. Skelet Muscle 2015; 5:35. [PMID: 26457177 PMCID: PMC4600214 DOI: 10.1186/s13395-015-0059-1] [Citation(s) in RCA: 66] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2015] [Accepted: 09/28/2015] [Indexed: 12/17/2022] Open
Abstract
Background Although high-throughput studies of gene expression have generated large amounts of data, most of which is freely available in public archives, the use of this valuable resource is limited by computational complications and non-homogenous annotation. To address these issues, we have performed a complete re-annotation of public microarray data from human skeletal muscle biopsies and constructed a muscle expression compendium consisting of nearly 3000 samples. The created muscle compendium is a publicly available resource including all curated annotation. Using this data set, we aimed to elucidate the molecular mechanism of muscle aging and to describe how physical exercise may alleviate negative physiological effects. Results We find 957 genes to be significantly associated with aging (p < 0.05, FDR = 5 %, n = 361). Aging was associated with perturbation of many central metabolic pathways like mitochondrial function including reduced expression of genes in the ATP synthase, NADH dehydrogenase, cytochrome C reductase and oxidase complexes, as well as in glucose and pyruvate processing. Among the genes with the strongest association with aging were H3 histone, family 3B (H3F3B, p = 3.4 × 10−13), AHNAK nucleoprotein, desmoyokin (AHNAK, p = 6.9 × 10−12), and histone deacetylase 4 (HDAC4, p = 4.0 × 10−9). We also discover genes previously not linked to muscle aging and metabolism, such as fasciculation and elongation protein zeta 2 (FEZ2, p = 2.8 × 10−8). Out of the 957 genes associated with aging, 21 (p < 0.001, false discovery rate = 5 %, n = 116) were also associated with maximal oxygen consumption (VO2MAX). Strikingly, 20 out of those 21 genes are regulated in opposite direction when comparing increasing age with increasing VO2MAX. Conclusions These results support that mitochondrial dysfunction is a major age-related factor and also highlight the beneficial effects of maintaining a high physical capacity for prevention of age-related sarcopenia. Electronic supplementary material The online version of this article (doi:10.1186/s13395-015-0059-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jing Su
- European Molecular Biology Laboratory-European Bioinformatics Institute, Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD UK
| | - Carl Ekman
- Lund University Diabetes Center, Department of Clinical Sciences, Diabetes and Endocrinology, Skåne University Hospital Malmö, Lund University, Malmö, 20502 Sweden
| | - Nikolay Oskolkov
- Lund University Diabetes Center, Department of Clinical Sciences, Diabetes and Endocrinology, Skåne University Hospital Malmö, Lund University, Malmö, 20502 Sweden
| | - Leo Lahti
- Department of Veterinary Biosciences, University of Helsinki, PO Box 66, FI-00014 Helsinki, Finland
| | - Kristoffer Ström
- Lund University Diabetes Center, Department of Clinical Sciences, Diabetes and Endocrinology, Skåne University Hospital Malmö, Lund University, Malmö, 20502 Sweden.,Swedish Winter Sports Research Centre, Department of Health Sciences, Mid Sweden University, SE-83125 Östersund, Sweden
| | - Alvis Brazma
- European Molecular Biology Laboratory-European Bioinformatics Institute, Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD UK
| | - Leif Groop
- Lund University Diabetes Center, Department of Clinical Sciences, Diabetes and Endocrinology, Skåne University Hospital Malmö, Lund University, Malmö, 20502 Sweden
| | - Johan Rung
- European Molecular Biology Laboratory-European Bioinformatics Institute, Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD UK.,Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Rudbeck Laboratory, Uppsala University, 751 85 Uppsala, Sweden
| | - Ola Hansson
- Lund University Diabetes Center, Department of Clinical Sciences, Diabetes and Endocrinology, Skåne University Hospital Malmö, Lund University, Malmö, 20502 Sweden
| |
Collapse
|
45
|
Xu Y, Huang Y, Cai D, Liu J, Cao X. Analysis of differences in the molecular mechanism of rheumatoid arthritis and osteoarthritis based on integration of gene expression profiles. Immunol Lett 2015; 168:246-53. [PMID: 26404854 DOI: 10.1016/j.imlet.2015.09.011] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2015] [Revised: 09/14/2015] [Accepted: 09/18/2015] [Indexed: 11/16/2022]
Abstract
BACKGROUND We aimed to elucidate the molecular mechanisms underlying rheumatoid arthritis (RA) and osteoarthritis (OA) and analyze the mechanism differences between them. METHODS The gene expression profile of GSE1919, GSE12021, GSE21959 and GSE48780 were downloaded from Gene Expression Omnibus. Total 165 samples of synovial fibroblasts (118 RA samples, 15 OA samples and 32 normal controls) were used. The differentially expressed genes (DEGs) in RA samples but no differences in OA samples (RA.DEGs) and genes in OA samples but no differences in RA samples (OA.DEGs) were screened using limma package. Functional enrichment analysis was performed using DAVID. Moreover, transcriptional regulatory network (TRN) and microRNA regulatory network were constructed. RESULTS Total 211 RA.DEGs (96 up- and 115 down-regulate) and 497 OA.DEGs (224 up- and 273 down-regulated) were identified. TRN analysis showed that C-ETS-1 and P53 were important transcription factors. C-ETS-1 could interact with matrix metallopeptidase 1 (MMP1) and CD53 while P53 could interact with epidermal growth factor receptor (EGFR) and dual specificity phosphatase 1 (DUSP1). Besides, v-myc avian myelocytomatosis viral oncogene homolog (MYC) and interleukin 1, beta (IL1B) could be regulated by the most microRNAs in microRNA regulatory network. Our study indicates that ETS-1 may contribute to RA progression by up-regulation of MMP1 and result in OA progression via up-regulating CD53. CONCLUSIONS P53 may be involved in the progression of RA and OA via targeting downstream EGFR and DUSP1 respectively. Besides, MYC and IL1B may play an important role in OA progression via the regulation of microRNAs.
Collapse
Affiliation(s)
- YiSheng Xu
- Orthopedics Department, Guangdong Provincial Hospital of Traditional Chinese Medicine, 111 Dade Road, Guangzhou 510120, Guangdong, China.
| | - YongMing Huang
- Orthopedics Department, Guangdong Provincial Hospital of Traditional Chinese Medicine, 55 Neihuanxi Road, Guangzhou 510006, Guangdong, China
| | - DaKe Cai
- Guangdong Province Engineering Technology Research Institute of Traditional Chinese Medicine, Guangdong, Guangzhou 510095, China
| | - JinWen Liu
- Orthopedics Department, Guangdong Provincial Hospital of Traditional Chinese Medicine, 111 Dade Road, Guangzhou 510120, Guangdong, China
| | - XueWei Cao
- Orthopedics Department, Guangdong Provincial Hospital of Traditional Chinese Medicine, 111 Dade Road, Guangzhou 510120, Guangdong, China
| |
Collapse
|
46
|
Microarray Meta-Analysis and Cross-Platform Normalization: Integrative Genomics for Robust Biomarker Discovery. MICROARRAYS 2015; 4:389-406. [PMID: 27600230 PMCID: PMC4996376 DOI: 10.3390/microarrays4030389] [Citation(s) in RCA: 66] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Revised: 08/16/2015] [Accepted: 08/17/2015] [Indexed: 01/24/2023]
Abstract
The diagnostic and prognostic potential of the vast quantity of publicly-available microarray data has driven the development of methods for integrating the data from different microarray platforms. Cross-platform integration, when appropriately implemented, has been shown to improve reproducibility and robustness of gene signature biomarkers. Microarray platform integration can be conceptually divided into approaches that perform early stage integration (cross-platform normalization) versus late stage data integration (meta-analysis). A growing number of statistical methods and associated software for platform integration are available to the user, however an understanding of their comparative performance and potential pitfalls is critical for best implementation. In this review we provide evidence-based, practical guidance to researchers performing cross-platform integration, particularly with an objective to discover biomarkers.
Collapse
|
47
|
Sha C, Barrans S, Care MA, Cunningham D, Tooze RM, Jack A, Westhead DR. Transferring genomics to the clinic: distinguishing Burkitt and diffuse large B cell lymphomas. Genome Med 2015; 7:64. [PMID: 26207141 PMCID: PMC4512160 DOI: 10.1186/s13073-015-0187-6] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2015] [Accepted: 06/15/2015] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Classifiers based on molecular criteria such as gene expression signatures have been developed to distinguish Burkitt lymphoma and diffuse large B cell lymphoma, which help to explore the intermediate cases where traditional diagnosis is difficult. Transfer of these research classifiers into a clinical setting is challenging because there are competing classifiers in the literature based on different methodology and gene sets with no clear best choice; classifiers based on one expression measurement platform may not transfer effectively to another; and, classifiers developed using fresh frozen samples may not work effectively with the commonly used and more convenient formalin fixed paraffin-embedded samples used in routine diagnosis. METHODS Here we thoroughly compared two published high profile classifiers developed on data from different Affymetrix array platforms and fresh-frozen tissue, examining their transferability and concordance. Based on this analysis, a new Burkitt and diffuse large B cell lymphoma classifier (BDC) was developed and employed on Illumina DASL data from our own paraffin-embedded samples, allowing comparison with the diagnosis made in a central haematopathology laboratory and evaluation of clinical relevance. RESULTS We show that both previous classifiers can be recapitulated using very much smaller gene sets than originally employed, and that the classification result is closely dependent on the Burkitt lymphoma criteria applied in the training set. The BDC classification on our data exhibits high agreement (~95 %) with the original diagnosis. A simple outcome comparison in the patients presenting intermediate features on conventional criteria suggests that the cases classified as Burkitt lymphoma by BDC have worse response to standard diffuse large B cell lymphoma treatment than those classified as diffuse large B cell lymphoma. CONCLUSIONS In this study, we comprehensively investigate two previous Burkitt lymphoma molecular classifiers, and implement a new gene expression classifier, BDC, that works effectively on paraffin-embedded samples and provides useful information for treatment decisions. The classifier is available as a free software package under the GNU public licence within the R statistical software environment through the link http://www.bioinformatics.leeds.ac.uk/labpages/softwares/ or on github https://github.com/Sharlene/BDC.
Collapse
Affiliation(s)
- Chulin Sha
- />School of Molecular and Cellular Biology, Garstang Building, University of Leeds, Leeds, LS2 9JT UK
| | - Sharon Barrans
- />Haematological, Malignancy Diagnostic Service, St James’s University Hospital, Leeds, UK
| | - Matthew A. Care
- />Section of Experimental Haematology, Leeds Institute of Cancer and Pathology, University of Leeds, Leeds, UK
| | | | - Reuben M. Tooze
- />Haematological, Malignancy Diagnostic Service, St James’s University Hospital, Leeds, UK
- />Section of Experimental Haematology, Leeds Institute of Cancer and Pathology, University of Leeds, Leeds, UK
| | - Andrew Jack
- />Haematological, Malignancy Diagnostic Service, St James’s University Hospital, Leeds, UK
| | - David R. Westhead
- />School of Molecular and Cellular Biology, Garstang Building, University of Leeds, Leeds, LS2 9JT UK
| |
Collapse
|
48
|
Hughey JJ, Butte AJ. Robust meta-analysis of gene expression using the elastic net. Nucleic Acids Res 2015; 43:e79. [PMID: 25829177 PMCID: PMC4499117 DOI: 10.1093/nar/gkv229] [Citation(s) in RCA: 81] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2014] [Accepted: 03/05/2015] [Indexed: 12/15/2022] Open
Abstract
Meta-analysis of gene expression has enabled numerous insights into biological systems, but current methods have several limitations. We developed a method to perform a meta-analysis using the elastic net, a powerful and versatile approach for classification and regression. To demonstrate the utility of our method, we conducted a meta-analysis of lung cancer gene expression based on publicly available data. Using 629 samples from five data sets, we trained a multinomial classifier to distinguish between four lung cancer subtypes. Our meta-analysis-derived classifier included 58 genes and achieved 91% accuracy on leave-one-study-out cross-validation and on three independent data sets. Our method makes meta-analysis of gene expression more systematic and expands the range of questions that a meta-analysis can be used to address. As the amount of publicly available gene expression data continues to grow, our method will be an effective tool to help distill these data into knowledge.
Collapse
Affiliation(s)
- Jacob J Hughey
- Division of Systems Medicine, Department of Pediatrics, Department of Pediatrics, Stanford University, Stanford, CA 94305, USA
| | - Atul J Butte
- Division of Systems Medicine, Department of Pediatrics, Department of Pediatrics, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
49
|
Identification of crucial genes in intracranial aneurysm based on weighted gene coexpression network analysis. Cancer Gene Ther 2015; 22:238-45. [DOI: 10.1038/cgt.2015.10] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2014] [Revised: 01/14/2015] [Accepted: 01/16/2015] [Indexed: 01/17/2023]
|
50
|
Golf O, Muirhead LJ, Speller A, Balog J, Abbassi-Ghadi N, Kumar S, Mróz A, Veselkov K, Takáts Z. XMS: cross-platform normalization method for multimodal mass spectrometric tissue profiling. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2015; 26:44-54. [PMID: 25380777 DOI: 10.1007/s13361-014-0997-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2014] [Revised: 09/01/2014] [Accepted: 09/02/2014] [Indexed: 06/04/2023]
Abstract
Here we present a proof of concept cross-platform normalization approach to convert raw mass spectra acquired by distinct desorption ionization methods and/or instrumental setups to cross-platform normalized analyte profiles. The initial step of the workflow is database driven peak annotation followed by summarization of peak intensities of different ions from the same molecule. The resulting compound-intensity spectra are adjusted to a method-independent intensity scale by using predetermined, compound-specific normalization factors. The method is based on the assumption that distinct MS-based platforms capture a similar set of chemical species in a biological sample, though these species may exhibit platform-specific molecular ion intensity distribution patterns. The method was validated on two sample sets of (1) porcine tissue analyzed by laser desorption ionization (LDI), desorption electrospray ionization (DESI), and rapid evaporative ionization mass spectrometric (REIMS) in combination with Fourier transformation-based mass spectrometry; and (2) healthy/cancerous colorectal tissue analyzed by DESI and REIMS with the latter being combined with time-of-flight mass spectrometry. We demonstrate the capacity of our method to reduce MS-platform specific variation resulting in (1) high inter-platform concordance coefficients of analyte intensities; (2) clear principal component based clustering of analyte profiles according to histological tissue types, irrespective of the used desorption ionization technique or mass spectrometer; and (3) accurate "blind" classification of histologic tissue types using cross-platform normalized analyte profiles.
Collapse
Affiliation(s)
- Ottmar Golf
- Institute for Inorganic and Analytical Chemistry, Justus Liebig University, Giessen, Germany
| | | | | | | | | | | | | | | | | |
Collapse
|