1
|
He S, Gubin MM, Rafei H, Basar R, Dede M, Jiang X, Liang Q, Tan Y, Kim K, Gillison ML, Rezvani K, Peng W, Haymaker C, Hernandez S, Solis LM, Mohanty V, Chen K. Elucidating immune-related gene transcriptional programs via factorization of large-scale RNA-profiles. iScience 2024; 27:110096. [PMID: 38957791 PMCID: PMC11217617 DOI: 10.1016/j.isci.2024.110096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 04/03/2024] [Accepted: 05/21/2024] [Indexed: 07/04/2024] Open
Abstract
Recent developments in immunotherapy, including immune checkpoint blockade (ICB) and adoptive cell therapy (ACT), have encountered challenges such as immune-related adverse events and resistance, especially in solid tumors. To advance the field, a deeper understanding of the molecular mechanisms behind treatment responses and resistance is essential. However, the lack of functionally characterized immune-related gene sets has limited data-driven immunological research. To address this gap, we adopted non-negative matrix factorization on 83 human bulk RNA sequencing (RNA-seq) datasets and constructed 28 immune-specific gene sets. After rigorous immunologist-led manual annotations and orthogonal validations across immunological contexts and functional omics data, we demonstrated that these gene sets can be applied to refine pan-cancer immune subtypes, improve ICB response prediction and functionally annotate spatial transcriptomic data. These functional gene sets, informing diverse immune states, will advance our understanding of immunology and cancer research.
Collapse
Affiliation(s)
- Shan He
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Matthew M. Gubin
- Department of Immunology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Hind Rafei
- Department of Stem Cell Transplantation and Cellular Therapy, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Rafet Basar
- Department of Stem Cell Transplantation and Cellular Therapy, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Merve Dede
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Xianli Jiang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Qingnan Liang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Yukun Tan
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Kunhee Kim
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Maura L. Gillison
- Department of Thoracic/Head and Neck Medical Oncology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Katayoun Rezvani
- Department of Stem Cell Transplantation and Cellular Therapy, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Weiyi Peng
- Department of Biology and Biochemistry, The University of Houston, Houston, TX, USA
| | - Cara Haymaker
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Sharia Hernandez
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Luisa M. Solis
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Vakul Mohanty
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Ken Chen
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| |
Collapse
|
2
|
He S, Gubin MM, Rafei H, Basar R, Dede M, Jiang X, Liang Q, Tan Y, Kim K, Gillison ML, Rezvani K, Peng W, Haymaker C, Hernandez S, Solis LM, Mohanty V, Chen K. Elucidating immune-related gene transcriptional programs via factorization of large-scale RNA-profiles. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.10.593433. [PMID: 38798470 PMCID: PMC11118452 DOI: 10.1101/2024.05.10.593433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Recent developments in immunotherapy, including immune checkpoint blockade (ICB) and adoptive cell therapy, have encountered challenges such as immune-related adverse events and resistance, especially in solid tumors. To advance the field, a deeper understanding of the molecular mechanisms behind treatment responses and resistance is essential. However, the lack of functionally characterized immune-related gene sets has limited data-driven immunological research. To address this gap, we adopted non-negative matrix factorization on 83 human bulk RNA-seq datasets and constructed 28 immune-specific gene sets. After rigorous immunologist-led manual annotations and orthogonal validations across immunological contexts and functional omics data, we demonstrated that these gene sets can be applied to refine pan-cancer immune subtypes, improve ICB response prediction and functionally annotate spatial transcriptomic data. These functional gene sets, informing diverse immune states, will advance our understanding of immunology and cancer research.
Collapse
|
3
|
Yang G, Yu XR, Weisenberger DJ, Lu T, Liang G. A Multi-Omics Overview of Colorectal Cancer to Address Mechanisms of Disease, Metastasis, Patient Disparities and Outcomes. Cancers (Basel) 2023; 15:cancers15112934. [PMID: 37296894 DOI: 10.3390/cancers15112934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2023] [Revised: 05/16/2023] [Accepted: 05/23/2023] [Indexed: 06/12/2023] Open
Abstract
Human colorectal cancer (CRC) is one of the most common malignancies in men and women across the globe, albeit CRC incidence and mortality shows a substantial racial and ethnic disparity, with the highest burden in African American patients. Even with effective screening tools such as colonoscopy and diagnostic detection assays, CRC remains a substantial health burden. In addition, primary tumors located in the proximal (right) or distal (left) sides of the colorectum have been shown to be unique tumor types that require unique treatment schema. Distal metastases in the liver and other organ systems are the major causes of mortality in CRC patients. Characterizing genomic, epigenomic, transcriptomic and proteomic (multi-omics) alterations has led to a better understanding of primary tumor biology, resulting in targeted therapeutic advancements. In this regard, molecular-based CRC subgroups have been developed that show correlations with patient outcomes. Molecular characterization of CRC metastases has highlighted similarities and differences between metastases and primary tumors; however, our understanding as to how to improve patient outcomes based on metastasis biology is lagging and remains a major obstacle to improving CRC patient outcomes. In this review, we will summarize the multi-omics features of primary CRC tumors and their metastases across racial and ethnic groups, the differences in proximal and distal tumor biology, molecular-based CRC subgroups, treatment strategies and challenges for improving patient outcomes.
Collapse
Affiliation(s)
- Guang Yang
- School of Sciences, China Pharmaceutical University, Nanjing 211121, China
- China Grand Enterprises, Beijing 100101, China
| | - Xi Richard Yu
- China Grand Enterprises, Beijing 100101, China
- Huadong Medicine Co., Ltd., Hangzhou 310011, China
| | - Daniel J Weisenberger
- Department of Biochemistry and Molecular Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
- USC Institute of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | - Tao Lu
- School of Sciences, China Pharmaceutical University, Nanjing 211121, China
- State Key Laboratory of Natural Sciences, China Pharmaceutical University, Nanjing 211121, China
| | - Gangning Liang
- USC Institute of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
- USC Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| |
Collapse
|
4
|
Agapito G, Milano M, Cannataro M. A Python Clustering Analysis Protocol of Genes Expression Data Sets. Genes (Basel) 2022; 13:genes13101839. [PMID: 36292724 PMCID: PMC9601308 DOI: 10.3390/genes13101839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 10/05/2022] [Accepted: 10/08/2022] [Indexed: 11/16/2022] Open
Abstract
Gene expression and SNPs data hold great potential for a new understanding of disease prognosis, drug sensitivity, and toxicity evaluations. Cluster analysis is used to analyze data that do not contain any specific subgroups. The goal is to use the data itself to recognize meaningful and informative subgroups. In addition, cluster investigation helps data reduction purposes, exposes hidden patterns, and generates hypotheses regarding the relationship between genes and phenotypes. Cluster analysis could also be used to identify bio-markers and yield computational predictive models. The methods used to analyze microarrays data can profoundly influence the interpretation of the results. Therefore, a basic understanding of these computational tools is necessary for optimal experimental design and meaningful data analysis. This manuscript provides an analysis protocol to effectively analyze gene expression data sets through the K-means and DBSCAN algorithms. The general protocol enables analyzing omics data to identify subsets of features with low redundancy and high robustness, speeding up the identification of new bio-markers through pathway enrichment analysis. In addition, to demonstrate the effectiveness of our clustering analysis protocol, we analyze a real data set from the GEO database. Finally, the manuscript provides some best practice and tips to overcome some issues in the analysis of omics data sets through unsupervised learning.
Collapse
Affiliation(s)
- Giuseppe Agapito
- Department of Law, Economics and Social Sciences, University Magna Græcia of Catanzaro, 88100 Catanzaro, Italy
- Data Analytics Research Center, University Magna Græcia of Catanzaro, 88100 Catanzaro, Italy
- Correspondence:
| | - Marianna Milano
- Data Analytics Research Center, University Magna Græcia of Catanzaro, 88100 Catanzaro, Italy
- Department of Medical and Clinical Surgery, University Magna Græcia of Catanzaro, 88100 Catanzaro, Italy
| | - Mario Cannataro
- Data Analytics Research Center, University Magna Græcia of Catanzaro, 88100 Catanzaro, Italy
- Department of Medical and Clinical Surgery, University Magna Græcia of Catanzaro, 88100 Catanzaro, Italy
| |
Collapse
|