1
|
Wu P, Li D, Zhang C, Dai B, Tang X, Liu J, Wu Y, Wang X, Shen A, Zhao J, Zi X, Li R, Sun N, He J. A unique circulating microRNA pairs signature serves as a superior tool for early diagnosis of pan-cancer. Cancer Lett 2024; 588:216655. [PMID: 38460724 DOI: 10.1016/j.canlet.2024.216655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Revised: 11/18/2023] [Accepted: 01/16/2024] [Indexed: 03/11/2024]
Abstract
Cancer remains a major burden globally and the critical role of early diagnosis is self-evident. Although various miRNA-based signatures have been developed in past decades, clinical utilization is limited due to a lack of precise cutoff value. Here, we innovatively developed a signature based on pairwise expression of miRNAs (miRPs) for pan-cancer diagnosis using machine learning approach. We analyzed miRNA spectrum of 15832 patients, who were divided into training, validation, test, and external test sets, with 13 different cancers from 10 cohorts. Five different machine-learning (ML) algorithms (XGBoost, SVM, RandomForest, LASSO, and Logistic) were adopted for signature construction. The best ML algorithm and the optimal number of miRPs included were identified using area under the curve (AUC) and youden index in validation set. The AUC of the best model was compared to previously published 25 signatures. Overall, Random Forest approach including 31 miRPs (31-miRP) was developed, proving highly efficient in cancer diagnosis across different datasets and cancer types (AUC range: 0.980-1.000). Regarding diagnosis of cancers at early stage, 31-miRP also exhibited high capacities, with AUC ranging from 0.961 to 0.998. Moreover, 31-miRP exhibited advantages in differentiating cancers from normal tissues (AUC range: 0.976-0.998) as well as differentiating cancers from corresponding benign lesions. Encouragingly, comparing to previously published 25 different signatures, 31-miRP also demonstrated clear advantages. In conclusion, 31-miRP acts as a powerful model for cancer diagnosis, characterized by high specificity and sensitivity as well as a clear cutoff value, thereby holding potential as a reliable tool for cancer diagnosis at early stage.
Collapse
Affiliation(s)
- Peng Wu
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Dongyu Li
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China; 4+4 Medical Doctor Program, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Chaoqi Zhang
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Bing Dai
- School of Software, Tsinghua University, Beijing, 100084, China
| | - Xiaoya Tang
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Jingjing Liu
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Yue Wu
- Department of Clinical Laboratory, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Xingwu Wang
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Ao Shen
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Jiapeng Zhao
- 4+4 Medical Doctor Program, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Xiaohui Zi
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Ruirui Li
- Department of Pathology, National Cancer Center/ National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Nan Sun
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China.
| | - Jie He
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China.
| |
Collapse
|
2
|
Yamaguchi N, Sawano T, Nakatani J, Nakano-Doi A, Nakagomi T, Matsuyama T, Tanaka H. Voluntary running exercise modifies astrocytic population and features in the peri-infarct cortex. IBRO Neurosci Rep 2023; 14:253-263. [PMID: 36880055 PMCID: PMC9984846 DOI: 10.1016/j.ibneur.2023.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 02/19/2023] [Accepted: 02/20/2023] [Indexed: 02/24/2023] Open
Abstract
Rehabilitative exercise following a brain stroke has beneficial effects on the morphological plasticity of neurons. Particularly, voluntary running exercise after focal cerebral ischemia promotes functional recovery and ameliorates ischemia-induced dendritic spine loss in the peri-infarct motor cortex layer 5. Moreover, neuronal morphology is affected by changes in the perineuronal environment. Glial cells, whose phenotypes may be altered by exercise, are known to play a pivotal role in the formation of this perineuronal environment. Herein, we investigated the effects of voluntary running exercise on glial cells after middle cerebral artery occlusion. Voluntary running exercise increased the population of glial fibrillary acidic protein-positive astrocytes born between post-operative days (POD) 0 and 3 on POD15 in the peri-infarct cortex. After exercise, transcriptomic analysis of post-ischemic astrocytes revealed 10 upregulated and 70 downregulated genes. Furthermore, gene ontology analysis showed that the 70 downregulated genes were significantly associated with neuronal morphology. In addition, exercise reduced the number of astrocytes expressing lipocalin 2, a regulator of dendritic spine density, on POD15. Our results suggest that exercise modifies the composition of astrocytic population and their phenotype.
Collapse
Key Words
- ACSA-2, astrocyte cell surface antigen-2
- Astrocytes
- BrdU, 5-bromo-2′-deoxyuridine
- Cerebral ischemia
- DEG, differentially expressed gene
- EDTA, ethylenediaminetetraacetic acid
- FBS, fetal bovine serum
- GFAP, glial fibrillary acidic protein
- GO, gene ontology
- GST-π, glutathione S-transferase-π
- Gstp1, glutathione S-transferase, pi 1
- Gstp2, glutathione S-transferase, pi 2
- Iba1, ionized calcium-binding adapter molecule 1
- Ig, immunoglobulin
- Lcn2, lipocalin 2
- MCAO, middle cerebral artery occlusion
- PBS, phosphate-buffered saline
- PFA, 4% paraformaldehyde
- POD, post-operative day
- Proliferation
- TUNEL, terminal deoxynucleotidyl transferase-mediated dUTP nick 3’-end labeling
- Transcriptome
- Vegfa, vascular endothelial growth factor A
- Voluntary running exercise
- Vtn, vitronectin
- qPCR, quantitative polymerase chain reaction
Collapse
Affiliation(s)
- Natsumi Yamaguchi
- Pharmacology Laboratory, Department of Biomedical Sciences, College of Life Sciences, Ritsumeikan University, 1-1-1 Noji-Higashi, Kusatsu, Shiga 525-8577, Japan.,Ritsumeikan Advanced Research Academy, 1 Nishinokyo-Suzaku-cho, Nakagyo-ku, Kyoto 604-8520, Japan
| | - Toshinori Sawano
- Pharmacology Laboratory, Department of Biomedical Sciences, College of Life Sciences, Ritsumeikan University, 1-1-1 Noji-Higashi, Kusatsu, Shiga 525-8577, Japan
| | - Jin Nakatani
- Pharmacology Laboratory, Department of Biomedical Sciences, College of Life Sciences, Ritsumeikan University, 1-1-1 Noji-Higashi, Kusatsu, Shiga 525-8577, Japan
| | - Akiko Nakano-Doi
- Institute for Advanced Medical Sciences, Hyogo College of Medicine, 1-1 Mukogawacho, Nishinomiya 663-8501, Japan.,Department of Therapeutic Progress in Brain Diseases, Hyogo College of Medicine, 1-1 Mukogawacho, Nishinomiya 663-8501, Japan
| | - Takayuki Nakagomi
- Institute for Advanced Medical Sciences, Hyogo College of Medicine, 1-1 Mukogawacho, Nishinomiya 663-8501, Japan.,Department of Therapeutic Progress in Brain Diseases, Hyogo College of Medicine, 1-1 Mukogawacho, Nishinomiya 663-8501, Japan
| | - Tomohiro Matsuyama
- Department of Therapeutic Progress in Brain Diseases, Hyogo College of Medicine, 1-1 Mukogawacho, Nishinomiya 663-8501, Japan
| | - Hidekazu Tanaka
- Pharmacology Laboratory, Department of Biomedical Sciences, College of Life Sciences, Ritsumeikan University, 1-1-1 Noji-Higashi, Kusatsu, Shiga 525-8577, Japan
| |
Collapse
|
3
|
Liu Y, Lin Y, Yang W, Lin Y, Wu Y, Zhang Z, Lin N, Wang X, Tong M, Yu R. Application of individualized differential expression analysis in human cancer proteome. Brief Bioinform 2022; 23:6562685. [DOI: 10.1093/bib/bbac096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 02/06/2022] [Accepted: 02/23/2022] [Indexed: 11/13/2022] Open
Abstract
Abstract
Liquid chromatography–mass spectrometry-based quantitative proteomics can measure the expression of thousands of proteins from biological samples and has been increasingly applied in cancer research. Identifying differentially expressed proteins (DEPs) between tumors and normal controls is commonly used to investigate carcinogenesis mechanisms. While differential expression analysis (DEA) at an individual level is desired to identify patient-specific molecular defects for better patient stratification, most statistical DEP analysis methods only identify deregulated proteins at the population level. To date, robust individualized DEA algorithms have been proposed for ribonucleic acid data, but their performance on proteomics data is underexplored. Herein, we performed a systematic evaluation on five individualized DEA algorithms for proteins on cancer proteomic datasets from seven cancer types. Results show that the within-sample relative expression orderings (REOs) of protein pairs in normal tissues were highly stable, providing the basis for individualized DEA for proteins using REOs. Moreover, individualized DEA algorithms achieve higher precision in detecting sample-specific deregulated proteins than population-level methods. To facilitate the utilization of individualized DEA algorithms in proteomics for prognostic biomarker discovery and personalized medicine, we provide Individualized DEP Analysis IDEPAXMBD (XMBD: Xiamen Big Data, a biomedical open software initiative in the National Institute for Data Science in Health and Medicine, Xiamen University, China.) (https://github.com/xmuyulab/IDEPA-XMBD), which is a user-friendly and open-source Python toolkit that integrates individualized DEA algorithms for DEP-associated deregulation pattern recognition.
Collapse
Affiliation(s)
- Yachen Liu
- School of Informatics, Xiamen University, Xiamen, Fujian 316000, China
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 316005, China
| | - Yalan Lin
- School of Informatics, Xiamen University, Xiamen, Fujian 316000, China
| | - Wenxian Yang
- Aginome Scientific, Xiamen, Fujian 316005, China
| | - Yuxiang Lin
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 316005, China
- State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network, School of Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Yujuan Wu
- School of Informatics, Xiamen University, Xiamen, Fujian 316000, China
| | - Zheyang Zhang
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 316005, China
- State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network, School of Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Nuoqi Lin
- State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network, School of Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Xianlong Wang
- Department of Bioinformatics, School of Medical Technology and Engineering, Key Laboratory of Medical Bioinformatics, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Fujian Medical University, Fuzhou, Fujian 350122, China
| | - Mengsha Tong
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 316005, China
- State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network, School of Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Rongshan Yu
- School of Informatics, Xiamen University, Xiamen, Fujian 316000, China
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 316005, China
- Aginome Scientific, Xiamen, Fujian 316005, China
| |
Collapse
|
4
|
Chekka LMS, Langaee T, Johnson JA. Comparison of Data Normalization Strategies for Array-Based MicroRNA Profiling Experiments and Identification and Validation of Circulating MicroRNAs as Endogenous Controls in Hypertension. Front Genet 2022; 13:836636. [PMID: 35432462 PMCID: PMC9008777 DOI: 10.3389/fgene.2022.836636] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 03/03/2022] [Indexed: 02/06/2023] Open
Abstract
Introduction: MicroRNAs are small noncoding RNAs with potential regulatory roles in hypertension and drug response. The presence of many of these RNAs in biofluids has spurred investigation into their role as possible biomarkers for use in precision approaches to healthcare. One of the major challenges in clinical translation of circulating miRNA biomarkers is the limited replication across studies due to lack of standards for data normalization techniques for array-based approaches and a lack of consensus on an endogenous control normalizer for qPCR-based candidate miRNA profiling studies. Methods: We conducted genome-wide profiling of 754 miRNAs in baseline plasma of 36 European American individuals with uncomplicated hypertension selected from the PEAR clinical trial, who had been untreated for hypertension for at least one month prior to sample collection. After appropriate quality control with amplification score and missingness filters, we tested different normalization strategies such as normalization with global mean of imputed and unimputed data, mean of restricted set of miRNAs, quantile normalization, and endogenous control miRNA normalization to identify the method that best reduces the technical/experimental variability in the data. We identified best endogenous control candidates with expression pattern closest to the mean miRNA expression in the sample, as well as by assessing their stability using a combination of NormFinder, geNorm, Best Keeper and Delta Ct algorithms under the Reffinder software. The suitability of the four best endogenous controls was validated in 50 hypertensive African Americans from the same trial with reverse-transcription–qPCR and by evaluating their stability ranking in that cohort. Results: Among the compared normalization strategies, quantile normalization and global mean normalization performed better than others in terms of reducing the standard deviation of miRNAs across samples in the array-based data. Among the four strongest candidate miRNAs from our selection process (miR-223-3p, 19b, 106a, and 126-5p), miR-223-3p and miR-126-5p were consistently expressed with the best stability ranking in the validation cohort. Furthermore, the combination of miR-223-3p and 126-5p showed better stability ranking when compared to single miRNAs. Conclusion: We identified quantile normalization followed by global mean normalization to be the best methods in reducing the variance in the data. We identified the combination of miR-223-3p and 126-5p as potential endogenous control in studies of hypertension.
Collapse
Affiliation(s)
- Lakshmi Manasa S. Chekka
- Department of Pharmacotherapy and Translational Research and Center for Pharmacogenomics and Precision Medicine, University of Florida, Gainesville, FL, United States
| | - Taimour Langaee
- Department of Pharmacotherapy and Translational Research and Center for Pharmacogenomics and Precision Medicine, University of Florida, Gainesville, FL, United States
| | - Julie A. Johnson
- Department of Pharmacotherapy and Translational Research and Center for Pharmacogenomics and Precision Medicine, University of Florida, Gainesville, FL, United States
- Division of Cardiovascular Medicine, Department of Medicine, University of Florida, Gainesville, FL, United States
- *Correspondence: Julie A. Johnson,
| |
Collapse
|
5
|
Zhao Y, Wong L, Goh WWB. How to do quantile normalization correctly for gene expression data analyses. Sci Rep 2020; 10:15534. [PMID: 32968196 PMCID: PMC7511327 DOI: 10.1038/s41598-020-72664-6] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2020] [Accepted: 08/03/2020] [Indexed: 02/07/2023] Open
Abstract
Quantile normalization is an important normalization technique commonly used in high-dimensional data analysis. However, it is susceptible to class-effect proportion effects (the proportion of class-correlated variables in a dataset) and batch effects (the presence of potentially confounding technical variation) when applied blindly on whole data sets, resulting in higher false-positive and false-negative rates. We evaluate five strategies for performing quantile normalization, and demonstrate that good performance in terms of batch-effect correction and statistical feature selection can be readily achieved by first splitting data by sample class-labels before performing quantile normalization independently on each split (“Class-specific”). Via simulations with both real and simulated batch effects, we demonstrate that the “Class-specific” strategy (and others relying on similar principles) readily outperform whole-data quantile normalization, and is robust-preserving useful signals even during the combined analysis of separately-normalized datasets. Quantile normalization is a commonly used procedure. But when carelessly applied on whole datasets without first considering class-effect proportion and batch effects, can result in poor performance. If quantile normalization must be used, then we recommend using the “Class-specific” strategy.
Collapse
Affiliation(s)
- Yaxing Zhao
- School of Pharmaceutical Science and Technology, Tianjin University, Tianjin, China
| | - Limsoon Wong
- Department of Computer Science, National University of Singapore, Singapore, Singapore.,Department of Pathology, National University of Singapore, Singapore, Singapore
| | - Wilson Wen Bin Goh
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore.
| |
Collapse
|
6
|
Ho SY, Tan S, Sze CC, Wong L, Goh WWB. What can Venn diagrams teach us about doing data science better? INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS 2020. [DOI: 10.1007/s41060-020-00230-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
7
|
Ho SY, Wong L, Goh WWB. Avoid Oversimplifications in Machine Learning: Going beyond the Class-Prediction Accuracy. PATTERNS 2020; 1:100025. [PMID: 33205097 PMCID: PMC7660406 DOI: 10.1016/j.patter.2020.100025] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Class-prediction accuracy provides a quick but superficial way of determining classifier performance. It does not inform on the reproducibility of the findings or whether the selected or constructed features used are meaningful and specific. Furthermore, the class-prediction accuracy oversummarizes and does not inform on how training and learning have been accomplished: two classifiers providing the same performance in one validation can disagree on many future validations. It does not provide explainability in its decision-making process and is not objective, as its value is also affected by class proportions in the validation set. Despite these issues, this does not mean we should omit the class-prediction accuracy. Instead, it needs to be enriched with accompanying evidence and tests that supplement and contextualize the reported accuracy. This additional evidence serves as augmentations and can help us perform machine learning better while avoiding naive reliance on oversimplified metrics. There is a huge potential for machine learning, but blind reliance on oversimplified metrics can mislead. Class-prediction accuracy is a common metric used for determining classifier performance. This article provides examples to show how the class-prediction accuracy is superficial and even misleading. We propose some augmentative measures to supplement the class-prediction accuracy. This in turn helps us to better understand the quality of learning of the classifier.
Collapse
Affiliation(s)
- Sung Yang Ho
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore
| | - Limsoon Wong
- Department of Computer Science, National University of Singapore, Singapore 117417, Singapore
| | - Wilson Wen Bin Goh
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore
| |
Collapse
|
8
|
Li X, Shi G, Chu Q, Jiang W, Liu Y, Zhang S, Zhang Z, Wei Z, He F, Guo Z, Qi L. A qualitative transcriptional signature for the histological reclassification of lung squamous cell carcinomas and adenocarcinomas. BMC Genomics 2019; 20:881. [PMID: 31752667 PMCID: PMC6868745 DOI: 10.1186/s12864-019-6086-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2019] [Accepted: 09/09/2019] [Indexed: 12/31/2022] Open
Abstract
Background Targeted therapy for non-small cell lung cancer is histology dependent. However, histological classification by routine pathological assessment with hematoxylin-eosin staining and immunostaining for poorly differentiated tumors, particularly those from small biopsies, is still challenging. Additionally, the effectiveness of immunomarkers is limited by technical inconsistencies of immunostaining and lack of standardization for staining interpretation. Results Using gene expression profiles of pathologically-determined lung adenocarcinomas and squamous cell carcinomas, denoted as pADC and pSCC respectively, we developed a qualitative transcriptional signature, based on the within-sample relative gene expression orderings (REOs) of gene pairs, to distinguish ADC from SCC. The signature consists of two genes, KRT5 and AGR2, which has the stable REO pattern of KRT5 > AGR2 in pSCC and KRT5 < AGR2 in pADC. In the two test datasets with relative unambiguous NSCLC types, the apparent accuracy of the signature were 94.44 and 98.41%, respectively. In the other integrated dataset for frozen tissues, the signature reclassified 4.22% of the 805 pADC patients as SCC and 12% of the 125 pSCC patients as ADC. Similar results were observed in the clinical challenging cases, including FFPE specimens, mixed tumors, small biopsy specimens and poorly differentiated specimens. The survival analyses showed that the pADC patients reclassified as SCC had significantly shorter overall survival than the signature-confirmed pADC patients (log-rank p = 0.0123, HR = 1.89), consisting with the knowledge that SCC patients suffer poor prognoses than ADC patients. The proliferative activity, subtype-specific marker genes and consensus clustering analyses also supported the correctness of our signature. Conclusions The non-subjective qualitative REOs signature could effectively distinguish ADC from SCC, which would be an auxiliary test for the pathological assessment of the ambiguous cases.
Collapse
Affiliation(s)
- Xin Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Gengen Shi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Qingsong Chu
- Fujian Key Laboratory for Translational Research, Institute of Translational Medicine, Fujian Medical University, Fuzhou, 350001, China
| | - Wenbin Jiang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Yixin Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Sainan Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Zheyang Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Zixin Wei
- Department of Medical Oncology, Harbin Medical University Cancer hospital, Harbin, 150081, China
| | - Fei He
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Epidemiology and Health Statistics, School of Public Health, Fujian Medical University, Fuzhou, 350001, China
| | - Zheng Guo
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China. .,Department of Bioinformatics, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350001, China. .,Key laboratory of Medical Bioinformatics, Fujian Province, Fuzhou, 350001, China.
| | - Lishuang Qi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China.
| |
Collapse
|
9
|
A qualitative transcriptional signature to reclassify estrogen receptor status of breast cancer patients. Breast Cancer Res Treat 2018; 170:271-277. [DOI: 10.1007/s10549-018-4758-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Accepted: 03/13/2018] [Indexed: 12/11/2022]
|
10
|
Circumvent the uncertainty in the applications of transcriptional signatures to tumor tissues sampled from different tumor sites. Oncotarget 2018; 8:30265-30275. [PMID: 28427173 PMCID: PMC5444741 DOI: 10.18632/oncotarget.15754] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2016] [Accepted: 01/30/2017] [Indexed: 11/25/2022] Open
Abstract
The expression measurements of thousands of genes are correlated with the proportions of tumor epithelial cell (PTEC) in clinical samples. Thus, for a tumor diagnostic or prognostic signature based on a summarization of expression levels of the signature genes, the risk score for a patient may dependent on the tumor tissues sampled from different tumor sites with diverse PTEC for the same patient. Here, we proposed that the within-samples relative expression orderings (REOs) based gene pairs signatures should be insensitive to PTEC variations. Firstly, by analysis of paired tumor epithelial cell and stromal cell microdissected samples from 27 cancer patients, we showed that above 80% of gene pairs had consistent REOs between the two cells, indicating these REOs would be independent of PTEC in cancer tissues. Then, by simulating tumor tissues with different PTEC using each of the 27 paired samples, we showed that about 90% REOs of gene pairs in tumor epithelial cells were maintained in tumor samples even when PTEC decreased to 30%. Especially, the REOs of gene pairs with larger expression differences in tumor epithelial cells tend to be more robust against PTEC variations. Finally, as a case study, we developed a gene pair signature which could robustly distinguish colorectal cancer tissues with various PTEC from normal tissues. We concluded that the REOs-based signatures were robust against PTEC variations.
Collapse
|
11
|
Chen R, Guan Q, Cheng J, He J, Liu H, Cai H, Hong G, Zhang J, Li N, Ao L, Guo Z. Robust transcriptional tumor signatures applicable to both formalin-fixed paraffin-embedded and fresh-frozen samples. Oncotarget 2018; 8:6652-6662. [PMID: 28036264 PMCID: PMC5351660 DOI: 10.18632/oncotarget.14257] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2016] [Accepted: 12/02/2016] [Indexed: 12/19/2022] Open
Abstract
Formalin-fixed paraffin-embedded (FFPE) samples represent a valuable resource for clinical researches. However, FFPE samples are usually considered an unreliable source for gene expression analysis due to the partial RNA degradation. In this study, through comparing gene expression profiles between FFPE samples and paired fresh-frozen (FF) samples for three cancer types, we firstly showed that expression measurements of thousands of genes had at least two-fold change in FFPE samples compared with paired FF samples. Therefore, for a transcriptional signature based on risk scores summarized from the expression levels of the signature genes, the risk score thresholds trained from FFPE (or FF) samples could not be applied to FF (or FFPE) samples. On the other hand, we found that more than 90% of the relative expression orderings (REOs) of gene pairs in the FF samples were maintained in their paired FFPE samples and largely unaffected by the storage time. The result suggested that the REOs of gene pairs were highly robust against partial RNA degradation in FFPE samples. Finally, as a case study, we developed a REOs-based signature to distinguish liver cirrhosis from hepatocellular carcinoma (HCC) using FFPE samples. The signature was validated in four datasets of FFPE samples and eight datasets of FF samples. In conclusion, the valuable FFPE samples can be fully exploited to identify REOs-based diagnostic and prognostic signatures which could be robustly applicable to both FF samples and FFPE samples with degraded RNA.
Collapse
Affiliation(s)
- Rou Chen
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou 350001, China
| | - Qingzhou Guan
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou 350001, China
| | - Jun Cheng
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou 350001, China
| | - Jun He
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou 350001, China
| | - Huaping Liu
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou 350001, China
| | - Hao Cai
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou 350001, China
| | - Guini Hong
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou 350001, China
| | - Jiahui Zhang
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou 350001, China
| | - Na Li
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou 350001, China
| | - Lu Ao
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou 350001, China
| | - Zheng Guo
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou 350001, China
| |
Collapse
|
12
|
Discriminating cancer-related and cancer-unrelated chemoradiation-response genes for locally advanced rectal cancers. Sci Rep 2016; 6:36935. [PMID: 27845363 PMCID: PMC5109405 DOI: 10.1038/srep36935] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2016] [Accepted: 10/24/2016] [Indexed: 02/07/2023] Open
Abstract
For patients with locally advanced rectal cancer (LARC) treated with preoperation chemoradiation (pCRT), identifying differentially expressed (DE) genes between non-responders and responders is a common approach for investigating mechanisms of chemoradiation resistance. However, some of such DE genes might be irrelevant to cancer itself but simply reflect the pharmacokinetic differences of the normal tissues. In this study, we adopted the RankComp algorithm to identify DE genes for each of LARC sample compared with its own normal state. Then, we identified genes with significantly different deregulation frequencies between the non-responders and responders, defined as cancer-related pCRT-response genes. Pathway enrichment and protein-protein interaction analyses showed that these genes specifically and intensively interacted with currently known effective genes of pCRT, involving in DNA replication, cell cycle and DNA repair. In contrast, after excluding the cancer-related pCRT-response genes, the other DE genes between non-responders and responders were enriched in many pathways of drug and protein metabolisms and transports, and interacted with both the known effective genes and pharmacokinetic genes. Hence, these two types of DE genes should be distinguished for investigating mechanisms of pCRT response in LARCs.
Collapse
|
13
|
Guan Q, Chen R, Yan H, Cai H, Guo Y, Li M, Li X, Tong M, Ao L, Li H, Hong G, Guo Z. Differential expression analysis for individual cancer samples based on robust within-sample relative gene expression orderings across multiple profiling platforms. Oncotarget 2016; 7:68909-68920. [PMID: 27634898 PMCID: PMC5356599 DOI: 10.18632/oncotarget.11996] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2016] [Accepted: 08/09/2016] [Indexed: 11/25/2022] Open
Abstract
The highly stable within-sample relative expression orderings (REOs) of gene pairs in a particular type of human normal tissue are widely reversed in the cancer condition. Based on this finding, we have recently proposed an algorithm named RankComp to detect differentially expressed genes (DEGs) for individual disease samples measured by a particular platform. In this paper, with 461 normal lung tissue samples separately measured by four commonly used platforms, we demonstrated that tens of millions of gene pairs with significantly stable REOs in normal lung tissue can be consistently detected in samples measured by different platforms. However, about 20% of stable REOs commonly detected by two different platforms (e.g., Affymetrix and Illumina platforms) showed inconsistent REO patterns due to the differences in probe design principles. Based on the significantly stable REOs (FDR<0.01) for normal lung tissue consistently detected by the four platforms, which tended to have large rank differences, RankComp detected averagely 1184, 1335 and 1116 DEGs per sample with averagely 96.51%, 95.95% and 94.78% precisions in three evaluation datasets with 25, 57 and 58 paired lung cancer and normal samples, respectively. Individualized pathway analysis revealed some common and subtype-specific functional mechanisms of lung cancer. Similar results were observed for colorectal cancer. In conclusion, based on the cross-platform significantly stable REOs for a particular normal tissue, differentially expressed genes and pathways in any disease sample measured by any of the platforms can be readily and accurately detected, which could be further exploited for dissecting the heterogeneity of cancer.
Collapse
Affiliation(s)
- Qingzhou Guan
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou, 350001, China
| | - Rou Chen
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou, 350001, China
| | - Haidan Yan
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou, 350001, China
| | - Hao Cai
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou, 350001, China
| | - You Guo
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou, 350001, China
- Department of Preventive Medicine, School of Basic Medicine Sciences, Gannan Medical University, Ganzhou, 341000, China
| | - Mengyao Li
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou, 350001, China
| | - Xiangyu Li
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou, 350001, China
| | - Mengsha Tong
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou, 350001, China
| | - Lu Ao
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou, 350001, China
| | - Hongdong Li
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou, 350001, China
| | - Guini Hong
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou, 350001, China
| | - Zheng Guo
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou, 350001, China
| |
Collapse
|
14
|
Guan X, Yi Y, Huang Y, Hu Y, Li X, Wang X, Fan H, Wang G, Wang D. Revealing potential molecular targets bridging colitis and colorectal cancer based on multidimensional integration strategy. Oncotarget 2015; 6:37600-12. [PMID: 26461477 PMCID: PMC4741951 DOI: 10.18632/oncotarget.6067] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2015] [Accepted: 09/24/2015] [Indexed: 02/05/2023] Open
Abstract
Chronic inflammation may play a vital role in the pathogenesis of inflammation-associated tumors. However, the underlying mechanisms bridging ulcerative colitis (UC) and colorectal cancer (CRC) remain unclear. Here, we integrated multidimensional interaction resources, including gene expression profiling, protein-protein interactions (PPIs), transcriptional and post-transcriptional regulation data, and virus-host interactions, to tentatively explore potential molecular targets that functionally link UC and CRC at a systematic level. In this work, by deciphering the overlapping genes, crosstalking genes and pivotal regulators of both UC- and CRC-associated functional module pairs, we revealed a variety of genes (including FOS and DUSP1, etc.), transcription factors (including SMAD3 and ETS1, etc.) and miRNAs (including miR-155 and miR-196b, etc.) that may have the potential to complete the connections between UC and CRC. Interestingly, further analyses of the virus-host interaction network demonstrated that several virus proteins (including EBNA-LP of EBV and protein E7 of HPV) frequently inter-connected to UC- and CRC-associated module pairs with their validated targets significantly enriched in both modules of the host. Together, our results suggested that multidimensional integration strategy provides a novel approach to discover potential molecular targets that bridge the connections between UC and CRC, which could also be extensively applied to studies on other inflammation-related cancers.
Collapse
Affiliation(s)
- Xu Guan
- Department of Colorectal Cancer Surgery, the Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Ying Yi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yan Huang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yongfei Hu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Xiaobo Li
- Department of Pathology, Harbin Medical University, Harbin, China
| | - Xishan Wang
- Department of Colorectal Cancer Surgery, the Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Huihui Fan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Guiyu Wang
- Department of Colorectal Cancer Surgery, the Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Dong Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- Department of Biochemistry and Molecular Biology, Shantou University Medical College, Shantou, China
| |
Collapse
|
15
|
A functional module-based exploration between inflammation and cancer in esophagus. Sci Rep 2015; 5:15340. [PMID: 26489668 PMCID: PMC4614801 DOI: 10.1038/srep15340] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2015] [Accepted: 09/23/2015] [Indexed: 12/26/2022] Open
Abstract
Inflammation contributing to the underlying progression of diverse human cancers has been generally appreciated, however, explorations into the molecular links between inflammation and cancer in esophagus are still at its early stage. In our study, we presented a functional module-based approach, in combination with multiple data resource (gene expression, protein-protein interactions (PPI), transcriptional and post-transcriptional regulations) to decipher the underlying links. Via mapping differentially expressed disease genes, functional disease modules were identified. As indicated, those common genes and interactions tended to play important roles in linking inflammation and cancer. Based on crosstalk analysis, we demonstrated that, although most disease genes were not shared by both kinds of modules, they might act through participating in the same or similar functions to complete the molecular links. Additionally, we applied pivot analysis to extract significant regulators for per significant crosstalk module pair. As shown, pivot regulators might manipulate vital parts of the module subnetworks, and then work together to bridge inflammation and cancer in esophagus. Collectively, based on our functional module analysis, we demonstrated that shared genes or interactions, significant crosstalk modules, and those significant pivot regulators were served as different functional parts underlying the molecular links between inflammation and cancer in esophagus.
Collapse
|
16
|
Wang H, Cai H, Ao L, Yan H, Zhao W, Qi L, Gu Y, Guo Z. Individualized identification of disease-associated pathways with disrupted coordination of gene expression. Brief Bioinform 2015; 17:78-87. [PMID: 26023086 DOI: 10.1093/bib/bbv030] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2015] [Indexed: 01/08/2023] Open
Abstract
Current pathway analysis approaches are primarily dedicated to capturing deregulated pathways at the population level and cannot provide patient-specific pathway deregulation information. In this article, the authors present a simple approach, called individPath, to detect pathways with significantly disrupted intra-pathway relative expression orderings for each disease sample compared with the stable, normal intra-pathway relative expression orderings pre-determined in previously accumulated normal samples. Through the analysis of multiple microarray data sets for lung and breast cancer, the authors demonstrate individPath's effectiveness for detecting cancer-associated pathways with disrupted relative expression orderings at the individual level and dissecting the heterogeneity of pathway deregulation among different patients. The portable use of this simple approach in clinical contexts is exemplified by the identification of prognostic intra-pathway gene pair signatures to predict overall survival of resected early-stage lung adenocarcinoma patients and signatures to predict relapse-free survival of estrogen receptor-positive breast cancer patients after tamoxifen treatment.
Collapse
|
17
|
Wang H, Sun Q, Zhao W, Qi L, Gu Y, Li P, Zhang M, Li Y, Liu SL, Guo Z. Individual-level analysis of differential expression of genes and pathways for personalized medicine. ACTA ACUST UNITED AC 2014; 31:62-8. [PMID: 25165092 DOI: 10.1093/bioinformatics/btu522] [Citation(s) in RCA: 91] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
MOTIVATION The differential expression analysis focusing on inter-group comparison can capture only differentially expressed genes (DE genes) at the population level, which may mask the heterogeneity of differential expression in individuals. Thus, to provide patient-specific information for personalized medicine, it is necessary to conduct differential expression analysis at the individual level. RESULTS We proposed a method to detect DE genes in individual disease samples by using the disrupted ordering in individual disease samples. In both simulated data and real paired cancer-normal sample data, this method showed excellent performance. It was found to be insensitive to experimental batch effects and data normalization. The landscape of stable gene pairs in a particular type of normal tissue could be predetermined using previously accumulated data, based on which dysregulated genes and pathways for any disease sample can be readily detected. The usefulness of the RankComp method in clinical settings was exemplified by the identification and application of prognostic markers for lung cancer. AVAILABILITY AND IMPLEMENTATION RankComp is implemented in R script that is freely available from Supplementary Materials.
Collapse
Affiliation(s)
- Hongwei Wang
- College of Bioinformatics Science and Technology, Genomics Research Center, Harbin Medical University, Harbin 150086, China, Department of Microbiology and Infectious Diseases, University of Calgary, Calgary, AB, T2N 4N1, Canada and Bioinformatics Department, Basic Medical College, Fujian Medical University, Fuzhou 350004, China
| | - Qiang Sun
- College of Bioinformatics Science and Technology, Genomics Research Center, Harbin Medical University, Harbin 150086, China, Department of Microbiology and Infectious Diseases, University of Calgary, Calgary, AB, T2N 4N1, Canada and Bioinformatics Department, Basic Medical College, Fujian Medical University, Fuzhou 350004, China
| | - Wenyuan Zhao
- College of Bioinformatics Science and Technology, Genomics Research Center, Harbin Medical University, Harbin 150086, China, Department of Microbiology and Infectious Diseases, University of Calgary, Calgary, AB, T2N 4N1, Canada and Bioinformatics Department, Basic Medical College, Fujian Medical University, Fuzhou 350004, China
| | - Lishuang Qi
- College of Bioinformatics Science and Technology, Genomics Research Center, Harbin Medical University, Harbin 150086, China, Department of Microbiology and Infectious Diseases, University of Calgary, Calgary, AB, T2N 4N1, Canada and Bioinformatics Department, Basic Medical College, Fujian Medical University, Fuzhou 350004, China
| | - Yunyan Gu
- College of Bioinformatics Science and Technology, Genomics Research Center, Harbin Medical University, Harbin 150086, China, Department of Microbiology and Infectious Diseases, University of Calgary, Calgary, AB, T2N 4N1, Canada and Bioinformatics Department, Basic Medical College, Fujian Medical University, Fuzhou 350004, China
| | - Pengfei Li
- College of Bioinformatics Science and Technology, Genomics Research Center, Harbin Medical University, Harbin 150086, China, Department of Microbiology and Infectious Diseases, University of Calgary, Calgary, AB, T2N 4N1, Canada and Bioinformatics Department, Basic Medical College, Fujian Medical University, Fuzhou 350004, China
| | - Mengmeng Zhang
- College of Bioinformatics Science and Technology, Genomics Research Center, Harbin Medical University, Harbin 150086, China, Department of Microbiology and Infectious Diseases, University of Calgary, Calgary, AB, T2N 4N1, Canada and Bioinformatics Department, Basic Medical College, Fujian Medical University, Fuzhou 350004, China
| | - Yang Li
- College of Bioinformatics Science and Technology, Genomics Research Center, Harbin Medical University, Harbin 150086, China, Department of Microbiology and Infectious Diseases, University of Calgary, Calgary, AB, T2N 4N1, Canada and Bioinformatics Department, Basic Medical College, Fujian Medical University, Fuzhou 350004, China
| | - Shu-Lin Liu
- College of Bioinformatics Science and Technology, Genomics Research Center, Harbin Medical University, Harbin 150086, China, Department of Microbiology and Infectious Diseases, University of Calgary, Calgary, AB, T2N 4N1, Canada and Bioinformatics Department, Basic Medical College, Fujian Medical University, Fuzhou 350004, China College of Bioinformatics Science and Technology, Genomics Research Center, Harbin Medical University, Harbin 150086, China, Department of Microbiology and Infectious Diseases, University of Calgary, Calgary, AB, T2N 4N1, Canada and Bioinformatics Department, Basic Medical College, Fujian Medical University, Fuzhou 350004, China
| | - Zheng Guo
- College of Bioinformatics Science and Technology, Genomics Research Center, Harbin Medical University, Harbin 150086, China, Department of Microbiology and Infectious Diseases, University of Calgary, Calgary, AB, T2N 4N1, Canada and Bioinformatics Department, Basic Medical College, Fujian Medical University, Fuzhou 350004, China College of Bioinformatics Science and Technology, Genomics Research Center, Harbin Medical University, Harbin 150086, China, Department of Microbiology and Infectious Diseases, University of Calgary, Calgary, AB, T2N 4N1, Canada and Bioinformatics Department, Basic Medical College, Fujian Medical University, Fuzhou 350004, China
| |
Collapse
|
18
|
Wu D, Kang J, Huang Y, Li X, Wang X, Huang D, Wang Y, Li B, Hao D, Gu Q, Tang N, Li K, Guo Z, Li X, Xu J, Wang D. Deciphering global signal features of high-throughput array data from cancers. MOLECULAR BIOSYSTEMS 2014; 10:1549-56. [PMID: 24695970 DOI: 10.1039/c4mb00084f] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Normalization of array data relies on the assumption that most genes are not altered, which means that the signals for different samples should be scaled to have similar median or average values. However, accumulating evidence suggests that gene expression could be widely up-regulated in cancers. Our previous results and subsequent findings have shown that violation of the assumption led to erroneous interpretation of microarray data. To decipher the global signal features of microarray data from cancer samples, we empirically evaluated a large collection of gene and miRNA expression profiles and copy-number variation arrays. Our results showed that, at the transcriptomic level, genes and miRNAs are widely over-expressed in a large proportion of cancers. In contrast, at the genomic level, global raw signal intensities for methylation and copy number variation show negligible differences between cancer and normal samples. These results force us to re-evaluate the proper use of normalization procedures under different experimental conditions and for different array platforms.
Collapse
Affiliation(s)
- Deng Wu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Zhao W, Qi L, Qin Y, Wang H, Chen B, Wang R, Gu Y, Liu C, Wang C, Guo Z. Functional comparison between genes dysregulated in ulcerative colitis and colorectal carcinoma. PLoS One 2013; 8:e71989. [PMID: 23991021 PMCID: PMC3750042 DOI: 10.1371/journal.pone.0071989] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2013] [Accepted: 07/05/2013] [Indexed: 01/25/2023] Open
Abstract
Background Patients with ulcerative colitis (UC) are predisposed to colitis-associated colorectal cancer (CAC). However, the transcriptional mechanism of the transformation from UC to CAC is not fully understood. Methodology Firstly, we showed that CAC and non-UC-associated CRC were very similar in gene expression. Secondly, based on multiple datasets for UC and CRC, we extracted differentially expressed (DE) genes in UC and CRC versus normal controls, respectively. Thirdly, we compared the dysregulation directions (upregulation or downregulation) between DE genes of UC and CRC in CRC-related functions overrepresented with the DE genes of CRC, and proposed a regulatory model to explain the CRC-like dysregulation of genes in UC. A case study for “positive regulation of immune system process” was done to reveal the functional implication of DE genes with reversal dysregulations in these two diseases. Principal Findings In all the 44 detected CRC-related functions except for “viral transcription”, the dysregulation directions of DE genes in UC were significantly similar with their counterparts in CRC, and such CRC-like dysregulation in UC could be regulated by transcription factors affected by pro-inflammatory stimuli for colitis. A small portion of genes in each CRC-related function were dysregulated in opposite directions in the two diseases. The case study showed that genes related to humoral immunity specifically expressed in B cells tended to be upregulated in UC but downregulated in CRC. Conclusions The CRC-like dysregulation of genes in CRC-related functions in UC patients provides hints for understanding the transcriptional basis for UC to CRC transition. A small portion of genes with distinct dysregulation directions in each of the CRC-related functions in the two diseases implicate that their reversal dysregulations might be critical for UC to CRC transition. The cases study indicates that the humoral immune response might be inhibited during the transformation from UC to CRC.
Collapse
Affiliation(s)
- Wenyuan Zhao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Lishuang Qi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yao Qin
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Hongwei Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Beibei Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Ruiping Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yunyan Gu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Chunyang Liu
- Department of Bioinformatics, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, China
| | - Chenguang Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- * E-mail: (ZG); (CW)
| | - Zheng Guo
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- Department of Bioinformatics, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, China
- * E-mail: (ZG); (CW)
| |
Collapse
|
20
|
Chen M, Xiao J, Zhang Z, Liu J, Wu J, Yu J. Identification of human HK genes and gene expression regulation study in cancer from transcriptomics data analysis. PLoS One 2013; 8:e54082. [PMID: 23382867 PMCID: PMC3561342 DOI: 10.1371/journal.pone.0054082] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2012] [Accepted: 12/06/2012] [Indexed: 11/23/2022] Open
Abstract
The regulation of gene expression is essential for eukaryotes, as it drives the processes of cellular differentiation and morphogenesis, leading to the creation of different cell types in multicellular organisms. RNA-Sequencing (RNA-Seq) provides researchers with a powerful toolbox for characterization and quantification of transcriptome. Many different human tissue/cell transcriptome datasets coming from RNA-Seq technology are available on public data resource. The fundamental issue here is how to develop an effective analysis method to estimate expression pattern similarities between different tumor tissues and their corresponding normal tissues. We define the gene expression pattern from three directions: 1) expression breadth, which reflects gene expression on/off status, and mainly concerns ubiquitously expressed genes; 2) low/high or constant/variable expression genes, based on gene expression level and variation; and 3) the regulation of gene expression at the gene structure level. The cluster analysis indicates that gene expression pattern is higher related to physiological condition rather than tissue spatial distance. Two sets of human housekeeping (HK) genes are defined according to cell/tissue types, respectively. To characterize the gene expression pattern in gene expression level and variation, we firstly apply improved K-means algorithm and a gene expression variance model. We find that cancer-associated HK genes (a HK gene is specific in cancer group, while not in normal group) are expressed higher and more variable in cancer condition than in normal condition. Cancer-associated HK genes prefer to AT-rich genes, and they are enriched in cell cycle regulation related functions and constitute some cancer signatures. The expression of large genes is also avoided in cancer group. These studies will help us understand which cell type-specific patterns of gene expression differ among different cell types, and particularly for cancer.
Collapse
Affiliation(s)
- Meili Chen
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- Graduate University of Chinese Academy of Sciences, Beijing, China
| | - Jingfa Xiao
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Zhang Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Jingxing Liu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- Graduate University of Chinese Academy of Sciences, Beijing, China
| | - Jiayan Wu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- * E-mail: (JW); (JY)
| | - Jun Yu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- * E-mail: (JW); (JY)
| |
Collapse
|
21
|
Wang D, Zhang Y, Huang Y, Li P, Wang M, Wu R, Cheng L, Zhang W, Zhang Y, Li B, Wang C, Guo Z. Comparison of different normalization assumptions for analyses of DNA methylation data from the cancer genome. Gene 2012; 506:36-42. [PMID: 22771920 DOI: 10.1016/j.gene.2012.06.075] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2011] [Revised: 06/21/2012] [Accepted: 06/22/2012] [Indexed: 01/02/2023]
Abstract
Nowadays, some researchers normalized DNA methylation arrays data in order to remove the technical artifacts introduced by experimental differences in sample preparation, array processing and other factors. However, other researchers analyzed DNA methylation arrays without performing data normalization considering that current normalizations for methylation data may distort real differences between normal and cancer samples because cancer genomes may be extensively subject to hypomethylation and the total amount of CpG methylation might differ substantially among samples. In this study, using eight datasets by Infinium HumanMethylation27 assay, we systemically analyzed the global distribution of DNA methylation changes in cancer compared to normal control and its effect on data normalization for selecting differentially methylated (DM) genes. We showed more differentially methylated (DM) genes could be found in the Quantile/Lowess-normalized data than in the non-normalized data. We found the DM genes additionally selected in the Quantile/Lowess-normalized data showed significantly consistent methylation states in another independent dataset for the same cancer, indicating these extra DM genes were effective biological signals related to the disease. These results suggested normalization can increase the power of detecting DM genes in the context of diagnostic markers which were usually characterized by relatively large effect sizes. Besides, we evaluated the reproducibility of DM discoveries for a particular cancer type, and we found most of the DM genes additionally detected in one dataset showed the same methylation directions in the other dataset for the same cancer type, indicating that these DM genes were effective biological signals in the other dataset. Furthermore, we showed that some DM genes detected from different studies for a particular cancer type were significantly reproducible at the functional level.
Collapse
Affiliation(s)
- Dong Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|