1
|
Huang X, Zheng S, Li S, Huang Y, Zhang W, Liu F, Cao Q. Machine learning-based pathomics model predicts ANGPT2 expression and prognosis in hepatocellular carcinoma. THE AMERICAN JOURNAL OF PATHOLOGY 2024:S0002-9440(24)00478-4. [PMID: 39746507 DOI: 10.1016/j.ajpath.2024.12.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2024] [Revised: 11/05/2024] [Accepted: 12/04/2024] [Indexed: 01/04/2025]
Abstract
Angiopoietin 2 (ANGPT2) is a promising prognostic marker and therapeutic target in hepatocellular carcinoma (HCC). However, assessing ANGPT2 expression and prognosis from histopathological images with naked eye is challenging. In this study, machine learning was employed to develop a pathomics model that analyzed histopathological images to predict ANGPT2 status. 267 cases, obtained from TCGA-HCC were divided into training and testing set. 91 cases from a single center were employed as a validation set. ANGPT2 was demonstrated up-regulated in HCC and patients with high ANGPT2 expression had a significant overall survival (OS) decline in TCGA-HCC cohort. Histopathological features in the training set were extracted, screened, and incorporated into a gradient boosting machine (GBM) model that generated pathomics score (PS), which successfully identified ANGPT2 expression level in three sets and showed remarkable risk stratification for OS in TCGA-HCC cohort (P < 0.0001) and the single center cohort (P = 0.001). Multivariate analysis suggested that PS could serve as a predictor for prognosis (P < 0.001). Bioinformatics analysis illustrated distinction of tumor growth and development related gene enriched pathways, VEGF-related genes expression and immune cell infiltration in different PS value. Our research indicates that histopathological image features can enhance prediction of molecular status and prognosis in HCC. The integration of image features with machine learning has potential for improving prognosis prediction in HCC.
Collapse
Affiliation(s)
- Xinyi Huang
- Department of Pathology, The First Affiliated Hospital, Sun Yat-sen University. Guangzhou, 510080, China
| | - Shuang Zheng
- Department of Pathology, The First Affiliated Hospital, Sun Yat-sen University. Guangzhou, 510080, China; Department of Pathology, The Seventh Affiliated Hospital, Sun Yat-sen University. Shenzhen, 518107, China
| | - Shuqi Li
- Department of Pathology, The First Affiliated Hospital, Sun Yat-sen University. Guangzhou, 510080, China
| | - Yu Huang
- Department of Pathology, The First Affiliated Hospital, Sun Yat-sen University. Guangzhou, 510080, China
| | - Wenhui Zhang
- Department of Pathology, The First Affiliated Hospital, Sun Yat-sen University. Guangzhou, 510080, China
| | - Fang Liu
- State Key Laboratory of Organ Failure Research, Guangdong Provincial Key Laboratory of Viral Hepatitis Research, Department of Infectious Diseases, Department of Liver Tumor Center, Nanfang Hospital, Southern Medical University, Guangzhou, 510510, China; Department of Oncology, Nanfang Hospital, Southern Medical University, Guangzhou, 510510, China.
| | - Qinghua Cao
- Department of Pathology, The First Affiliated Hospital, Sun Yat-sen University. Guangzhou, 510080, China.
| |
Collapse
|
2
|
Gong S, Wang Q, Huang J, Huang R, Chen S, Cheng X, Liu L, Dai X, Zhong Y, Fan C, Liao Z. LC-MS/MS platform-based serum untargeted screening reveals the diagnostic biomarker panel and molecular mechanism of breast cancer. Methods 2024; 222:100-111. [PMID: 38228196 DOI: 10.1016/j.ymeth.2024.01.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 10/12/2023] [Accepted: 01/11/2024] [Indexed: 01/18/2024] Open
Abstract
BACKGROUND Breast cancer (BC), the most common form of malignant cancer affecting women worldwide, was characterized by heterogeneous metabolic disorder and lack of effective biomarkers for diagnosis. The purpose of this study is to search for reliable metabolite biomarkers of BC as well as triple-negative breast cancer (TNBC) using serum metabolomics approach. METHODS In this study, an untargeted metabolomics technique based on ultra-high performance liquid chromatography combined with mass spectrometry (UHPLC-MS) was utilized to investigate the differences in serum metabolic profile between the BC group (n = 53) and non-BC group (n = 57), as well as between TNBC patients (n = 23) and non-TNBC subjects (n = 30). The multivariate data analysis, determination of the fold change and the Mann-Whitney U test were used to screen out the differential metabolites. Additionally, machine learning methods including receiver operating curve analysis and logistic regression analysis were conducted to establish diagnostic biomarker panels. RESULTS There were 36 metabolites found to be significantly different between BC and non-BC groups, and 12 metabolites discovered to be significantly different between TNBC and non-TNBC patients. Results also showed that four metabolites, including N-acetyl-D-tryptophan, 2-arachidonoylglycerol, pipecolic acid and oxoglutaric acid, were considered as vital biomarkers for the diagnosis of BC and non-BC with an area under the curve (AUC) of 0.995. Another two-metabolite panel of N-acetyl-D-tryptophan and 2-arachidonoylglycerol was discovered to discriminate TNBC from non-TNBC and produced an AUC of 0.965. CONCLUSION This study demonstrated that serum metabolomics can be used to identify BC specifically and identified promising serum metabolic markers for TNBC diagnosis.
Collapse
Affiliation(s)
- Sisi Gong
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China
| | - Qingshui Wang
- College of Life Sciences, Fujian Normal University, Fuzhou, PR China
| | - Jiewei Huang
- The Graduate School of Fujian Medical University, Fuzhou, PR China
| | - Rongfu Huang
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China
| | - Shanshan Chen
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China
| | - Xiaojuan Cheng
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China
| | - Lei Liu
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China
| | - Xiaofang Dai
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China
| | - Yameng Zhong
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China
| | - Chunmei Fan
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China.
| | - Zhijun Liao
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China; Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, PR China.
| |
Collapse
|
3
|
Ye Y, Li M, Pan Q, Fang X, Yang H, Dong B, Yang J, Zheng Y, Zhang R, Liao Z. Machine learning-based classification of deubiquitinase USP26 and its cell proliferation inhibition through stabilizing KLF6 in cervical cancer. Comput Biol Med 2024; 168:107745. [PMID: 38064851 DOI: 10.1016/j.compbiomed.2023.107745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Revised: 10/31/2023] [Accepted: 11/20/2023] [Indexed: 01/10/2024]
Abstract
OBJECTIVE We aim to accurately distinguish ubiquitin-specific proteases (USPs) from other members within the deubiquitinating enzyme families based on protein sequences. Additionally, we seek to elucidate the specific regulatory mechanisms through which USP26 modulates Krüppel-like factor 6 (KLF6) and assess the subsequent effects of this regulation on both the proliferation and migration of cervical cancer cells. METHODS All the deubiquitinase (DUB) sequences were classified into USPs and non-USPs. Feature vectors, including 188D, n-gram, and 400D dimensions, were extracted from these sequences and subjected to binary classification via the Weka software. Next, thirty human USPs were also analyzed to identify conserved motifs and ascertained evolutionary relationships. Experimentally, more than 90 unique DUB-encoding plasmids were transfected into HeLa cell lines to assess alterations in KLF6 protein levels and to isolate a specific DUB involved in KLF6 regulation. Subsequent experiments utilized both wild-type (WT) USP26 overexpression and shRNA-mediated USP26 knockdown to examine changes in KLF6 protein levels. The half-life experiment was performed to assess the influence of USP26 on KLF6 protein stability. Immunoprecipitation was applied to confirm the USP26-KLF6 interaction, and ubiquitination assays to explore the role of USP26 in KLF6 deubiquitination. Additional cellular assays were conducted to evaluate the effects of USP26 on HeLa cell proliferation and migration. RESULTS 1. Among the extracted feature vectors of 188D, 400D, and n-gram, all 12 classifiers demonstrated excellent performance. The RandomForest classifier demonstrated superior performance in this assessment. Phylogenetic analysis of 30 human USPs revealed the presence of nine unique motifs, comprising zinc finger and ubiquitin-specific protease domains. 2. Through a systematic screening of the deubiquitinase library, USP26 was identified as the sole DUB associated with KLF6. 3. USP26 positively regulated the protein level of KLF6, as evidenced by the decrease in KLF6 protein expression upon shUSP26 knockdown in both 293T and Hela cell lines. Additionally, half-life experiments demonstrated that USP26 prolonged the stability of KLF6. 4. Immunoprecipitation experiments revealed a strong interaction between USP26 and KLF6. Notably, the functional interaction domain was mapped to amino acids 285-913 of USP26, as opposed to the 1-295 region. 5. WT USP26 was found to attenuate the ubiquitination levels of KLF6. However, the mutant USP26 abrogated its deubiquitination activity. 6. Functional biological assays demonstrated that overexpression of USP26 inhibited both proliferation and migration of HeLa cells. Conversely, knockdown of USP26 was shown to promote these oncogenic properties. CONCLUSIONS 1. At the protein sequence level, members of the USP family can be effectively differentiated from non-USP proteins. Furthermore, specific functional motifs have been identified within the sequences of human USPs. 2. The deubiquitinating enzyme USP26 has been shown to target KLF6 for deubiquitination, thereby modulating its stability. Importantly, USP26 plays a pivotal role in the modulation of proliferation and migration in cervical cancer cells.
Collapse
Affiliation(s)
- Ying Ye
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Meng Li
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Qilong Pan
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Xin Fang
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China; Laboratory of Non-communicable Chronic Disease Control, Fujian Provincial Center for Disease Control and Prevention, Fuzhou, 350012, China
| | - Hong Yang
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Bingying Dong
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Jiaying Yang
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Yuan Zheng
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Renxiang Zhang
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Zhijun Liao
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China.
| |
Collapse
|
4
|
Wu D, Fang X, Luan K, Xu Q, Lin S, Sun S, Yang J, Dong B, Manavalan B, Liao Z. Identification of SH2 domain-containing proteins and motifs prediction by a deep learning method. Comput Biol Med 2023; 162:107065. [PMID: 37267826 DOI: 10.1016/j.compbiomed.2023.107065] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 04/30/2023] [Accepted: 05/27/2023] [Indexed: 06/04/2023]
Abstract
The Src Homology 2 (SH2) domain plays an important role in the signal transmission mechanism in organisms. It mediates the protein-protein interactions based on the combination between phosphotyrosine and motifs in SH2 domain. In this study, we designed a method to identify SH2 domain-containing proteins and non-SH2 domain-containing proteins through deep learning technology. Firstly, we collected SH2 and non-SH2 domain-containing protein sequences including multiple species. We built six deep learning models through DeepBIO after data preprocessing and compared their performance. Secondly, we selected the model with the strongest comprehensive ability to conduct training and test separately again, and analyze the results visually. It was found that 288-dimensional (288D) feature could effectively identify two types of proteins. Finally, motifs analysis discovered the specific motif YKIR and revealed its function in signal transduction. In summary, we successfully identified SH2 domain and non-SH2 domain proteins through deep learning method, and obtained 288D features that perform best. In addition, we found a new motif YKIR in SH2 domain, and analyzed its function which helps to further understand the signaling mechanisms within the organism.
Collapse
Affiliation(s)
- Duanzhi Wu
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Xin Fang
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China; Laboratory of Non-communicable Chronic Disease Control, Fujian Provincial Center for Disease Control and Prevention, Fuzhou, 350012, China
| | - Kai Luan
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Qijin Xu
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Shiqi Lin
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Shiying Sun
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Jiaying Yang
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China; Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Bingying Dong
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China; Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Balachandran Manavalan
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, 16419, Gyeonggi-do, Republic of Korea.
| | - Zhijun Liao
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China; Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China.
| |
Collapse
|
5
|
Li P, Chen X, Zhou S, Xia X, Wang E, Han R, Zeng D, Fei G, Wang R. High Expression of DEPDC1B Predicts Poor Prognosis in Lung Adenocarcinoma. J Inflamm Res 2022; 15:4171-4184. [PMID: 35912402 PMCID: PMC9332445 DOI: 10.2147/jir.s369219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Accepted: 07/11/2022] [Indexed: 11/23/2022] Open
Abstract
Introduction Lung adenocarcinoma (LUAD) is the most common type of lung cancer. DEP domain-containing 1 B (DEPDC1B) is involved in the development of several cancers; however, its role in LUAD is unknown. Therefore, we aimed to determine the biological function and prognostic value of DEPDC1B in LUAD. Material and Methods We analyzed the correlation between DEPDC1B expression and the clinical features of LUAD and lung squamous cell carcinoma (LUSC). Survival was evaluated by generating Kaplan-Meier curves, which were used to analyze the relationship between DEPDC1B expression and prognosis in LUAD and LUSC. DEPDC1B expression in tumor and normal tissues from patients with LUAD and LUSC was determined using immunohistochemistry, and its clinical significance was analyzed. Finally, the correlation between the expression and biological function of DEPDC1B in LUAD was examined. Results Our findings revealed that DEPDC1B expression was higher in tumor tissues than that in normal tissues from patients with LUAD and LUSC (P < 0.001). These results were confirmed in clinical samples from patients using immunohistochemistry. Analysis of a dataset from The Cancer Genome Atlas (TCGA) showed that high DEPDC1B expression was associated with poor prognosis only in patients with LUAD (P < 0.001). Similarly, high DEPDC1B expression was related to shorter overall survival (OS) and progression-free interval (PFI) in patients with LUAD. These associations were not observed in LUSC. Functional enrichment analysis suggested that DEPDC1B promoted tumor development in LUAD by regulating the cell cycle. Conclusion High DEPDC1B expression predicts poor prognosis in patients with LUAD. Thus, DEPDC1B has potential as a therapeutic target for LUAD.
Collapse
Affiliation(s)
- Pulin Li
- Department of Respiratory and Critical Care Medicine, the First Affiliated Hospital of Anhui Medical University, Hefei, People's Republic of China
| | - Xiaojuan Chen
- Department of Infectious Diseases, Hefei Second People's Hospital, Hefei, People's Republic of China
| | - Sijing Zhou
- Department of Occupational Medicine, Hefei Third Clinical College of Anhui Medical University, Hefei, People's Republic of China
| | - Xingyuan Xia
- Department of Respiratory and Critical Care Medicine, the First Affiliated Hospital of Anhui Medical University, Hefei, People's Republic of China
| | - Enze Wang
- Department of Respiratory and Critical Care Medicine, the First Affiliated Hospital of Anhui Medical University, Hefei, People's Republic of China
| | - Rui Han
- Department of Respiratory and Critical Care Medicine, the First Affiliated Hospital of Anhui Medical University, Hefei, People's Republic of China
| | - Daxiong Zeng
- Department of Pulmonary and Critical Care Medicine, Suzhou Dushu Lake Hospital, Suzhou, People's Republic of China.,Department of Pulmonary and Critical Care Medicine, Dushu Lake Hospital Affiliated to Soochow University, Medical Center of Soochow University, Suzhou, People's Republic of China
| | - Guanghe Fei
- Department of Respiratory and Critical Care Medicine, the First Affiliated Hospital of Anhui Medical University, Hefei, People's Republic of China
| | - Ran Wang
- Department of Respiratory and Critical Care Medicine, the First Affiliated Hospital of Anhui Medical University, Hefei, People's Republic of China
| |
Collapse
|
6
|
Ye T, Lin L, Cao L, Huang W, Wei S, Shan Y, Zhang Z. Novel Prognostic Signatures of Hepatocellular Carcinoma Based on Metabolic Pathway Phenotypes. Front Oncol 2022; 12:863266. [PMID: 35677150 PMCID: PMC9168273 DOI: 10.3389/fonc.2022.863266] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 04/06/2022] [Indexed: 12/03/2022] Open
Abstract
Hepatocellular carcinoma is a disastrous cancer with an aberrant metabolism. In this study, we aimed to assess the role of metabolism in the prognosis of hepatocellular carcinoma. Ten metabolism-related pathways were identified to classify the hepatocellular carcinoma into two clusters: Metabolism_H and Metabolism_L. Compared with Metabolism_L, patients in Metabolism_H had lower survival rates with more mutated TP53 genes and more immune infiltration. Moreover, risk scores for predicting overall survival based on eleven differentially expressed metabolic genes were developed by the least absolute shrinkage and selection operator (LASSO)-Cox regression model in The Cancer Genome Atlas (TCGA) dataset, which was validated in the International Cancer Genome Consortium (ICGC) dataset. The immunohistochemistry staining of liver cancer patient specimens also identified that the 11 genes were associated with the prognosis of liver cancer patients. Multivariate Cox regression analyses indicated that the differentially expressed metabolic gene-based risk score was also an independent prognostic factor for overall survival. Furthermore, the risk score (AUC = 0.767) outperformed other clinical variables in predicting overall survival. Therefore, the metabolism-related survival-predictor model may predict overall survival excellently for HCC patients.
Collapse
Affiliation(s)
- Tingbo Ye
- Department of Hepatobiliary Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.,Key Laboratory of Diagnosis and Treatment of Severe Hepato-Pancreatic Diseases of Zhejiang Province, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Leilei Lin
- Department of Ultrasound, Wenzhou People's Hospital, Wenzhou, China
| | - Lulu Cao
- Department of Pathology, The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, Quzhou, China
| | - Weiguo Huang
- Department of Vascular Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Shengzhe Wei
- Department of Hand Surgery and Peripheral Neurosurgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Yunfeng Shan
- Department of Hepatobiliary Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Zhongjing Zhang
- Department of Vascular Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| |
Collapse
|
7
|
Yu H, Wang X, Cao H. Construction and investigation of a circRNA-associated ceRNA regulatory network in Tetralogy of Fallot. BMC Cardiovasc Disord 2021; 21:437. [PMID: 34521346 PMCID: PMC8442392 DOI: 10.1186/s12872-021-02217-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Accepted: 08/20/2021] [Indexed: 12/14/2022] Open
Abstract
Background As the most frequent type of cyanotic congenital heart disease (CHD), tetralogy of Fallot (TOF) has a relatively poor prognosis without corrective surgery. Circular RNAs (circRNAs) represent a novel class of endogenous noncoding RNAs that regulate target gene expression posttranscriptionally in heart development. Here, we investigated the potential role of the ceRNA network in the pathogenesis of TOF. Methods To identify circRNA expression profiles in TOF, microarrays were used to screen the differentially expressed circRNAs between 3 TOF and 3 control human myocardial tissue samples. Then, a dysregulated circRNA-associated ceRNA network was constructed using the established multistep screening strategy. Results In summary, a total of 276 differentially expressed circRNAs were identified, including 214 upregulated and 62 downregulated circRNAs in TOF samples. By constructing the circRNA-associated ceRNA network based on bioinformatics data, a total of 19 circRNAs, 9 miRNAs, and 34 mRNAs were further screened. Moreover, by enlarging the sample size, the qPCR results validated the positive correlations between hsa_circ_0007798 and HIF1A. Conclusions The findings in this study provide a comprehensive understanding of the ceRNA network involved in TOF biology, such as the hsa_circ_0007798/miR-199b-5p/HIF1A signalling axis, and may offer candidate diagnostic biomarkers or potential therapeutic targets for TOF. In addition, we propose that the ceRNA network regulates TOF progression. Supplementary Information The online version contains supplementary material available at 10.1186/s12872-021-02217-w.
Collapse
Affiliation(s)
- Haifei Yu
- Department of Cardiac Surgery, Fujian Maternity and Child Health Hospital Affiliated to Fujian Medical University, Fuzhou, Fujian, People's Republic of China.,Key Laboratory of Technical Evaluation of Fertility Regulation for Non-human Primates, National Health and Family Planning Commission, Fuzhou, Fujian, People's Republic of China
| | - Xinrui Wang
- Key Laboratory of Technical Evaluation of Fertility Regulation for Non-human Primates, National Health and Family Planning Commission, Fuzhou, Fujian, People's Republic of China. .,Medical Research Centre, Fujian Maternity and Child Health Hospital, Affiliated Hospital of Fujian Medical University, Fuzhou, Fujian, People's Republic of China.
| | - Hua Cao
- Department of Cardiac Surgery, Fujian Maternity and Child Health Hospital Affiliated to Fujian Medical University, Fuzhou, Fujian, People's Republic of China. .,Key Laboratory of Technical Evaluation of Fertility Regulation for Non-human Primates, National Health and Family Planning Commission, Fuzhou, Fujian, People's Republic of China. .,Medical Research Centre, Fujian Maternity and Child Health Hospital, Affiliated Hospital of Fujian Medical University, Fuzhou, Fujian, People's Republic of China.
| |
Collapse
|
8
|
Lu W, Cao Y, Wu H, Ding Y, Song Z, Zhang Y, Fu Q, Li H. Research on RNA secondary structure predicting via bidirectional recurrent neural network. BMC Bioinformatics 2021; 22:431. [PMID: 34496763 PMCID: PMC8427827 DOI: 10.1186/s12859-021-04332-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Accepted: 08/23/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND RNA secondary structure prediction is an important research content in the field of biological information. Predicting RNA secondary structure with pseudoknots has been proved to be an NP-hard problem. Traditional machine learning methods can not effectively apply protein sequence information with different sequence lengths to the prediction process due to the constraint of the self model when predicting the RNA secondary structure. In addition, there is a large difference between the number of paired bases and the number of unpaired bases in the RNA sequences, which means the problem of positive and negative sample imbalance is easy to make the model fall into a local optimum. To solve the above problems, this paper proposes a variable-length dynamic bidirectional Gated Recurrent Unit(VLDB GRU) model. The model can accept sequences with different lengths through the introduction of flag vector. The model can also make full use of the base information before and after the predicted base and can avoid losing part of the information due to truncation. Introducing a weight vector to predict the RNA training set by dynamically adjusting each base loss function solves the problem of balanced sample imbalance. RESULTS The algorithm proposed in this paper is compared with the existing algorithms on five representative subsets of the data set RNA STRAND. The experimental results show that the accuracy and Matthews correlation coefficient of the method are improved by 4.7% and 11.4%, respectively. CONCLUSIONS The flag vector introduced allows the model to effectively use the information before and after the protein sequence; the introduced weight vector solves the problem of unbalanced sample balance. Compared with other algorithms, the LVDB GRU algorithm proposed in this paper has the best detection results.
Collapse
Affiliation(s)
- Weizhong Lu
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, 215009, China.,Jiangsu Province Key Laboratory of Intelligent Building Energy Efficiency, Suzhou University of Science and Technology, Suzhou, 215009, China
| | - Yan Cao
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, 215009, China
| | - Hongjie Wu
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, 215009, China. .,Jiangsu Province Key Laboratory of Intelligent Building Energy Efficiency, Suzhou University of Science and Technology, Suzhou, 215009, China.
| | - Yijie Ding
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, 215009, China.,Jiangsu Province Key Laboratory of Intelligent Building Energy Efficiency, Suzhou University of Science and Technology, Suzhou, 215009, China
| | - Zhengwei Song
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, 215009, China
| | - Yu Zhang
- Suzhou Industrial Park Institute of Services Outsourcing, Suzhou, 215123, China
| | - Qiming Fu
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, 215009, China.,Jiangsu Province Key Laboratory of Intelligent Building Energy Efficiency, Suzhou University of Science and Technology, Suzhou, 215009, China
| | - Haiou Li
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, 215009, China
| |
Collapse
|
9
|
Dang XW, Pan Q, Lin ZH, Wang HH, Li LH, Li L, Shen DQ, Wang PJ. Overexpressed DEPDC1B contributes to the progression of hepatocellular carcinoma by CDK1. Aging (Albany NY) 2021; 13:20094-20115. [PMID: 34032605 PMCID: PMC8436915 DOI: 10.18632/aging.203016] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Accepted: 02/16/2021] [Indexed: 12/24/2022]
Abstract
BACKGROUND Hepatocellular carcinoma (HCC) is the main type of primary liver cancer and shows a heavy burden worldwide. Its recurrence and mortality rate are still uncontrolled by the usage of present treatments. More attention has been focused on exploring specific genes that play important roles in HCC procession, and the function of DEP domain containing 1B (DEPDC1B) in HCC has not been researched. METHODS Immunohistochemical staining was used to detect the expression level of DEPDC1B in tumor tissues and adjacent normal tissues. After DEPDC1B and CDK1 knockdown in cell lines HEP3B2.1-7 and SK-HEP-1, MTT assay and colony formation assay was used to detect cell growth, flow cytometry assay was used to investigate cell apoptosis and cell cycle, wound-healing assay and Transwell assay were used to examine the tumor cell migration. Moreover, a xenograft model was constructed to research functions of DEPDC1B in tumor growth in vivo. RESULTS The results show that DEPDC1B knockdown inhibit the progression of HCC, through inhibiting cell proliferation, migration, colony formation, leading to G2 phase arrest, and promoting cell apoptosis in vitro, and CDK1 was selected for further mechanic research according to the results of Human GeneChip prime view. The results of recovery experiment displayed that the functions of DEPDC1B on HCC progression were mediated by CDK1. DEPDC1B knockdown can also inhibit tumor growth in vivo. CONCLUSIONS The study confirmed that DEPDC1B knockdown restrains the tumor growth in vitro and vivo, and it can interact with CDK1 and rescued by CDK1. The study suggested that DEPDC1B was as a potential therapeutic target involved in HCC growth and progression.
Collapse
Affiliation(s)
- Xiao-Wei Dang
- Department of Hepatopancreatobiliary Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Qi Pan
- Department of Hepatic Surgery, The Cancer Hospital of Fudan University, Shanghai, China
| | - Zhen-Hai Lin
- Department of Hepatic Surgery, The Cancer Hospital of Fudan University, Shanghai, China
| | - Hao-Hao Wang
- Department of Hepatopancreatobiliary Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Lu-Hao Li
- Department of Hepatopancreatobiliary Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Lin Li
- Department of Hepatopancreatobiliary Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Dong-Qi Shen
- Department of Hepatopancreatobiliary Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Pei-Ju Wang
- Department of Hepatopancreatobiliary Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
| |
Collapse
|
10
|
Liu T, Chen JM, Zhang D, Zhang Q, Peng B, Xu L, Tang H. ApoPred: Identification of Apolipoproteins and Their Subfamilies With Multifarious Features. Front Cell Dev Biol 2021; 8:621144. [PMID: 33490085 PMCID: PMC7820372 DOI: 10.3389/fcell.2020.621144] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Accepted: 11/24/2020] [Indexed: 01/24/2023] Open
Abstract
Apolipoprotein is a group of plasma proteins that are associated with a variety of diseases, such as hyperlipidemia, atherosclerosis, Alzheimer's disease, and diabetes. In order to investigate the function of apolipoproteins and to develop effective targets for related diseases, it is necessary to accurately identify and classify apolipoproteins. Although it is possible to identify apolipoproteins accurately through biochemical experiments, they are expensive and time-consuming. This work aims to establish a high-efficiency and high-accuracy prediction model for recognition of apolipoproteins and their subfamilies. We firstly constructed a high-quality benchmark dataset including 270 apolipoproteins and 535 non-apolipoproteins. Based on the dataset, pseudo-amino acid composition (PseAAC) and composition of k-spaced amino acid pairs (CKSAAP) were used as input vectors. To improve the prediction accuracy and eliminate redundant information, analysis of variance (ANOVA) was used to rank the features. And the incremental feature selection was utilized to obtain the best feature subset. Support vector machine (SVM) was proposed to construct the classification model, which could produce the accuracy of 97.27%, sensitivity of 96.30%, and specificity of 97.76% for discriminating apolipoprotein from non-apolipoprotein in 10-fold cross-validation. In addition, the same process was repeated to generate a new model for predicting apolipoprotein subfamilies. The new model could achieve an overall accuracy of 95.93% in 10-fold cross-validation. According to our proposed model, a convenient webserver called ApoPred was established, which can be freely accessed at http://tang-biolab.com/server/ApoPred/service.html. We expect that this work will contribute to apolipoprotein function research and drug development in relevant diseases.
Collapse
Affiliation(s)
- Ting Liu
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, China
| | - Jia-Mao Chen
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, China
| | - Dan Zhang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Qian Zhang
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, China
| | - Bowen Peng
- Division of international Cooperation, Health Commission of Sichuan Province, Chengdu, China
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, China
| | - Hua Tang
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, China
- Central Nervous System Drug Key Laboratory of Sichuan Province, Luzhou, China
| |
Collapse
|
11
|
Xiao X, Chen WJ, Qiu WR. A Novel Prediction of Quaternary Structural Type of Proteins with Gene Ontology. Protein Pept Lett 2019; 27:313-320. [PMID: 31749418 DOI: 10.2174/0929866526666191014144618] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Revised: 05/20/2019] [Accepted: 06/29/2019] [Indexed: 11/22/2022]
Abstract
BACKGROUND The information of quaternary structure attributes of proteins is very important because it is closely related to the biological functions of proteins. With the rapid development of new generation sequencing technology, we are facing a challenge: how to automatically identify the four-level attributes of new polypeptide chains according to their sequence information (i.e., whether they are formed as just as a monomer, or as a hetero-oligomer, or a homo-oligomer). OBJECTIVE In this article, our goal is to find a new way to represent protein sequences, thereby improving the prediction rate of protein quaternary structure. METHODS In this article, we developed a prediction system for protein quaternary structural type in which a protein sequence was expressed by combining the Pfam functional-domain and gene ontology. turn protein features into digital sequences, and complete the prediction of quaternary structure through specific machine learning algorithms and verification algorithm. RESULTS Our data set contains 5495 protein samples. Through the method provided in this paper, we classify proteins into monomer, or as a hetero-oligomer, or a homo-oligomer, and the prediction rate is 74.38%, which is 3.24% higher than that of previous studies. Through this new feature extraction method, we can further classify the four-level structure of proteins, and the results are also correspondingly improved. CONCLUSION After the applying the new prediction system, compared with the previous results, we have successfully improved the prediction rate. We have reason to believe that the feature extraction method in this paper has better practicability and can be used as a reference for other protein classification problems.
Collapse
Affiliation(s)
- Xuan Xiao
- School of Information, Jingdezhen Ceramic Institute, Jingdezhen 333403, China
| | - Wei-Jie Chen
- School of Information, Jingdezhen Ceramic Institute, Jingdezhen 333403, China
| | - Wang-Ren Qiu
- School of Information, Jingdezhen Ceramic Institute, Jingdezhen 333403, China.,Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| |
Collapse
|
12
|
Gong Z, Ma Q, Wang X, Cai Q, Gong X, Genchev GZ, Lu H, Zeng F. A Herpes Simplex Virus Thymidine Kinase-Induced Mouse Model of Hepatocellular Carcinoma Associated with Up-Regulated Immune-Inflammatory-Related Signals. Genes (Basel) 2018; 9:E380. [PMID: 30060537 PMCID: PMC6115908 DOI: 10.3390/genes9080380] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Revised: 07/19/2018] [Accepted: 07/23/2018] [Indexed: 12/11/2022] Open
Abstract
Inflammation and fibrosis in human liver are often precursors to hepatocellular carcinoma (HCC), yet none of them is easily modeled in animals. We previously generated transgenic mice with hepatocyte-specific expressed herpes simplex virus thymidine kinase (HSV-tk). These mice would develop hepatitis with the administration of ganciclovir (GCV)(Zhang, 2005 #1). However, our HSV-tk transgenic mice developed hepatitis and HCC tumor as early as six months of age even without GCV administration. We analyzed the transcriptome of the HSV-tk HCC tumor and hepatitis tissue using microarray analysis to investigate the possible causes of HCC. Gene Ontology (GO) enrichment analysis showed that the up-regulated genes in the HCC tissue mainly include the immune-inflammatory and cell cycle genes. The down-regulated genes in HCC tumors are mainly concentrated in the regions related to lipid metabolism. Gene set enrichment analysis (GSEA) showed that immune-inflammatory-related signals in the HSV-tk mice are up-regulated compared to those in Notch mice. Our study suggests that the immune system and inflammation play an important role in HCC development in HSV-tk mice. Specifically, increased expression of immune-inflammatory-related genes is characteristic of HSV-tk mice and that inflammation-induced cell cycle activation maybe a precursory step to cancer. The HSV-tk mouse provides a suitable model for the study of the relationship between immune-inflammation and HCC, and their underlying mechanism for the development of therapeutic application in the future.
Collapse
Affiliation(s)
- Zhijuan Gong
- Shanghai Institute of Medical Genetics, Shanghai Children's Hospital, Shanghai Jiao Tong University, Shanghai 200040, China.
- Department of Histo-Embryology, Genetics and Developmental Biology, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China.
- Key Laboratory of Embryo Molecular Biology, Ministry of Health & Shanghai Key Laboratory of Embryo and Reproduction Engineering, Shanghai 200040, China.
| | - Qingwen Ma
- Shanghai Institute of Medical Genetics, Shanghai Children's Hospital, Shanghai Jiao Tong University, Shanghai 200040, China.
- Key Laboratory of Embryo Molecular Biology, Ministry of Health & Shanghai Key Laboratory of Embryo and Reproduction Engineering, Shanghai 200040, China.
| | - Xujun Wang
- SJTU-Yale Joint Center for Biostatistics, School of Life Science and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China.
| | - Qin Cai
- Shanghai Institute of Medical Genetics, Shanghai Children's Hospital, Shanghai Jiao Tong University, Shanghai 200040, China.
- Key Laboratory of Embryo Molecular Biology, Ministry of Health & Shanghai Key Laboratory of Embryo and Reproduction Engineering, Shanghai 200040, China.
| | - Xiuli Gong
- Shanghai Institute of Medical Genetics, Shanghai Children's Hospital, Shanghai Jiao Tong University, Shanghai 200040, China.
- Key Laboratory of Embryo Molecular Biology, Ministry of Health & Shanghai Key Laboratory of Embryo and Reproduction Engineering, Shanghai 200040, China.
| | - Georgi Z Genchev
- SJTU-Yale Joint Center for Biostatistics, School of Life Science and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China.
| | - Hui Lu
- Shanghai Institute of Medical Genetics, Shanghai Children's Hospital, Shanghai Jiao Tong University, Shanghai 200040, China.
- Key Laboratory of Embryo Molecular Biology, Ministry of Health & Shanghai Key Laboratory of Embryo and Reproduction Engineering, Shanghai 200040, China.
- SJTU-Yale Joint Center for Biostatistics, School of Life Science and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China.
| | - Fanyi Zeng
- Shanghai Institute of Medical Genetics, Shanghai Children's Hospital, Shanghai Jiao Tong University, Shanghai 200040, China.
- Department of Histo-Embryology, Genetics and Developmental Biology, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China.
- Key Laboratory of Embryo Molecular Biology, Ministry of Health & Shanghai Key Laboratory of Embryo and Reproduction Engineering, Shanghai 200040, China.
| |
Collapse
|
13
|
He W, Jia C, Duan Y, Zou Q. 70ProPred: a predictor for discovering sigma70 promoters based on combining multiple features. BMC SYSTEMS BIOLOGY 2018; 12:44. [PMID: 29745856 PMCID: PMC5998878 DOI: 10.1186/s12918-018-0570-1] [Citation(s) in RCA: 56] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
BACKGROUND Promoter is an important sequence regulation element, which is in charge of gene transcription initiation. In prokaryotes, σ70 promoters regulate the transcription of most genes. The promoter recognition has been a crucial part of gene structure recognition. It's also the core issue of constructing gene transcriptional regulation network. With the successfully completion of genome sequencing from an increasing number of microbe species, the accurate identification of σ70 promoter regions in DNA sequence is not easy. RESULTS In order to improve the prediction accuracy of sigma70 promoters in prokaryote, a promoter recognition model 70ProPred was established. In this work, two sequence-based features, including position-specific trinucleotide propensity based on single-stranded characteristic (PSTNPss) and electron-ion potential values for trinucleotides (PseEIIP), were assessed to build the best prediction model. It was found that 79 features of PSTNPSS combined with 64 features of PseEIIP obtained the best performance for sigma70 promoter identification, with a promising accuracy and the Matthews correlation coefficient (MCC) at 95.56% and 0.90, respectively. CONCLUSION The jackknife tests showed that 70ProPred outperforms the existing sigma70 promoter prediction approaches in terms of accuracy and stability. Additionally, this approach can also be extended to predict promoters of other species. In order to facilitate experimental biologists, an online web server for the proposed method was established, which is freely available at http://server.malab.cn/70ProPred/ .
Collapse
Affiliation(s)
- Wenying He
- School of Computer Science and Technology, Tianjin University, Tianjin, 300072 China
| | - Cangzhi Jia
- Department of Mathematics, Dalian Maritime University, Dalian, 116026 China
| | - Yucong Duan
- College of Information and Technology, Hainan University, Haikou, 570228 China
| | - Quan Zou
- School of Computer Science and Technology, Tianjin University, Tianjin, 300072 China
| |
Collapse
|
14
|
Wang X, Liao Z, Bai Z, He Y, Duan J, Wei L. MiR-93-5p Promotes Cell Proliferation through Down-Regulating PPARGC1A in Hepatocellular Carcinoma Cells by Bioinformatics Analysis and Experimental Verification. Genes (Basel) 2018; 9:genes9010051. [PMID: 29361788 PMCID: PMC5793202 DOI: 10.3390/genes9010051] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2017] [Revised: 01/15/2018] [Accepted: 01/16/2018] [Indexed: 12/11/2022] Open
Abstract
Peroxisome proliferator-activated receptor gamma coactivator-1 alpha (PPARGC1A, formerly known as PGC-1a) is a transcriptional coactivator and metabolic regulator. Previous studies are mainly focused on the association between PPARGC1A and hepatoma. However, the regulatory mechanism remains unknown. A microRNA associated with cancer (oncomiR), miR-93-5p, has recently been found to play an essential role in tumorigenesis and progression of various carcinomas, including liver cancer. Therefore, this paper aims to explore the regulatory mechanism underlying these two proteins in hepatoma cells. Firstly, an integrative analysis was performed with miRNA–mRNA modules on microarray and The Cancer Genome Atlas (TCGA) data and obtained the core regulatory network and miR-93-5p/PPARGC1A pair. Then, a series of experiments were conducted in hepatoma cells with the results including miR-93-5p upregulated and promoted cell proliferation. Thirdly, the inverse correlation between miR-93-5p and PPARGC1A expression was validated. Finally, we inferred that miR-93-5p plays an essential role in inhibiting PPARGC1A expression by directly targeting the 3′-untranslated region (UTR) of its mRNA. In conclusion, these results suggested that miR-93-5p overexpression contributes to hepatoma development by inhibiting PPARGC1A. It is anticipated to be a promising therapeutic strategy for patients with liver cancer in the future.
Collapse
Affiliation(s)
- Xinrui Wang
- State Key Laboratory for Medical Genomics, Shanghai Institute of Hematology, Rui Jin Hospital Affiliated to School of Medicine, Shanghai Jiao Tong University, Shanghai 200025, China.
| | - Zhijun Liao
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350122, China.
| | - Zhimin Bai
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350122, China.
- Department of Clinical Laboratory, Jinjiang Municipal Hospital, Jinjiang 362200, China.
| | - Yan He
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350122, China.
| | - Juan Duan
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350122, China.
| | - Leyi Wei
- School of Computer Science and Technology, Tianjin University, Tianjin 300350, China.
| |
Collapse
|
15
|
Antonets KS, Nizhnikov AA. Predicting Amyloidogenic Proteins in the Proteomes of Plants. Int J Mol Sci 2017; 18:ijms18102155. [PMID: 29035294 PMCID: PMC5666836 DOI: 10.3390/ijms18102155] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Revised: 10/12/2017] [Accepted: 10/13/2017] [Indexed: 12/21/2022] Open
Abstract
Amyloids are protein fibrils with characteristic spatial structure. Though amyloids were long perceived to be pathogens that cause dozens of incurable pathologies in humans and mammals, it is currently clear that amyloids also represent a functionally important form of protein structure implicated in a variety of biological processes in organisms ranging from archaea and bacteria to fungi and animals. Despite their social significance, plants remain the most poorly studied group of organisms in the field of amyloid biology. To date, amyloid properties have only been demonstrated in vitro or in heterologous systems for a small number of plant proteins. Here, for the first time, we performed a comprehensive analysis of the distribution of potentially amyloidogenic proteins in the proteomes of approximately 70 species of land plants using the Waltz and SARP (Sequence Analysis based on the Ranking of Probabilities) bioinformatic algorithms. We analyzed more than 2.9 million protein sequences and found that potentially amyloidogenic proteins are abundant in plant proteomes. We found that such proteins are overrepresented among membrane as well as DNA- and RNA-binding proteins of plants. Moreover, seed storage and defense proteins of most plant species are rich in amyloidogenic regions. Taken together, our data demonstrate the diversity of potentially amyloidogenic proteins in plant proteomes and suggest biological processes where formation of amyloids might be functionally important.
Collapse
Affiliation(s)
- Kirill S Antonets
- Laboratory for Proteomics of Supra-Organismal Systems, All-Russia Research Institute for Agricultural Microbiology, 196608 Podbelskogo sh., 3, Pushkin, St. Petersburg 196608, Russia.
- Department of Genetics and Biotechnology, St. Petersburg State University, 199034 Universitetskaya nab., 7/9, St. Petersburg 199034, Russia.
| | - Anton A Nizhnikov
- Laboratory for Proteomics of Supra-Organismal Systems, All-Russia Research Institute for Agricultural Microbiology, 196608 Podbelskogo sh., 3, Pushkin, St. Petersburg 196608, Russia.
- Department of Genetics and Biotechnology, St. Petersburg State University, 199034 Universitetskaya nab., 7/9, St. Petersburg 199034, Russia.
| |
Collapse
|
16
|
Abstract
Classification problems from different domains vary in complexity, size, and imbalance of the number of samples from different classes. Although several classification models have been proposed, selecting the right model and parameters for a given classification task to achieve good performance is not trivial. Therefore, there is a constant interest in developing novel robust and efficient models suitable for a great variety of data. Here, we propose OmniGA, a framework for the optimization of omnivariate decision trees based on a parallel genetic algorithm, coupled with deep learning structure and ensemble learning methods. The performance of the OmniGA framework is evaluated on 12 different datasets taken mainly from biomedical problems and compared with the results obtained by several robust and commonly used machine-learning models with optimized parameters. The results show that OmniGA systematically outperformed these models for all the considered datasets, reducing the F1 score error in the range from 100% to 2.25%, compared to the best performing model. This demonstrates that OmniGA produces robust models with improved performance. OmniGA code and datasets are available at www.cbrc.kaust.edu.sa/omniga/.
Collapse
Affiliation(s)
- Arturo Magana-Mora
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center, Thuwal, 23955-6900, Saudi Arabia
| | - Vladimir B Bajic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center, Thuwal, 23955-6900, Saudi Arabia.
| |
Collapse
|
17
|
Silva JCF, Carvalho TFM, Basso MF, Deguchi M, Pereira WA, Sobrinho RR, Vidigal PMP, Brustolini OJB, Silva FF, Dal-Bianco M, Fontes RLF, Santos AA, Zerbini FM, Cerqueira FR, Fontes EPB. Geminivirus data warehouse: a database enriched with machine learning approaches. BMC Bioinformatics 2017; 18:240. [PMID: 28476106 PMCID: PMC5420152 DOI: 10.1186/s12859-017-1646-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Accepted: 04/25/2017] [Indexed: 03/28/2023] Open
Abstract
BACKGROUND The Geminiviridae family encompasses a group of single-stranded DNA viruses with twinned and quasi-isometric virions, which infect a wide range of dicotyledonous and monocotyledonous plants and are responsible for significant economic losses worldwide. Geminiviruses are divided into nine genera, according to their insect vector, host range, genome organization, and phylogeny reconstruction. Using rolling-circle amplification approaches along with high-throughput sequencing technologies, thousands of full-length geminivirus and satellite genome sequences were amplified and have become available in public databases. As a consequence, many important challenges have emerged, namely, how to classify, store, and analyze massive datasets as well as how to extract information or new knowledge. Data mining approaches, mainly supported by machine learning (ML) techniques, are a natural means for high-throughput data analysis in the context of genomics, transcriptomics, proteomics, and metabolomics. RESULTS Here, we describe the development of a data warehouse enriched with ML approaches, designated geminivirus.org. We implemented search modules, bioinformatics tools, and ML methods to retrieve high precision information, demarcate species, and create classifiers for genera and open reading frames (ORFs) of geminivirus genomes. CONCLUSIONS The use of data mining techniques such as ETL (Extract, Transform, Load) to feed our database, as well as algorithms based on machine learning for knowledge extraction, allowed us to obtain a database with quality data and suitable tools for bioinformatics analysis. The Geminivirus Data Warehouse (geminivirus.org) offers a simple and user-friendly environment for information retrieval and knowledge discovery related to geminiviruses.
Collapse
Affiliation(s)
- Jose Cleydson F Silva
- Departamento de Informática, Universidade Federal de Viçosa, Viçosa, Brazil.,National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil
| | | | - Marcos F Basso
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil
| | - Michihito Deguchi
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil
| | - Welison A Pereira
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil
| | - Roberto R Sobrinho
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil
| | - Pedro M P Vidigal
- Núcleo de Biomoléculas, Universidade Federal de Viçosa, Viçosa, MG, Brazil
| | - Otávio J B Brustolini
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil
| | - Fabyano F Silva
- Departamento de Zootecnia, Universidade Federal de Viçosa, Viçosa, Brazil
| | - Maximiller Dal-Bianco
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil
| | | | - Anésia A Santos
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil.,Departamento de Biologia Geral, Universidade Federal de Viçosa, Viçosa, Brazil
| | - Francisco Murilo Zerbini
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil.,Departamento de Fitopatologia, Universidade Federal de Viçosa, Viçosa, MG, Brazil
| | - Fabio R Cerqueira
- Departamento de Informática, Universidade Federal de Viçosa, Viçosa, Brazil.,Departamento de Engenharia de Produção, Universidade Federal Fluminense, Petrópolis, Rio de Janeiro, Brazil
| | - Elizabeth P B Fontes
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil. .,Departamento de Bioquímica e Biologia Molecular, Universidade Federal de Viçosa, Viçosa, Brazil.
| |
Collapse
|