1
|
Zhang ZM, Huang Y, Liu G, Yu W, Xie Q, Chen Z, Huang G, Wei J, Zhang H, Chen D, Du H. Development of machine learning-based predictors for early diagnosis of hepatocellular carcinoma. Sci Rep 2024; 14:5274. [PMID: 38438393 PMCID: PMC10912761 DOI: 10.1038/s41598-024-51265-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 01/03/2024] [Indexed: 03/06/2024] Open
Abstract
Hepatocellular carcinoma (HCC) remains a formidable malignancy that significantly impacts human health, and the early diagnosis of HCC holds paramount importance. Therefore, it is imperative to develop an efficacious signature for the early diagnosis of HCC. In this study, we aimed to develop early HCC predictors (eHCC-pred) using machine learning-based methods and compare their performance with existing methods. The enhancements and advancements of eHCC-pred encompassed the following: (i) utilization of a substantial number of samples, including an increased representation of cirrhosis tissues without HCC (CwoHCC) samples for model training and augmented numbers of HCC and CwoHCC samples for model validation; (ii) incorporation of two feature selection methods, namely minimum redundancy maximum relevance and maximum relevance maximum distance, along with the inclusion of eight machine learning-based methods; (iii) improvement in the accuracy of early HCC identification, elevating it from 78.15 to 97% using identical independent datasets; and (iv) establishment of a user-friendly web server. The eHCC-pred is freely accessible at http://www.dulab.com.cn/eHCC-pred/ . Our approach, eHCC-pred, is anticipated to be robustly employed at the individual level for facilitating early HCC diagnosis in clinical practice, surpassing currently available state-of-the-art techniques.
Collapse
Affiliation(s)
- Zi-Mei Zhang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China
| | - Yuting Huang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China
| | - Guanghao Liu
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
- Fujian Key Laboratory of Medical Bioinformatics, Department of Bioinformatics, School of Medical Technology and Engineering, Fujian Medical University, Fuzhou, 350122, China
| | - Wenqi Yu
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China
| | - Qingsong Xie
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China
| | - Zixi Chen
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China
| | - Guanda Huang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China
| | - Jinfen Wei
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China
| | - Haibo Zhang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China
| | - Dong Chen
- Fangrui Institute of Innovative Drugs, South China University of Technology, Guangzhou, China
| | - Hongli Du
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China.
| |
Collapse
|
2
|
Su W, Deng S, Gu Z, Yang K, Ding H, Chen H, Zhang Z. Prediction of apoptosis protein subcellular location based on amphiphilic pseudo amino acid composition. Front Genet 2023; 14:1157021. [PMID: 36926588 PMCID: PMC10011625 DOI: 10.3389/fgene.2023.1157021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 02/20/2023] [Indexed: 03/08/2023] Open
Abstract
Introduction: Apoptosis proteins play an important role in the process of cell apoptosis, which makes the rate of cell proliferation and death reach a relative balance. The function of apoptosis protein is closely related to its subcellular location, it is of great significance to study the subcellular locations of apoptosis proteins. Many efforts in bioinformatics research have been aimed at predicting their subcellular location. However, the subcellular localization of apoptotic proteins needs to be carefully studied. Methods: In this paper, based on amphiphilic pseudo amino acid composition and support vector machine algorithm, a new method was proposed for the prediction of apoptosis proteins\x{2019} subcellular location. Results and Discussion: The method achieved good performance on three data sets. The Jackknife test accuracy of the three data sets reached 90.5%, 93.9% and 84.0%, respectively. Compared with previous methods, the prediction accuracies of APACC_SVM were improved.
Collapse
Affiliation(s)
- Wenxia Su
- College of Science, Inner Mongolia Agriculture University, Hohhot, China
| | - Shuyi Deng
- School of Life Science and Technology, Center for Information Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Zhifeng Gu
- School of Life Science and Technology, Center for Information Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Keli Yang
- Nonlinear Research Institute, Baoji University of Arts and Sciences, Baoji, China
| | - Hui Ding
- School of Life Science and Technology, Center for Information Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Hui Chen
- School of Healthcare Technology, Chengdu Neusoft University, Chengdu, China
| | - Zhaoyue Zhang
- School of Life Science and Technology, Center for Information Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Healthcare Technology, Chengdu Neusoft University, Chengdu, China
| |
Collapse
|
3
|
Zulfiqar H, Ahmed Z, Kissanga Grace-Mercure B, Hassan F, Zhang ZY, Liu F. Computational prediction of promotors in Agrobacterium tumefaciens strain C58 by using the machine learning technique. Front Microbiol 2023; 14:1170785. [PMID: 37125199 PMCID: PMC10133480 DOI: 10.3389/fmicb.2023.1170785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 03/17/2023] [Indexed: 05/02/2023] Open
Abstract
Promotors are those genomic regions on the upstream of genes, which are bound by RNA polymerase for starting gene transcription. Because it is the most critical element of gene expression, the recognition of promoters is crucial to understand the regulation of gene expression. This study aimed to develop a machine learning-based model to predict promotors in Agrobacterium tumefaciens (A. tumefaciens) strain C58. In the model, promotor sequences were encoded by three different kinds of feature descriptors, namely, accumulated nucleotide frequency, k-mer nucleotide composition, and binary encodings. The obtained features were optimized by using correlation and the mRMR-based algorithm. These optimized features were inputted into a random forest (RF) classifier to discriminate promotor sequences from non-promotor sequences in A. tumefaciens strain C58. The examination of 10-fold cross-validation showed that the proposed model could yield an overall accuracy of 0.837. This model will provide help for the study of promoters in A. tumefaciens C58 strain.
Collapse
Affiliation(s)
- Hasan Zulfiqar
- Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou, China
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
- *Correspondence: Hasan Zulfiqar
| | - Zahoor Ahmed
- Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou, China
| | - Bakanina Kissanga Grace-Mercure
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Farwa Hassan
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Zhao-Yue Zhang
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
- Zhao-Yue Zhang
| | - Fen Liu
- Department of Radiation Oncology, Peking University Cancer Hospital (Inner Mongolia Campus), Affiliated Cancer Hospital of Inner Mongolia Medical University, Inner Mongolia Cancer Hospital, Hohhot, China
- Fen Liu
| |
Collapse
|
4
|
Guan Q, Zhao P, Tian Y, Yang L, Zhang Z, Li J. Identification of cancer risk assessment signature in patients with chronic obstructive pulmonary disease and exploration of the potential key genes. Ann Med 2022; 54:2309-2320. [PMID: 35993327 PMCID: PMC9415445 DOI: 10.1080/07853890.2022.2112070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
It is essential to assess the cancer risk for patients with chronic obstructive pulmonary disease (COPD). Comparing gene expression data from patients with lung cancer (a total of 506 samples) and those with cancer-adjacent normal lung tissues (a total of 370 samples), we generated a qualitative transcriptional signature consisting of 2046 gene pairs. The signature was verified in an evaluation dataset comprising 18 subjects with severe disease and 52 subjects with moderate disease (Wilcoxon rank-sum test; p = 7.33 × 10-5). Similar results were obtained in other independent datasets. Among the gene pairs in the signature, 326 COPD stage-related gene pairs were identified based on Spearman's rank correlation tests and those gene pairs comprised 368 unique genes. Of these 368 genes, 16 genes were significantly dysregulated in COPD rat model data compared with control data. Some of these genes (Dhx16, Upf2, Notch3, Sec61a1, Dyrk2, and Hmmr) were altered when the COPD rat model was treated with traditional Chinese medicines (TCM), including Bufei Yishen formula, Bufei Jianpi formula, and Yiqi Zishen formula. Overall, the signature could predict the cancer incidence-risk of COPD and the identified key genes might provide guidance regarding both the treatment of COPD using TCM and the prevention of cancer in patients with COPD. KEY MESSAGESA cancer risk assessment signature was identified in patients with COPD.The signature is insensitive to batch effects and is well verified.COPD key genes identified in this study might play a crucial role in TCM treatment and cancer prevention.
Collapse
Affiliation(s)
- Qingzhou Guan
- Academy of Chinese Medical Sciences, Henan University of Chinese Medicine, Zhengzhou, China.,Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Co-Construction Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases by Henan & Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou, China
| | - Peng Zhao
- Academy of Chinese Medical Sciences, Henan University of Chinese Medicine, Zhengzhou, China.,Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Co-Construction Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases by Henan & Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou, China
| | - Yange Tian
- Academy of Chinese Medical Sciences, Henan University of Chinese Medicine, Zhengzhou, China.,Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Co-Construction Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases by Henan & Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou, China
| | - Liping Yang
- School of Basic Medicine, Henan University of Chinese Medicine, Zhengzhou, China
| | - Zhenzhen Zhang
- Academy of Chinese Medical Sciences, Henan University of Chinese Medicine, Zhengzhou, China.,Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Co-Construction Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases by Henan & Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou, China
| | - Jiansheng Li
- Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Co-Construction Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases by Henan & Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou, China.,The First Affiliated Hospital, Henan University of Chinese Medicine, Zhengzhou, China
| |
Collapse
|
5
|
Elrakaybi A, Ruess DA, Lübbert M, Quante M, Becker H. Epigenetics in Pancreatic Ductal Adenocarcinoma: Impact on Biology and Utilization in Diagnostics and Treatment. Cancers (Basel) 2022; 14:cancers14235926. [PMID: 36497404 PMCID: PMC9738647 DOI: 10.3390/cancers14235926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Revised: 11/18/2022] [Accepted: 11/24/2022] [Indexed: 12/05/2022] Open
Abstract
Pancreatic ductal adenocarcinoma (PDAC) is one of the most aggressive malignancies with high potential of metastases and therapeutic resistance. Although genetic mutations drive PDAC initiation, they alone do not explain its aggressive nature. Epigenetic mechanisms, including aberrant DNA methylation and histone modifications, significantly contribute to inter- and intratumoral heterogeneity, disease progression and metastasis. Thus, increased understanding of the epigenetic landscape in PDAC could offer new potential biomarkers and tailored therapeutic approaches. In this review, we shed light on the role of epigenetic modifications in PDAC biology and on the potential clinical applications of epigenetic biomarkers in liquid biopsy. In addition, we provide an overview of clinical trials assessing epigenetically targeted treatments alone or in combination with other anticancer therapies to improve outcomes of patients with PDAC.
Collapse
Affiliation(s)
- Asmaa Elrakaybi
- Department of Hematology, Oncology and Stem Cell Transplantation, Medical Center University of Freiburg, Faculty of Medicine, University of Freiburg, 79106 Freiburg, Germany
- Department of Clinical Pharmacy, Ain Shams University, Cairo 11566, Egypt
| | - Dietrich A. Ruess
- Department of General and Visceral Surgery, Center of Surgery, Medical Center University of Freiburg, 79106 Freiburg, Germany
- German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Partner Site Freiburg, 79106 Freiburg, Germany
| | - Michael Lübbert
- Department of Hematology, Oncology and Stem Cell Transplantation, Medical Center University of Freiburg, Faculty of Medicine, University of Freiburg, 79106 Freiburg, Germany
- German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Partner Site Freiburg, 79106 Freiburg, Germany
| | - Michael Quante
- German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Partner Site Freiburg, 79106 Freiburg, Germany
- Department of Gastroenterology and Hepatology, Medical Center University of Freiburg, Faculty of Medicine, University of Freiburg, 79106 Freiburg, Germany
| | - Heiko Becker
- Department of Hematology, Oncology and Stem Cell Transplantation, Medical Center University of Freiburg, Faculty of Medicine, University of Freiburg, 79106 Freiburg, Germany
- German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Partner Site Freiburg, 79106 Freiburg, Germany
- Correspondence: ; Tel.: +49-761-270-36000
| |
Collapse
|
6
|
Comprehensive Analysis of the Molecular Characteristics and Prognosis value of AT II-associated Genes in Non-small Cell Lung Cancer. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:3106688. [PMID: 36203529 PMCID: PMC9530922 DOI: 10.1155/2022/3106688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 09/10/2022] [Indexed: 11/25/2022]
Abstract
Alveolar type II (AT II) is a key structure of the distal lung epithelium and essential to maintain normal lung homeostasis. Dedifferentiation of AT II cells is significantly correlated with lung tumor progression. However, the potential molecular mechanism and clinical significance of AT II-associated genes for lung cancer has not yet been fully elucidated. In this study, we comprehensively analyzed the gene expression, prognosis value, genetic alteration, and immune cell infiltration of eight AT II-associated genes (AQP4, SFTPB, SFTPC, SFTPD, CLDN18, FOXA2, NKX2-1, and PGC) in Nonsmall Cell Lung Cancer (NSCLC). The results have shown that the expression of eight genes were remarkably reduced in cancer tissues and observably relating to clinical cancer stages. Survival analysis of the eight genes revealed that low-expression of CLDN18, FOXA2, NKX2-1, PGC, SFTPB, SFTPC, and SFTPD were significantly related to a reduced progression-free survival (FP), and low CLDN18, FOXA2, and SFTPD mRNA expression led to a short postprogression survival (PPS). Meanwhile, the alteration of 8 AT II-associated genes covered 273 out of 1053 NSCLC samples (26%). Additionally, the expression level of eight genes were significantly correlated with the infiltration of diverse immune cells, including six types of CD4+T cells, macrophages, neutrophils, B cells, CD8+ T cells, and dendritic cells. In summary, this study provided clues of the values of eight AT II-associated genes as clinical biomarkers and therapeutic targets in NSCLC and might provide some new inspirations to assist the design of new immunotherapies.
Collapse
|
7
|
Tonini V, Zanni M. Early diagnosis of pancreatic cancer: What strategies to avoid a foretold catastrophe. World J Gastroenterol 2022; 28:4235-4248. [PMID: 36159004 PMCID: PMC9453775 DOI: 10.3748/wjg.v28.i31.4235] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 05/18/2022] [Accepted: 07/25/2022] [Indexed: 02/06/2023] Open
Abstract
While great strides in improving survival rates have been made for most cancers in recent years, pancreatic ductal adenocarcinoma (PDAC) remains one of the solid tumors with the worst prognosis. PDAC mortality often overlaps with incidence. Surgical resection is the only potentially curative treatment, but it can be performed in a very limited number of cases. In order to improve the prognosis of PDAC, there are ideally two possible ways: the discovery of new strategies or drugs that will make it possible to treat the tumor more successfully or an earlier diagnosis that will allow patients to be operated on at a less advanced stage. The aim of this review was to summarize all the possible strategies available today for the early diagnosis of PDAC and the paths that research needs to take to make this goal ever closer. All the most recent studies on risk factors and screening modalities, new laboratory tests including liquid biopsy, new imaging methods and possible applications of artificial intelligence and machine learning were reviewed and commented on. Unfortunately, in 2022 the results for this type of cancer still remain discouraging, while a catastrophic increase in cases is expected in the coming years. The article was also written with the aim of highlighting the urgency of devoting more attention and resources to this pathology in order to reach a solution that seems more and more unreachable every day.
Collapse
Affiliation(s)
- Valeria Tonini
- Department of Medical and Surgical Sciences, University of Bologna, Bologna 40138, Italy
| | - Manuel Zanni
- Department of Medical and Surgical Sciences, University of Bologna, Bologna 40138, Italy
| |
Collapse
|
8
|
Jo Y, Yeo MK, Dao T, Kwon J, Yi H, Ryu D. Machine learning-featured Secretogranin V is a circulating diagnostic biomarker for pancreatic adenocarcinomas associated with adipopenia. Front Oncol 2022; 12:942774. [PMID: 36059698 PMCID: PMC9428794 DOI: 10.3389/fonc.2022.942774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 07/25/2022] [Indexed: 11/13/2022] Open
Abstract
Background Pancreatic cancer is one of the most fatal malignancies of the gastrointestinal cancer, with a challenging early diagnosis due to lack of distinctive symptoms and specific biomarkers. The exact etiology of pancreatic cancer is unknown, making the development of reliable biomarkers difficult. The accumulation of patient-derived omics data along with technological advances in artificial intelligence is giving way to a new era in the discovery of suitable biomarkers. Methods We performed machine learning (ML)-based modeling using four independent transcriptomic datasets, including GSE16515, GSE62165, GSE71729, and the pancreatic adenocarcinoma (PAC) dataset of the Cancer Genome Atlas. To find candidates for circulating biomarkers, we exported expression profiles of 1,703 genes encoding secretory proteins. Integrating three transcriptomic datasets into either a training or test set, ML-based modeling distinguishing PAC from normal was carried out. Another ML-model classifying long-lived and short-lived patients with PAC was also built to select prognosis-associated features. Finally, circulating level of SCG5 in the plasma was determined from the independent cohort (non-tumor = 25 and pancreatic cancer = 25). We also investigated the impact of SCG5 on adipocyte biology using recombinant protein. Results Three distinctive ML-classifiers selected 29-, 64- and 18-featured genes, recognizing the only common gene, SCG5. As per the prediction of ML-models, the SCG5 transcripts was significantly reduced in PAC and decreased further with the progression of the tumor, indicating its potential as a diagnostic as well as prognostic marker for PAC. External validation of SCG5 using plasma samples from patients with PAC confirmed that SCG5 was reduced significantly in patients with PAC when compared to controls. Interestingly, plasma SCG5 levels were correlated with the body mass index and age of donors, implying pancreas-originated SCG5 could regulate energy metabolism systemically. Additionally, analyses using publicly available Genotype-Tissue Expression datasets, including adipose tissue histology and pancreatic SCG5 expression, further validated the association between pancreatic SCG5 expression and the size of subcutaneous adipocytes in humans. However, we could not observe any definite effect of rSCG5 on the cultured adipocyte, in 2D in vitro culture. Conclusion Circulating SCG5, which may be associated with adipopenia, is a promising diagnostic biomarker for PAC.
Collapse
Affiliation(s)
- Yunju Jo
- Department of Molecular Cell Biology, Sungkyunkwan University (SKKU) School of Medicine, Suwon, South Korea
| | - Min-Kyung Yeo
- Department of Pathology, Chungnam National University School of Medicine, Daejeon, South Korea
| | - Tam Dao
- Department of Molecular Cell Biology, Sungkyunkwan University (SKKU) School of Medicine, Suwon, South Korea
| | - Jeongho Kwon
- Department of Molecular Cell Biology, Sungkyunkwan University (SKKU) School of Medicine, Suwon, South Korea
| | - Hyon‐Seung Yi
- Department of Medical Science, Chungnam National University School of Medicine, Daejeon, South Korea
- Laboratory of Endocrinology and Immune System, Chungnam National University School of Medicine, Daejeon, South Korea
- *Correspondence: Hyon‐Seung Yi, ; Dongryeol Ryu,
| | - Dongryeol Ryu
- Department of Molecular Cell Biology, Sungkyunkwan University (SKKU) School of Medicine, Suwon, South Korea
- *Correspondence: Hyon‐Seung Yi, ; Dongryeol Ryu,
| |
Collapse
|
9
|
Yin H, Zhang F, Yang X, Meng X, Miao Y, Noor Hussain MS, Yang L, Li Z. Research trends of artificial intelligence in pancreatic cancer: a bibliometric analysis. Front Oncol 2022; 12:973999. [PMID: 35982967 PMCID: PMC9380440 DOI: 10.3389/fonc.2022.973999] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 07/13/2022] [Indexed: 01/03/2023] Open
Abstract
Purpose We evaluated the related research on artificial intelligence (AI) in pancreatic cancer (PC) through bibliometrics analysis and explored the research hotspots and current status from 1997 to 2021. Methods Publications related to AI in PC were retrieved from the Web of Science Core Collection (WoSCC) during 1997-2021. Bibliometrix package of R software 4.0.3 and VOSviewer were used to bibliometrics analysis. Results A total of 587 publications in this field were retrieved from WoSCC database. After 2018, the number of publications grew rapidly. The United States and Johns Hopkins University were the most influential country and institution, respectively. A total of 2805 keywords were investigated, 81 of which appeared more than 10 times. Co-occurrence analysis categorized these keywords into five types of clusters: (1) AI in biology of PC, (2) AI in pathology and radiology of PC, (3) AI in the therapy of PC, (4) AI in risk assessment of PC and (5) AI in endoscopic ultrasonography (EUS) of PC. Trend topics and thematic maps show that keywords " diagnosis ", “survival”, “classification”, and “management” are the research hotspots in this field. Conclusion The research related to AI in pancreatic cancer is still in the initial stage. Currently, AI is widely studied in biology, diagnosis, treatment, risk assessment, and EUS of pancreatic cancer. This bibliometrics study provided an insight into AI in PC research and helped researchers identify new research orientations.
Collapse
Affiliation(s)
- Hua Yin
- Department of Gastroenterology, General Hospital of Ningxia Medical University, Yinchuan, China
- Postgraduate Training Base in Shanghai Gongli Hospital, Ningxia Medical University, Shanghai, China
| | - Feixiong Zhang
- Department of Gastroenterology, General Hospital of Ningxia Medical University, Yinchuan, China
| | - Xiaoli Yang
- Department of Gastroenterology, General Hospital of Ningxia Medical University, Yinchuan, China
| | - Xiangkun Meng
- Department of Gastroenterology, General Hospital of Ningxia Medical University, Yinchuan, China
| | - Yu Miao
- Department of Gastroenterology, General Hospital of Ningxia Medical University, Yinchuan, China
| | | | - Li Yang
- Department of Gastroenterology, General Hospital of Ningxia Medical University, Yinchuan, China
- *Correspondence: Zhaoshen Li, ; Li Yang,
| | - Zhaoshen Li
- Postgraduate Training Base in Shanghai Gongli Hospital, Ningxia Medical University, Shanghai, China
- Clinical Medical College, Ningxia Medical University, Yinchuan, China
- *Correspondence: Zhaoshen Li, ; Li Yang,
| |
Collapse
|
10
|
Rangwani S, Ardeshna DR, Rodgers B, Melnychuk J, Turner R, Culp S, Chao WL, Krishna SG. Application of Artificial Intelligence in the Management of Pancreatic Cystic Lesions. Biomimetics (Basel) 2022; 7:biomimetics7020079. [PMID: 35735595 PMCID: PMC9221027 DOI: 10.3390/biomimetics7020079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 06/07/2022] [Accepted: 06/07/2022] [Indexed: 12/10/2022] Open
Abstract
The rate of incidentally detected pancreatic cystic lesions (PCLs) has increased over the past decade and was recently reported at 8%. These lesions pose a unique challenge, as each subtype of PCL carries a different risk of malignant transformation, ranging from 0% (pancreatic pseudocyst) to 34–68% (main duct intraductal papillary mucinous neoplasm). It is imperative to correctly risk-stratify the malignant potential of these lesions in order to provide the correct care course for the patient, ranging from monitoring to surgical intervention. Even with the multiplicity of guidelines (i.e., the American Gastroenterology Association guidelines and Fukuoka/International Consensus guidelines) and multitude of diagnostic information, risk stratification of PCLs falls short. Studies have reported that 25–64% of patients undergoing PCL resection have pancreatic cysts with no malignant potential, and up to 78% of mucin-producing cysts resected harbor no malignant potential on pathological evaluation. Clinicians are now incorporating artificial intelligence technology to aid in the management of these difficult lesions. This review article focuses on advancements in artificial intelligence within digital pathomics, radiomics, and genomics as they apply to the diagnosis and risk stratification of PCLs.
Collapse
Affiliation(s)
- Shiva Rangwani
- Department of Internal Medicine, Ohio State University Wexner Medical Center, Columbus, OH 43210, USA; (S.R.); (D.R.A.)
| | - Devarshi R. Ardeshna
- Department of Internal Medicine, Ohio State University Wexner Medical Center, Columbus, OH 43210, USA; (S.R.); (D.R.A.)
| | - Brandon Rodgers
- College of Medicine, The Ohio State University, Columbus, OH 43210, USA; (B.R.); (J.M.); (R.T.)
| | - Jared Melnychuk
- College of Medicine, The Ohio State University, Columbus, OH 43210, USA; (B.R.); (J.M.); (R.T.)
| | - Ronald Turner
- College of Medicine, The Ohio State University, Columbus, OH 43210, USA; (B.R.); (J.M.); (R.T.)
| | - Stacey Culp
- Department of Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH 43210, USA;
| | - Wei-Lun Chao
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH 43210, USA;
| | - Somashekar G. Krishna
- Department of Gastroenterology, Hepatology, and Nutrition, The Ohio State University Wexner Medical Center, Columbus, OH 43210, USA
- Correspondence: ; Tel.: +614-293-6255
| |
Collapse
|
11
|
Ahmed Z, Zulfiqar H, Khan AA, Gul I, Dao FY, Zhang ZY, Yu XL, Tang L. iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy. Front Microbiol 2022; 13:790063. [PMID: 35273581 PMCID: PMC8902591 DOI: 10.3389/fmicb.2022.790063] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Accepted: 01/10/2022] [Indexed: 01/20/2023] Open
Abstract
Thermophilic proteins have important application value in biotechnology and industrial processes. The correct identification of thermophilic proteins provides important information for the application of these proteins in engineering. The identification method of thermophilic proteins based on biochemistry is laborious, time-consuming, and high cost. Therefore, there is an urgent need for a fast and accurate method to identify thermophilic proteins. Considering this urgency, we constructed a reliable benchmark dataset containing 1,368 thermophilic and 1,443 non-thermophilic proteins. A multi-layer perceptron (MLP) model based on a multi-feature fusion strategy was proposed to discriminate thermophilic proteins from non-thermophilic proteins. On independent data set, the proposed model could achieve an accuracy of 96.26%, which demonstrates that the model has a good application prospect. In order to use the model conveniently, a user-friendly software package called iThermo was established and can be freely accessed at http://lin-group.cn/server/iThermo/index.html. The high accuracy of the model and the practicability of the developed software package indicate that this study can accelerate the discovery and engineering application of thermally stable proteins.
Collapse
Affiliation(s)
- Zahoor Ahmed
- School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Hasan Zulfiqar
- School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Abdullah Aman Khan
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China.,Sichuan Artificial Intelligence Research Institute, Yibin, China
| | - Ijaz Gul
- School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,Tsinghua Shenzhen International Graduate School, Institute of Biopharmaceutical and Health Engineering, Tsinghua University, Shenzhen, China
| | - Fu-Ying Dao
- School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Zhao-Yue Zhang
- School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Xiao-Long Yu
- School of Materials Science and Engineering, Hainan University, Haikou, China
| | - Lixia Tang
- School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
12
|
Lin X. Genomic Variation Prediction: A Summary From Different Views. Front Cell Dev Biol 2021; 9:795883. [PMID: 34901036 PMCID: PMC8656232 DOI: 10.3389/fcell.2021.795883] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 11/11/2021] [Indexed: 12/02/2022] Open
Abstract
Structural variations in the genome are closely related to human health and the occurrence and development of various diseases. To understand the mechanisms of diseases, find pathogenic targets, and carry out personalized precision medicine, it is critical to detect such variations. The rapid development of high-throughput sequencing technologies has accelerated the accumulation of large amounts of genomic mutation data, including synonymous mutations. Identifying pathogenic synonymous mutations that play important roles in the occurrence and development of diseases from all the available mutation data is of great importance. In this paper, machine learning theories and methods are reviewed, efficient and accurate pathogenic synonymous mutation prediction methods are developed, and a standardized three-level variant analysis framework is constructed. In addition, multiple variation tolerance prediction models are studied and integrated, and new ideas for structural variation detection based on deep information mining are explored.
Collapse
Affiliation(s)
- Xiuchun Lin
- College of Information and Electrical Engineering, China Agricultural University, Beijing, China
| |
Collapse
|
13
|
Zhou S, Meng Q, Li L, Hai L, Wang Z, Li Z, Sun Y. Identification of a Qualitative Signature for the Diagnosis of Dementia With Lewy Bodies. Front Genet 2021; 12:758103. [PMID: 34868234 PMCID: PMC8640079 DOI: 10.3389/fgene.2021.758103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 10/28/2021] [Indexed: 11/24/2022] Open
Abstract
Background and purpose: Diagnosis of dementia with Lewy bodies (DLB) is highly challenging, primarily due to a lack of valid and reliable diagnostic tools. To date, there is no report of qualitative signature for the diagnosis of DLB. We aimed to develop a blood-based qualitative signature for differentiating DLB patients from healthy controls. Methods: The GSE120584 dataset was downloaded from the public database Gene Expression Omnibus (GEO). We combined multiple methods to select features based on the within-sample relative expression orderings (REOs) of microRNA (miRNA) pairs. Specifically, we first quickly selected miRNA pairs related to DLB by identifying reversal stable miRNA pairs. Then, an optimal miRNA pair subset was extracted by random forest (RF) and support vector machine-recursive feature elimination (SVM-RFE) methods. Furthermore, we applied logistic regression (LR) and SVM to build several prediction models. The model performance was assessed using the receiver operating characteristic curve (ROC) analysis. Lastly, we conducted bioinformatics analyses to explore the molecular mechanisms of the discovered miRNAs. Results: A qualitative signature consisted of 17 miRNA pairs and two clinical factors was identified for discriminating DLB patients from healthy controls. The signature is robust against experimental batch effects and applicable at the individual levels. The accuracies of the-signature-based models on the test set are 82.61 and 79.35%, respectively, indicating that the signature has acceptable discrimination performance. Moreover, bioinformatics analyses revealed that predicted target genes were enriched in 11 Go terms and 2 KEGG pathways. Moreover, five potential hub genes were found for DLB, including SRF, MAPK1, YWHAE, RPS6KA3, and KDM7A. Conclusion: This study provided a blood-based qualitative signature with the potential to be used as an effective tool to improve the accuracy of DLB diagnosis.
Collapse
Affiliation(s)
- Shu Zhou
- Institute of Biomedical and Health Engineering, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.,Central Laboratory, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital & Shenzhen Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Shenzhen, China
| | - Qingchun Meng
- Central Laboratory, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital & Shenzhen Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Shenzhen, China
| | - Lingyu Li
- Central Laboratory, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital & Shenzhen Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Shenzhen, China
| | - Luo Hai
- Central Laboratory, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital & Shenzhen Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Shenzhen, China
| | - Zexuan Wang
- Central Laboratory, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital & Shenzhen Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Shenzhen, China
| | - Zhicheng Li
- Institute of Biomedical and Health Engineering, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Yingli Sun
- Central Laboratory, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital & Shenzhen Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Shenzhen, China
| |
Collapse
|
14
|
Chen J, Zhang Q, Liu T, Tang H. Roles of M6A Regulators in Hepatocellular Carcinoma: Promotion or Suppression. Curr Gene Ther 2021; 22:40-50. [PMID: 34825870 DOI: 10.2174/1566523221666211126105940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Revised: 06/15/2021] [Accepted: 10/14/2021] [Indexed: 11/22/2022]
Abstract
Hepatocellular carcinoma (HCC) is the sixth globally diagnosed cancer with a poor prognosis. Although the pathological factors of hepatocellular carcinoma are well elucidated, the underlying molecular mechanisms remain unclear. N6-methyladenosine (m6A) is an adenosine methylation occurring at the N6 site, which is the most prevalent modification of eukaryotic mRNA. Recent studies have shown that m6A can regulate gene expression, thus modulating the processes of cell self-renewal, differentiation, and apoptosis. The methyls in m6A are installed by methyltransferases ("writers"), removed by demethylases ("erasers") and recognized by m6A-binding proteins ("readers"). In this review, we discuss the roles of above regulators in the progression and prognosis of HCC, and summarize the clinical association between m6A modification and hepatocellular carcinoma, so as to provide more valuable information for clinical treatment.
Collapse
Affiliation(s)
- Jiamao Chen
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, China
| | - Qian Zhang
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, China
| | - Ting Liu
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, China
| | - Hua Tang
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, China
| |
Collapse
|
15
|
Hayashi H, Uemura N, Matsumura K, Zhao L, Sato H, Shiraishi Y, Yamashita YI, Baba H. Recent advances in artificial intelligence for pancreatic ductal adenocarcinoma. World J Gastroenterol 2021; 27:7480-7496. [PMID: 34887644 PMCID: PMC8613738 DOI: 10.3748/wjg.v27.i43.7480] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Revised: 08/02/2021] [Accepted: 11/15/2021] [Indexed: 02/06/2023] Open
Abstract
Pancreatic ductal adenocarcinoma (PDAC) remains the most lethal type of cancer. The 5-year survival rate for patients with early-stage diagnosis can be as high as 20%, suggesting that early diagnosis plays a pivotal role in the prognostic improvement of PDAC cases. In the medical field, the broad availability of biomedical data has led to the advent of the "big data" era. To overcome this deadly disease, how to fully exploit big data is a new challenge in the era of precision medicine. Artificial intelligence (AI) is the ability of a machine to learn and display intelligence to solve problems. AI can help to transform big data into clinically actionable insights more efficiently, reduce inevitable errors to improve diagnostic accuracy, and make real-time predictions. AI-based omics analyses will become the next alterative approach to overcome this poor-prognostic disease by discovering biomarkers for early detection, providing molecular/genomic subtyping, offering treatment guidance, and predicting recurrence and survival. Advances in AI may therefore improve PDAC survival outcomes in the near future. The present review mainly focuses on recent advances of AI in PDAC for clinicians. We believe that breakthroughs will soon emerge to fight this deadly disease using AI-navigated precision medicine.
Collapse
Affiliation(s)
- Hiromitsu Hayashi
- Department of Gastroenterological Surgery, Graduate School of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Norio Uemura
- Department of Gastroenterological Surgery, Graduate School of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Kazuki Matsumura
- Department of Gastroenterological Surgery, Graduate School of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Liu Zhao
- Department of Gastroenterological Surgery, Graduate School of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Hiroki Sato
- Department of Gastroenterological Surgery, Graduate School of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Yuta Shiraishi
- Department of Gastroenterological Surgery, Graduate School of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Yo-ichi Yamashita
- Department of Gastroenterological Surgery, Graduate School of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Hideo Baba
- Department of Gastroenterological Surgery, Graduate School of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| |
Collapse
|
16
|
Jiao S, Zou Q, Guo H, Shi L. iTTCA-RF: a random forest predictor for tumor T cell antigens. J Transl Med 2021; 19:449. [PMID: 34706730 PMCID: PMC8554859 DOI: 10.1186/s12967-021-03084-x] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 09/16/2021] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Cancer is one of the most serious diseases threatening human health. Cancer immunotherapy represents the most promising treatment strategy due to its high efficacy and selectivity and lower side effects compared with traditional treatment. The identification of tumor T cell antigens is one of the most important tasks for antitumor vaccines development and molecular function investigation. Although several machine learning predictors have been developed to identify tumor T cell antigen, more accurate tumor T cell antigen identification by existing methodology is still challenging. METHODS In this study, we used a non-redundant dataset of 592 tumor T cell antigens (positive samples) and 393 tumor T cell antigens (negative samples). Four types feature encoding methods have been studied to build an efficient predictor, including amino acid composition, global protein sequence descriptors and grouped amino acid and peptide composition. To improve the feature representation ability of the hybrid features, we further employed a two-step feature selection technique to search for the optimal feature subset. The final prediction model was constructed using random forest algorithm. RESULTS Finally, the top 263 informative features were selected to train the random forest classifier for detecting tumor T cell antigen peptides. iTTCA-RF provides satisfactory performance, with balanced accuracy, specificity and sensitivity values of 83.71%, 78.73% and 88.69% over tenfold cross-validation as well as 73.14%, 62.67% and 83.61% over independent tests, respectively. The online prediction server was freely accessible at http://lab.malab.cn/~acy/iTTCA . CONCLUSIONS We have proven that the proposed predictor iTTCA-RF is superior to the other latest models, and will hopefully become an effective and useful tool for identifying tumor T cell antigens presented in the context of major histocompatibility complex class I.
Collapse
Affiliation(s)
- Shihu Jiao
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Huannan Guo
- Department of Oncology, General Hospital of Heilongjiang Province Land Reclamation Bureau, Harbin, China.
| | - Lei Shi
- Department of Spine Surgery, Changzheng Hospital, Naval Medical University, Shanghai, China.
| |
Collapse
|
17
|
Feng Y, Wang Z, Yang N, Liu S, Yan J, Song J, Yang S, Zhang Y. Identification of Biomarkers for Cervical Cancer Radiotherapy Resistance Based on RNA Sequencing Data. Front Cell Dev Biol 2021; 9:724172. [PMID: 34414195 PMCID: PMC8369412 DOI: 10.3389/fcell.2021.724172] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Accepted: 07/14/2021] [Indexed: 11/28/2022] Open
Abstract
Cervical cancer as a common gynecological malignancy threatens the health and lives of women. Resistance to radiotherapy is the primary cause of treatment failure and is mainly related to difference in the inherent vulnerability of tumors after radiotherapy. Here, we investigated signature genes associated with poor response to radiotherapy by analyzing an independent cervical cancer dataset from the Gene Expression Omnibus, including pre-irradiation and mid-irradiation information. A total of 316 differentially expressed genes were significantly identified. The correlations between these genes were investigated through the Pearson correlation analysis. Subsequently, random forest model was used in determining cancer-related genes, and all genes were ranked by random forest scoring. The top 30 candidate genes were selected for uncovering their biological functions. Functional enrichment analysis revealed that the biological functions chiefly enriched in tumor immune responses, such as cellular defense response, negative regulation of immune system process, T cell activation, neutrophil activation involved in immune response, regulation of antigen processing and presentation, and peptidyl-tyrosine autophosphorylation. Finally, the top 30 genes were screened and analyzed through literature verification. After validation, 10 genes (KLRK1, LCK, KIF20A, CD247, FASLG, CD163, ZAP70, CD8B, ZNF683, and F10) were to our objective. Overall, the present research confirmed that integrated bioinformatics methods can contribute to the understanding of the molecular mechanisms and potential therapeutic targets underlying radiotherapy resistance in cervical cancer.
Collapse
Affiliation(s)
- Yue Feng
- Department of Gynecological Radiotherapy, Harbin Medical University Cancer Hospital, Harbin, China
| | - Zhao Wang
- Department of Gynecological Radiotherapy, Harbin Medical University Cancer Hospital, Harbin, China
| | - Nan Yang
- Department of Gynecological Radiotherapy, Harbin Medical University Cancer Hospital, Harbin, China
| | - Sijia Liu
- Department of Gynecological Radiotherapy, Harbin Medical University Cancer Hospital, Harbin, China
| | - Jiazhuo Yan
- Department of Gynecological Radiotherapy, Harbin Medical University Cancer Hospital, Harbin, China
| | - Jiayu Song
- Department of Gynecological Radiotherapy, Harbin Medical University Cancer Hospital, Harbin, China
| | - Shanshan Yang
- Department of Gynecological Radiotherapy, Harbin Medical University Cancer Hospital, Harbin, China
| | - Yunyan Zhang
- Department of Gynecological Radiotherapy, Harbin Medical University Cancer Hospital, Harbin, China
| |
Collapse
|
18
|
Khatun MS, Alam MA, Shoombuatong W, Mollah MNH, Kurata H, Hasan MM. Recent development of bioinformatics tools for microRNA target prediction. Curr Med Chem 2021; 29:865-880. [PMID: 34348604 DOI: 10.2174/0929867328666210804090224] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 06/10/2021] [Accepted: 06/15/2021] [Indexed: 11/22/2022]
Abstract
MicroRNAs (miRNAs) are central players that regulate the post-transcriptional processes of gene expression. Binding of miRNAs to target mRNAs can repress their translation by inducing the degradation or by inhibiting the translation of the target mRNAs. High-throughput experimental approaches for miRNA target identification are costly and time-consuming, depending on various factors. It is vitally important to develop the bioinformatics methods for accurately predicting miRNA targets. With the increase of RNA sequences in the post-genomic era, bioinformatics methods are being developed for miRNA studies specially for miRNA target prediction. This review summarizes the current development of state-of-the-art bioinformatics tools for miRNA target prediction, points out the progress and limitations of the available miRNA databases, and their working principles. Finally, we discuss the caveat and perspectives of the next-generation algorithms for the prediction of miRNA targets.
Collapse
Affiliation(s)
- Mst Shamima Khatun
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502. Japan
| | - Md Ashad Alam
- Tulane Center for Biomedical Informatics and Genomics, Division of Biomedical Informatics and Genomics, John W. Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, LA 70112. United States
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700. Thailand
| | - Md Nurul Haque Mollah
- Laboratory of Bioinformatics, Department of Statistics, University of Rajshahi, Rajshahi, Bangladesh. 5Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083. Japan
| | - Hiroyuki Kurata
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502. Japan
| | - Md Mehedi Hasan
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502. Japan
| |
Collapse
|
19
|
Zulfiqar H, Yuan SS, Huang QL, Sun ZJ, Dao FY, Yu XL, Lin H. Identification of cyclin protein using gradient boost decision tree algorithm. Comput Struct Biotechnol J 2021; 19:4123-4131. [PMID: 34527186 PMCID: PMC8346528 DOI: 10.1016/j.csbj.2021.07.013] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 07/15/2021] [Accepted: 07/15/2021] [Indexed: 12/12/2022] Open
Abstract
Cyclin proteins are capable to regulate the cell cycle by forming a complex with cyclin-dependent kinases to activate cell cycle. Correct recognition of cyclin proteins could provide key clues for studying their functions. However, their sequences share low similarity, which results in poor prediction for sequence similarity-based methods. Thus, it is urgent to construct a machine learning model to identify cyclin proteins. This study aimed to develop a computational model to discriminate cyclin proteins from non-cyclin proteins. In our model, protein sequences were encoded by seven kinds of features that are amino acid composition, composition of k-spaced amino acid pairs, tri peptide composition, pseudo amino acid composition, geary correlation, normalized moreau-broto autocorrelation and composition/transition/distribution. Afterward, these features were optimized by using analysis of variance (ANOVA) and minimum redundancy maximum relevance (mRMR) with incremental feature selection (IFS) technique. A gradient boost decision tree (GBDT) classifier was trained on the optimal features. Five-fold cross-validated results showed that our model would identify cyclins with an accuracy of 93.06% and AUC value of 0.971, which are higher than the two recent studies on the same data.
Collapse
Affiliation(s)
- Hasan Zulfiqar
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Shi-Shi Yuan
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Qin-Lai Huang
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Zi-Jie Sun
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Fu-Ying Dao
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Xiao-Long Yu
- School of Materials Science and Engineering, Hainan University, Haikou 570228, China
| | - Hao Lin
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|
20
|
Hunt C, Montgomery S, Berkenpas JW, Sigafoos N, Oakley JC, Espinosa J, Justice N, Kishaba K, Hippe K, Si D, Hou J, Ding H, Cao R. Recent Progress of Machine Learning in Gene Therapy. Curr Gene Ther 2021; 22:132-143. [PMID: 34161210 DOI: 10.2174/1566523221666210622164133] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 03/15/2021] [Accepted: 04/02/2021] [Indexed: 11/22/2022]
Abstract
With new developments in biomedical technology, it is now a viable therapeutic treatment to alter genes with techniques like CRISPR. At the same time, it is increasingly cheaper to do whole genome sequencing, resulting in rapid advancement in gene therapy and editing in precision medicine. Thus, understanding the current industry and academic applications of gene therapy provides an important backdrop to future scientific developments. Additionally, machine learning and artificial intelligence techniques allow for the reduction of time and money spent in the development of new gene therapy products and techniques. In this paper, we survey the current progress of gene therapy treatments for several diseases and explore machine learning applications in gene therapy. We also discuss the ethical implications of gene therapy and the use of machine learning in precision medicine. Machine learning and gene therapy are both topics gaining popularity in various publications, and we conclude that there is still room for continued research and application of machine learning techniques in the gene therapy field.
Collapse
Affiliation(s)
- Cassandra Hunt
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, United States
| | - Sandra Montgomery
- Department of Physics, Pacific Lutheran University, Tacoma, WA, United States
| | | | - Noel Sigafoos
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, United States
| | - John Christian Oakley
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, United States
| | - Jacob Espinosa
- Department of Mathematics, Pacific Lutheran University, Tacoma, WA, United States
| | - Nicola Justice
- Department of Mathematics, Pacific Lutheran University, Tacoma, WA, United States
| | - Kiyomi Kishaba
- Department of Humanities, Pacific Lutheran University, Tacoma, WA, United States
| | - Kyle Hippe
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, United States
| | - Dong Si
- Division of Computing Software Systems, University of Washington-Bothell, Bothell, WA, United States
| | - Jie Hou
- Department of Computer Science, Saint Louis University, St. Louis, MO, United States
| | - Hui Ding
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Renzhi Cao
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, United States
| |
Collapse
|
21
|
Zulfiqar H, Khan RS, Hassan F, Hippe K, Hunt C, Ding H, Song XM, Cao R. Computational identification of N4-methylcytosine sites in the mouse genome with machine-learning method. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2021; 18:3348-3363. [PMID: 34198389 DOI: 10.3934/mbe.2021167] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/24/2023]
Abstract
N4-methylcytosine (4mC) is a kind of DNA modification which could regulate multiple biological processes. Correctly identifying 4mC sites in genomic sequences can provide precise knowledge about their genetic roles. This study aimed to develop an ensemble model to predict 4mC sites in the mouse genome. In the proposed model, DNA sequences were encoded by k-mer, enhanced nucleic acid composition and composition of k-spaced nucleic acid pairs. Subsequently, these features were optimized by using minimum redundancy maximum relevance (mRMR) with incremental feature selection (IFS) and five-fold cross-validation. The obtained optimal features were inputted into random forest classifier for discriminating 4mC from non-4mC sites in mouse. On the independent dataset, our model could yield the overall accuracy of 85.41%, which was approximately 3.8% -6.3% higher than the two existing models, i4mC-Mouse and 4mCpred-EL respectively. The data and source code of the model can be freely download from https://github.com/linDing-groups/model_4mc.
Collapse
Affiliation(s)
- Hasan Zulfiqar
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Rida Sarwar Khan
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Farwa Hassan
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Kyle Hippe
- Department of Computer Science, Pacific Lutheran University, Tacoma 98447, USA
| | - Cassandra Hunt
- Department of Computer Science, Pacific Lutheran University, Tacoma 98447, USA
| | - Hui Ding
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Xiao-Ming Song
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
- School of Life Sciences, North China University of Science and Technology, Tangshan, Hebei 063210, China
| | - Renzhi Cao
- Department of Computer Science, Pacific Lutheran University, Tacoma 98447, USA
| |
Collapse
|
22
|
Zhang Z, Cui F, Lin C, Zhao L, Wang C, Zou Q. Critical downstream analysis steps for single-cell RNA sequencing data. Brief Bioinform 2021; 22:6210064. [PMID: 33822873 DOI: 10.1093/bib/bbab105] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Revised: 02/20/2021] [Accepted: 03/09/2021] [Indexed: 12/13/2022] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) has enabled us to study biological questions at the single-cell level. Currently, many analysis tools are available to better utilize these relatively noisy data. In this review, we summarize the most widely used methods for critical downstream analysis steps (i.e. clustering, trajectory inference, cell-type annotation and integrating datasets). The advantages and limitations are comprehensively discussed, and we provide suggestions for choosing proper methods in different situations. We hope this paper will be useful for scRNA-seq data analysts and bioinformatics tool developers.
Collapse
Affiliation(s)
- Zilong Zhang
- University of Electronic Science and Technology of China
| | | | | | | | | | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China
| |
Collapse
|
23
|
Screening of Prospective Plant Compounds as H1R and CL1R Inhibitors and Its Antiallergic Efficacy through Molecular Docking Approach. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021. [DOI: 10.1155/2021/6683407] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Allergens have the ability to enter the body and cause illness. Leukotriene is the widespread allergen which could stimulate mast cells to discharge histamine which causes allergy symptoms. An effective strategy for treating leukotriene-induced allergy is to find the inhibitors of leukotriene or histamine activity from phytochemicals. For this purpose, a library of 8,500 phytochemicals was generated using MOE software. The structures of histamine-1 receptor and cysteinyl leukotriene receptor-1 were predicted by the homology modeling method through the SWISS model. The phytochemicals were docked with predicted structures of histamine-1 and cysteinyl leukotriene receptor-1 in MOE software to determine the binding affinity of the phytochemicals against the targets. Moreover, chemoinformatics properties and ADMET of phytochemicals were assessed to find the drug likeness behavior of compounds. Compound ID 10054216 has the lowest
-score value for H-1 receptor that is -18.9186 kcal/mol which is lower than the value of standard -15.167 kcal/mol. The other compounds 393471, 71448939, 10722577, and 442614 also showed good
-score values than the standard. Moreover, compound ID 11843082 has the lowest
-score value for CL1R that is -15.481 kcal/mol which is lower than the value of standard -12.453 kcal/mol. The other compounds 72284, 5282102, 66559251, and 102506430 also showed good
-score values than the standard. In this research article, we performed molecular docking to find the best inhibitors against H1R and CL1R and their antiallergic efficacy. This in silico knowledge will be helpful in near future for the design of novel, safe, and less costing H-1 receptor and CL1R inhibitors with the aim to improve human life quality.
Collapse
|