1
|
Wang X, Gao X, Fan X, Huai Z, Zhang G, Yao M, Wang T, Huang X, Lai L. WUREN: Whole-modal union representation for epitope prediction. Comput Struct Biotechnol J 2024; 23:2122-2131. [PMID: 38817963 PMCID: PMC11137340 DOI: 10.1016/j.csbj.2024.05.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 05/14/2024] [Accepted: 05/14/2024] [Indexed: 06/01/2024] Open
Abstract
B-cell epitope identification plays a vital role in the development of vaccines, therapies, and diagnostic tools. Currently, molecular docking tools in B-cell epitope prediction are heavily influenced by empirical parameters and require significant computational resources, rendering a great challenge to meet large-scale prediction demands. When predicting epitopes from antigen-antibody complex, current artificial intelligence algorithms cannot accurately implement the prediction due to insufficient protein feature representations, indicating novel algorithm is desperately needed for efficient protein information extraction. In this paper, we introduce a multimodal model called WUREN (Whole-modal Union Representation for Epitope predictioN), which effectively combines sequence, graph, and structural features. It achieved AUC-PR scores of 0.213 and 0.193 on the solved structures and AlphaFold-generated structures, respectively, for the independent test proteins selected from DiscoTope3 benchmark. Our findings indicate that WUREN is an efficient feature extraction model for protein complexes, with the generalizable application potential in the development of protein-based drugs. Moreover, the streamlined framework of WUREN could be readily extended to model similar biomolecules, such as nucleic acids, carbohydrates, and lipids.
Collapse
Affiliation(s)
| | | | - Xuezhe Fan
- XtalPi Innovation Center, Beijing, China
| | - Zhe Huai
- XtalPi Innovation Center, Beijing, China
| | | | | | | | | | - Lipeng Lai
- XtalPi Innovation Center, Beijing, China
| |
Collapse
|
2
|
Fang Y, Wan J, Zeng Y. Use machine learning to predict pulmonary metastasis of esophageal cancer: a population-based study. J Cancer Res Clin Oncol 2024; 150:420. [PMID: 39283330 PMCID: PMC11405433 DOI: 10.1007/s00432-024-05937-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 08/30/2024] [Indexed: 09/22/2024]
Abstract
BACKGROUND This study aims to establish a predictive model for assessing the risk of esophageal cancer lung metastasis using machine learning techniques. METHODS Data on esophageal cancer patients from 2010 to 2020 were extracted from the surveillance, epidemiology, and end results (SEER) database. Through univariate and multivariate logistic regression analyses, eight indicators related to the risk of lung metastasis were selected. These indicators were incorporated into six machine learning classifiers to develop corresponding predictive models. The performance of these models was evaluated and compared using metrics such as The area under curve (AUC), accuracy, sensitivity, specificity, and F1 score. RESULTS A total of 20,249 confirmed cases of esophageal cancer were included in this study. Among them, 14,174 cases (70%) were assigned to the training set while 6075 cases (30%) constituted the internal test set. Primary site location, tumor histology, tumor grade classification system T staging criteria N staging criteria brain metastasis bone metastasis liver metastasis emerged as independent risk factors for esophageal cancer with lung metastasis. Amongst the six constructed models, the GBM algorithm-based machine learning model demonstrated superior performance during internal dataset validation. AUC, accuracy, sensitivity, and specificity values achieved by this model stood at respectively at 0.803, 0.849, 0.604, and 0.867. CONCLUSION We have developed an online calculator based on the GBM model ( https://lvgrkyxcgdvo7ugoyxyywe.streamlit.app/)to aid clinical decision-making and treatment planning.
Collapse
Affiliation(s)
- Ying Fang
- Department of Joint Surgery, Hangzhou Xiaoshan Hospital of Traditional Chinese Medicine, Hangzhou, Zhejiang, China
| | - Jun Wan
- Department of Emergency surgery, Yangtze University Jingzhou Hospital, No.26, Chuyuan Road, Jingzhou, Hubei, China.
| | - Yukai Zeng
- Department of Thoracic Surgery, China-Japan Union Hospital of Jilin University, No. 126 Xiantai Street, Changchun, Jilin, China.
| |
Collapse
|
3
|
Wan J, Zeng Y. Prediction of hepatic metastasis in esophageal cancer based on machine learning. Sci Rep 2024; 14:14507. [PMID: 38914571 PMCID: PMC11196737 DOI: 10.1038/s41598-024-63213-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Accepted: 05/27/2024] [Indexed: 06/26/2024] Open
Abstract
This study aimed to establish a machine learning (ML) model for predicting hepatic metastasis in esophageal cancer. We retrospectively analyzed patients with esophageal cancer recorded in the Surveillance, Epidemiology, and End Results (SEER) database from 2010 to 2020. We identified 11 indicators associated with the risk of liver metastasis through univariate and multivariate logistic regression. Subsequently, these indicators were incorporated into six ML classifiers to build corresponding predictive models. The performance of these models was evaluated using the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and specificity. A total of 17,800 patients diagnosed with esophageal cancer were included in this study. Age, primary site, histology, tumor grade, T stage, N stage, surgical intervention, radiotherapy, chemotherapy, bone metastasis, and lung metastasis were independent risk factors for hepatic metastasis in esophageal cancer patients. Among the six models developed, the ML model constructed using the GBM algorithm exhibited the highest performance during internal validation of the dataset, with AUC, accuracy, sensitivity, and specificity of 0.885, 0.868, 0.667, and 0.888, respectively. Based on the GBM algorithm, we developed an accessible web-based prediction tool (accessible at https://project2-dngisws9d7xkygjcvnue8u.streamlit.app/ ) for predicting the risk of hepatic metastasis in esophageal cancer.
Collapse
Affiliation(s)
- Jun Wan
- Department of Emergency surgery, Yangtze University Jingzhou Hospital, jingzhou, China
| | - Yukai Zeng
- Department of Thoracic Surgery, China-Japan Union Hospital of Jilin University, No. 126 Xiantai street, Changchun, Jilin, China.
| |
Collapse
|
4
|
Tu JB, Liao WJ, Long SP, Li MP, Gao XH. Construction and validation of a machine learning model for the diagnosis of juvenile idiopathic arthritis based on fecal microbiota. Front Cell Infect Microbiol 2024; 14:1371371. [PMID: 38524178 PMCID: PMC10957563 DOI: 10.3389/fcimb.2024.1371371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 02/26/2024] [Indexed: 03/26/2024] Open
Abstract
Purpose Human gut microbiota has been shown to be significantly associated with various inflammatory diseases. Therefore, this study aimed to develop an excellent auxiliary tool for the diagnosis of juvenile idiopathic arthritis (JIA) based on fecal microbial biomarkers. Method The fecal metagenomic sequencing data associated with JIA were extracted from NCBI, and the sequencing data were transformed into the relative abundance of microorganisms by professional data cleaning (KneadData, Trimmomatic and Bowtie2) and comparison software (Kraken2 and Bracken). After that, the fecal microbes with high abundance were extracted for subsequent analysis. The extracted fecal microbes were further screened by least absolute shrinkage and selection operator (LASSO) regression, and the selected fecal microbe biomarkers were used for model training. In this study, we constructed six different machine learning (ML) models, and then selected the best model for constructing a JIA diagnostic tool by comparing the performance of the models based on a combined consideration of area under receiver operating characteristic curve (AUC), accuracy, specificity, F1 score, calibration curves and clinical decision curves. In addition, to further explain the model, Permutation Importance analysis and Shapley Additive Explanations (SHAP) were performed to understand the contribution of each biomarker in the prediction process. Result A total of 231 individuals were included in this study, including 203 JIA patients and Non-JIA individuals. In the analysis of diversity at the genus level, the alpha diversity represented by Shannon value was not significantly different between the two groups, while the belt diversity was slightly different. After selection by LASSO regression, 10 fecal microbe biomarkers were selected for model training. By comparing six different models, the XGB model showed the best performance, which average AUC, accuracy and F1 score were 0.976, 0.914 and 0.952, respectively, thus being used to construct the final JIA diagnosis model. Conclusion A JIA diagnosis model based on XGB algorithm was constructed with excellent performance, which may assist physicians in early detection of JIA patients and improve the prognosis of JIA patients.
Collapse
Affiliation(s)
- Jun-Bo Tu
- Department of Orthopaedics, Xinfeng County People’s Hospital, Xinfeng, Jiangxi, China
| | - Wei-Jie Liao
- Department of ICU, GanZhou People’s Hospital, GanZhou, Jiangxi, China
| | - Si-Ping Long
- The First Clinical Medical College of Nanchang University, Nanchang, China
| | - Meng-Pan Li
- The First Clinical Medical College of Nanchang University, Nanchang, China
- Department of Orthopedics, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Xing-Hua Gao
- Department of Orthopaedics, Guangzhou First People’s Hospital, South China University of Technology, Guangzhou, China
| |
Collapse
|
5
|
Kumar N, Bajiya N, Patiyal S, Raghava GPS. Multi-perspectives and challenges in identifying B-cell epitopes. Protein Sci 2023; 32:e4785. [PMID: 37733481 PMCID: PMC10578127 DOI: 10.1002/pro.4785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 09/11/2023] [Accepted: 09/16/2023] [Indexed: 09/23/2023]
Abstract
The identification of B-cell epitopes (BCEs) in antigens is a crucial step in developing recombinant vaccines or immunotherapies for various diseases. Over the past four decades, numerous in silico methods have been developed for predicting BCEs. However, existing reviews have only covered specific aspects, such as the progress in predicting conformational or linear BCEs. Therefore, in this paper, we have undertaken a systematic approach to provide a comprehensive review covering all aspects associated with the identification of BCEs. First, we have covered the experimental techniques developed over the years for identifying linear and conformational epitopes, including the limitations and challenges associated with these techniques. Second, we have briefly described the historical perspectives and resources that maintain experimentally validated information on BCEs. Third, we have extensively reviewed the computational methods developed for predicting conformational BCEs from the structure of the antigen, as well as the methods for predicting conformational epitopes from the sequence. Fourth, we have systematically reviewed the in silico methods developed in the last four decades for predicting linear or continuous BCEs. Finally, we have discussed the overall challenge of identifying continuous or conformational BCEs. In this review, we only listed major computational resources; a complete list with the URL is available from the BCinfo website (https://webs.iiitd.edu.in/raghava/bcinfo/).
Collapse
Affiliation(s)
- Nishant Kumar
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Nisha Bajiya
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Sumeet Patiyal
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Gajendra P. S. Raghava
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| |
Collapse
|
6
|
Angaitkar P, Janghel RR, Sahu TP. DL-TCNN: Deep Learning-based Temporal Convolutional Neural Network for prediction of conformational B-cell epitopes. 3 Biotech 2023; 13:297. [PMID: 37575599 PMCID: PMC10412510 DOI: 10.1007/s13205-023-03716-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2023] [Accepted: 07/24/2023] [Indexed: 08/15/2023] Open
Abstract
Prediction of conformational B-cell epitopes (CBCE) is an essential phase for vaccine design, drug invention, and accurate disease diagnosis. Many laboratorial and computational approaches have been developed to predict CBCE. However, laboratorial experiments are costly and time consuming, leading to the popularity of Machine Learning (ML)-based computational methods. Although ML methods have succeeded in many domains, achieving higher accuracy in CBCE prediction remains a challenge. To overcome this drawback and consider the limitations of ML methods, this paper proposes a novel DL-based framework for CBCE prediction, leveraging the capabilities of deep learning in the medical domain. The proposed model is named Deep Learning-based Temporal Convolutional Neural Network (DL-TCNN), which hybridizes empirical hyper-tuned 1D-CNN and TCN. TCN is an architecture that employs causal convolutions and dilations, adapting well to sequential input with extensive receptive fields. To train the proposed model, physicochemical features are firstly extracted from antigen sequences. Next, the Synthetic Minority Oversampling Technique (SMOTE) is applied to address the class imbalance problem. Finally, the proposed DL-TCNN is employed for the prediction of CBCE. The model's performance is evaluated and validated on a benchmark antigen-antibody dataset. The DL-TCNN achieves 94.44% accuracy, and 0.989 AUC score for the training dataset, 78.53% accuracy, and 0.661 AUC score for the validation dataset; and 85.10% accuracy, 0.855 AUC score for the testing dataset. The proposed model outperforms all the existing CBCE methods.
Collapse
Affiliation(s)
- Pratik Angaitkar
- Department of Information Technology, National Institute of Technology, Raipur, G.E. Road, Raipur, C.G. 492010 India
| | - Rekh Ram Janghel
- Department of Information Technology, National Institute of Technology, Raipur, G.E. Road, Raipur, C.G. 492010 India
| | - Tirath Prasad Sahu
- Department of Information Technology, National Institute of Technology, Raipur, G.E. Road, Raipur, C.G. 492010 India
| |
Collapse
|
7
|
Li MP, Liu WC, Sun BL, Zhong NS, Liu ZL, Huang SH, Zhang ZH, Liu JM. Prediction of bone metastasis in non-small cell lung cancer based on machine learning. Front Oncol 2023; 12:1054300. [PMID: 36698411 PMCID: PMC9869148 DOI: 10.3389/fonc.2022.1054300] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 11/21/2022] [Indexed: 01/12/2023] Open
Abstract
Objective The purpose of this paper was to develop a machine learning algorithm with good performance in predicting bone metastasis (BM) in non-small cell lung cancer (NSCLC) and establish a simple web predictor based on the algorithm. Methods Patients who diagnosed with NSCLC between 2010 and 2018 in the Surveillance, Epidemiology and End Results (SEER) database were involved. To increase the extensibility of the research, data of patients who first diagnosed with NSCLC at the First Affiliated Hospital of Nanchang University between January 2007 and December 2016 were also included in this study. Independent risk factors for BM in NSCLC were screened by univariate and multivariate logistic regression. At this basis, we chose six commonly machine learning algorithms to build predictive models, including Logistic Regression (LR), Decision tree (DT), Random Forest (RF), Gradient Boosting Machine (GBM), Naive Bayes classifiers (NBC) and eXtreme gradient boosting (XGB). Then, the best model was identified to build the web-predictor for predicting BM of NSCLC patients. Finally, area under receiver operating characteristic curve (AUC), accuracy, sensitivity and specificity were used to evaluate the performance of these models. Results A total of 50581 NSCLC patients were included in this study, and 5087(10.06%) of them developed BM. The sex, grade, laterality, histology, T stage, N stage, and chemotherapy were independent risk factors for NSCLC. Of these six models, the machine learning model built by the XGB algorithm performed best in both internal and external data setting validation, with AUC scores of 0.808 and 0.841, respectively. Then, the XGB algorithm was used to build a web predictor of BM from NSCLC. Conclusion This study developed a web predictor based XGB algorithm for predicting the risk of BM in NSCLC patients, which may assist doctors for clinical decision making.
Collapse
Affiliation(s)
- Meng-Pan Li
- Department of Orthopedic Surgery, The First Affiliated Hospital of Nanchang University, Nanchang, China,The First Clinical Medical College of Nanchang University, Nanchang, China
| | - Wen-Cai Liu
- Department of Orthopedic Surgery, The First Affiliated Hospital of Nanchang University, Nanchang, China,The First Clinical Medical College of Nanchang University, Nanchang, China,Department of Orthopaedics, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, China
| | - Bo-Lin Sun
- Department of Orthopedic Surgery, The First Affiliated Hospital of Nanchang University, Nanchang, China,Institute of Spine and Spinal Cord, Nanchang University, Nanchang, China
| | - Nan-Shan Zhong
- Department of Orthopedic Surgery, The First Affiliated Hospital of Nanchang University, Nanchang, China,Institute of Spine and Spinal Cord, Nanchang University, Nanchang, China
| | - Zhi-Li Liu
- Department of Orthopedic Surgery, The First Affiliated Hospital of Nanchang University, Nanchang, China,Institute of Spine and Spinal Cord, Nanchang University, Nanchang, China
| | - Shan-Hu Huang
- Department of Orthopedic Surgery, The First Affiliated Hospital of Nanchang University, Nanchang, China,Institute of Spine and Spinal Cord, Nanchang University, Nanchang, China
| | - Zhi-Hong Zhang
- Department of Orthopedic Surgery, The First Affiliated Hospital of Nanchang University, Nanchang, China,Institute of Spine and Spinal Cord, Nanchang University, Nanchang, China,*Correspondence: Jia-Ming Liu, ; Zhi-Hong Zhang,
| | - Jia-Ming Liu
- Department of Orthopedic Surgery, The First Affiliated Hospital of Nanchang University, Nanchang, China,Institute of Spine and Spinal Cord, Nanchang University, Nanchang, China,*Correspondence: Jia-Ming Liu, ; Zhi-Hong Zhang,
| |
Collapse
|
8
|
A Quantum Vaccinomics Approach for the Design and Production of MSP4 Chimeric Antigen for the Control of Anaplasma phagocytophilum Infections. Vaccines (Basel) 2022; 10:vaccines10121995. [PMID: 36560405 PMCID: PMC9784196 DOI: 10.3390/vaccines10121995] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 11/16/2022] [Accepted: 11/22/2022] [Indexed: 11/25/2022] Open
Abstract
Anaplasma phagocytophilum Major surface protein 4 (MSP4) plays a role during infection and multiplication in host neutrophils and tick vector cells. Recently, vaccination trials with the A. phagocytophilum antigen MSP4 in sheep showed only partial protection against pathogen infection. However, in rabbits immunized with MSP4, this recombinant antigen was protective. Differences between rabbit and sheep antibody responses are probably associated with the recognition of non-protective epitopes by IgG of immunized lambs. To address this question, we applied quantum vaccinomics to identify and characterize MSP4 protective epitopes by a microarray epitope mapping using sera from vaccinated rabbits and sheep. The identified candidate protective epitopes or immunological quantum were used for the design and production of a chimeric protective antigen. Inhibition assays of A. phagocytophilum infection in human HL60 and Ixodes scapularis tick ISE6 cells evidenced protection by IgG from sheep and rabbits immunized with the chimeric antigen. These results supported that the design of new chimeric candidate protective antigens using quantum vaccinomics to improve the protective capacity of antigens in multiple hosts.
Collapse
|
9
|
Xu Z, Ismanto HS, Zhou H, Saputri DS, Sugihara F, Standley DM. Advances in antibody discovery from human BCR repertoires. FRONTIERS IN BIOINFORMATICS 2022; 2:1044975. [PMID: 36338807 PMCID: PMC9631452 DOI: 10.3389/fbinf.2022.1044975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Accepted: 10/11/2022] [Indexed: 11/06/2022] Open
Abstract
Antibodies make up an important and growing class of compounds used for the diagnosis or treatment of disease. While traditional antibody discovery utilized immunization of animals to generate lead compounds, technological innovations have made it possible to search for antibodies targeting a given antigen within the repertoires of B cells in humans. Here we group these innovations into four broad categories: cell sorting allows the collection of cells enriched in specificity to one or more antigens; BCR sequencing can be performed on bulk mRNA, genomic DNA or on paired (heavy-light) mRNA; BCR repertoire analysis generally involves clustering BCRs into specificity groups or more in-depth modeling of antibody-antigen interactions, such as antibody-specific epitope predictions; validation of antibody-antigen interactions requires expression of antibodies, followed by antigen binding assays or epitope mapping. Together with innovations in Deep learning these technologies will contribute to the future discovery of diagnostic and therapeutic antibodies directly from humans.
Collapse
Affiliation(s)
- Zichang Xu
- Department of Genome Informatics, Research Institute for Microbial Diseases, Osaka University, Suita, Japan
| | - Hendra S. Ismanto
- Department of Genome Informatics, Research Institute for Microbial Diseases, Osaka University, Suita, Japan
| | - Hao Zhou
- Department of Genome Informatics, Research Institute for Microbial Diseases, Osaka University, Suita, Japan
| | - Dianita S. Saputri
- Department of Genome Informatics, Research Institute for Microbial Diseases, Osaka University, Suita, Japan
| | - Fuminori Sugihara
- Core Instrumentation Facility, Immunology Frontier Research Center, Osaka University, Suita, Japan
| | - Daron M. Standley
- Department of Genome Informatics, Research Institute for Microbial Diseases, Osaka University, Suita, Japan
- Department Systems Immunology, Immunology Frontier Research Center, Osaka University, Suita, Japan
| |
Collapse
|
10
|
Selecting the Suitable Resampling Strategy for Imbalanced Data Classification Regarding Dataset Properties. An Approach Based on Association Models. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11188546] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class. This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples. Thus, the prediction model is unreliable although the overall model accuracy can be acceptable. Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class. However, their effectiveness depends on several factors mainly related to data intrinsic characteristics, such as imbalance ratio, dataset size and dimensionality, overlapping between classes or borderline examples. In this work, the impact of these factors is analyzed through a comprehensive comparative study involving 40 datasets from different application areas. The objective is to obtain models for automatic selection of the best resampling strategy for any dataset based on its characteristics. These models allow us to check several factors simultaneously considering a wide range of values since they are induced from very varied datasets that cover a broad spectrum of conditions. This differs from most studies that focus on the individual analysis of the characteristics or cover a small range of values. In addition, the study encompasses both basic and advanced resampling strategies that are evaluated by means of eight different performance metrics, including new measures specifically designed for imbalanced data classification. The general nature of the proposal allows the choice of the most appropriate method regardless of the domain, avoiding the search for special purpose techniques that could be valid for the target data.
Collapse
|
11
|
Cai X, Li JJ, Liu T, Brian O, Li J. Infectious disease mRNA vaccines and a review on epitope prediction for vaccine design. Brief Funct Genomics 2021; 20:289-303. [PMID: 34089044 PMCID: PMC8194884 DOI: 10.1093/bfgp/elab027] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Revised: 03/05/2021] [Accepted: 03/12/2021] [Indexed: 12/15/2022] Open
Abstract
Messenger RNA (mRNA) vaccines have recently emerged as a new type of vaccine technology, showing strong potential to combat the COVID-19 pandemic. In addition to SARS-CoV-2 which caused the pandemic, mRNA vaccines have been developed and tested to prevent infectious diseases caused by other viruses such as Zika virus, the dengue virus, the respiratory syncytial virus, influenza H7N9 and Flavivirus. Interestingly, mRNA vaccines may also be useful for preventing non-infectious diseases such as diabetes and cancer. This review summarises the current progresses of mRNA vaccines designed for a range of diseases including COVID-19. As epitope study is a primary component in the in silico design of mRNA vaccines, we also survey on advanced bioinformatics and machine learning algorithms which have been used for epitope prediction, and review on user-friendly software tools available for this purpose. Finally, we discuss some of the unanswered concerns about mRNA vaccines, such as unknown long-term side effects, and present with our perspectives on future developments in this exciting area.
Collapse
Affiliation(s)
- Xinhui Cai
- Data Science Institute, Faculty of Engineering & IT, University of Technology Sydney, 15 Broadway, Ultimo, 2007, New South Wales, Australia
| | - Jiao Jiao Li
- School of Biomedical Engineering, Faculty of Engineering and IT, University of Technology Sydney, 15 Broadway, Ultimo, 2007, New South Wales, Australia
| | - Tao Liu
- School of Life Sciences, Faculty of Science, University of Technology Sydney, 15 Broadway, Ultimo, 2007, New South Wales, Australia
| | - Oliver Brian
- Children’s Cancer Institute Australia, University of New South Wales Sydney, Children’s Cancer Institute Australia, Randwick, Sydney, 2031, New South Wales, Australia
| | - Jinyan Li
- Data Science Institute, Faculty of Engineering & IT, University of Technology Sydney, 15 Broadway, Ultimo, 2007, New South Wales, Australia
| |
Collapse
|
12
|
Azhir E, Jafari Navimipour N, Hosseinzadeh M, Sharifi A, Darwesh A. A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method. PeerJ Comput Sci 2021; 7:e580. [PMID: 34141897 PMCID: PMC8176525 DOI: 10.7717/peerj-cs.580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Accepted: 05/15/2021] [Indexed: 06/12/2023]
Abstract
Query optimization is the process of identifying the best Query Execution Plan (QEP). The query optimizer produces a close to optimal QEP for the given queries based on the minimum resource usage. The problem is that for a given query, there are plenty of different equivalent execution plans, each with a corresponding execution cost. To produce an effective query plan thus requires examining a large number of alternative plans. Access plan recommendation is an alternative technique to database query optimization, which reuses the previously-generated QEPs to execute new queries. In this technique, the query optimizer uses clustering methods to identify groups of similar queries. However, clustering such large datasets is challenging for traditional clustering algorithms due to huge processing time. Numerous cloud-based platforms have been introduced that offer low-cost solutions for the processing of distributed queries such as Hadoop, Hive, Pig, etc. This paper has applied and tested a model for clustering variant sizes of large query datasets parallelly using MapReduce. The results demonstrate the effectiveness of the parallel implementation of query workloads clustering to achieve good scalability.
Collapse
Affiliation(s)
- Elham Azhir
- Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran
| | - Nima Jafari Navimipour
- Future Technology Research Center, National Yunlin University of Science and Technology, Douliou, Yunlin, Taiwan, R.O.C.
| | - Mehdi Hosseinzadeh
- Pattern Recognition and Machine Learning Lab, Gachon University, 1342 Seongnamdaero, Sujeonggu, Seongnam, Republic of Korea
| | - Arash Sharifi
- Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran
| | - Aso Darwesh
- Department of Information Technology, University of Human Development, Sulaymaniyah, Iraq
| |
Collapse
|
13
|
De Benedetti S, Di Pisa F, Fassi EMA, Cretich M, Musicò A, Frigerio R, Mussida A, Bombaci M, Grifantini R, Colombo G, Bolognesi M, Grande R, Zanchetta N, Gismondo MR, Mileto D, Mancon A, Gourlay LJ. Structure, Immunoreactivity, and In Silico Epitope Determination of SmSPI S. mansoni Serpin for Immunodiagnostic Application. Vaccines (Basel) 2021; 9:vaccines9040322. [PMID: 33915716 PMCID: PMC8066017 DOI: 10.3390/vaccines9040322] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 03/17/2021] [Accepted: 03/19/2021] [Indexed: 11/16/2022] Open
Abstract
The human parasitic disease Schistosomiasis is caused by the Schistosoma trematode flatworm that infects freshwaters in tropical regions of the world, particularly in Sub-Saharan Africa, South America, and the Far-East. It has also been observed as an emerging disease in Europe, due to increased immigration. In addition to improved therapeutic strategies, it is imperative to develop novel, rapid, and sensitive diagnostic tests that can detect the Schistosoma parasite, allowing timely treatment. Present diagnosis is difficult and involves microscopy-based detection of Schistosoma eggs in the feces. In this context, we present the 3.22 Å resolution crystal structure of the circulating antigen Serine protease inhibitor from S. mansoni (SmSPI), and we describe it as a potential serodiagnostic marker. Moreover, we identify three potential immunoreactive epitopes using in silico-based epitope mapping methods. Here, we confirm effective immune sera reactivity of the recombinant antigen, suggesting the further investigation of the protein and/or its predicted epitopes as serodiagnostic Schistosomiasis biomarkers.
Collapse
Affiliation(s)
- Stefano De Benedetti
- Department of Biosciences, Università degli Studi di Milano, Via Celoria 26, 20133 Milano, Italy; (S.D.B.); (F.D.P.); (M.B.)
| | - Flavio Di Pisa
- Department of Biosciences, Università degli Studi di Milano, Via Celoria 26, 20133 Milano, Italy; (S.D.B.); (F.D.P.); (M.B.)
| | - Enrico Mario Alessandro Fassi
- Consiglio Nazionale delle Ricerche, Istituto di Scienze e Tecnologie Chimiche “Giulio Natta” (SCITEC), Via Mario Bianco 9, 20131 Milano, Italy; (E.M.A.F.); (M.C.); (A.M.); (R.F.); (A.M.)
- Dipartimento di Scienze Farmaceutiche, Università degli Studi di Milano, Via L. Mangiagalli 25, 20133 Milano, Italy
| | - Marina Cretich
- Consiglio Nazionale delle Ricerche, Istituto di Scienze e Tecnologie Chimiche “Giulio Natta” (SCITEC), Via Mario Bianco 9, 20131 Milano, Italy; (E.M.A.F.); (M.C.); (A.M.); (R.F.); (A.M.)
| | - Angelo Musicò
- Consiglio Nazionale delle Ricerche, Istituto di Scienze e Tecnologie Chimiche “Giulio Natta” (SCITEC), Via Mario Bianco 9, 20131 Milano, Italy; (E.M.A.F.); (M.C.); (A.M.); (R.F.); (A.M.)
| | - Roberto Frigerio
- Consiglio Nazionale delle Ricerche, Istituto di Scienze e Tecnologie Chimiche “Giulio Natta” (SCITEC), Via Mario Bianco 9, 20131 Milano, Italy; (E.M.A.F.); (M.C.); (A.M.); (R.F.); (A.M.)
| | - Alessandro Mussida
- Consiglio Nazionale delle Ricerche, Istituto di Scienze e Tecnologie Chimiche “Giulio Natta” (SCITEC), Via Mario Bianco 9, 20131 Milano, Italy; (E.M.A.F.); (M.C.); (A.M.); (R.F.); (A.M.)
| | - Mauro Bombaci
- Istituto Nazionale Genetica Molecolare, Padiglione Romeo ed Enrica Invernizzi, IRCCS Ospedale Maggiore Policlinico, 20122 Milan, Italy; (M.B.); (R.G.)
| | - Renata Grifantini
- Istituto Nazionale Genetica Molecolare, Padiglione Romeo ed Enrica Invernizzi, IRCCS Ospedale Maggiore Policlinico, 20122 Milan, Italy; (M.B.); (R.G.)
| | - Giorgio Colombo
- Dipartimento di Chimica, Università di Pavia, V.le Taramelli 12, 27100 Pavia, Italy;
| | - Martino Bolognesi
- Department of Biosciences, Università degli Studi di Milano, Via Celoria 26, 20133 Milano, Italy; (S.D.B.); (F.D.P.); (M.B.)
- Centro di Ricerca Pediatrica Romeo ed Enrica Invernizzi, Università degli Studi di Milano, 20133 Milano, Italy
| | - Romualdo Grande
- UOC Microbiologia Clinica, Virologia e Diagnostica delle Bioemergenze ASST FBF Sacco, 20157 Milano, Italy; (R.G.); (N.Z.); (M.R.G.); (D.M.); (A.M.)
| | - Nadia Zanchetta
- UOC Microbiologia Clinica, Virologia e Diagnostica delle Bioemergenze ASST FBF Sacco, 20157 Milano, Italy; (R.G.); (N.Z.); (M.R.G.); (D.M.); (A.M.)
| | - Maria Rita Gismondo
- UOC Microbiologia Clinica, Virologia e Diagnostica delle Bioemergenze ASST FBF Sacco, 20157 Milano, Italy; (R.G.); (N.Z.); (M.R.G.); (D.M.); (A.M.)
- Clinical Microbiology, Virology and Bioemergency Unit, Department of Biomedical and Clinical Sciences, Luigi Sacco Hospital, University of Milan, 20157 Milan, Italy
| | - Davide Mileto
- UOC Microbiologia Clinica, Virologia e Diagnostica delle Bioemergenze ASST FBF Sacco, 20157 Milano, Italy; (R.G.); (N.Z.); (M.R.G.); (D.M.); (A.M.)
| | - Alessandro Mancon
- UOC Microbiologia Clinica, Virologia e Diagnostica delle Bioemergenze ASST FBF Sacco, 20157 Milano, Italy; (R.G.); (N.Z.); (M.R.G.); (D.M.); (A.M.)
| | - Louise Jane Gourlay
- Department of Biosciences, Università degli Studi di Milano, Via Celoria 26, 20133 Milano, Italy; (S.D.B.); (F.D.P.); (M.B.)
- Correspondence: ; Tel.: +39-(0)2-5031-4914
| |
Collapse
|