1
|
Murmu S, Sinha D, Chaurasia H, Sharma S, Das R, Jha GK, Archak S. A review of artificial intelligence-assisted omics techniques in plant defense: current trends and future directions. FRONTIERS IN PLANT SCIENCE 2024; 15:1292054. [PMID: 38504888 PMCID: PMC10948452 DOI: 10.3389/fpls.2024.1292054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Accepted: 01/24/2024] [Indexed: 03/21/2024]
Abstract
Plants intricately deploy defense systems to counter diverse biotic and abiotic stresses. Omics technologies, spanning genomics, transcriptomics, proteomics, and metabolomics, have revolutionized the exploration of plant defense mechanisms, unraveling molecular intricacies in response to various stressors. However, the complexity and scale of omics data necessitate sophisticated analytical tools for meaningful insights. This review delves into the application of artificial intelligence algorithms, particularly machine learning and deep learning, as promising approaches for deciphering complex omics data in plant defense research. The overview encompasses key omics techniques and addresses the challenges and limitations inherent in current AI-assisted omics approaches. Moreover, it contemplates potential future directions in this dynamic field. In summary, AI-assisted omics techniques present a robust toolkit, enabling a profound understanding of the molecular foundations of plant defense and paving the way for more effective crop protection strategies amidst climate change and emerging diseases.
Collapse
Affiliation(s)
- Sneha Murmu
- Indian Agricultural Statistics Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Dipro Sinha
- Indian Agricultural Statistics Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Himanshushekhar Chaurasia
- Central Institute for Research on Cotton Technology, Indian Council of Agricultural Research (ICAR), Mumbai, India
| | - Soumya Sharma
- Indian Agricultural Statistics Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Ritwika Das
- Indian Agricultural Statistics Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Girish Kumar Jha
- Indian Agricultural Statistics Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Sunil Archak
- National Bureau of Plant Genetic Resources, Indian Council of Agricultural Research (ICAR), New Delhi, India
| |
Collapse
|
2
|
Zhang G, Li M, Tang Q, Meng F, Feng P, Chen W. MulCNN-HSP: A multi-scale convolutional neural networks-based deep learning method for classification of heat shock proteins. Int J Biol Macromol 2024; 257:128802. [PMID: 38101670 DOI: 10.1016/j.ijbiomac.2023.128802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 12/04/2023] [Accepted: 12/12/2023] [Indexed: 12/17/2023]
Abstract
Heat shock proteins (HSPs) are crucial cellular stress proteins that react to environmental cues, ensuring the preservation of cellular functions. They also play pivotal roles in orchestrating the immune response and participating in processes associated with cancer. Consequently, the classification of HSPs holds immense significance in enhancing our understanding of their biological functions and in various diseases. However, the use of computational methods for identifying and classifying HSPs still faces challenges related to accuracy and interpretability. In this study, we introduced MulCNN-HSP, a novel deep learning model based on multi-scale convolutional neural networks, for identifying and classifying of HSPs. Comparative results showed that MulCNN-HSP outperforms or matches existing models in the identification and classification of HSPs. Furthermore, MulCNN-HSP can extract and analyze essential features for the prediction task, enhancing its interpretability. To facilitate its accessibility, we have made MulCNN-HSP available at http://cbcb.cdutcm.edu.cn/HSP/. We hope that MulCNN-HSP will contribute to advancing the study of HSPs and their roles in various biological processes and diseases.
Collapse
Affiliation(s)
- Guiyang Zhang
- State Key Laboratory of Southwestern Chinese Medicine Resources, Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
| | - Mingrui Li
- State Key Laboratory of Southwestern Chinese Medicine Resources, School of Basic Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
| | - Qiang Tang
- State Key Laboratory of Southwestern Chinese Medicine Resources, School of Basic Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
| | - Fanbo Meng
- State Key Laboratory of Southwestern Chinese Medicine Resources, School of Basic Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
| | - Pengmian Feng
- State Key Laboratory of Southwestern Chinese Medicine Resources, School of Basic Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China.
| | - Wei Chen
- State Key Laboratory of Southwestern Chinese Medicine Resources, Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China; State Key Laboratory of Southwestern Chinese Medicine Resources, School of Basic Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China.
| |
Collapse
|
3
|
Talkhi N, Nooghabi MJ, Esmaily H, Maleki S, Hajipoor M, Ferns GA, Ghayour-Mobarhan M. Prediction of serum anti-HSP27 antibody titers changes using a light gradient boosting machine (LightGBM) technique. Sci Rep 2023; 13:12775. [PMID: 37550399 PMCID: PMC10406940 DOI: 10.1038/s41598-023-39724-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 07/29/2023] [Indexed: 08/09/2023] Open
Abstract
Previous studies have proposed that heat shock proteins 27 (HSP27) and its anti-HSP27 antibody titers may play a crucial role in several diseases including cardiovascular disease. However, available studies has been used simple analytical methods. This study aimed to determine the factors that associate serum anti-HSP27 antibody titers using ensemble machine learning methods and to demonstrate the magnitude and direction of the predictors using PFI and SHAP methods. The study employed Python 3 to apply various machine learning models, including LightGBM, CatBoost, XGBoost, AdaBoost, SVR, MLP, and MLR. The best models were selected using model evaluation metrics during the K-Fold cross-validation strategy. The LightGBM model (with RMSE: 0.1900 ± 0.0124; MAE: 0.1471 ± 0.0044; MAPE: 0.8027 ± 0.064 as the mean ± sd) and the SHAP method revealed that several factors, including pro-oxidant-antioxidant balance (PAB), physical activity level (PAL), platelet distribution width, mid-upper arm circumference, systolic blood pressure, age, red cell distribution width, waist-to-hip ratio, neutrophils to lymphocytes ratio, platelet count, serum glucose, serum cholesterol, red blood cells were associated with anti-HSP27, respectively. The study found that PAB and PAL were strongly associated with serum anti-HSP27 antibody titers, indicating a direct and indirect relationship, respectively. These findings can help improve our understanding of the factors that determine anti-HSP27 antibody titers and their potential role in disease development.
Collapse
Affiliation(s)
- Nasrin Talkhi
- Department of Biostatistics, School of Health, Mashhad University of Medical Sciences, Mashhad, Iran
- International UNESCO Center for Health-Related Basic Sciences and Human Nutrition, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mehdi Jabbari Nooghabi
- Department of Statistics, Ferdowsi University of Mashhad, Mashhad, Iran
- Department of Mathematical Sciences, University of Copenhagen, 2100, Copenhagen, Denmark
| | - Habibollah Esmaily
- Department of Biostatistics, School of Health, Mashhad University of Medical Sciences, Mashhad, Iran
- Social Determinants of Health Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Saba Maleki
- International UNESCO Center for Health-Related Basic Sciences and Human Nutrition, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mojtaba Hajipoor
- Department of Nutrition Sciences, Varastegan Institute for Medical Sciences, Mashhad, Iran
| | - Gordon A Ferns
- Division of Medical Education, Brighton & Sussex Medical School, Falmer, Brighton, BN1 9PH, Sussex, UK
| | - Majid Ghayour-Mobarhan
- International UNESCO Center for Health-Related Basic Sciences and Human Nutrition, Mashhad University of Medical Sciences, Mashhad, Iran.
- Metabolic Syndrome Research Center, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
| |
Collapse
|
4
|
Huo A, Wang F. Biomarkers of ulcerative colitis disease activity CXCL1, CYP2R1, LPCAT1, and NEU4 and their relationship to immune infiltrates. Sci Rep 2023; 13:12126. [PMID: 37495756 PMCID: PMC10372061 DOI: 10.1038/s41598-023-39012-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 07/18/2023] [Indexed: 07/28/2023] Open
Abstract
The diagnosis and assessment of ulcerative colitis (UC) poses significant challenges, which may result in inadequate treatment and a poor prognosis for patients. This study aims to identify potential activity biomarkers for UC and investigate the role of infiltrating immune cells in the disease. To perform gene set enrichment analysis, we utilized the cluster profiler and ggplot2 packages. Kyoto encyclopedia of genes and genomes was used to analyze degenerate enrichment genes. Significant gene set enrichment was determined using the cluster profiler and ggplot2 packages. Additionally, quantitative PCR (qRT-PCR) was employed to validate the expression of each marker in the ulcerative colitis model. We identified 651 differentially expressed genes (DEGs) and further investigated potential UC activity biomarkers. Our analysis revealed that CXCL1 (AUC = 0.710), CYP2R1 (AUC = 0.863), LPCAT1 (AUC = 0.783), and NEU4 (AUC = 0.833) were promising activity markers for the diagnosis of UC. Using rat DSS model, we validated these markers through qRT-PCR, which showed statistically significant differences between UC and normal colon mucosa. Infiltrating immune cell analysis indicated that M1 macrophages, M2 macrophages, activated dendritic cells (DCs), and neutrophils played crucial roles in the occurrence and progression of UC. Moreover, the activity markers exhibited varying degrees of correlation with activated memory CD4 T cells, M0 macrophages, T follicular helper cells, memory B cells, and activated DCs. The potential diagnostic genes for UC activity, such as CXCL1, CYP2R1, LPCAT1, and NEU4, as well as the infiltration of immune cells, may contribute to the pathogenesis and progression of UC.
Collapse
Affiliation(s)
- Aijing Huo
- Department of Nephropathy and Immunology, The Third Central Clinical College of Tianjin Medical University, No. 83 Jintang Road, Hedong District, Tianjin, 300170, China
- Tianjin Key Laboratory of Extracorporeal Life Support for Critical Diseases, Artificial Cell Engineering Technology Research Center, Tianjin Institute of Hepatobiliary Disease, The Third Central Hospital of Tianjin, Tianjin, China
| | - Fengmei Wang
- Tianjin Key Laboratory of Extracorporeal Life Support for Critical Diseases, Artificial Cell Engineering Technology Research Center, Tianjin Institute of Hepatobiliary Disease, The Third Central Hospital of Tianjin, Tianjin, China.
- Department of Gastroenterology and Hepatology, The Third Central Clinical College of Tianjin Medical University, No. 83 Jintang Road, Hedong District, Tianjin, 300170, China.
| |
Collapse
|
5
|
Gao Y, Li JN, Pu JJ, Tao KX, Zhao XX, Yang QQ. Genome-wide identification and characterization of the HSP gene superfamily in apple snails (Gastropoda: Ampullariidae) and expression analysis under temperature stress. Int J Biol Macromol 2022; 222:2545-2555. [DOI: 10.1016/j.ijbiomac.2022.10.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 09/28/2022] [Accepted: 10/02/2022] [Indexed: 11/05/2022]
|
6
|
ASRmiRNA: Abiotic Stress-Responsive miRNA Prediction in Plants by Using Machine Learning Algorithms with Pseudo K-Tuple Nucleotide Compositional Features. Int J Mol Sci 2022; 23:ijms23031612. [PMID: 35163534 PMCID: PMC8835813 DOI: 10.3390/ijms23031612] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Revised: 01/23/2022] [Accepted: 01/26/2022] [Indexed: 02/04/2023] Open
Abstract
MicroRNAs (miRNAs) play a significant role in plant response to different abiotic stresses. Thus, identification of abiotic stress-responsive miRNAs holds immense importance in crop breeding programmes to develop cultivars resistant to abiotic stresses. In this study, we developed a machine learning-based computational method for prediction of miRNAs associated with abiotic stresses. Three types of datasets were used for prediction, i.e., miRNA, Pre-miRNA, and Pre-miRNA + miRNA. The pseudo K-tuple nucleotide compositional features were generated for each sequence to transform the sequence data into numeric feature vectors. Support vector machine (SVM) was employed for prediction. The area under receiver operating characteristics curve (auROC) of 70.21, 69.71, 77.94 and area under precision-recall curve (auPRC) of 69.96, 65.64, 77.32 percentages were obtained for miRNA, Pre-miRNA, and Pre-miRNA + miRNA datasets, respectively. Overall prediction accuracies for the independent test set were 62.33, 64.85, 69.21 percentages, respectively, for the three datasets. The SVM also achieved higher accuracy than other learning methods such as random forest, extreme gradient boosting, and adaptive boosting. To implement our method with ease, an online prediction server “ASRmiRNA” has been developed. The proposed approach is believed to supplement the existing effort for identification of abiotic stress-responsive miRNAs and Pre-miRNAs.
Collapse
|
7
|
Oso BJ, Olaoye IF, Ogidi CO. In silico Design of a Vaccine Candidate for SAR S-CoV-2 Based on Multiple T-cell and B-cell Epitopes. ARCHIVES OF RAZI INSTITUTE 2021; 76:1191-1202. [PMID: 35355741 PMCID: PMC8934067 DOI: 10.22092/ari.2020.351605.1526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Accepted: 11/08/2020] [Indexed: 06/14/2023]
Abstract
Coronaviruses (2019-nCoV) are large single-stranded RNA viruses that usually cause respiratory infections with a crude lethality ratio of 3.8% and high levels of transmissibility. There is yet no applicable clinical evaluation to assess the efficacy of various therapeutic agents that have been suggested as investigational drugs against the viruses despite their respective supposed hypothetical claims due to their antiviral potentials. Moreover, the development of a safe and effective vaccine has been suggested as an intervention to control the 2019-nCoV pandemic. However, a major concern in the development of a 2019-nCoV vaccine is the possibility of stimulating a corresponding immune response without enhancing the induction of the disease and associated side effects. The present investigation was carried out by predicting the antigenicity of the primary sequences of 2019-nCoV structural proteins and identification of B-cell and T-cell epitopes through the Bepipred and PEPVAC servers, respectively. The peptides of the vaccine construct include the selected epitopes based on the VaxiJen score with a threshold of 1.0 and β-defensinas an adjuvant. The putative binding of the vaccine constructs to intracellular toll-like receptors (TLRs) was assessed through molecular docking analysis and molecular dynamics simulations. The selected epitopes for the final vaccine construct are DPNFKD, SPLSLN, and LELQDHNE as B-cell epitopes and EPKLGSLVV, NFKDQVILL, and SSRSSSRSR as T-cell epitopes. The molecular docking analysis showed the vaccine construct could have favorable interactions with TLRs as indicated by the negative values of the computed binding energies. The constructed immunogen based on the immune informatics study could be employed in the strategy to develop potential vaccine candidates against 2019-nCoV.
Collapse
Affiliation(s)
- B J Oso
- Department of Biochemistry, McPherson University, Seriki Sotayo, Ogun State, Nigeria
| | - I F Olaoye
- Department of Biochemistry, McPherson University, Seriki Sotayo, Ogun State, Nigeria
- Biotechnology Unit, Department of Biological Sciences, Kings University, Odeomu, Nigeria
| | - C O Ogidi
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, England
| |
Collapse
|
8
|
Meher PK, Rai A, Rao AR. mLoc-mRNA: predicting multiple sub-cellular localization of mRNAs using random forest algorithm coupled with feature selection via elastic net. BMC Bioinformatics 2021; 22:342. [PMID: 34167457 PMCID: PMC8223360 DOI: 10.1186/s12859-021-04264-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2020] [Accepted: 06/11/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Localization of messenger RNAs (mRNAs) plays a crucial role in the growth and development of cells. Particularly, it plays a major role in regulating spatio-temporal gene expression. The in situ hybridization is a promising experimental technique used to determine the localization of mRNAs but it is costly and laborious. It is also a known fact that a single mRNA can be present in more than one location, whereas the existing computational tools are capable of predicting only a single location for such mRNAs. Thus, the development of high-end computational tool is required for reliable and timely prediction of multiple subcellular locations of mRNAs. Hence, we develop the present computational model to predict the multiple localizations of mRNAs. RESULTS The mRNA sequences from 9 different localizations were considered. Each sequence was first transformed to a numeric feature vector of size 5460, based on the k-mer features of sizes 1-6. Out of 5460 k-mer features, 1812 important features were selected by the Elastic Net statistical model. The Random Forest supervised learning algorithm was then employed for predicting the localizations with the selected features. Five-fold cross-validation accuracies of 70.87, 68.32, 68.36, 68.79, 96.46, 73.44, 70.94, 97.42 and 71.77% were obtained for the cytoplasm, cytosol, endoplasmic reticulum, exosome, mitochondrion, nucleus, pseudopodium, posterior and ribosome respectively. With an independent test set, accuracies of 65.33, 73.37, 75.86, 72.99, 94.26, 70.91, 65.53, 93.60 and 73.45% were obtained for the respective localizations. The developed approach also achieved higher accuracies than the existing localization prediction tools. CONCLUSIONS This study presents a novel computational tool for predicting the multiple localization of mRNAs. Based on the proposed approach, an online prediction server "mLoc-mRNA" is accessible at http://cabgrid.res.in:8080/mlocmrna/ . The developed approach is believed to supplement the existing tools and techniques for the localization prediction of mRNAs.
Collapse
Affiliation(s)
- Prabina Kumar Meher
- ICAR-Indian Agricultural Statistics Research Institute, New Delhi, 110012, India.
| | - Anil Rai
- ICAR-Indian Agricultural Statistics Research Institute, New Delhi, 110012, India.
| | | |
Collapse
|
9
|
Chen YZ, Wang ZZ, Wang Y, Ying G, Chen Z, Song J. nhKcr: a new bioinformatics tool for predicting crotonylation sites on human nonhistone proteins based on deep learning. Brief Bioinform 2021; 22:6277413. [PMID: 34002774 DOI: 10.1093/bib/bbab146] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 03/18/2021] [Accepted: 03/25/2021] [Indexed: 12/20/2022] Open
Abstract
Lysine crotonylation (Kcr) is a newly discovered type of protein post-translational modification and has been reported to be involved in various pathophysiological processes. High-resolution mass spectrometry is the primary approach for identification of Kcr sites. However, experimental approaches for identifying Kcr sites are often time-consuming and expensive when compared with computational approaches. To date, several predictors for Kcr site prediction have been developed, most of which are capable of predicting crotonylation sites on either histones alone or mixed histone and nonhistone proteins together. These methods exhibit high diversity in their algorithms, encoding schemes, feature selection techniques and performance assessment strategies. However, none of them were designed for predicting Kcr sites on nonhistone proteins. Therefore, it is desirable to develop an effective predictor for identifying Kcr sites from the large amount of nonhistone sequence data. For this purpose, we first provide a comprehensive review on six methods for predicting crotonylation sites. Second, we develop a novel deep learning-based computational framework termed as CNNrgb for Kcr site prediction on nonhistone proteins by integrating different types of features. We benchmark its performance against multiple commonly used machine learning classifiers (including random forest, logitboost, naïve Bayes and logistic regression) by performing both 10-fold cross-validation and independent test. The results show that the proposed CNNrgb framework achieves the best performance with high computational efficiency on large datasets. Moreover, to facilitate users' efforts to investigate Kcr sites on human nonhistone proteins, we implement an online server called nhKcr and compare it with other existing tools to illustrate the utility and robustness of our method. The nhKcr web server and all the datasets utilized in this study are freely accessible at http://nhKcr.erc.monash.edu/.
Collapse
Affiliation(s)
- Yong-Zi Chen
- Laboratory of Tumor Cell Biology, Tianjin Medical University Cancer Institute and Hospital, Tianjin 300060, China
| | | | | | - Guoguang Ying
- Laboratory of Tumor Cell Biology in Tianjin Medical University Cancer Institute and Hospital, Tianjin 300060, China
| | - Zhen Chen
- Collaborative Innovation Center of Henan Grain Crops, Henan Agricultural University, China
| | - Jiangning Song
- Monash Biomedicine Discovery Institute, Monash University, Australia
| |
Collapse
|
10
|
Min S, Kim H, Lee B, Yoon S. Protein transfer learning improves identification of heat shock protein families. PLoS One 2021; 16:e0251865. [PMID: 34003870 PMCID: PMC8130922 DOI: 10.1371/journal.pone.0251865] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 05/04/2021] [Indexed: 12/16/2022] Open
Abstract
Heat shock proteins (HSPs) play a pivotal role as molecular chaperones against unfavorable conditions. Although HSPs are of great importance, their computational identification remains a significant challenge. Previous studies have two major limitations. First, they relied heavily on amino acid composition features, which inevitably limited their prediction performance. Second, their prediction performance was overestimated because of the independent two-stage evaluations and train-test data redundancy. To overcome these limitations, we introduce two novel deep learning algorithms: (1) time-efficient DeepHSP and (2) high-performance DeeperHSP. We propose a convolutional neural network (CNN)-based DeepHSP that classifies both non-HSPs and six HSP families simultaneously. It outperforms state-of-the-art algorithms, despite taking 14–15 times less time for both training and inference. We further improve the performance of DeepHSP by taking advantage of protein transfer learning. While DeepHSP is trained on raw protein sequences, DeeperHSP is trained on top of pre-trained protein representations. Therefore, DeeperHSP remarkably outperforms state-of-the-art algorithms increasing F1 scores in both cross-validation and independent test experiments by 20% and 10%, respectively. We envision that the proposed algorithms can provide a proteome-wide prediction of HSPs and help in various downstream analyses for pathology and clinical research.
Collapse
Affiliation(s)
- Seonwoo Min
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, South Korea
| | - HyunGi Kim
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, South Korea
| | - Byunghan Lee
- Department of Electronic and IT Media Engineering, Seoul National University of Science and Technology, Seoul, South Korea
- * E-mail: (BL); (SY)
| | - Sungroh Yoon
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, South Korea
- Department of Biological Sciences, Interdisciplinary Program in Bioinformatics, Interdisciplinary Program in Artificial Intelligence, ASRI, INMC, and Institute of Engineering Research, Seoul National University, Seoul, South Korea
- * E-mail: (BL); (SY)
| |
Collapse
|
11
|
Calaf GM, Bleak TC, Roy D. Signs of carcinogenicity induced by parathion, malathion, and estrogen in human breast epithelial cells (Review). Oncol Rep 2021; 45:24. [PMID: 33649804 PMCID: PMC7905528 DOI: 10.3892/or.2021.7975] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 01/29/2021] [Indexed: 02/07/2023] Open
Abstract
Cancer development is a multistep process that may be induced by a variety of compounds. Environmental substances, such as pesticides, have been associated with different human diseases. Organophosphorus pesticides (OPs) are among the most commonly used insecticides. Despite the fact that organophosphorus has been associated with an increased risk of cancer, particularly hormone-mediated cancer, few prospective studies have examined the use of individual insecticides. Reported results have demonstrated that OPs and estrogen induce a cascade of events indicative of the transformation of human breast epithelial cells. In vitro studies analyzing an immortalized non-tumorigenic human breast epithelial cell line may provide us with an approach to analyzing cell transformation under the effects of OPs in the presence of estrogen. The results suggested hormone-mediated effects of these insecticides on the risk of cancer among women. It can be concluded that, through experimental models, the initiation of cancer can be studied by analyzing the steps that transform normal breast cells to malignant ones through certain substances, such as pesticides and estrogen. Such substances cause genomic instability, and therefore tumor formation in the animal, and signs of carcinogenesis in vitro. Cancer initiation has been associated with an increase in genomic instability, indicated by the inactivation of tumor-suppressor genes and activation of oncogenes in the presence of malathion, parathion, and estrogen. In the present study, a comprehensive summary of the impact of OPs in human and rat breast cancer, specifically their effects on the cell cycle, signaling pathways linked to epidermal growth factor, drug metabolism, and genomic instability in an MCF-10F estrogen receptor-negative breast cell line is provided.
Collapse
Affiliation(s)
- Gloria M Calaf
- Instituto de Alta Investigación, Universidad de Tarapacá, Arica 1000000, Chile
| | - Tammy C Bleak
- Instituto de Alta Investigación, Universidad de Tarapacá, Arica 1000000, Chile
| | - Debasish Roy
- Department of Natural Sciences, Hostos Community College of The City University of New York, Bronx, NY 10451, USA
| |
Collapse
|
12
|
Identifying Heat Shock Protein Families from Imbalanced Data by Using Combined Features. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2020; 2020:8894478. [PMID: 33029195 PMCID: PMC7530508 DOI: 10.1155/2020/8894478] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 09/08/2020] [Accepted: 09/14/2020] [Indexed: 11/29/2022]
Abstract
Heat shock proteins (HSPs) are ubiquitous in living organisms. HSPs are an essential component for cell growth and survival; the main function of HSPs is controlling the folding and unfolding process of proteins. According to molecular function and mass, HSPs are categorized into six different families: HSP20 (small HSPS), HSP40 (J-proteins), HSP60, HSP70, HSP90, and HSP100. In this paper, improved methods for HSP prediction are proposed—the split amino acid composition (SAAC), the dipeptide composition (DC), the conjoint triad feature (CTF), and the pseudoaverage chemical shift (PseACS) were selected to predict the HSPs with a support vector machine (SVM). In order to overcome the imbalance data classification problems, the syntactic minority oversampling technique (SMOTE) was used to balance the dataset. The overall accuracy was 99.72% with a balanced dataset in the jackknife test by using the optimized combination feature SAAC+DC+CTF+PseACS, which was 4.81% higher than the imbalanced dataset with the same combination feature. The Sn, Sp, Acc, and MCC of HSP families in our predictive model were higher than those in existing methods. This improved method may be helpful for protein function prediction.
Collapse
|
13
|
Barman RK, Mukhopadhyay A, Maulik U, Das S. Identification of infectious disease-associated host genes using machine learning techniques. BMC Bioinformatics 2019; 20:736. [PMID: 31881961 PMCID: PMC6935192 DOI: 10.1186/s12859-019-3317-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 12/16/2019] [Indexed: 02/06/2023] Open
Abstract
Background With the global spread of multidrug resistance in pathogenic microbes, infectious diseases emerge as a key public health concern of the recent time. Identification of host genes associated with infectious diseases will improve our understanding about the mechanisms behind their development and help to identify novel therapeutic targets. Results We developed a machine learning techniques-based classification approach to identify infectious disease-associated host genes by integrating sequence and protein interaction network features. Among different methods, Deep Neural Networks (DNN) model with 16 selected features for pseudo-amino acid composition (PAAC) and network properties achieved the highest accuracy of 86.33% with sensitivity of 85.61% and specificity of 86.57%. The DNN classifier also attained an accuracy of 83.33% on a blind dataset and a sensitivity of 83.1% on an independent dataset. Furthermore, to predict unknown infectious disease-associated host genes, we applied the proposed DNN model to all reviewed proteins from the database. Seventy-six out of 100 highly-predicted infectious disease-associated genes from our study were also found in experimentally-verified human-pathogen protein-protein interactions (PPIs). Finally, we validated the highly-predicted infectious disease-associated genes by disease and gene ontology enrichment analysis and found that many of them are shared by one or more of the other diseases, such as cancer, metabolic and immune related diseases. Conclusions To the best of our knowledge, this is the first computational method to identify infectious disease-associated host genes. The proposed method will help large-scale prediction of host genes associated with infectious-diseases. However, our results indicated that for small datasets, advanced DNN-based method does not offer significant advantage over the simpler supervised machine learning techniques, such as Support Vector Machine (SVM) or Random Forest (RF) for the prediction of infectious disease-associated host genes. Significant overlap of infectious disease with cancer and metabolic disease on disease and gene ontology enrichment analysis suggests that these diseases perturb the functions of the same cellular signaling pathways and may be treated by drugs that tend to reverse these perturbations. Moreover, identification of novel candidate genes associated with infectious diseases would help us to explain disease pathogenesis further and develop novel therapeutics.
Collapse
Affiliation(s)
- Ranjan Kumar Barman
- Biomedical Informatics Centre, ICMR-National Institute of Cholera and Enteric Diseases, Kolkata, West Bengal, India.,Department of Computer Science and Engineering, Jadavpur University, Kolkata, West Bengal, India
| | - Anirban Mukhopadhyay
- Department of Computer Science and Engineering, University of Kalyani, Kalyani, West Bengal, India
| | - Ujjwal Maulik
- Department of Computer Science and Engineering, Jadavpur University, Kolkata, West Bengal, India
| | - Santasabuj Das
- Biomedical Informatics Centre, ICMR-National Institute of Cholera and Enteric Diseases, Kolkata, West Bengal, India. .,Division of Clinical Medicine, ICMR-National Institute of Cholera and Enteric Diseases, P-33, C.I.T.Road Scheme XM, Beliaghata-700010, Kolkata, West Bengal, India.
| |
Collapse
|
14
|
Expression of Heat Shock Proteins in Thermally Challenged Pacific Abalone Haliotis discus hannai. Genes (Basel) 2019; 11:genes11010022. [PMID: 31878084 PMCID: PMC7016835 DOI: 10.3390/genes11010022] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Revised: 12/09/2019] [Accepted: 12/16/2019] [Indexed: 12/12/2022] Open
Abstract
Summer mortality, caused by thermal conditions, is the biggest threat to abalone aquaculture production industries. Various measures have been taken to mitigate this issue by adjusting the environment; however, the cellular processes of Pacific abalone (Haliotis discus hannai) have been overlooked due to the paucity of genetic information. The draft genome of H. discus hannai has recently been reported, prompting exploration of the genes responsible for thermal regulation in Pacific abalone. In this study, 413 proteins were systematically annotated as members of the heat shock protein (HSP) super families, and among them 26 HSP genes from four Pacific abalone tissues (hemocytes, gill, mantle, and muscle) were differentially expressed under cold and heat stress conditions. The co-expression network revealed that HSP expression patterns were tissue-specific and similar to those of other shellfish inhabiting intertidal zones. Finally, representative HSPs were selected at random and their expression patterns were identified by RNA sequencing and validated by qRT-PCR to assess expression significance. The HSPs expressed in hemocytes were highly similar in both analyses, suggesting that hemocytes could be more reliable samples for validating thermal condition markers compared to other tissues.
Collapse
|
15
|
Luo A, Li X, Zhang X, Zhan H, Du H, Zhang Y, Peng X. Identification of AtHsp90.6 involved in early embryogenesis and its structure prediction by molecular dynamics simulations. ROYAL SOCIETY OPEN SCIENCE 2019; 6:190219. [PMID: 31218061 PMCID: PMC6550000 DOI: 10.1098/rsos.190219] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2019] [Accepted: 04/02/2019] [Indexed: 05/29/2023]
Abstract
Heat-shock protein of 90 kDa (Hsp90) is a key molecular chaperone involved in folding the synthesized protein and controlling protein quality. Conformational dynamics coupled to ATPase activity in N-terminal domain is essential for Hsp90's function. However, the relevant process is still largely unknown in plant Hsp90s, especially those required for plant embryogenesis which is inextricably tied up with human survival. Here, AtHsp90.6, a member of Hsp90 family in Arabidopsis, was firstly identified as a protein essential for embryogenesis. Thus we modelled AtHsp90.6 in its functionally closed 'lid-down' and open 'lid-up' states, exploring the nucleotide binding mechanism in these two states. Free energy landscape and electrostatic potential analysis revealed the switching mechanism between these two states. Collectively, this study quantitatively analysed the conformational changes of AtHsp90.6 bound to ATP or ADP. This result may help us understand the mechanism of action of AtHsp90.6 in future.
Collapse
Affiliation(s)
- An Luo
- College of Life Science, Yangtze University, Jingzhou 434023, People's Republic of China
| | - Xinbo Li
- College of Life Science, State Key Laboratory of Hybrid Rice, Wuhan University, Wuhan 430072, People's Republic of China
- Center for Tissue Engineering and Regenerative Medicine, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430072, People's Republic of China
| | - Xuecheng Zhang
- College of Life Science, State Key Laboratory of Hybrid Rice, Wuhan University, Wuhan 430072, People's Republic of China
| | - Huadong Zhan
- College of Life and Environment Sciences, Shanghai Normal University, Shanghai, 200234, People's Republic of China
| | - Hewei Du
- College of Life Science, Yangtze University, Jingzhou 434023, People's Republic of China
| | - Yubo Zhang
- Department of Food Science, Foshan University, Foshan 528231, People's Republic of China
| | - Xiongbo Peng
- College of Life Science, State Key Laboratory of Hybrid Rice, Wuhan University, Wuhan 430072, People's Republic of China
| |
Collapse
|
16
|
Kapakin KAT, Kapakin S, Imik H, Gumus R, Eser G. The Investigation of the Relationship Between HSP-27 Release and Oxidative DNA Damage in Broiler Chickens with Tibial Dyschondroplasia by Using Histopathological and Immunohistochemical Methods. BRAZILIAN JOURNAL OF POULTRY SCIENCE 2019. [DOI: 10.1590/1806-9061-2019-1091] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Affiliation(s)
| | | | - H Imik
- Atatürk University, Turkey
| | - R Gumus
- Cumhuriyet University, Turkey
| | - G Eser
- Atatürk University, Turkey
| |
Collapse
|
17
|
Boone K, Camarda K, Spencer P, Tamerler C. Antimicrobial peptide similarity and classification through rough set theory using physicochemical boundaries. BMC Bioinformatics 2018; 19:469. [PMID: 30522443 PMCID: PMC6282327 DOI: 10.1186/s12859-018-2514-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2018] [Accepted: 11/20/2018] [Indexed: 01/09/2023] Open
Abstract
Background Antimicrobial peptides attract considerable interest as novel agents to combat infections. Their long-time potency across bacteria, viruses and fungi as part of diverse innate immune systems offers a solution to overcome the rising concerns from antibiotic resistance. With the rapid increase of antimicrobial peptides reported in the databases, peptide selection becomes a challenge. We propose similarity analyses to describe key properties that distinguish between active and non-active peptide sequences building upon the physicochemical properties of antimicrobial peptides. We used an iterative supervised machine learning approach to classify active peptides from inactive peptides with low false discovery rates in a relatively short computational search time. Results By generating explicit boundaries, our method defines new categories of active and inactive peptides based on their physicochemical properties. Consequently, it describes physicochemical characteristics of similarity among active peptides and the physicochemical boundaries between active and inactive peptides in a single process. To build the similarity boundaries, we used the rough set theory approach; to our knowledge, this is the first time that this approach has been used to classify peptides. The modified rough set theory method limits the number of values describing a boundary to a user-defined limit. Our method is optimized for specificity over selectivity. Noting that false positives increase activity assays while false negatives only increase computational search time, our method provided a low false discovery rate. Published datasets were used to compare our rough set theory method to other published classification methods and based on this comparison, we achieved high selectivity and comparable sensitivity to currently available methods. Conclusions We developed rule sets that define physicochemical boundaries which allow us to directly classify the active sequences from inactive peptides. Existing classification methods are either sequence-order insensitive or length-dependent, whereas our method generates the rule sets that combine order-sensitive descriptors with length-independent descriptors. The method provides comparable or improved performance to currently available methods. Discovering the boundaries of physicochemical properties may lead to a new understanding of peptide similarity. Electronic supplementary material The online version of this article (10.1186/s12859-018-2514-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Kyle Boone
- Bioengineering Program, Institute of Bioengineering Research, University of Kansas, Learned Hall, Room 5109, 1530 W 15th Street, Lawrence, KS, 66045, USA
| | - Kyle Camarda
- Chemical and Petroleum Engineering Department, University of Kansas, Learned Hall, Room 4154, 1530 West 15th Street, Lawrence, KS, 66045, USA
| | - Paulette Spencer
- Mechanical Engineering Department, Bioengineering Program, Institute of Bioengineering Research, University of Kansas, Learned Hall, Room 3111, 1530 West 15th Street, Lawrence, KS, 66045, USA
| | - Candan Tamerler
- Mechanical Engineering Department, Bioengineering Program, Institute of Bioengineering Research, University of Kansas, Learned Hall, Room 3135A, 1530 W 15th St, Lawrence, KS, 66045, USA.
| |
Collapse
|