1
|
Gillani M, Pollastri G. Protein subcellular localization prediction tools. Comput Struct Biotechnol J 2024; 23:1796-1807. [PMID: 38707539 PMCID: PMC11066471 DOI: 10.1016/j.csbj.2024.04.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 04/11/2024] [Accepted: 04/11/2024] [Indexed: 05/07/2024] Open
Abstract
Protein subcellular localization prediction is of great significance in bioinformatics and biological research. Most of the proteins do not have experimentally determined localization information, computational prediction methods and tools have been acting as an active research area for more than two decades now. Knowledge of the subcellular location of a protein provides valuable information about its functionalities, the functioning of the cell, and other possible interactions with proteins. Fast, reliable, and accurate predictors provides platforms to harness the abundance of sequence data to predict subcellular locations accordingly. During the last decade, there has been a considerable amount of research effort aimed at developing subcellular localization predictors. This paper reviews recent subcellular localization prediction tools in the Eukaryotic, Prokaryotic, and Virus-based categories followed by a detailed analysis. Each predictor is discussed based on its main features, strengths, weaknesses, algorithms used, prediction techniques, and analysis. This review is supported by prediction tools taxonomies that highlight their rele- vant area and examples for uncomplicated categorization and ease of understandability. These taxonomies help users find suitable tools according to their needs. Furthermore, recent research gaps and challenges are discussed to cover areas that need the utmost attention. This survey provides an in-depth analysis of the most recent prediction tools to facilitate readers and can be considered a quick guide for researchers to identify and explore the recent literature advancements.
Collapse
Affiliation(s)
- Maryam Gillani
- School of Computer Science, University College Dublin (UCD), Dublin, D04 V1W8, Ireland
| | - Gianluca Pollastri
- School of Computer Science, University College Dublin (UCD), Dublin, D04 V1W8, Ireland
| |
Collapse
|
2
|
Bhushan V, Nita-Lazar A. Recent Advancements in Subcellular Proteomics: Growing Impact of Organellar Protein Niches on the Understanding of Cell Biology. J Proteome Res 2024; 23:2700-2722. [PMID: 38451675 PMCID: PMC11296931 DOI: 10.1021/acs.jproteome.3c00839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]
Abstract
The mammalian cell is a complex entity, with membrane-bound and membrane-less organelles playing vital roles in regulating cellular homeostasis. Organellar protein niches drive discrete biological processes and cell functions, thus maintaining cell equilibrium. Cellular processes such as signaling, growth, proliferation, motility, and programmed cell death require dynamic protein movements between cell compartments. Aberrant protein localization is associated with a wide range of diseases. Therefore, analyzing the subcellular proteome of the cell can provide a comprehensive overview of cellular biology. With recent advancements in mass spectrometry, imaging technology, computational tools, and deep machine learning algorithms, studies pertaining to subcellular protein localization and their dynamic distributions are gaining momentum. These studies reveal changing interaction networks because of "moonlighting proteins" and serve as a discovery tool for disease network mechanisms. Consequently, this review aims to provide a comprehensive repository for recent advancements in subcellular proteomics subcontexting methods, challenges, and future perspectives for method developers. In summary, subcellular proteomics is crucial to the understanding of the fundamental cellular mechanisms and the associated diseases.
Collapse
Affiliation(s)
- Vanya Bhushan
- Functional Cellular Networks Section, Laboratory of Immune System Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Aleksandra Nita-Lazar
- Functional Cellular Networks Section, Laboratory of Immune System Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, United States
| |
Collapse
|
3
|
Xiao H, Zou Y, Wang J, Wan S. A Review for Artificial Intelligence Based Protein Subcellular Localization. Biomolecules 2024; 14:409. [PMID: 38672426 PMCID: PMC11048326 DOI: 10.3390/biom14040409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 03/21/2024] [Accepted: 03/25/2024] [Indexed: 04/28/2024] Open
Abstract
Proteins need to be located in appropriate spatiotemporal contexts to carry out their diverse biological functions. Mislocalized proteins may lead to a broad range of diseases, such as cancer and Alzheimer's disease. Knowing where a target protein resides within a cell will give insights into tailored drug design for a disease. As the gold validation standard, the conventional wet lab uses fluorescent microscopy imaging, immunoelectron microscopy, and fluorescent biomarker tags for protein subcellular location identification. However, the booming era of proteomics and high-throughput sequencing generates tons of newly discovered proteins, making protein subcellular localization by wet-lab experiments a mission impossible. To tackle this concern, in the past decades, artificial intelligence (AI) and machine learning (ML), especially deep learning methods, have made significant progress in this research area. In this article, we review the latest advances in AI-based method development in three typical types of approaches, including sequence-based, knowledge-based, and image-based methods. We also elaborately discuss existing challenges and future directions in AI-based method development in this research field.
Collapse
Affiliation(s)
- Hanyu Xiao
- Department of Genetics, Cell Biology and Anatomy, College of Medicine, University of Nebraska Medical Center, Omaha, NE 68198, USA;
| | - Yijin Zou
- College of Veterinary Medicine, China Agricultural University, Beijing 100193, China;
| | - Jieqiong Wang
- Department of Neurological Sciences, College of Medicine, University of Nebraska Medical Center, Omaha, NE 68198, USA;
| | - Shibiao Wan
- Department of Genetics, Cell Biology and Anatomy, College of Medicine, University of Nebraska Medical Center, Omaha, NE 68198, USA;
| |
Collapse
|
4
|
Gong S, Wang Q, Huang J, Huang R, Chen S, Cheng X, Liu L, Dai X, Zhong Y, Fan C, Liao Z. LC-MS/MS platform-based serum untargeted screening reveals the diagnostic biomarker panel and molecular mechanism of breast cancer. Methods 2024; 222:100-111. [PMID: 38228196 DOI: 10.1016/j.ymeth.2024.01.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 10/12/2023] [Accepted: 01/11/2024] [Indexed: 01/18/2024] Open
Abstract
BACKGROUND Breast cancer (BC), the most common form of malignant cancer affecting women worldwide, was characterized by heterogeneous metabolic disorder and lack of effective biomarkers for diagnosis. The purpose of this study is to search for reliable metabolite biomarkers of BC as well as triple-negative breast cancer (TNBC) using serum metabolomics approach. METHODS In this study, an untargeted metabolomics technique based on ultra-high performance liquid chromatography combined with mass spectrometry (UHPLC-MS) was utilized to investigate the differences in serum metabolic profile between the BC group (n = 53) and non-BC group (n = 57), as well as between TNBC patients (n = 23) and non-TNBC subjects (n = 30). The multivariate data analysis, determination of the fold change and the Mann-Whitney U test were used to screen out the differential metabolites. Additionally, machine learning methods including receiver operating curve analysis and logistic regression analysis were conducted to establish diagnostic biomarker panels. RESULTS There were 36 metabolites found to be significantly different between BC and non-BC groups, and 12 metabolites discovered to be significantly different between TNBC and non-TNBC patients. Results also showed that four metabolites, including N-acetyl-D-tryptophan, 2-arachidonoylglycerol, pipecolic acid and oxoglutaric acid, were considered as vital biomarkers for the diagnosis of BC and non-BC with an area under the curve (AUC) of 0.995. Another two-metabolite panel of N-acetyl-D-tryptophan and 2-arachidonoylglycerol was discovered to discriminate TNBC from non-TNBC and produced an AUC of 0.965. CONCLUSION This study demonstrated that serum metabolomics can be used to identify BC specifically and identified promising serum metabolic markers for TNBC diagnosis.
Collapse
Affiliation(s)
- Sisi Gong
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China
| | - Qingshui Wang
- College of Life Sciences, Fujian Normal University, Fuzhou, PR China
| | - Jiewei Huang
- The Graduate School of Fujian Medical University, Fuzhou, PR China
| | - Rongfu Huang
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China
| | - Shanshan Chen
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China
| | - Xiaojuan Cheng
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China
| | - Lei Liu
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China
| | - Xiaofang Dai
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China
| | - Yameng Zhong
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China
| | - Chunmei Fan
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China.
| | - Zhijun Liao
- Clinical Lab and Medical Diagnostics Laboratory, Donghai Hospital District, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, PR China; Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, PR China.
| |
Collapse
|
5
|
Ye Y, Li M, Pan Q, Fang X, Yang H, Dong B, Yang J, Zheng Y, Zhang R, Liao Z. Machine learning-based classification of deubiquitinase USP26 and its cell proliferation inhibition through stabilizing KLF6 in cervical cancer. Comput Biol Med 2024; 168:107745. [PMID: 38064851 DOI: 10.1016/j.compbiomed.2023.107745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Revised: 10/31/2023] [Accepted: 11/20/2023] [Indexed: 01/10/2024]
Abstract
OBJECTIVE We aim to accurately distinguish ubiquitin-specific proteases (USPs) from other members within the deubiquitinating enzyme families based on protein sequences. Additionally, we seek to elucidate the specific regulatory mechanisms through which USP26 modulates Krüppel-like factor 6 (KLF6) and assess the subsequent effects of this regulation on both the proliferation and migration of cervical cancer cells. METHODS All the deubiquitinase (DUB) sequences were classified into USPs and non-USPs. Feature vectors, including 188D, n-gram, and 400D dimensions, were extracted from these sequences and subjected to binary classification via the Weka software. Next, thirty human USPs were also analyzed to identify conserved motifs and ascertained evolutionary relationships. Experimentally, more than 90 unique DUB-encoding plasmids were transfected into HeLa cell lines to assess alterations in KLF6 protein levels and to isolate a specific DUB involved in KLF6 regulation. Subsequent experiments utilized both wild-type (WT) USP26 overexpression and shRNA-mediated USP26 knockdown to examine changes in KLF6 protein levels. The half-life experiment was performed to assess the influence of USP26 on KLF6 protein stability. Immunoprecipitation was applied to confirm the USP26-KLF6 interaction, and ubiquitination assays to explore the role of USP26 in KLF6 deubiquitination. Additional cellular assays were conducted to evaluate the effects of USP26 on HeLa cell proliferation and migration. RESULTS 1. Among the extracted feature vectors of 188D, 400D, and n-gram, all 12 classifiers demonstrated excellent performance. The RandomForest classifier demonstrated superior performance in this assessment. Phylogenetic analysis of 30 human USPs revealed the presence of nine unique motifs, comprising zinc finger and ubiquitin-specific protease domains. 2. Through a systematic screening of the deubiquitinase library, USP26 was identified as the sole DUB associated with KLF6. 3. USP26 positively regulated the protein level of KLF6, as evidenced by the decrease in KLF6 protein expression upon shUSP26 knockdown in both 293T and Hela cell lines. Additionally, half-life experiments demonstrated that USP26 prolonged the stability of KLF6. 4. Immunoprecipitation experiments revealed a strong interaction between USP26 and KLF6. Notably, the functional interaction domain was mapped to amino acids 285-913 of USP26, as opposed to the 1-295 region. 5. WT USP26 was found to attenuate the ubiquitination levels of KLF6. However, the mutant USP26 abrogated its deubiquitination activity. 6. Functional biological assays demonstrated that overexpression of USP26 inhibited both proliferation and migration of HeLa cells. Conversely, knockdown of USP26 was shown to promote these oncogenic properties. CONCLUSIONS 1. At the protein sequence level, members of the USP family can be effectively differentiated from non-USP proteins. Furthermore, specific functional motifs have been identified within the sequences of human USPs. 2. The deubiquitinating enzyme USP26 has been shown to target KLF6 for deubiquitination, thereby modulating its stability. Importantly, USP26 plays a pivotal role in the modulation of proliferation and migration in cervical cancer cells.
Collapse
Affiliation(s)
- Ying Ye
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Meng Li
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Qilong Pan
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Xin Fang
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China; Laboratory of Non-communicable Chronic Disease Control, Fujian Provincial Center for Disease Control and Prevention, Fuzhou, 350012, China
| | - Hong Yang
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Bingying Dong
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Jiaying Yang
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Yuan Zheng
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Renxiang Zhang
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Zhijun Liao
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China.
| |
Collapse
|
6
|
Zou K, Wang S, Wang Z, Zou H, Yang F. Dual-Signal Feature Spaces Map Protein Subcellular Locations Based on Immunohistochemistry Image and Protein Sequence. SENSORS (BASEL, SWITZERLAND) 2023; 23:9014. [PMID: 38005402 PMCID: PMC10675401 DOI: 10.3390/s23229014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Revised: 10/29/2023] [Accepted: 11/01/2023] [Indexed: 11/26/2023]
Abstract
Protein is one of the primary biochemical macromolecular regulators in the compartmental cellular structure, and the subcellular locations of proteins can therefore provide information on the function of subcellular structures and physiological environments. Recently, data-driven systems have been developed to predict the subcellular location of proteins based on protein sequence, immunohistochemistry (IHC) images, or immunofluorescence (IF) images. However, the research on the fusion of multiple protein signals has received little attention. In this study, we developed a dual-signal computational protocol by incorporating IHC images into protein sequences to learn protein subcellular localization. Three major steps can be summarized as follows in this protocol: first, a benchmark database that includes 281 proteins sorted out from 4722 proteins of the Human Protein Atlas (HPA) and Swiss-Prot database, which is involved in the endoplasmic reticulum (ER), Golgi apparatus, cytosol, and nucleoplasm; second, discriminative feature operators were first employed to quantitate protein image-sequence samples that include IHC images and protein sequence; finally, the feature subspace of different protein signals is absorbed to construct multiple sub-classifiers via dimensionality reduction and binary relevance (BR), and multiple confidence derived from multiple sub-classifiers is adopted to decide subcellular location by the centralized voting mechanism at the decision layer. The experimental results indicated that the dual-signal model embedded IHC images and protein sequences outperformed the single-signal models with accuracy, precision, and recall of 75.41%, 80.38%, and 74.38%, respectively. It is enlightening for further research on protein subcellular location prediction under multi-signal fusion of protein.
Collapse
Affiliation(s)
- Kai Zou
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang 330038, China
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Simeng Wang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang 330038, China
| | - Ziqian Wang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang 330038, China
| | - Hongliang Zou
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang 330038, China
| | - Fan Yang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang 330038, China
- Artificial Intelligence and Bioinformation Cognition Laboratory, Jiangxi Science and Technology Normal University, Nanchang 330038, China
| |
Collapse
|
7
|
Faiz M, Khan SJ, Azim F, Ejaz N. Disclosing the locale of transmembrane proteins within cellular alcove by machine learning approach: systematic review and meta analysis. J Biomol Struct Dyn 2023:1-16. [PMID: 37768108 DOI: 10.1080/07391102.2023.2260490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 09/13/2023] [Indexed: 09/29/2023]
Abstract
Protein subcellular localization is a promising research question in Proteomics and associated fields, including Biological Sciences, Biomedical Engineering, Computational Biology, Bioinformatics, Proteomics, Artificial Intelligence, and Biophysics. However, computational techniques are preferred to explore this attribute for a massive number of proteins. The byproduct of this conjunction yields diversified location identifiers of proteins. These protein subcellular localization identifiers are unique regarding the database used, organisms, Machine Learning Technique, and accuracy. Despite the availability of these identifiers, the majority of the work has been done on the subcellular localization of proteins and, less work has been done specifically on locations of transmembrane proteins. This systematic review accounts for computational techniques implemented on transmembrane protein localization. Moreover, a literature search on PubMed, Science Direct, and IEEE Databases disclosed no systematic review or meta-analysis on the cell's transmembrane protein locale. A Systematic review was formed under the guidelines of PRISMA by using Science Direct, PubMed, and IEEE Databases. Journal publications from 2000 to 2023 were taken into consideration and screened. This review has focused only on computational studies rather than experimental techniques. 1004 studies were reviewed and were categorized as relevant and non-relevant according to inclusion and exclusion criteria. All the screening was done through Endnote after importing citations. This systematic review characterizes the gap in targeting the locale of the transmembrane protein and will aid researchers in exploring its new horizons.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Mehwish Faiz
- Department of Biomedical Engineering, Ziauddin University (FESTM), Karachi, Pakistan
- Department of Electrical Engineering, Ziauddin University, (FESTM), Karachi, Pakistan
| | - Saad Jawaid Khan
- Department of Biomedical Engineering, Ziauddin University (FESTM), Karachi, Pakistan
| | - Fahad Azim
- Department of Electrical Engineering, Ziauddin University, (FESTM), Karachi, Pakistan
| | - Nazia Ejaz
- Balochistan University of Engineering and Technology, Khuzdar, Pakistan
| |
Collapse
|
8
|
Wu D, Fang X, Luan K, Xu Q, Lin S, Sun S, Yang J, Dong B, Manavalan B, Liao Z. Identification of SH2 domain-containing proteins and motifs prediction by a deep learning method. Comput Biol Med 2023; 162:107065. [PMID: 37267826 DOI: 10.1016/j.compbiomed.2023.107065] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 04/30/2023] [Accepted: 05/27/2023] [Indexed: 06/04/2023]
Abstract
The Src Homology 2 (SH2) domain plays an important role in the signal transmission mechanism in organisms. It mediates the protein-protein interactions based on the combination between phosphotyrosine and motifs in SH2 domain. In this study, we designed a method to identify SH2 domain-containing proteins and non-SH2 domain-containing proteins through deep learning technology. Firstly, we collected SH2 and non-SH2 domain-containing protein sequences including multiple species. We built six deep learning models through DeepBIO after data preprocessing and compared their performance. Secondly, we selected the model with the strongest comprehensive ability to conduct training and test separately again, and analyze the results visually. It was found that 288-dimensional (288D) feature could effectively identify two types of proteins. Finally, motifs analysis discovered the specific motif YKIR and revealed its function in signal transduction. In summary, we successfully identified SH2 domain and non-SH2 domain proteins through deep learning method, and obtained 288D features that perform best. In addition, we found a new motif YKIR in SH2 domain, and analyzed its function which helps to further understand the signaling mechanisms within the organism.
Collapse
Affiliation(s)
- Duanzhi Wu
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Xin Fang
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China; Laboratory of Non-communicable Chronic Disease Control, Fujian Provincial Center for Disease Control and Prevention, Fuzhou, 350012, China
| | - Kai Luan
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Qijin Xu
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Shiqi Lin
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Shiying Sun
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Jiaying Yang
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China; Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Bingying Dong
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China; Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China
| | - Balachandran Manavalan
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, 16419, Gyeonggi-do, Republic of Korea.
| | - Zhijun Liao
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China; Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China.
| |
Collapse
|
9
|
Tiwari S, Vaish S, Singh N, Basantani M, Bhargava A. Genome-wide identification and characterization of glutathione S-transferase gene family in quinoa ( Chenopodium quinoa Willd.). 3 Biotech 2023; 13:230. [PMID: 37309406 PMCID: PMC10257622 DOI: 10.1007/s13205-023-03659-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2023] [Accepted: 06/01/2023] [Indexed: 06/14/2023] Open
Abstract
The present investigation was envisaged for large scale in-silico genome wide identification and characterization of glutathione S-transferases (GSTs) in Chenopodium quinoa. In this study, a total of 120 GST genes (CqGSTs) were identified and divided into 11 classes of which tau and phi were highest in numbers. The average protein length of protein was found to be 279.06 with their corresponding average molecular weight of 31,819.4 kDa. The subcellular localization analysis results showed that proteins were centrally localized in the cytoplasm followed by chloroplast, mitochondria and plastids. Structural analysis revealed the presence of 2 -14 exons in CqGST genes. Most of the proteins possessed two exon one intron organization. MEME analysis identified 15 significantly conserved motifs with a width of 6-50 amino acids. Motifs 1, 3, 2, 5, 6, 8, 9 and 13 were found specifically in tau class family; motifs 3, 4, 5, 6, 7 and 9 were found in phi class gene family, while motifs 3, 4, 13 and 14 were found in metaxin class. Multiple sequence alignment revealed highly conserved N-terminus with active site serine (Ser; S) or cysteine (Cys; C) residue for the activation of GSH binding and GST catalytic activity. The gene loci were found to be unevenly distributed across 18 different chromosomes with a maximum of 17 genes located on chromosome number 7. Dominance of alpha helix was followed by coil, extended strand and beta turns. Gene duplication analysis revealed that segmental duplication and purifying type selection were highest in number and found to be main source of expansion of GST gene family. Cis acting regulatory elements analysis showed the presence of 21 different elements involved in stress, hormone and light response and cellular development. The evolutionary relationship of CqGST proteins carried out using maximum likelihood method revealed that all the tau and phi class GSTs were closely associated with those of G. max, O. sativa and A. thaliana. Molecular docking of GST molecules with the fungicide metalaxyl showed that the CqGSTF1 had the lowest binding energy. The comprehensive study of CqGST gene family in quinoa provides groundwork for further functional analysis of CqGST genes in the species at molecular level and has potential applications in plant breeding.
Collapse
Affiliation(s)
- Shivani Tiwari
- Department of Botany, School of Life Sciences, Mahatma Gandhi Central University, Motihari, Bihar 845401 India
| | - Swati Vaish
- Institute of Biosciences and Technology, Shri Ramswaroop Memorial University, Lucknow-Deva Road, Barabanki, Uttar Pradesh 225003 India
| | - Nootan Singh
- Institute of Biosciences and Technology, Shri Ramswaroop Memorial University, Lucknow-Deva Road, Barabanki, Uttar Pradesh 225003 India
| | - Mahesh Basantani
- Experiome Biotech Private Limited, Vibhuti Khand, Gomti Nagar, Lucknow, Uttar Pradesh 226010 India
| | - Atul Bhargava
- Department of Botany, School of Life Sciences, Mahatma Gandhi Central University, Motihari, Bihar 845401 India
| |
Collapse
|
10
|
A graph neural network model for deciphering the biological mechanisms of plant electrical signal classification. Appl Soft Comput 2023. [DOI: 10.1016/j.asoc.2023.110153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
|
11
|
Nakai K, Wei L. Recent Advances in the Prediction of Subcellular Localization of Proteins and Related Topics. FRONTIERS IN BIOINFORMATICS 2022; 2:910531. [PMID: 36304291 PMCID: PMC9580943 DOI: 10.3389/fbinf.2022.910531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 04/25/2022] [Indexed: 11/13/2022] Open
Abstract
Prediction of subcellular localization of proteins from their amino acid sequences has a long history in bioinformatics and is still actively developing, incorporating the latest advances in machine learning and proteomics. Notably, deep learning-based methods for natural language processing have made great contributions. Here, we review recent advances in the field as well as its related fields, such as subcellular proteomics and the prediction/recognition of subcellular localization from image data.
Collapse
Affiliation(s)
- Kenta Nakai
- Institute of Medical Science, The University of Tokyo, Minato-Ku, Japan
- *Correspondence: Kenta Nakai,
| | - Leyi Wei
- School of Software, Shandong University, Jinan, China
| |
Collapse
|