1
|
Pradhan UK, Behera P, Das R, Naha S, Gupta A, Parsad R, Pradhan SK, Meher PK. AScirRNA: A novel computational approach to discover abiotic stress-responsive circular RNAs in plant genome. Comput Biol Chem 2024; 113:108205. [PMID: 39265460 DOI: 10.1016/j.compbiolchem.2024.108205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 07/12/2024] [Accepted: 09/04/2024] [Indexed: 09/14/2024]
Abstract
In the realm of plant biology, understanding the intricate regulatory mechanisms governing stress responses stands as a pivotal pursuit. Circular RNAs (circRNAs), emerging as critical players in gene regulation, have garnered attention in recent days for their potential roles in abiotic stress adaptation. A comprehensive grasp of circRNAs' functions in stress response offers avenues for breeders to manipulating plants to develop abiotic stress resistant crop cultivars to thrive in challenging climates. This study pioneers a machine learning-based model for predicting abiotic stress-responsive circRNAs. The K-tuple nucleotide composition (KNC) and Pseudo KNC (PKNC) features were utilized to numerically represent circRNAs. Three different feature selection strategies were employed to select relevant and non-redundant features. Eight shallow and four deep learning algorithms were evaluated to build the final predictive model. Following five-fold cross-validation process, XGBoost learning algorithm demonstrated superior performance with LightGBM-chosen 260 KNC features (Accuracy: 74.55 %, auROC: 81.23 %, auPRC: 76.52 %) and 160 PKNC features (Accuracy: 74.32 %, auROC: 81.04 %, auPRC: 76.43 %), over other combinations of learning algorithms and feature selection techniques. Further, the robustness of the developed models were evaluated using an independent test dataset, where the overall accuracy, auROC and auPRC were found to be 73.13 %, 72.34 % and 72.68 % for KNC feature set and 73.52 %, 79.53 % and 73.09 % for PKNC feature set, respectively. This computational approach was also integrated into an online prediction tool, AScirRNA (https://iasri-sg.icar.gov.in/ascirna/) for easy prediction by the users. Both the proposed model and the developed tool are poised to augment ongoing efforts in identifying stress-responsive circRNAs in plants.
Collapse
Affiliation(s)
- Upendra Kumar Pradhan
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India.
| | - Prasanjit Behera
- Department of Bioinformatics, Odisha University of Agriculture & Technology, Bhubaneswar, Odisha 751003, India.
| | - Ritwika Das
- Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India.
| | - Sanchita Naha
- Division of Computer Applications, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India.
| | - Ajit Gupta
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India.
| | - Rajender Parsad
- ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India.
| | - Sukanta Kumar Pradhan
- Department of Bioinformatics, Odisha University of Agriculture & Technology, Bhubaneswar, Odisha 751003, India.
| | - Prabina Kumar Meher
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India.
| |
Collapse
|
2
|
Chiappa G, Fassio G, Modica MV, Oliverio M. Potential Ancestral Conoidean Toxins in the Venom Cocktail of the Carnivorous Snail Raphitoma purpurea (Montagu, 1803) (Neogastropoda: Raphitomidae). Toxins (Basel) 2024; 16:348. [PMID: 39195758 PMCID: PMC11359391 DOI: 10.3390/toxins16080348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2024] [Revised: 08/02/2024] [Accepted: 08/06/2024] [Indexed: 08/29/2024] Open
Abstract
Venomous marine gastropods of the superfamily Conoidea possess a rich arsenal of toxins, including neuroactive toxins. Venom adaptations might have played a fundamental role in the radiation of conoideans; nevertheless, there is still no knowledge about the venom of the most diversified family of the group: Raphitomidae Bellardi, 1875. In this study, transcriptomes were produced from the carcase, salivary glands, and proximal and distal venom ducts of the northeastern Atlantic species Raphitoma purpurea (Montagu, 1803). Using a gut barcoding approach, we were also able to report, for the first time, molecular evidence of a vermivorous diet for the genus. Transcriptomic analyses revealed over a hundred putative venom components (PVC), including 69 neurotoxins. Twenty novel toxin families, including some with high levels of expansion, were discovered. No significant difference was observed between the distal and proximal venom duct secretions. Peptides related to cone snail toxins (Cerm06, Pgam02, and turritoxin) and other venom-related proteins (disulfide isomerase and elevenin) were retrieved from the salivary glands. These salivary venom components may constitute ancestral adaptations for venom production in conoideans. Although often neglected, salivary gland secretions are of extreme importance for understanding the evolutionary history of conoidean venom.
Collapse
Affiliation(s)
- Giacomo Chiappa
- Department of Biology and Biotechnologies “Charles Darwin”, Sapienza University of Rome, Viale dell’Università 32, 00185 Rome, Italy; (G.F.); (M.O.)
| | - Giulia Fassio
- Department of Biology and Biotechnologies “Charles Darwin”, Sapienza University of Rome, Viale dell’Università 32, 00185 Rome, Italy; (G.F.); (M.O.)
| | - Maria Vittoria Modica
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Via Gregorio Allegri 1, 00198 Rome, Italy;
| | - Marco Oliverio
- Department of Biology and Biotechnologies “Charles Darwin”, Sapienza University of Rome, Viale dell’Università 32, 00185 Rome, Italy; (G.F.); (M.O.)
| |
Collapse
|
3
|
Turcio R, Di Matteo F, Capolupo I, Ciaglia T, Musella S, Di Chio C, Stagno C, Campiglia P, Bertamino A, Ostacolo C. Voltage-Gated K + Channel Modulation by Marine Toxins: Pharmacological Innovations and Therapeutic Opportunities. Mar Drugs 2024; 22:350. [PMID: 39195466 DOI: 10.3390/md22080350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 07/25/2024] [Accepted: 07/28/2024] [Indexed: 08/29/2024] Open
Abstract
Bioactive compounds are abundant in animals originating from marine ecosystems. Ion channels, which include sodium, potassium, calcium, and chloride, together with their numerous variants and subtypes, are the primary molecular targets of the latter. Based on their cellular targets, these venom compounds show a range of potencies and selectivity and may have some therapeutic properties. Due to their potential as medications to treat a range of (human) diseases, including pain, autoimmune disorders, and neurological diseases, marine molecules have been the focus of several studies over the last ten years. The aim of this review is on the various facets of marine (or marine-derived) molecules, ranging from structural characterization and discovery to pharmacology, culminating in the development of some "novel" candidate chemotherapeutic drugs that target potassium channels.
Collapse
Affiliation(s)
- Rita Turcio
- Department of Pharmacy, University of Salerno, 84084 Fisciano, Italy
| | | | - Ilaria Capolupo
- Department of Pharmacy, University of Salerno, 84084 Fisciano, Italy
| | - Tania Ciaglia
- Department of Pharmacy, University of Salerno, 84084 Fisciano, Italy
| | - Simona Musella
- Department of Pharmacy, University of Salerno, 84084 Fisciano, Italy
| | - Carla Di Chio
- Department of Chemical, Biological, Pharmaceutical and Environmental Sciences (CHIBIOFARAM), University of Messina, 98166 Messina, Italy
| | - Claudio Stagno
- Department of Chemical, Biological, Pharmaceutical and Environmental Sciences (CHIBIOFARAM), University of Messina, 98166 Messina, Italy
| | - Pietro Campiglia
- Department of Pharmacy, University of Salerno, 84084 Fisciano, Italy
| | - Alessia Bertamino
- Department of Pharmacy, University of Salerno, 84084 Fisciano, Italy
| | - Carmine Ostacolo
- Department of Pharmacy, University of Salerno, 84084 Fisciano, Italy
| |
Collapse
|
4
|
Ringeval A, Farhat S, Fedosov A, Gerdol M, Greco S, Mary L, Modica MV, Puillandre N. DeTox: a pipeline for the detection of toxins in venomous organisms. Brief Bioinform 2024; 25:bbae094. [PMID: 38493344 PMCID: PMC10944572 DOI: 10.1093/bib/bbae094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 01/29/2024] [Accepted: 02/16/2024] [Indexed: 03/18/2024] Open
Abstract
Venomous organisms have independently evolved the ability to produce toxins 101 times during their evolutionary history, resulting in over 200 000 venomous species. Collectively, these species produce millions of toxins, making them a valuable resource for bioprospecting and understanding the evolutionary mechanisms underlying genetic diversification. RNA-seq is the preferred method for characterizing toxin repertoires, but the analysis of the resulting data remains challenging. While early approaches relied on similarity-based mapping to known toxin databases, recent studies have highlighted the importance of structural features for toxin detection. The few existing pipelines lack an integration between these complementary approaches, and tend to be difficult to run for non-experienced users. To address these issues, we developed DeTox, a comprehensive and user-friendly tool for toxin research. It combines fast execution, parallelization and customization of parameters. DeTox was tested on published transcriptomes from gastropod mollusks, cnidarians and snakes, retrieving most putative toxins from the original articles and identifying additional peptides as potential toxins to be confirmed through manual annotation and eventually proteomic analysis. By integrating a structure-based search with similarity-based approaches, DeTox allows the comprehensive characterization of toxin repertoire in poorly-known taxa. The effect of the taxonomic bias in existing databases is minimized in DeTox, as mirrored in the detection of unique and divergent toxins that would have been overlooked by similarity-based methods. DeTox streamlines toxin annotation, providing a valuable tool for efficient identification of venom components that will enhance venom research in neglected taxa.
Collapse
Affiliation(s)
- Allan Ringeval
- Institut Systématique Evolution Biodiversité (ISYEB), Muséum National d’Histoire Naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, 57 rue Cuvier, 75005 Paris, France
| | - Sarah Farhat
- Institut Systématique Evolution Biodiversité (ISYEB), Muséum National d’Histoire Naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, 57 rue Cuvier, 75005 Paris, France
| | - Alexander Fedosov
- Institut Systématique Evolution Biodiversité (ISYEB), Muséum National d’Histoire Naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, 57 rue Cuvier, 75005 Paris, France
- Department of Zoology, Swedish Museum of Natural History, P. O. Box 50007, SE-104 05, Stockholm, Sweden
| | - Marco Gerdol
- Department of Life Sciences, University of Trieste, Trieste, Italy
- Department of Biology and Evolution of Marine Organisms (BEOM), Stazione Zoologica Anton Dohrn, Roma, Italy
| | - Samuele Greco
- Department of Life Sciences, University of Trieste, Trieste, Italy
| | - Lou Mary
- Institut Systématique Evolution Biodiversité (ISYEB), Muséum National d’Histoire Naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, 57 rue Cuvier, 75005 Paris, France
| | - Maria Vittoria Modica
- Department of Biology and Evolution of Marine Organisms (BEOM), Stazione Zoologica Anton Dohrn, Roma, Italy
| | - Nicolas Puillandre
- Institut Systématique Evolution Biodiversité (ISYEB), Muséum National d’Histoire Naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, 57 rue Cuvier, 75005 Paris, France
| |
Collapse
|
5
|
Monroe LK, Truong DP, Miner JC, Adikari SH, Sasiene ZJ, Fenimore PW, Alexandrov B, Williams RF, Nguyen HB. Conotoxin Prediction: New Features to Increase Prediction Accuracy. Toxins (Basel) 2023; 15:641. [PMID: 37999504 PMCID: PMC10675404 DOI: 10.3390/toxins15110641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 10/27/2023] [Accepted: 10/30/2023] [Indexed: 11/25/2023] Open
Abstract
Conotoxins are toxic, disulfide-bond-rich peptides from cone snail venom that target a wide range of receptors and ion channels with multiple pathophysiological effects. Conotoxins have extraordinary potential for medical therapeutics that include cancer, microbial infections, epilepsy, autoimmune diseases, neurological conditions, and cardiovascular disorders. Despite the potential for these compounds in novel therapeutic treatment development, the process of identifying and characterizing the toxicities of conotoxins is difficult, costly, and time-consuming. This challenge requires a series of diverse, complex, and labor-intensive biological, toxicological, and analytical techniques for effective characterization. While recent attempts, using machine learning based solely on primary amino acid sequences to predict biological toxins (e.g., conotoxins and animal venoms), have improved toxin identification, these methods are limited due to peptide conformational flexibility and the high frequency of cysteines present in toxin sequences. This results in an enumerable set of disulfide-bridged foldamers with different conformations of the same primary amino acid sequence that affect function and toxicity levels. Consequently, a given peptide may be toxic when its cysteine residues form a particular disulfide-bond pattern, while alternative bonding patterns (isoforms) or its reduced form (free cysteines with no disulfide bridges) may have little or no toxicological effects. Similarly, the same disulfide-bond pattern may be possible for other peptide sequences and result in different conformations that all exhibit varying toxicities to the same receptor or to different receptors. We present here new features, when combined with primary sequence features to train machine learning algorithms to predict conotoxins, that significantly increase prediction accuracy.
Collapse
Affiliation(s)
- Lyman K. Monroe
- Bioscience Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Duc P. Truong
- Theoretical Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Jacob C. Miner
- Bioscience Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Samantha H. Adikari
- Bioscience Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Zachary J. Sasiene
- Bioscience Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Paul W. Fenimore
- Theoretical Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Boian Alexandrov
- Theoretical Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Robert F. Williams
- Bioscience Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Hau B. Nguyen
- Bioscience Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| |
Collapse
|
6
|
Pradhan UK, Meher PK, Naha S, Rao AR, Kumar U, Pal S, Gupta A. ASmiR: a machine learning framework for prediction of abiotic stress-specific miRNAs in plants. Funct Integr Genomics 2023; 23:92. [PMID: 36939943 DOI: 10.1007/s10142-023-01014-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 01/18/2023] [Accepted: 03/06/2023] [Indexed: 03/21/2023]
Abstract
Abiotic stresses have become a major challenge in recent years due to their pervasive nature and shocking impacts on plant growth, development, and quality. MicroRNAs (miRNAs) play a significant role in plant response to different abiotic stresses. Thus, identification of specific abiotic stress-responsive miRNAs holds immense importance in crop breeding programmes to develop cultivars resistant to abiotic stresses. In this study, we developed a machine learning-based computational model for prediction of miRNAs associated with four specific abiotic stresses such as cold, drought, heat and salt. The pseudo K-tuple nucleotide compositional features of Kmer size 1 to 5 were used to represent miRNAs in numeric form. Feature selection strategy was employed to select important features. With the selected feature sets, support vector machine (SVM) achieved the highest cross-validation accuracy in all four abiotic stress conditions. The highest cross-validated prediction accuracies in terms of area under precision-recall curve were found to be 90.15, 90.09, 87.71, and 89.25% for cold, drought, heat and salt respectively. Overall prediction accuracies for the independent dataset were respectively observed 84.57, 80.62, 80.38 and 82.78%, for the abiotic stresses. The SVM was also seen to outperform different deep learning models for prediction of abiotic stress-responsive miRNAs. To implement our method with ease, an online prediction server "ASmiR" has been established at https://iasri-sg.icar.gov.in/asmir/ . The proposed computational model and the developed prediction tool are believed to supplement the existing effort for identification of specific abiotic stress-responsive miRNAs in plants.
Collapse
Affiliation(s)
- Upendra Kumar Pradhan
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi, 110012, India
| | - Prabina Kumar Meher
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi, 110012, India.
| | - Sanchita Naha
- Division of Computer Applications, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi, 110012, India
| | | | - Upendra Kumar
- Department of Molecular Biology, Biotechnology and Bioinformatics, College of Basic Sciences and Humanities, CCS Haryana Agricultural University, Hisar, 125004, India
| | - Soumen Pal
- Division of Computer Applications, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi, 110012, India
| | - Ajit Gupta
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi, 110012, India
| |
Collapse
|
7
|
Pohanka M. Immunosensors for Assay of Toxic Biological Warfare Agents. BIOSENSORS 2023; 13:402. [PMID: 36979614 PMCID: PMC10046508 DOI: 10.3390/bios13030402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 03/17/2023] [Accepted: 03/19/2023] [Indexed: 06/18/2023]
Abstract
An immunosensor for the assay of toxic biological warfare agents is a biosensor suitable for detecting hazardous substances such as aflatoxin, botulinum toxin, ricin, Shiga toxin, and others. The application of immunosensors is used in outdoor assays, point-of-care tests, as a spare method for more expensive devices, and even in the laboratory as a standard analytical method. Some immunosensors, such as automated flow-through analyzers or lateral flow tests, have been successfully commercialized as tools for toxins assay, but the research is ongoing. New devices are being developed, and the use of advanced materials and assay techniques make immunosensors highly competitive analytical devices in the field of toxic biological warfare agents assay. This review summarizes facts about current applications and new trends of immunosensors regarding recent papers in this area.
Collapse
Affiliation(s)
- Miroslav Pohanka
- Faculty of Military Health Sciences, University of Defense, Trebesska 1575, CZ-50001 Hradec Kralove, Czech Republic
| |
Collapse
|
8
|
Yue ZX, Yan TC, Xu HQ, Liu YH, Hong YF, Chen GX, Xie T, Tao L. A systematic review on the state-of-the-art strategies for protein representation. Comput Biol Med 2023; 152:106440. [PMID: 36543002 DOI: 10.1016/j.compbiomed.2022.106440] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 12/08/2022] [Accepted: 12/15/2022] [Indexed: 12/23/2022]
Abstract
The study of drug-target protein interaction is a key step in drug research. In recent years, machine learning techniques have become attractive for research, including drug research, due to their automated nature, predictive power, and expected efficiency. Protein representation is a key step in the study of drug-target protein interaction by machine learning, which plays a fundamental role in the ultimate accomplishment of accurate research. With the progress of machine learning, protein representation methods have gradually attracted attention and have consequently developed rapidly. Therefore, in this review, we systematically classify current protein representation methods, comprehensively review them, and discuss the latest advances of interest. According to the information extraction methods and information sources, these representation methods are generally divided into structure and sequence-based representation methods. Each primary class can be further divided into specific subcategories. As for the particular representation methods involve both traditional and the latest approaches. This review contains a comprehensive assessment of the various methods which researchers can use as a reference for their specific protein-related research requirements, including drug research.
Collapse
Affiliation(s)
- Zi-Xuan Yue
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Tian-Ci Yan
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Hong-Quan Xu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Yu-Hong Liu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Yan-Feng Hong
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Gong-Xing Chen
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Tian Xie
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| |
Collapse
|
9
|
Gao B, Huang Y, Peng C, Lin B, Liao Y, Bian C, Yang J, Shi Q. High-Throughput Prediction and Design of Novel Conopeptides for Biomedical Research and Development. BIODESIGN RESEARCH 2022; 2022:9895270. [PMID: 37850131 PMCID: PMC10521759 DOI: 10.34133/2022/9895270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Accepted: 07/23/2022] [Indexed: 10/19/2023] Open
Abstract
Cone snail venoms have been considered a valuable treasure for international scientists and businessmen, mainly due to their pharmacological applications in development of marine drugs for treatment of various human diseases. To date, around 800 Conus species are recorded, and each of them produces over 1,000 venom peptides (termed as conopeptides or conotoxins). This reflects the high diversity and complexity of cone snails, although most of their venoms are still uncharacterized. Advanced multiomics (such as genomics, transcriptomics, and proteomics) approaches have been recently developed to mine diverse Conus venom samples, with the main aim to predict and identify potentially interesting conopeptides in an efficient way. Some bioinformatics techniques have been applied to predict and design novel conopeptide sequences, related targets, and their binding modes. This review provides an overview of current knowledge on the high diversity of conopeptides and multiomics advances in high-throughput prediction of novel conopeptide sequences, as well as molecular modeling and design of potential drugs based on the predicted or validated interactions between these toxins and their molecular targets.
Collapse
Affiliation(s)
- Bingmiao Gao
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, School of Pharmacy, Hainan Medical University, Haikou, Hainan 570102, China
| | - Yu Huang
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, Shenzhen, Guangdong 518081, China
| | - Chao Peng
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, Shenzhen, Guangdong 518081, China
- BGI-Marine Research Institute for Biomedical Technology, Shenzhen Huahong Marine Biomedicine Co. Ltd., Shenzhen, Guangdong 518119, China
| | - Bo Lin
- Hainan Provincial Key Laboratory of Carcinogenesis and Intervention, Hainan Medical University, Haikou, Hainan 570102, China
| | - Yanling Liao
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, School of Pharmacy, Hainan Medical University, Haikou, Hainan 570102, China
| | - Chao Bian
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, Shenzhen, Guangdong 518081, China
| | - Jiaan Yang
- Research and Development Department, Micro Pharmtech Ltd., Wuhan, Hubei 430075, China
| | - Qiong Shi
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, Shenzhen, Guangdong 518081, China
- BGI-Marine Research Institute for Biomedical Technology, Shenzhen Huahong Marine Biomedicine Co. Ltd., Shenzhen, Guangdong 518119, China
| |
Collapse
|
10
|
Wu Y, Yang M, Li Y, Zhang W, Zhou M. Synthesis and evaluation of a novel analgesic conotoxin Lt7b that inhibits calcium currents and increases sodium currents. J Cell Mol Med 2022; 26:5330-5334. [PMID: 36050866 PMCID: PMC9575111 DOI: 10.1111/jcmm.17521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 07/22/2022] [Accepted: 08/01/2022] [Indexed: 11/26/2022] Open
Abstract
Conotoxins are promising neuropharmacological tools and drug candidates due to their high efficiency and specificity in targeting ion channels or neurotransmitter receptors. In this study, a novel O2‐superfamily conotoxin, Lt7b, was synthesized and its pharmacological functions were evaluated. Lt7b with three modified amino acids and three disulfide bonds was successfully synthesized. CD spectra showed that Lt7b had a typical α‐helix in the secondary structure. Patch clamp experiments on rat DRG neurons showed that Lt7b could significantly inhibit calcium currents with an IC50 value of 856 ± 95 nM. Meanwhile, 10 μM Lt7b could significantly increase the sodium currents by 77 ± 8%, but it had no obvious effects on the potassium currents in DRG neurons. In addition, patch clamp experiments on ion channel subtypes showed that 10 μM Lt7b could inhibit 7.0 ± 1.2%, 8.0 ± 1.5%, 4.6 ± 3.4%, and 9.5 ± 0.1% of the hCav1.2, hCav2.1, hCav2.2, and hCav3.2 currents, respectively, while it did not increase the rNav1.7, rNav1.8, hNav1.5, hNav1.7, and hNav1.8 currents. Lt7b had no obvious toxicity to HaCaT and ND7/23 cells up to 1 mM and significantly increased the pain threshold at the testing time of 0.5–4 h in a dose‐dependent manner in the mouse hotplate assay. This novel conotoxin Lt7b may be a useful tool for ion channel studies and analgesic drug development.
Collapse
Affiliation(s)
- Yun Wu
- Guangdong Provincial Key Laboratory of Medical Molecular Diagnostics, The First Dongguan Affiliated Hospital, Guangdong Medical University, Dongguan, China
| | - Manyi Yang
- Department of Hepatobiliary and Pancreatic Surgery, NHC Key Laboratory of Nanobiological Technology, Xiangya Hospital, Central South University, Changsha, China
| | - Yubin Li
- Department of Oncology, NHC Key Laboratory of Cancer Proteomics, State Local Joint Engineering Laboratory for Anticancer Drugs, Xiangya Hospital, Central South University, Changsha, China
| | - Wei Zhang
- Institute of High Energy Physics, Chinese Academy of Sciences, Shijingshan District, China
| | - Maojun Zhou
- Department of Oncology, NHC Key Laboratory of Cancer Proteomics, State Local Joint Engineering Laboratory for Anticancer Drugs, Xiangya Hospital, Central South University, Changsha, China
| |
Collapse
|
11
|
Jin F, Xi Y, Xie D, Wang Q. Comprehensive analysis reveals a 5-gene signature and immune cell infiltration in Alzheimer’s disease with qPCR validation. Front Genet 2022; 13:913535. [PMID: 36092935 PMCID: PMC9454400 DOI: 10.3389/fgene.2022.913535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Accepted: 07/19/2022] [Indexed: 11/30/2022] Open
Abstract
Over 50 million people around the world currently are suffering from Alzheimer’s disease (AD) without any effective therapy. Neuroinflammation plays a pivotal role in AD, which leads us to probe the profile of immune cell infiltration in AD. Here, we analyzed a microarray dataset (GSE44770) containing 115 AD and 115 control samples to determine biomarkers and immune infiltration characteristics of AD by multiple bioinformatics methods. First, we identified 3,840 DEGs (1892 upregulated and 1948 downregulated) by using the limma package and 2,697 hub genes by constructing a weighted gene correlation network, and they had a total of 2,167 intersecting genes. Second, combining the LASSO logistic regression and SVM-RFE, we obtained five biomarkers (DGKG, MAP3K7IP2, NFKBIE, VIP, and PCCB), which may reveal the key pathogenetic features of AD and serve as diagnostic markers assessed by the ROC curve (AUC = 0.9716) and validation of another AD dataset (GSE33000) (AUC = 0.9388). Third, immune cell infiltration analysis revealed that compared with control samples, plasma cells, CD8 T cells, T follicular helper cells, and activated NK cells infiltrated less in AD; Monocytes, M2 macrophages, and neutrophils infiltrated more in AD. Neutrophils and activated NK cells demonstrated the most significant and negative correlation. Then, Spearman correlation analysis between the five biomarkers and immune infiltrating cells revealed that all of them were significantly associated with plasma cells. Finally, mRNA levels of VIP and PCCB were conformed in a murine AD model. In conclusion, DGKG, MAP3K7IP2, NFKBIE, VIP, and PCCB may be used as diagnostic markers of AD, and the disruption of the delicate immune balance may be a key process in the onset and development of AD.
Collapse
Affiliation(s)
- Fanmao Jin
- Lishui People's Hospital, Lishui, Zhejiang, China
| | - Yuemei Xi
- School of Medicine, Xiamen University, Xiamen, Fujian, China
- Xiamen Key Laboratory of Translational Medicine for Nucleic Acid Metabolism and Regulation, Xiamen, Fujian, China
| | - De Xie
- School of Medicine, Xiamen University, Xiamen, Fujian, China
- Xiamen Key Laboratory of Translational Medicine for Nucleic Acid Metabolism and Regulation, Xiamen, Fujian, China
| | - Qiang Wang
- School of Medicine, Xiamen University, Xiamen, Fujian, China
- Xiamen Key Laboratory of Translational Medicine for Nucleic Acid Metabolism and Regulation, Xiamen, Fujian, China
- *Correspondence: Qiang Wang,
| |
Collapse
|
12
|
Wang H, Li Y, Yang M, Zhou M. Synthesis and characterization of αM-conotoxin SIIID, a reversible human α7 nicotinic acetylcholine receptor antagonist. Toxicon 2022; 210:141-147. [DOI: 10.1016/j.toxicon.2022.03.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 02/14/2022] [Accepted: 03/03/2022] [Indexed: 11/24/2022]
|
13
|
Meng C, Ju Y, Shi H. TMPpred: A support vector machine-based thermophilic protein identifier. Anal Biochem 2022; 645:114625. [PMID: 35218736 DOI: 10.1016/j.ab.2022.114625] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2021] [Revised: 02/18/2022] [Accepted: 02/21/2022] [Indexed: 11/13/2022]
Abstract
MOTIVATION The thermostability of proteins will cause them to break the temperature binding and play more functions. Using machine learning, we explored the mechanism of and reasons for protein thermostability characteristics. RESULTS Different from other methods that only pursue the performance of models, we aim to find important features so as to provide a powerful reference for in vitro experiments. We transformed this problem into a binary classification problem, that is, the distinction between thermophilic proteins and nonthermophilic proteins. Using support vector machine-based model construction and analysis, we inferred that Gly, Ala, Ser and Thr may be the most important components at the residue level that determine the thermal stability of proteins. It is also noteworthy that our proposed model obtains an Sn of 0.892, an Sp of 0.857, an ACC of 0.87566 and an AUC of 0.874. To facilitate other researchers, we wrapped our model and deployed it as a web server, which is accessible at http://112.124.26.17:7000/TMPpred/index.html.
Collapse
Affiliation(s)
- Chaolu Meng
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China; Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application for Agriculture and Animal Husbandry, Hohhot, China
| | - Ying Ju
- School of Informatics, Xiamen University, Xiamen, China.
| | - Hua Shi
- School of Opto-electronic and Communication Engineering, Xiamen University of Technology, Xiamen, China.
| |
Collapse
|
14
|
ASRmiRNA: Abiotic Stress-Responsive miRNA Prediction in Plants by Using Machine Learning Algorithms with Pseudo K-Tuple Nucleotide Compositional Features. Int J Mol Sci 2022; 23:ijms23031612. [PMID: 35163534 PMCID: PMC8835813 DOI: 10.3390/ijms23031612] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Revised: 01/23/2022] [Accepted: 01/26/2022] [Indexed: 02/04/2023] Open
Abstract
MicroRNAs (miRNAs) play a significant role in plant response to different abiotic stresses. Thus, identification of abiotic stress-responsive miRNAs holds immense importance in crop breeding programmes to develop cultivars resistant to abiotic stresses. In this study, we developed a machine learning-based computational method for prediction of miRNAs associated with abiotic stresses. Three types of datasets were used for prediction, i.e., miRNA, Pre-miRNA, and Pre-miRNA + miRNA. The pseudo K-tuple nucleotide compositional features were generated for each sequence to transform the sequence data into numeric feature vectors. Support vector machine (SVM) was employed for prediction. The area under receiver operating characteristics curve (auROC) of 70.21, 69.71, 77.94 and area under precision-recall curve (auPRC) of 69.96, 65.64, 77.32 percentages were obtained for miRNA, Pre-miRNA, and Pre-miRNA + miRNA datasets, respectively. Overall prediction accuracies for the independent test set were 62.33, 64.85, 69.21 percentages, respectively, for the three datasets. The SVM also achieved higher accuracy than other learning methods such as random forest, extreme gradient boosting, and adaptive boosting. To implement our method with ease, an online prediction server “ASRmiRNA” has been developed. The proposed approach is believed to supplement the existing effort for identification of abiotic stress-responsive miRNAs and Pre-miRNAs.
Collapse
|
15
|
Preparation and Functional Identification of a Novel Conotoxin QcMNCL-XIII0.1 from Conus quercinus. Toxins (Basel) 2022; 14:toxins14020099. [PMID: 35202127 PMCID: PMC8877388 DOI: 10.3390/toxins14020099] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 01/20/2022] [Accepted: 01/24/2022] [Indexed: 01/14/2023] Open
Abstract
Conotoxins are tools used by marine Conus snails to hunt and are a significant repository for marine drug research. Conotoxins highly selectively coordinate different subtypes of various ion channels, and a few have been used in pain management. Although more than 8000 conotoxin genes have been found, the biological activity and function of most have not yet been examined. In this report, we selected the toxin gene QcMNCL-XIII0.1 from our previous investigation and studied it in vitro. First, we successfully prepared active recombinant QcMNCL-XIII0.1 using a TrxA (Thioredoxin A)-assisted folding expression vector based on genetic engineering technology. Animal experiments showed that the recombinant QcMNCL-XIII0.1 exhibited nerve conduction inhibition similar to that of pethidine hydrochloride. With flow cytometry combined fluorescent probe Fluo-4 AM, we found that 10 ng/μL recombinant QcMNCL-XIII0.1 inhibited the fluorescence intensity by 31.07% in the 293T cell model transfected with Cav3.1, implying an interaction between α1G T-type calcium channel protein and recombinant QcMNCL-XIII0.1. This toxin could be an important drug in biomedical research and medicine for pain control.
Collapse
|
16
|
Mining the Royal Jelly Proteins: Combinatorial Hexapeptide Ligand Library Significantly Improves the MS-Based Proteomic Identification in Complex Biological Samples. MOLECULES (BASEL, SWITZERLAND) 2021; 26:molecules26092762. [PMID: 34067143 PMCID: PMC8125745 DOI: 10.3390/molecules26092762] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 04/30/2021] [Accepted: 05/04/2021] [Indexed: 12/18/2022]
Abstract
Royal jelly (RJ) is a complex, creamy secretion produced by the glands of worker bees. Due to its health-promoting properties, it is used by humans as a dietary supplement. However, RJ compounds are not fully characterized yet. Hence, in this research, we aimed to broaden the knowledge of the proteomic composition of fresh RJ. Water extracts of the samples were pre-treated using combinatorial hexapeptide ligand libraries (ProteoMinerTM kit), trypsin-digested, and analyzed by a nanoLC-MALDI-TOF/TOF MS system. To check the ProteoMinerTM performance in the MS-based protein identification, we also examined RJ extracts that were not prepared with the ProteoMinerTM kit. We identified a total of 86 proteins taxonomically classified to Apis spp. (bees). Among them, 74 proteins were detected in RJ extracts pre-treated with ProteoMinerTM kit, and only 50 proteins were found in extracts non-enriched with this technique. Ten of the identified features were hypothetical proteins whose existence has been predicted, but any experimental evidence proves their in vivo expression. Additionally, we detected four uncharacterized proteins of unknown functions. The results of this research indicate that the ProteoMinerTM strategy improves proteomic identification in complex biological samples. Broadening the knowledge of RJ composition may contribute to the development of standards and regulations, enhancing the quality of RJ, and consequently, the safety of its supplementation.
Collapse
|
17
|
He S, Guo F, Zou Q, HuiDing. MRMD2.0: A Python Tool for Machine Learning with Feature Ranking and Reduction. Curr Bioinform 2021. [DOI: 10.2174/1574893615999200503030350] [Citation(s) in RCA: 101] [Impact Index Per Article: 33.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Aims:
The study aims to find a way to reduce the dimensionality of the dataset.
Background:
Dimensionality reduction is the key issue of the machine learning process. It does
not only improve the prediction performance but also could recommend the intrinsic features and
help to explore the biological expression of the machine learning “black box”.
Objective:
A variety of feature selection algorithms are used to select data features to achieve
dimensionality reduction.
Methods:
First, MRMD2.0 integrated 7 different popular feature ranking algorithms with
PageRank strategy. Second, optimized dimensionality was detected with forward adding strategy.
Result:
We have achieved good results in our experiments.
Conclusion:
Several works have been tested with MRMD2.0. It showed well performance.
Otherwise, it also can draw the performance curves according to the feature dimensionality. If
users want to sacrifice accuracy for fewer features, they can select the dimensionality from the
performance curves.
Other:
We developed friendly python tools together with the web server. The users could upload
their csv, arff or libsvm format files. Then the webserver would help to rank features and find the
optimized dimensionality.
Collapse
Affiliation(s)
- Shida He
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Fei Guo
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - HuiDing
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
18
|
Wang H, Liang P, Zheng L, Long C, Li H, Zuo Y. eHSCPr discriminating the cell identity involved in endothelial to hematopoietic transition. Bioinformatics 2021; 37:2157-2164. [PMID: 33532815 DOI: 10.1093/bioinformatics/btab071] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Revised: 01/15/2021] [Accepted: 01/28/2021] [Indexed: 12/11/2022] Open
Abstract
MOTIVATION Hematopoietic stem cells (HSCs) give rise to all blood cells and play a vital role throughout the whole lifespan through their pluripotency and self-renewal properties. Accurately identifying the stages of early HSCs is extremely important, as it may open up new prospects for extracorporeal blood research. Existing experimental techniques for identifying the early stages of HSCs development are time-consuming and expensive. Machine learning has shown its excellence in massive single-cell data processing and it is desirable to develop related computational models as good complements to experimental techniques. RESULTS In this study, we presented a novel predictor called eHSCPr specifically for predicting the early stages of HSCs development. To reveal the distinct genes at each developmental stage of HSCs, we compared F-score with three state-of-art differential gene selection methods (limma, DESeq2, edgeR) and evaluated their performance. F-score captured the more critical surface markers of endothelial cells and hematopoietic cells, and the area under receiver operating characteristic curve (ROC) value was 0.987. Based on SVM, the 10-fold cross-validation accuracy of eHSCpr in the independent dataset and the training dataset reached 94.84% and 94.19%, respectively. Importantly, we performed transcription analysis on the F-score gene set, which indeed further enriched the signal markers of HSCs development stages. eHSCPr can be a powerful tool for predicting early stages of HSCs development, facilitating hypothesis-driven experimental design and providing crucial clues for the in vitro blood regeneration studies. AVAILABILITY http://bioinfor.imu.edu.cn/ehscpr. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hao Wang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China
| | - Pengfei Liang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China
| | - Lei Zheng
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China
| | - ChunShen Long
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China
| | - HanShuang Li
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China
| | - Yongchun Zuo
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China
| |
Collapse
|
19
|
Yang M, Zhou M. μ-conotoxin TsIIIA, a peptide inhibitor of human voltage-gated sodium channel hNa v1.8. Toxicon 2020; 186:29-34. [PMID: 32758497 DOI: 10.1016/j.toxicon.2020.07.024] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 07/19/2020] [Accepted: 07/22/2020] [Indexed: 10/23/2022]
Abstract
TsIIIA, the first μ-conotoxin from Conus tessulatus, can selectively inhibit rat tetrodotoxin-resistant sodium channels. TsIIIA also shows potent analgesic activity in a mice hotplate analgesic assay, but its effect on human sodium channels remains unknown. In this study, eight human sodium channel subtypes, hNav1.1- hNav1.8, were expressed in HEK293 or ND7/23 cells and tested on the chemically synthesized TsIIIA. Patch clamp experiments showed that 10 μM TsIIIA had no effects on the tetrodotoxin-sensitive hNav1.1, hNav1.2, hNav1.3, hNav1.4, hNav1.6 and hNav1.7, as well as tetrodotoxin-resistant hNav1.5. For tetrodotoxin-resistant hNav1.8, concentrations of 1, 5 and 10 μM TsIIIA reduced the hNav1.8 currents to 59.26%, 36.21% and 24.93% respectively. Further detailed dose-effect experiments showed that TsIIIA inhibited hNav1.8 currents with an IC50 value of 2.11 μM. In addition, 2 μM TsIIIA did not induce a shift in the current-voltage relationship of hNav1.8. Taken together, the hNav1.8 peptide inhibitor TsIIIA provides a pharmacological probe for sodium channels and a potential therapeutic agent for pain.
Collapse
Affiliation(s)
- Manyi Yang
- Department of Hepatobiliary and Pancreatic Surgery, NHC Key Laboratory of Nanobiological Technology, Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Maojun Zhou
- Department of Oncology, State Local Joint Engineering Laboratory for Anticancer Drugs, NHC Key Laboratory of Cancer Proteomics, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China.
| |
Collapse
|
20
|
Gallo A, Boni R, Tosti E. Neurobiological activity of conotoxins via sodium channel modulation. Toxicon 2020; 187:47-56. [PMID: 32877656 DOI: 10.1016/j.toxicon.2020.08.019] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 07/20/2020] [Accepted: 08/22/2020] [Indexed: 01/02/2023]
Abstract
Conotoxins (CnTX) are bioactive peptides produced by marine molluscs belonging to Conus genus. The biochemical structure of these venomous peptides is characterized by a low number of amino acids linked with disulfide bonds formed by a high degree of post-translational modifications and glycosylation steps which increase the diversity and rate of evolution of these molecules. CnTX different isoforms are known to target ion channels and, in particular, voltage-gated sodium (Na+) channels (Nav channels). These are transmembrane proteins fundamental in excitable cells for generating the depolarization of plasma membrane potential known as action potential which propagates electrical signals in muscles and nerves for physiological functions. Disorders in Nav channel activity have been shown to induce neurological pathologies and pain states. Here, we describe the current knowledge of CnTX isoform modulation of the Nav channel activity, the mechanism of action and the potential therapeutic use of these toxins in counteracting neurological dysfunctions.
Collapse
Affiliation(s)
- Alessandra Gallo
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121, Naples, Italy.
| | - Raffele Boni
- Department of Sciences, University of Basilicata, 85100, Potenza, Italy.
| | - Elisabetta Tosti
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121, Naples, Italy.
| |
Collapse
|
21
|
Tan JX, Lv H, Wang F, Dao FY, Chen W, Ding H. A Survey for Predicting Enzyme Family Classes Using Machine Learning Methods. Curr Drug Targets 2020; 20:540-550. [PMID: 30277150 DOI: 10.2174/1389450119666181002143355] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2018] [Revised: 08/17/2018] [Accepted: 09/04/2018] [Indexed: 12/13/2022]
Abstract
Enzymes are proteins that act as biological catalysts to speed up cellular biochemical processes. According to their main Enzyme Commission (EC) numbers, enzymes are divided into six categories: EC-1: oxidoreductase; EC-2: transferase; EC-3: hydrolase; EC-4: lyase; EC-5: isomerase and EC-6: synthetase. Different enzymes have different biological functions and acting objects. Therefore, knowing which family an enzyme belongs to can help infer its catalytic mechanism and provide information about the relevant biological function. With the large amount of protein sequences influxing into databanks in the post-genomics age, the annotation of the family for an enzyme is very important. Since the experimental methods are cost ineffective, bioinformatics tool will be a great help for accurately classifying the family of the enzymes. In this review, we summarized the application of machine learning methods in the prediction of enzyme family from different aspects. We hope that this review will provide insights and inspirations for the researches on enzyme family classification.
Collapse
Affiliation(s)
- Jiu-Xin Tan
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hao Lv
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Fang Wang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Fu-Ying Dao
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Wei Chen
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.,Department of Physics, School of Sciences, and Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan 063000, China.,Gordon Life Science Institute, Boston, MA 02478, United States
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|
22
|
Liang P, Yang W, Chen X, Long C, Zheng L, Li H, Zuo Y. Machine Learning of Single-Cell Transcriptome Highly Identifies mRNA Signature by Comparing F-Score Selection with DGE Analysis. MOLECULAR THERAPY. NUCLEIC ACIDS 2020; 20:155-163. [PMID: 32169803 PMCID: PMC7066034 DOI: 10.1016/j.omtn.2020.02.004] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Revised: 12/27/2019] [Accepted: 02/05/2020] [Indexed: 12/21/2022]
Abstract
Human preimplantation development is a complex process involving dramatic changes in transcriptional architecture. For a better understanding of their time-spatial development, it is indispensable to identify key genes. Although the single-cell RNA sequencing (RNA-seq) techniques could provide detailed clustering signatures, the identification of decisive factors remains difficult. Additionally, it requires high experimental cost and a long experimental period. Thus, it is highly desired to develop computational methods for identifying effective genes of development signature. In this study, we first developed a predictor called EmPredictor to identify developmental stages of human preimplantation embryogenesis. First, we compared the F-score of feature selection algorithms with differential gene expression (DGE) analysis to find specific signatures of the development stage. In addition, by training the support vector machine (SVM), four types of signature subsets were comprehensively discussed. The prediction results showed that a feature subset with 1,881 genes from the F-score algorithm obtained the best predictive performance, which achieved the highest accuracy of 93.3% on the cross-validation set. Further function enrichment demonstrated that the gene set selected by the feature selection method was involved in more development-related pathways and cell fate determination biomarkers. This indicates that the F-score algorithm should be preferentially proposed for detecting key genes of multi-period data in mammalian early development.
Collapse
Affiliation(s)
- Pengfei Liang
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot 010070, China
| | - Wuritu Yang
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot 010070, China
| | - Xing Chen
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot 010070, China
| | - Chunshen Long
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot 010070, China
| | - Lei Zheng
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot 010070, China
| | - Hanshuang Li
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot 010070, China
| | - Yongchun Zuo
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot 010070, China.
| |
Collapse
|
23
|
Zhou T, Law KMY, Yung KL. An empirical analysis of intention of use for bike-sharing system in China through machine learning techniques. ENTERP INF SYST-UK 2020. [DOI: 10.1080/17517575.2020.1758796] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Tao Zhou
- School of Engineering, Faculty of Science Engineering and Built Environment, Deakin University, Geelong, Australia
| | - Kris M. Y. Law
- School of Engineering, Faculty of Science Engineering and Built Environment, Deakin University, Geelong, Australia
- Department of Industrial Engineering Management, University of Oulu, Oulu, Finland
| | - K. L. Yung
- Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong
| |
Collapse
|
24
|
Jin AH, Muttenthaler M, Dutertre S, Himaya SWA, Kaas Q, Craik DJ, Lewis RJ, Alewood PF. Conotoxins: Chemistry and Biology. Chem Rev 2019; 119:11510-11549. [PMID: 31633928 DOI: 10.1021/acs.chemrev.9b00207] [Citation(s) in RCA: 168] [Impact Index Per Article: 33.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The venom of the marine predatory cone snails (genus Conus) has evolved for prey capture and defense, providing the basis for survival and rapid diversification of the now estimated 750+ species. A typical Conus venom contains hundreds to thousands of bioactive peptides known as conotoxins. These mostly disulfide-rich and well-structured peptides act on a wide range of targets such as ion channels, G protein-coupled receptors, transporters, and enzymes. Conotoxins are of interest to neuroscientists as well as drug developers due to their exquisite potency and selectivity, not just against prey but also mammalian targets, thereby providing a rich source of molecular probes and therapeutic leads. The rise of integrated venomics has accelerated conotoxin discovery with now well over 10,000 conotoxin sequences published. However, their structural and pharmacological characterization lags considerably behind. In this review, we highlight the diversity of new conotoxins uncovered since 2014, their three-dimensional structures and folds, novel chemical approaches to their syntheses, and their value as pharmacological tools to unravel complex biology. Additionally, we discuss challenges and future directions for the field.
Collapse
Affiliation(s)
- Ai-Hua Jin
- Institute for Molecular Bioscience , The University of Queensland , Brisbane Queensland 4072 , Australia
| | - Markus Muttenthaler
- Institute for Molecular Bioscience , The University of Queensland , Brisbane Queensland 4072 , Australia.,Institute of Biological Chemistry, Faculty of Chemistry , University of Vienna , 1090 Vienna , Austria
| | - Sebastien Dutertre
- Département des Acides Amines, Peptides et Protéines, Unité Mixte de Recherche 5247, Université Montpellier 2-Centre Nationale de la Recherche Scientifique , Institut des Biomolécules Max Mousseron , Place Eugène Bataillon , 34095 Montpellier Cedex 5 , France
| | - S W A Himaya
- Institute for Molecular Bioscience , The University of Queensland , Brisbane Queensland 4072 , Australia
| | - Quentin Kaas
- Institute for Molecular Bioscience , The University of Queensland , Brisbane Queensland 4072 , Australia
| | - David J Craik
- Institute for Molecular Bioscience , The University of Queensland , Brisbane Queensland 4072 , Australia
| | - Richard J Lewis
- Institute for Molecular Bioscience , The University of Queensland , Brisbane Queensland 4072 , Australia
| | - Paul F Alewood
- Institute for Molecular Bioscience , The University of Queensland , Brisbane Queensland 4072 , Australia
| |
Collapse
|
25
|
Verma N, Singh H, Khanna D, Rana PS, Bhadada SK. Classification of drug molecules for oxidative stress signalling pathway. IET Syst Biol 2019; 13:243-250. [PMID: 31538958 PMCID: PMC8687196 DOI: 10.1049/iet-syb.2018.5078] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
In humans, oxidative stress is involved in the development of diabetes, cancer, hypertension, Alzheimers’ disease, and heart failure. One of the mechanisms in the cellular defence against oxidative stress is the activation of the Nrf2‐antioxidant response element (ARE) signalling pathway. Computation of activity, efficacy, and potency score of ARE signalling pathway and to propose a multi‐level prediction scheme for the same is the main aim of the study as it contributes in a big amount to the improvement of oxidative stress in humans. Applying the process of knowledge discovery from data, required knowledge is gathered and then machine learning techniques are applied to propose a multi‐level scheme. The validation of the proposed scheme is done using the K‐fold cross‐validation method and an accuracy of 90% is achieved for prediction of activity score for ARE molecules which determine their power to refine oxidative stress.
Collapse
Affiliation(s)
- Nikhil Verma
- Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, Punjab 147004, India.
| | - Harpreet Singh
- Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, Punjab 147004, India
| | - Divya Khanna
- Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, Punjab 147004, India
| | - Prashant Singh Rana
- Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, Punjab 147004, India
| | - Sanjay Kumar Bhadada
- Department of Endocrinology, Postgraduate Institute of Medical Education and Research, Chandigarh 160012, India
| |
Collapse
|
26
|
Morales Duque H, Campos Dias S, Franco OL. Structural and Functional Analyses of Cone Snail Toxins. Mar Drugs 2019; 17:md17060370. [PMID: 31234371 PMCID: PMC6628382 DOI: 10.3390/md17060370] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2019] [Revised: 06/16/2019] [Accepted: 06/17/2019] [Indexed: 12/12/2022] Open
Abstract
Cone snails are marine gastropod mollusks with one of the most powerful venoms in nature. The toxins, named conotoxins, must act quickly on the cone snails´ prey due to the fact that snails are extremely slow, reducing their hunting capability. Therefore, the characteristics of conotoxins have become the object of investigation, and as a result medicines have been developed or are in the trialing process. Conotoxins interact with transmembrane proteins, showing specificity and potency. They target ion channels and ionotropic receptors with greater regularity, and when interaction occurs, there is immediate physiological decompensation. In this review we aimed to evaluate the structural features of conotoxins and the relationship with their target types.
Collapse
Affiliation(s)
- Harry Morales Duque
- Centro de Análises Proteômicas e Bioquímicas, Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF 70.790-160, Brazil.
| | - Simoni Campos Dias
- Centro de Análises Proteômicas e Bioquímicas, Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF 70.790-160, Brazil.
| | - Octávio Luiz Franco
- Centro de Análises Proteômicas e Bioquímicas, Programa de Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília-DF 70.790-160, Brazil.
- S-inova Biotech, Programa de Pós-Graduação em Biotecnologia, Universidade Católica Dom Bosco, Campo Grande-MS 79.117-900, Brazil.
| |
Collapse
|
27
|
Rajesh RP, Franklin JB, Badsha I, Arjun P, Jain RP, Vignesh MS, Kannan RR. Proteome based de novo sequencing of novel conotoxins from marine molluscivorous cone snail Conus amadis and neurological activities of its natural venom in zebrafish model. Protein Pept Lett 2019; 26:819-833. [PMID: 31203793 DOI: 10.2174/0929866526666190614144006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Revised: 04/09/2019] [Accepted: 04/19/2019] [Indexed: 11/22/2022]
Abstract
Conus amadis is a carnivorous snail found abundantly in coastal waters of India. They are equipped with potent chemical arsenal made of neurotoxic peptide concoction used for predation and competition. In this study, we have identified 19 novel conotoxins containing 1, 2 & 3 disulfides, belonging to different classes, from a molluscivorous cone snail Conus amadis using proteome based MALDI-TOF and LC-MS-MS analysis. Among them, 2 novel contryphans, 3 T-superfamily conotoxin, 2 A-superfamily conotoxins and 2 Mini M-Superfamily conotoxins were sequenced to its amino acid level from the fragmented spectrum of singly and doubly charged parent ions using de novo sequencing strategies. ama1054, a contryphan peptide toxin, possesses post translationally modified bromo tryptophan at its seventh position. Except ama1251, all the sequenced peptide toxins possess modified C-terminal amidation. Moreover, we have screened the crude venom for the presence of biological function in zebrafish model. Crude venom exhibited anticonvulsant properties in pentylenetetrazole-induced seizure in zebrafish larvae which suggested anti-epileptic properties of the venom cocktail. Acetyl cholinesterase activity was also identified in the venom complex.
Collapse
Affiliation(s)
- R P Rajesh
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012. India
| | - Jayaseelan Benjamin Franklin
- Andaman and Nicobar Centre for Ocean Science and Technology, National Institute of Ocean Technology, Ministry of Earth Sciences, Government of India, Port Blair 744103. India
| | - Iffath Badsha
- Molecular & Nanomedicine Research Unit, Centre for Nanoscience and Nanotechnology, Sathyabama Institute of Science and Technology, Chennai 600119. India
| | - P Arjun
- Molecular & Nanomedicine Research Unit, Centre for Nanoscience and Nanotechnology, Sathyabama Institute of Science and Technology, Chennai 600119. India
| | - Ruchi P Jain
- Molecular & Nanomedicine Research Unit, Centre for Nanoscience and Nanotechnology, Sathyabama Institute of Science and Technology, Chennai 600119. India
| | - M S Vignesh
- Molecular & Nanomedicine Research Unit, Centre for Nanoscience and Nanotechnology, Sathyabama Institute of Science and Technology, Chennai 600119. India
| | - R Rajesh Kannan
- Molecular & Nanomedicine Research Unit, Centre for Nanoscience and Nanotechnology, Sathyabama Institute of Science and Technology, Chennai 600119. India
| |
Collapse
|
28
|
Yonge F, Weixia X. Identification of Mitochondrial Proteins of Malaria Parasite Adding the New Parameter. LETT ORG CHEM 2019. [DOI: 10.2174/1570178615666180608100348] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Malaria has been one of the serious infectious diseases caused by Plasmodium falciparum (P. falciparum). Mitochondrial proteins of P. falciparum are regarded as effective drug targets against malaria. Thus, it is necessary to accurately identify mitochondrial proteins of malaria parasite. Many algorithms have been proposed for the prediction of mitochondrial proteins of malaria parasite and yielded the better results. However, the parameters used by these methods were primarily based on amino acid sequences. In this study, we added a novel parameter for predicting mitochondrial proteins of malaria parasite based on protein secondary structure. Firstly, we extracted three feature parameters, namely, three kinds of protein secondary structures compositions (3PSS), 20 amino acid compositions (20AAC) and 400 dipeptide compositions (400DC), and used the analysis of variance (ANOVA) to screen 400 dipeptides. Secondly, we adopted these features to predict mitochondrial proteins of malaria parasite by using support vector machine (SVM). Finally, we found that 1) adding the feature of protein secondary structure (3PSS) can indeed improve the prediction accuracy. This result demonstrated that the parameter of protein secondary structure is a valid feature in the prediction of mitochondrial proteins of malaria parasite; 2) feature combination can improve the prediction’s results; feature selection can reduce the dimension and simplify the calculation. We achieved the sensitivity (Sn) of 98.16%, the specificity (Sp) of 97.64% and overall accuracy (Acc) of 97.88% with 0.957 of Mathew’s correlation coefficient (MCC) by using 3PSS+ 20AAC+ 34DC as a feature in 15-fold cross-validation. This result is compared with that of the similar work in the same dataset, showing the superiority of our work.
Collapse
Affiliation(s)
- Feng Yonge
- College of Science, Inner Mongolia Agriculture University, Hohhot 010018, China
| | - Xie Weixia
- College of Science, Inner Mongolia Agriculture University, Hohhot 010018, China
| |
Collapse
|
29
|
Khan YD, Batool A, Rasool N, Khan SA, Chou KC. Prediction of Nitrosocysteine Sites Using Position and Composition Variant Features. LETT ORG CHEM 2019. [DOI: 10.2174/1570178615666180802122953] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
S-nitrosylation is one of the most prominent posttranslational modification among proteins. It involves the addition of nitrogen oxide group to cysteine thiols forming S-nitrosocysteine. Evidence suggests that S-nitrosylation plays a foremost role in numerous human diseases and disorders. The incorporation of techniques for robust identification of S-nitrosylated proteins is highly anticipated in biological research and drug discovery. The proposed system endeavors a novel strategy based on a statistical and computational intelligent methods for the identification of S-nitrosocystiene sites within a given primary protein sequence. For this purpose, 5-step rule was approached comprising of benchmark dataset creation, mathematical modelling, prediction, evaluation and web-server development. For position relative feature extraction, statistical moments were used and a multilayer neural network was trained adapting Gradient Descent and Adaptive Learning algorithms. The results were comparatively analyzed with existing techniques using benchmark datasets. It is inferred through conclusive experimentation that the proposed scheme is very propitious, accurate and exceptionally effective for the prediction of S-nitrosocystiene in protein sequences.
Collapse
Affiliation(s)
- Yaser Daanial Khan
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Pakistan
| | - Aroosa Batool
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Pakistan
| | - Nouman Rasool
- Department of Life Sciences, School of Science, University of Management and Technology, Lahore, Pakistan
| | - Sher Afzal Khan
- Faculty of Computing and Information Technology in Rabigh, King Abdulaziz University, Jeddah, 21577, Saudi Arabia
| | - Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, United States
| |
Collapse
|
30
|
Akbar S, Hayat M, Kabir M, Iqbal M. iAFP-gap-SMOTE: An Efficient Feature Extraction Scheme Gapped Dipeptide Composition is Coupled with an Oversampling Technique for Identification of Antifreeze Proteins. LETT ORG CHEM 2019. [DOI: 10.2174/1570178615666180816101653] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Antifreeze proteins (AFPs) perform distinguishable roles in maintaining homeostatic conditions of living organisms and protect their cell and body from freezing in extremely cold conditions. Owing to high diversity in protein sequences and structures, the discrimination of AFPs from non- AFPs through experimental approaches is expensive and lengthy. It is, therefore, vastly desirable to propose a computational intelligent and high throughput model that truly reflects AFPs quickly and accurately. In a sequel, a new predictor called “iAFP-gap-SMOTE” is proposed for the identification of AFPs. Protein sequences are expressed by adopting three numerical feature extraction schemes namely; Split Amino Acid Composition, G-gap di-peptide Composition and Reduce Amino Acid alphabet composition. Usually, classification hypothesis biased towards majority class in case of the imbalanced dataset. Oversampling technique Synthetic Minority Over-sampling Technique is employed in order to increase the instances of the lower class and control the biasness. 10-fold cross-validation test is applied to appraise the success rates of “iAFP-gap-SMOTE” model. After the empirical investigation, “iAFP-gap-SMOTE” model obtained 95.02% accuracy. The comparison suggested that the accuracy of” iAFP-gap-SMOTE” model is higher than that of the present techniques in the literature so far. It is greatly recommended that our proposed model “iAFP-gap-SMOTE” might be helpful for the research community and academia.
Collapse
Affiliation(s)
- Shahid Akbar
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP 23200, Pakistan
| | - Maqsood Hayat
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP 23200, Pakistan
| | - Muhammad Kabir
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP 23200, Pakistan
| | - Muhammad Iqbal
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP 23200, Pakistan
| |
Collapse
|
31
|
Mansbach RA, Travers T, McMahon BH, Fair JM, Gnanakaran S. Snails In Silico: A Review of Computational Studies on the Conopeptides. Mar Drugs 2019; 17:E145. [PMID: 30832207 PMCID: PMC6471681 DOI: 10.3390/md17030145] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2019] [Revised: 02/21/2019] [Accepted: 02/22/2019] [Indexed: 12/26/2022] Open
Abstract
Marine cone snails are carnivorous gastropods that use peptide toxins called conopeptides both as a defense mechanism and as a means to immobilize and kill their prey. These peptide toxins exhibit a large chemical diversity that enables exquisite specificity and potency for target receptor proteins. This diversity arises in terms of variations both in amino acid sequence and length, and in posttranslational modifications, particularly the formation of multiple disulfide linkages. Most of the functionally characterized conopeptides target ion channels of animal nervous systems, which has led to research on their therapeutic applications. Many facets of the underlying molecular mechanisms responsible for the specificity and virulence of conopeptides, however, remain poorly understood. In this review, we will explore the chemical diversity of conopeptides from a computational perspective. First, we discuss current approaches used for classifying conopeptides. Next, we review different computational strategies that have been applied to understanding and predicting their structure and function, from machine learning techniques for predictive classification to docking studies and molecular dynamics simulations for molecular-level understanding. We then review recent novel computational approaches for rapid high-throughput screening and chemical design of conopeptides for particular applications. We close with an assessment of the state of the field, emphasizing important questions for future lines of inquiry.
Collapse
Affiliation(s)
- Rachael A Mansbach
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, NM 87545, USA.
| | - Timothy Travers
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, NM 87545, USA.
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM 87545, USA.
| | - Benjamin H McMahon
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, NM 87545, USA.
| | - Jeanne M Fair
- Biosecurity and Public Health Group, Los Alamos National Laboratory, Los Alamos, NM 87545, USA.
| | - S Gnanakaran
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, NM 87545, USA.
| |
Collapse
|
32
|
Su R, Liu X, Wei L, Zou Q. Deep-Resp-Forest: A deep forest model to predict anti-cancer drug response. Methods 2019; 166:91-102. [PMID: 30772464 DOI: 10.1016/j.ymeth.2019.02.009] [Citation(s) in RCA: 135] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Revised: 01/13/2019] [Accepted: 02/10/2019] [Indexed: 12/01/2022] Open
Abstract
The identification of therapeutic biomarkers predictive of drug response is crucial in personalized medicine. A number of computational models to predict response of anti-cancer drugs have been developed as the establishment of several pharmacogenomics screening databases. In our study, we proposed a deep cascaded forest model, Deep-Resp-Forest, to classify the anti-cancer drug response as "sensitive" or "resistant". We made three contributions in this study. Firstly, diverse molecular data could be effectively integrated to provide more information than single type of data for the classification. Combination of two types of data were tested here. Secondly, two structures based on the multi-grained scanning to transform the raw features into high-dimensional feature vectors and integrate the diverse data were proposed in our study. Thirdly, the original deep and time-consuming architecture of cascade forest was improved by a feature optimization operation, which emphasized the most discriminative features across layers. We evaluated the proposed method on the Cancer Cell Line Encyclopedia (CCLE) and Genomics of Drug Sensitivity in Cancer (GDSC) data sets and then compared with the Support Vector Machine. The proposed Deep-Resp-Forest has demonstrated the promising use of deep learning and deep forest approach on the drug response prediction tasks. The R implementation for running our experiments is available athttps://github.com/RanSuLab/Deep-Resp-Forest.
Collapse
Affiliation(s)
- Ran Su
- School of Computer Software, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Xinyi Liu
- School of Computer Software, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Leyi Wei
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China.
| |
Collapse
|
33
|
Ning J, Li R, Ren J, Zhangsun D, Zhu X, Wu Y, Luo S. Alanine-Scanning Mutagenesis of α-Conotoxin GI Reveals the Residues Crucial for Activity at the Muscle Acetylcholine Receptor. Mar Drugs 2018; 16:md16120507. [PMID: 30551685 PMCID: PMC6315591 DOI: 10.3390/md16120507] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Revised: 11/25/2018] [Accepted: 12/10/2018] [Indexed: 01/30/2023] Open
Abstract
Recently, the muscle-type nicotinic acetylcholine receptors (nAChRs) have been pursued as a potential target of several diseases, including myogenic disorders, muscle dystrophies and myasthenia gravis, etc. α-conotoxin GI isolated from Conus geographus selectively and potently inhibited the muscle-type nAChRs which can be developed as a tool to study them. Herein, alanine scanning mutagenesis was used to reveal the structure–activity relationship (SAR) between GI and mouse α1β1δε nAChRs. The Pro5, Gly8, Arg9, and Tyr11 were proved to be the critical residues for receptor inhibiting as the alanine (Ala) replacement led to a significant potency loss on mouse α1β1δε nAChR. On the contrary, substituting Asn4, His10 and Ser12 with Ala respectively did not affect its activity. Interestingly, the [E1A] GI analogue exhibited a three-fold potency for mouse α1β1δε nAChR, whereas it obviously decreased potency at rat α9α10 nAChR compared to wildtype GI. Molecular dynamic simulations also suggest that loop2 of GI significantly affects the interaction with α1β1δε nAChR, and Tyr11 of GI is a critical residue binding with three hydrophobic amino acids of the δ subunit, including Leu93, Tyr95 and Leu103. Our research elucidates the interaction of GI and mouse α1β1δε nAChR in detail that will help to develop the novel analogues of GI.
Collapse
Affiliation(s)
- Jiong Ning
- Key Laboratory of Tropical Biological Resources, Ministry of Education, Key Lab for Marine Drugs of Haikou, Hainan University, Haikou 570228, Hainan, China.
| | - Rui Li
- Key Laboratory of Tropical Biological Resources, Ministry of Education, Key Lab for Marine Drugs of Haikou, Hainan University, Haikou 570228, Hainan, China.
| | - Jie Ren
- Key Laboratory of Tropical Biological Resources, Ministry of Education, Key Lab for Marine Drugs of Haikou, Hainan University, Haikou 570228, Hainan, China.
| | - Dongting Zhangsun
- Key Laboratory of Tropical Biological Resources, Ministry of Education, Key Lab for Marine Drugs of Haikou, Hainan University, Haikou 570228, Hainan, China.
| | - Xiaopeng Zhu
- Key Laboratory of Tropical Biological Resources, Ministry of Education, Key Lab for Marine Drugs of Haikou, Hainan University, Haikou 570228, Hainan, China.
| | - Yong Wu
- Key Laboratory of Tropical Biological Resources, Ministry of Education, Key Lab for Marine Drugs of Haikou, Hainan University, Haikou 570228, Hainan, China.
| | - Sulan Luo
- Key Laboratory of Tropical Biological Resources, Ministry of Education, Key Lab for Marine Drugs of Haikou, Hainan University, Haikou 570228, Hainan, China.
| |
Collapse
|
34
|
Dao FY, Lv H, Wang F, Ding H. Recent Advances on the Machine Learning Methods in Identifying DNA Replication Origins in Eukaryotic Genomics. Front Genet 2018; 9:613. [PMID: 30619452 PMCID: PMC6295579 DOI: 10.3389/fgene.2018.00613] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2018] [Accepted: 11/21/2018] [Indexed: 01/01/2023] Open
Abstract
The initiate site of DNA replication is called origins of replication (ORI) which is regulated by a set of regulatory proteins and plays important roles in the basic biochemical process during cell growth and division in all living organisms. Therefore, the study of ORIs is essential for understanding the cell-division cycle and gene expression regulation so that scholars can develop a new strategy against genetic diseases by using the knowledge of DNA replication. Thus, the accurate identification of ORIs will provide key clues for DNA replication research and clinical medicine. Although, the conventional experiments could provide accurate results, they are time-consuming and cost ineffective. On the contrary, bioinformatics-based methods can overcome these shortcomings. Especially, with the emergence of DNA sequences in the post-genomic era, it is highly expected to develop high throughput tools to identify ORIs based on sequence information. In this review, we will summarize the current progress in computational prediction of eukaryotic ORIs including the collection of benchmark dataset, the application of machine learning-based techniques, the results obtained by these methods, and the construction of web servers. Finally, we gave the future perspectives on ORIs prediction. The review provided readers with a whole background of ORIs prediction based on machine learning methods, which will be helpful for researchers to study DNA replication in-depth and drug therapy of genetic defect.
Collapse
Affiliation(s)
- Fu-Ying Dao
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Hao Lv
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Fang Wang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
35
|
Zou Q, Lin G, Jiang X, Liu X, Zeng X. Sequence clustering in bioinformatics: an empirical study. Brief Bioinform 2018; 21:1-10. [PMID: 30239587 DOI: 10.1093/bib/bby090] [Citation(s) in RCA: 72] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Revised: 08/18/2018] [Accepted: 08/18/2018] [Indexed: 12/13/2022] Open
Abstract
Sequence clustering is a basic bioinformatics task that is attracting renewed attention with the development of metagenomics and microbiomics. The latest sequencing techniques have decreased costs and as a result, massive amounts of DNA/RNA sequences are being produced. The challenge is to cluster the sequence data using stable, quick and accurate methods. For microbiome sequencing data, 16S ribosomal RNA operational taxonomic units are typically used. However, there is often a gap between algorithm developers and bioinformatics users. Different software tools can produce diverse results and users can find them difficult to analyze. Understanding the different clustering mechanisms is crucial to understanding the results that they produce. In this review, we selected several popular clustering tools, briefly explained the key computing principles, analyzed their characters and compared them using two independent benchmark datasets. Our aim is to assist bioinformatics users in employing suitable clustering tools effectively to analyze big sequencing data. Related data, codes and software tools were accessible at the link http://lab.malab.cn/∼lg/clustering/.
Collapse
Affiliation(s)
- Quan Zou
- Tianjin University.,University of Electronic Science and Technology of China
| | | | | | | | | |
Collapse
|
36
|
A Hybrid Deep Learning Model for Predicting Protein Hydroxylation Sites. Int J Mol Sci 2018; 19:ijms19092817. [PMID: 30231550 PMCID: PMC6164125 DOI: 10.3390/ijms19092817] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2018] [Revised: 09/12/2018] [Accepted: 09/15/2018] [Indexed: 12/17/2022] Open
Abstract
Protein hydroxylation is one type of post-translational modifications (PTMs) playing critical roles in human diseases. It is known that protein sequence contains many uncharacterized residues of proline and lysine. The question that needs to be answered is: which residue can be hydroxylated, and which one cannot. The answer will not only help understand the mechanism of hydroxylation but can also benefit the development of new drugs. In this paper, we proposed a novel approach for predicting hydroxylation using a hybrid deep learning model integrating the convolutional neural network (CNN) and long short-term memory network (LSTM). We employed a pseudo amino acid composition (PseAAC) method to construct valid benchmark datasets based on a sliding window strategy and used the position-specific scoring matrix (PSSM) to represent samples as inputs to the deep learning model. In addition, we compared our method with popular predictors including CNN, iHyd-PseAAC, and iHyd-PseCp. The results for 5-fold cross-validations all demonstrated that our method significantly outperforms the other methods in prediction accuracy.
Collapse
|
37
|
Distribution Grids Fault Location employing ST based Optimized Machine Learning Approach. ENERGIES 2018. [DOI: 10.3390/en11092328] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Precise information of fault location plays a vital role in expediting the restoration process, after being subjected to any kind of fault in power distribution grids. This paper proposed the Stockwell transform (ST) based optimized machine learning approach, to locate the faults and to identify the faulty sections in the distribution grids. This research employed the ST to extract useful features from the recorded three-phase current signals and fetches them as inputs to different machine learning tools (MLT), including the multilayer perceptron neural networks (MLP-NN), support vector machines (SVM), and extreme learning machines (ELM). The proposed approach employed the constriction-factor particle swarm optimization (CF-PSO) technique, to optimize the parameters of the SVM and ELM for their better generalization performance. Hence, it compared the obtained results of the test datasets in terms of the selected statistical performance indices, including the root mean squared error (RMSE), mean absolute percentage error (MAPE), percent bias (PBIAS), RMSE-observations to standard deviation ratio (RSR), coefficient of determination (R2), Willmott’s index of agreement (WIA), and Nash–Sutcliffe model efficiency coefficient (NSEC) to confirm the effectiveness of the developed fault location scheme. The satisfactory values of the statistical performance indices, indicated the superiority of the optimized machine learning tools over the non-optimized tools in locating faults. In addition, this research confirmed the efficacy of the faulty section identification scheme based on overall accuracy. Furthermore, the presented results validated the robustness of the developed approach against the measurement noise and uncertainties associated with pre-fault loading condition, fault resistance, and inception angle.
Collapse
|
38
|
Yang H, Lv H, Ding H, Chen W, Lin H. iRNA-2OM: A Sequence-Based Predictor for Identifying 2'-O-Methylation Sites in Homo sapiens. J Comput Biol 2018; 25:1266-1277. [PMID: 30113871 DOI: 10.1089/cmb.2018.0004] [Citation(s) in RCA: 113] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
2'-O-methylation plays an important biological role in gene expression. Owing to the explosive increase in genomic sequencing data, it is necessary to develop a method for quickly and efficiently identifying whether a sequence contains the 2'-O-methylation site. As an additional method to the experimental technique, a computational method may help to identify 2'-O-methylation sites. In this study, based on the experimental 2'-O-methylation data of Homo sapiens, we proposed a support vector machine-based model to predict 2'-O-methylation sites in H. sapiens. In this model, the RNA sequences were encoded with the optimal features obtained from feature selection. In the fivefold cross-validation test, the accuracy reached 97.95%.
Collapse
Affiliation(s)
- Hui Yang
- 1 Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China , Chengdu, China
| | - Hao Lv
- 1 Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China , Chengdu, China
| | - Hui Ding
- 1 Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China , Chengdu, China
| | - Wei Chen
- 1 Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China , Chengdu, China .,2 Department of Physics, School of Sciences, and Center for Genomics and Computational Biology, North China University of Science and Technology , Tangshan, China
| | - Hao Lin
- 1 Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China , Chengdu, China
| |
Collapse
|
39
|
Tan JX, Dao FY, Lv H, Feng PM, Ding H. Identifying Phage Virion Proteins by Using Two-Step Feature Selection Methods. Molecules 2018; 23:molecules23082000. [PMID: 30103458 PMCID: PMC6222849 DOI: 10.3390/molecules23082000] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2018] [Revised: 07/30/2018] [Accepted: 08/08/2018] [Indexed: 12/31/2022] Open
Abstract
Accurate identification of phage virion protein is not only a key step for understanding the function of the phage virion protein but also helpful for further understanding the lysis mechanism of the bacterial cell. Since traditional experimental methods are time-consuming and costly for identifying phage virion proteins, it is extremely urgent to apply machine learning methods to accurately and efficiently identify phage virion proteins. In this work, a support vector machine (SVM) based method was proposed by mixing multiple sets of optimal g-gap dipeptide compositions. The analysis of variance (ANOVA) and the minimal-redundancy-maximal-relevance (mRMR) with an increment feature selection (IFS) were applied to single out the optimal feature set. In the five-fold cross-validation test, the proposed method achieved an overall accuracy of 87.95%. We believe that the proposed method will become an efficient and powerful method for scientists concerning phage virion proteins.
Collapse
Affiliation(s)
- Jiu-Xin Tan
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Fu-Ying Dao
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Hao Lv
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Peng-Mian Feng
- Hebei Province Key Laboratory of Occupational Health and Safety for Coal Industry, School of Public Health, North China University of Science and Technology, Tangshan 063000, China.
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| |
Collapse
|
40
|
Manavalan B, Shin TH, Kim MO, Lee G. PIP-EL: A New Ensemble Learning Method for Improved Proinflammatory Peptide Predictions. Front Immunol 2018; 9:1783. [PMID: 30108593 PMCID: PMC6079197 DOI: 10.3389/fimmu.2018.01783] [Citation(s) in RCA: 88] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2018] [Accepted: 07/19/2018] [Indexed: 02/03/2023] Open
Abstract
Proinflammatory cytokines have the capacity to increase inflammatory reaction and play a central role in first line of defence against invading pathogens. Proinflammatory inducing peptides (PIPs) have been used as an antineoplastic agent, an antibacterial agent and a vaccine in immunization therapies. Due to the advancement in sequence technologies that resulted an avalanche of protein sequence data. Therefore, it is necessary to develop an automated computational method to enable fast and accurate identification of novel PIPs within the vast number of candidate proteins and peptides. To address this, we proposed a new predictor, PIP-EL, for predicting PIPs using the strategy of ensemble learning (EL). Our benchmarking dataset is imbalanced. Thus, we applied a random under-sampling technique to generate 10 balanced models for each composition. Technically, PIP-EL is the fusion of 50 independent random forest (RF) models, where each of the five different compositions, including amino acid, dipeptide, composition-transition-distribution, physicochemical properties, and amino acid index contains 10 RF models. PIP-EL achieves the Matthews' correlation coefficient (MCC) of 0.435 in a 5-fold cross-validation test, which is ~2-5% higher than that of the individual classifiers and hybrid feature-based classifier. Furthermore, we evaluate the performance of PIP-EL on the independent dataset, showing that our method outperforms the existing method and two different machine learning methods developed in this study, with an MCC of 0.454. These results indicate that PIP-EL will be a useful tool for predicting PIPs and for researchers working in the field of peptide therapeutics and immunotherapy. The user-friendly web server, PIP-EL, is freely accessible.
Collapse
Affiliation(s)
| | - Tae Hwan Shin
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea
- Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| | - Myeong Ok Kim
- Division of Life Science and Applied Life Science (BK21 Plus), College of Natural Sciences, Gyeongsang National University, Jinju, South Korea
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea
- Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| |
Collapse
|
41
|
Niu M, Li Y, Wang C, Han K. RFAmyloid: A Web Server for Predicting Amyloid Proteins. Int J Mol Sci 2018; 19:ijms19072071. [PMID: 30013015 PMCID: PMC6073578 DOI: 10.3390/ijms19072071] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Revised: 07/10/2018] [Accepted: 07/12/2018] [Indexed: 12/22/2022] Open
Abstract
Amyloid is an insoluble fibrous protein and its mis-aggregation can lead to some diseases, such as Alzheimer’s disease and Creutzfeldt–Jakob’s disease. Therefore, the identification of amyloid is essential for the discovery and understanding of disease. We established a novel predictor called RFAmy based on random forest to identify amyloid, and it employed SVMProt 188-D feature extraction method based on protein composition and physicochemical properties and pse-in-one feature extraction method based on amino acid composition, autocorrelation pseudo acid composition, profile-based features and predicted structures features. In the ten-fold cross-validation test, RFAmy’s overall accuracy was 89.19% and F-measure was 0.891. Results were obtained by comparison experiments with other feature, classifiers, and existing methods. This shows the effectiveness of RFAmy in predicting amyloid protein. The RFAmy proposed in this paper can be accessed through the URL http://server.malab.cn/RFAmyloid/.
Collapse
Affiliation(s)
- Mengting Niu
- School of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China.
| | - Yanjuan Li
- School of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China.
| | - Chunyu Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150040, China.
| | - Ke Han
- School of Computer and Information Engineering, Harbin University of Commerce, Harbin 150040, China.
| |
Collapse
|
42
|
Ogawara H. Comparison of Strategies to Overcome Drug Resistance: Learning from Various Kingdoms. Molecules 2018; 23:E1476. [PMID: 29912169 PMCID: PMC6100412 DOI: 10.3390/molecules23061476] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Revised: 06/13/2018] [Accepted: 06/15/2018] [Indexed: 11/16/2022] Open
Abstract
Drug resistance, especially antibiotic resistance, is a growing threat to human health. To overcome this problem, it is significant to know precisely the mechanisms of drug resistance and/or self-resistance in various kingdoms, from bacteria through plants to animals, once more. This review compares the molecular mechanisms of the resistance against phycotoxins, toxins from marine and terrestrial animals, plants and fungi, and antibiotics. The results reveal that each kingdom possesses the characteristic features. The main mechanisms in each kingdom are transporters/efflux pumps in phycotoxins, mutation and modification of targets and sequestration in marine and terrestrial animal toxins, ABC transporters and sequestration in plant toxins, transporters in fungal toxins, and various or mixed mechanisms in antibiotics. Antibiotic producers in particular make tremendous efforts for avoiding suicide, and are more flexible and adaptable to the changes of environments. With these features in mind, potential alternative strategies to overcome these resistance problems are discussed. This paper will provide clues for solving the issues of drug resistance.
Collapse
Affiliation(s)
- Hiroshi Ogawara
- HO Bio Institute, Yushima-2, Bunkyo-ku, Tokyo 113-0034, Japan.
- Department of Biochemistry, Meiji Pharmaceutical University, Noshio-2, Kiyose, Tokyo 204-8588, Japan.
| |
Collapse
|
43
|
Chen W, Feng P, Yang H, Ding H, Lin H, Chou KC. iRNA-3typeA: Identifying Three Types of Modification at RNA's Adenosine Sites. MOLECULAR THERAPY. NUCLEIC ACIDS 2018; 11:468-474. [PMID: 29858081 PMCID: PMC5992483 DOI: 10.1016/j.omtn.2018.03.012] [Citation(s) in RCA: 133] [Impact Index Per Article: 22.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/23/2018] [Revised: 03/25/2018] [Accepted: 03/27/2018] [Indexed: 01/09/2023]
Abstract
RNA modifications are additions of chemical groups to nucleotides or their local structural changes. Knowledge about the occurrence sites of these modifications is essential for in-depth understanding of the biological functions and mechanisms and for treating some genomic diseases as well. With the avalanche of RNA sequences generated in the post-genomic age, many computational methods have been proposed for identifying various types of RNA modifications one by one. However, so far no method whatsoever has been developed for simultaneously identifying several different types of RNA modifications. To address such a challenge, we developed a predictor called "iRNA-3typeA," by which we can simultaneously identify the occurrence sites of the following three most frequently observed modifications in RNA: (1) N1-methyladenosine (m1A), (2) N6-methyladenosine (m6A), and (3) adenosine to inosine (A-to-I). It has been shown via rigorous cross-validations for the RNA sequences from Homo sapiens and Mus musculus transcriptomes that the success rates achieved by the powerful new predictor are quite high. For the convenience of broad experimental scientists, a user-friendly web server for iRNA-3typeA has been established at http://lin-group.cn/server/iRNA-3typeA/. It is anticipated that iRNA-3typeA may become a useful high throughput tool for genome analysis.
Collapse
Affiliation(s)
- Wei Chen
- Department of Physics, School of Sciences, and Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan 063000, China; Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; Gordon Life Science Institute, Boston, MA 02478, USA.
| | - Pengmian Feng
- Hebei Province Key Laboratory of Occupational Health and Safety for Coal Industry, School of Public Health, North China University of Science and Technology, Tangshan 063000, China
| | - Hui Yang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; Gordon Life Science Institute, Boston, MA 02478, USA.
| | - Kuo-Chen Chou
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; Gordon Life Science Institute, Boston, MA 02478, USA
| |
Collapse
|
44
|
Yang H, Qiu WR, Liu G, Guo FB, Chen W, Chou KC, Lin H. iRSpot-Pse6NC: Identifying recombination spots in Saccharomyces cerevisiae by incorporating hexamer composition into general PseKNC. Int J Biol Sci 2018; 14:883-891. [PMID: 29989083 PMCID: PMC6036749 DOI: 10.7150/ijbs.24616] [Citation(s) in RCA: 135] [Impact Index Per Article: 22.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2017] [Accepted: 02/04/2018] [Indexed: 02/06/2023] Open
Abstract
Meiotic recombination caused by meiotic double-strand DNA breaks. In some regions the frequency of DNA recombination is relatively higher, while in other regions the frequency is lower: the former is usually called "recombination hotspot", while the latter the "recombination coldspot". Information of the hot and cold spots may provide important clues for understanding the mechanism of genome revolution. Therefore, it is important to accurately predict these spots. In this study, we rebuilt the benchmark dataset by unifying its samples with a same length (131 bp). Based on such a foundation and using SVM (Support Vector Machine) classifier, a new predictor called "iRSpot-Pse6NC" was developed by incorporating the key hexamer features into the general PseKNC (Pseudo K-tuple Nucleotide Composition) via the binomial distribution approach. It has been observed via rigorous cross-validations that the proposed predictor is superior to its counterparts in overall accuracy, stability, sensitivity and specificity. For the convenience of most experimental scientists, the web-server for iRSpot-Pse6NC has been established at http://lin-group.cn/server/iRSpot-Pse6NC, by which users can easily obtain their desired result without the need to go through the detailed mathematical equations involved.
Collapse
Affiliation(s)
- Hui Yang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Wang-Ren Qiu
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.,Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, 333403, China
| | - Guoqing Liu
- School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| | - Feng-Biao Guo
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Wei Chen
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.,Department of Physics, School of Sciences, and Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan 063000, China.,Gordon Life Science Institute, Boston, MA 02478, USA
| | - Kuo-Chen Chou
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.,Gordon Life Science Institute, Boston, MA 02478, USA
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.,Gordon Life Science Institute, Boston, MA 02478, USA
| |
Collapse
|
45
|
Tang H, Zhao YW, Zou P, Zhang CM, Chen R, Huang P, Lin H. HBPred: a tool to identify growth hormone-binding proteins. Int J Biol Sci 2018; 14:957-964. [PMID: 29989085 PMCID: PMC6036759 DOI: 10.7150/ijbs.24174] [Citation(s) in RCA: 136] [Impact Index Per Article: 22.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Accepted: 01/15/2018] [Indexed: 12/19/2022] Open
Abstract
Hormone-binding protein (HBP) is a kind of soluble carrier protein and can selectively and non-covalently interact with hormone. HBP plays an important role in life growth, but its function is still unclear. Correct recognition of HBPs is the first step to further study their function and understand their biological process. However, it is difficult to correctly recognize HBPs from more and more proteins through traditional biochemical experiments because of high experimental cost and long experimental period. To overcome these disadvantages, we designed a computational method for identifying HBPs accurately in the study. At first, we collected HBP data from UniProt to establish a high-quality benchmark dataset. Based on the dataset, the dipeptide composition was extracted from HBP residue sequences. In order to find out the optimal features to provide key clues for HBP identification, the analysis of various (ANOVA) was performed for feature ranking. The optimal features were selected through the incremental feature selection strategy. Subsequently, the features were inputted into support vector machine (SVM) for prediction model construction. Jackknife cross-validation results showed that 88.6% HBPs and 81.3% non-HBPs were correctly recognized, suggesting that our proposed model was powerful. This study provides a new strategy to identify HBPs. Moreover, based on the proposed model, we established a webserver called HBPred, which could be freely accessed at http://lin-group.cn/server/HBPred.
Collapse
Affiliation(s)
- Hua Tang
- Department of Pathophysiology, Southwest Medical University, Luzhou 646000, China
| | - Ya-Wei Zhao
- Key Laboratory for NeuroInformation of Ministry of Education, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Ping Zou
- Department of Pathophysiology, Southwest Medical University, Luzhou 646000, China
| | - Chun-Mei Zhang
- Department of Pathophysiology, Southwest Medical University, Luzhou 646000, China
| | - Rong Chen
- Department of Pathophysiology, Southwest Medical University, Luzhou 646000, China
| | - Po Huang
- Department of Pathophysiology, Southwest Medical University, Luzhou 646000, China
| | - Hao Lin
- Key Laboratory for NeuroInformation of Ministry of Education, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|
46
|
Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics. Molecules 2017; 23:molecules23010052. [PMID: 29278382 PMCID: PMC5943966 DOI: 10.3390/molecules23010052] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Revised: 12/15/2017] [Accepted: 12/16/2017] [Indexed: 11/29/2022] Open
Abstract
Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE) is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA) algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data.
Collapse
|