1
|
Wang L, Zeng Z, Xue Z, Wang Y. DeepNeuropePred: A robust and universal tool to predict cleavage sites from neuropeptide precursors by protein language model. Comput Struct Biotechnol J 2024; 23:309-315. [PMID: 38179071 PMCID: PMC10764246 DOI: 10.1016/j.csbj.2023.12.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 11/30/2023] [Accepted: 12/02/2023] [Indexed: 01/06/2024] Open
Abstract
Neuropeptides play critical roles in many biological processes such as growth, learning, memory, metabolism, and neuronal differentiation. A few approaches have been reported for predicting neuropeptides that are cleaved from precursor protein sequences. However, these models for cleavage site prediction of precursors were developed using a limited number of neuropeptide precursor datasets and simple precursors representation models. In addition, a universal method for predicting neuropeptide cleavage sites that can be applied to all species is still lacking. In this paper, we proposed a novel deep learning method called DeepNeuropePred, using a combination of pre-trained language model and Convolutional Neural Networks for feature extraction and predicting the neuropeptide cleavage sites from precursors. To demonstrate the model's effectiveness and robustness, we evaluated the performance of DeepNeuropePred and four models from the NeuroPred server in the independent dataset and our model achieved the highest AUC score (0.916), which are 6.9%, 7.8%, 8.8%, and 10.9% higher than Mammalian (0.857), insects (0.850), Mollusc (0.842) and Motif (0.826), respectively. For the convenience of researchers, we provide a web server (http://isyslab.info/NeuroPepV2/deepNeuropePred.jsp).
Collapse
Affiliation(s)
- Lei Wang
- Institute of Medical Artificial Intelligence, Binzhou Medical University, Yantai, Shandong 264003, China
- School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Zilu Zeng
- Wuhan Children's Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430010, China
| | - Zhidong Xue
- Institute of Medical Artificial Intelligence, Binzhou Medical University, Yantai, Shandong 264003, China
- School of Software Engineering, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Yan Wang
- Institute of Medical Artificial Intelligence, Binzhou Medical University, Yantai, Shandong 264003, China
- School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| |
Collapse
|
2
|
De La Toba EA, Anapindi KDB, Sweedler JV. Assessment and Comparison of Database Search Engines for Peptidomic Applications. J Proteome Res 2023; 22:3123-3134. [PMID: 36809008 PMCID: PMC10440370 DOI: 10.1021/acs.jproteome.2c00307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
Protein database search engines are an integral component of mass spectrometry-based peptidomic analyses. Given the unique computational challenges of peptidomics, many factors must be taken into consideration when optimizing search engine selection, as each platform has different algorithms by which tandem mass spectra are scored for subsequent peptide identifications. In this study, four different database search engines, PEAKS, MS-GF+, OMSSA, and X! Tandem, were compared with Aplysia californica and Rattus norvegicus peptidomics data sets, and various metrics were assessed such as the number of unique peptide and neuropeptide identifications, and peptide length distributions. Given the tested conditions, PEAKS was found to have the highest number of peptide and neuropeptide identifications out of the four search engines in both data sets. Furthermore, principal component analysis and multivariate logistic regression were employed to determine whether specific spectral features contribute to false C-terminal amidation assignments by each search engine. From this analysis, it was found that the primary features influencing incorrect peptide assignments were the precursor and fragment ion m/z errors. Finally, an assessment employing a mixed species protein database was performed to evaluate search engine precision and sensitivity when searched against an enlarged search space containing human proteins.
Collapse
Affiliation(s)
- Eduardo A. De La Toba
- Beckman Institute of Advanced Science and Technology, University of Illinois at Urbana-Champaign, 61801
- Department of Chemistry, University of Illinois at Urbana-Champaign, 61801
| | - Krishna D. B. Anapindi
- Beckman Institute of Advanced Science and Technology, University of Illinois at Urbana-Champaign, 61801
- Department of Chemistry, University of Illinois at Urbana-Champaign, 61801
| | - Jonathan V. Sweedler
- Beckman Institute of Advanced Science and Technology, University of Illinois at Urbana-Champaign, 61801
- Department of Chemistry, University of Illinois at Urbana-Champaign, 61801
| |
Collapse
|
3
|
Yang N, Anapindi KDB, Romanova EV, Rubakhin SS, Sweedler JV. Improved identification and quantitation of mature endogenous peptides in the rodent hypothalamus using a rapid conductive sample heating system. Analyst 2018; 142:4476-4485. [PMID: 29098220 DOI: 10.1039/c7an01358b] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Measurement, identification, and quantitation of endogenous peptides in tissue samples by mass spectrometry (MS) contribute to our understanding of the complex molecular mechanisms of numerous biological phenomena. For accurate results, it is essential to arrest the postmortem degradation of ubiquitous proteins in samples prior to performing peptidomic measurements. Doing so ensures that the detection of endogenous peptides, typically present at relatively low levels of abundance, is not overwhelmed by protein degradation products. Heat stabilization has been shown to inactivate the enzymes in tissue samples and minimize the presence of protein degradation products in the subsequent peptide extracts. However, the efficacy of different heat treatments to preserve the integrity of full-length endogenous peptides has not been well documented; prior peptidomic studies of heat stabilization methods have not distinguished between the full-length (mature) and numerous truncated (possible artifacts of sampling) forms of endogenous peptides. We show that thermal sample treatment via rapid conductive heat transfer is effective for detection of mature endogenous peptides in fresh and frozen rodent brain tissues. Freshly isolated tissue processing with the commercial Stabilizor T1 heat stabilization system resulted in the confident identification of 65% more full-length mature neuropeptides compared to widely used sample treatment in a hot water bath. This finding was validated by a follow-up quantitative multiple reaction monitoring MS analysis of select neuropeptides. The rapid conductive heating in partial vacuum provided by the Stabilizor T1 effectively reduces protein degradation and decreases the chemical complexity of the sample, as assessed by determining total protein content. This system enabled the detection, identification, and quantitation of neuropeptides related to 22 prohormones expressed in individual rat hypothalami and suprachiasmatic nuclei.
Collapse
Affiliation(s)
- Ning Yang
- Department of Chemistry and the Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana 61801, USA.
| | | | | | | | | |
Collapse
|
4
|
Veenstra JA. Neurohormones and neuropeptides encoded by the genome of Lottia gigantea, with reference to other mollusks and insects. Gen Comp Endocrinol 2010; 167:86-103. [PMID: 20171220 DOI: 10.1016/j.ygcen.2010.02.010] [Citation(s) in RCA: 161] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/15/2009] [Revised: 02/04/2010] [Accepted: 02/12/2010] [Indexed: 12/23/2022]
Abstract
The Lottia gigantea genome was prospected for the presence of genes coding neuropeptides and neurohormones. Four genes code insulin-related peptides: two genes code molluscan insulin-like growth hormones, one gene an insulin very similar to vertebrate insulin, and the fourth a peptide related to drosophila insulin-like peptide 7. Four other genes encode the cysteine-knot proteins GPA2/GPB5 and bursicon/parabursicon. Another 37 genes code for precursors of the following neuropeptides: achatin, APGWamide, allatostatin C, allatotropin, buccalin (perhaps an allatostatin A homolog), cerebrin, CCAP, conopressin, elevenin (the predicted neuropeptide made by abdominal neuron 11 in Aplysia), egg laying hormone (two genes), enterin, feeding circuit activating neuropeptide (FCAP), FFamide, FMRFamide, GGNG, a GnRH-like peptide, the newly discovered LASGLVamide, LFRFamide, LFRYamide, LRNFVamide, luqin, lymnokinin, myomodulin (two genes), the newly discovered NKY, NPY, pedal peptide (three genes), PKYMDT, pleurin, PXFVamide, small cardioactive peptides, tachykinins (two genes) and WWamide (an allatostatin B homolog). One gene was found to encode FWISamide, while about 20 closely related genes were found to encode WWFamide. These small neuropeptides appear homologous to the NdWFamide, which contains d-Trp; these genes are similar to the Aplysia gene encoding NWFamide. Some of these peptides had not been previously identified from mollusks, such as the predicted hormones similar to Drosophila and vertebrate insulins, bursicon, the putative proctolin homolog PKYMDT and allatostatin C. Together with neuropeptides which are likely homologs of other insect neuropeptides, such as cerebrin and WWamide, this shows that despite significant differences the molluscan and arthropod neuropeptidomes are more similar than generally recognized.
Collapse
Affiliation(s)
- Jan A Veenstra
- Université de Bordeaux, CNRS, CNIC UMR 5228, 33400 Talence, France.
| |
Collapse
|
5
|
Bora A, Annangudi SP, Millet LJ, Rubakhin SS, Forbes AJ, Kelleher NL, Gillette MU, Sweedler JV. Neuropeptidomics of the supraoptic rat nucleus. J Proteome Res 2008; 7:4992-5003. [PMID: 18816085 PMCID: PMC2646869 DOI: 10.1021/pr800394e] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The mammalian supraoptic nucleus (SON) is a neuroendocrine center in the brain regulating a variety of physiological functions. Within the SON, peptidergic magnocellular neurons that project to the neurohypophysis (posterior pituitary) are involved in controlling osmotic balance, lactation, and parturition, partly through secretion of signaling peptides such as oxytocin and vasopressin into the blood. An improved understanding of SON activity and function requires identification and characterization of the peptides used by the SON. Here, small-volume sample preparation approaches are optimized for neuropeptidomic studies of isolated SON samples ranging from entire nuclei down to single magnocellular neurons. Unlike most previous mammalian peptidome studies, tissues are not immediately heated or microwaved. SON samples are obtained from ex vivo brain slice preparations via tissue punch and the samples processed through sequential steps of peptide extraction. Analyses of the samples via liquid chromatography mass spectrometry and tandem mass spectrometry result in the identification of 85 peptides, including 20 unique peptides from known prohormones. As the sample size is further reduced, the depth of peptide coverage decreases; however, even from individually isolated magnocellular neuroendocrine cells, vasopressin and several other peptides are detected.
Collapse
Affiliation(s)
- Adriana Bora
- Neuroscience Program, Department of Cell and Developmental Biology, Beckman Institute, and Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | | | | | | | | | | | | | | |
Collapse
|
6
|
Boonen K, Landuyt B, Baggerman G, Husson SJ, Huybrechts J, Schoofs L. Peptidomics: The integrated approach of MS, hyphenated techniques and bioinformatics for neuropeptide analysis. J Sep Sci 2008; 31:427-45. [DOI: 10.1002/jssc.200700450] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
7
|
Composition, Structure and Function from Endopeptidease of Aplysia Egg Analyzed with Matrix Assisted Laser Desorption Ionization-Time of Flight-Mass Spectrometry. CHINESE JOURNAL OF ANALYTICAL CHEMISTRY 2007. [DOI: 10.1016/s1872-2040(07)60072-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
8
|
Proekt A, Vilim FS, Alexeeva V, Brezina V, Friedman A, Jing J, Li L, Zhurov Y, Sweedler JV, Weiss KR. Identification of a new neuropeptide precursor reveals a novel source of extrinsic modulation in the feeding system of Aplysia. J Neurosci 2006; 25:9637-48. [PMID: 16237168 PMCID: PMC6725720 DOI: 10.1523/jneurosci.2932-05.2005] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
The Aplysia feeding system is advantageous for investigating the role of neuropeptides in behavioral plasticity. One family of Aplysia neuropeptides is the myomodulins (MMs), originally purified from one of the feeding muscles, the accessory radula closer (ARC). However, two MMs, MMc and MMe, are not encoded on the only known MM gene. Here, we identify MM gene 2 (MMG2), which encodes MMc and MMe and four new neuropeptides. We use matrix-assisted laser desorption/ionization time-of-flight mass spectrometry to verify that these novel MMG2-derived peptides (MMG2-DPs), as well as MMc and MMe, are synthesized from the precursor. Using antibodies against the MMG2-DPs, we demonstrate that neuronal processes that stain for MMG2-DPs are found in the buccal ganglion, which contains the feeding network, and in the buccal musculature including the ARC muscle. Surprisingly, however, no immunostaining is observed in buccal neurons including the ARC motoneurons. In situ hybridization reveals only few MMG2-expressing neurons that are mostly located in the pedal ganglion. Using immunohistochemical and electrophysiological techniques, we demonstrate that some of these pedal neurons project to the buccal ganglion and are the likely source of the MMG2-DP innervation of the feeding network and musculature. We show that the MMG2-DPs are bioactive both centrally and peripherally: they bias egestive feeding programs toward ingestive ones, and they modulate ARC muscle contractions. The multiple actions of the MMG2-DPs suggest that these peptides play a broad role in behavioral plasticity and that the pedal-buccal projection neurons that express them are a novel source of extrinsic modulation of the feeding system of Aplysia.
Collapse
Affiliation(s)
- Alex Proekt
- Department of Neuroscience, Mount Sinai School of Medicine, New York, New York 10029, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Fu Q, Christie AE, Li L. Mass spectrometric characterization of crustacean hyperglycemic hormone precursor-related peptides (CPRPs) from the sinus gland of the crab, Cancer productus. Peptides 2005; 26:2137-50. [PMID: 16269349 DOI: 10.1016/j.peptides.2005.03.040] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/30/2005] [Revised: 03/15/2005] [Accepted: 03/17/2005] [Indexed: 12/25/2022]
Abstract
Crustacean hyperglycemic hormone (CHH) precursor-related peptides (CPRPs) are produced during the proteolytic processing of CHH preprohormones. Currently, the physiological roles played by CPRPs are unknown. Due to their large size, direct mass spectrometric sequencing of intact CPRPs is difficult. Here, we describe a novel strategy for sequencing Cancer productus CPRPs directly from a tissue extract using nanoflow liquid chromatography coupled to quadrupole time-of-flight tandem mass spectrometry. Four novel CPRPs were characterized with the aid of MS/MS de novo sequencing of 27 truncated CPRP peptides. Extensive modifications (methionine oxidation and carboxy-terminal methylation) were identified in both the full-length and truncated peptides. To investigate the origin of the modifications and truncations, a full-length CPRP was synthesized and subjected to the same storage and extraction protocols used for the characterization of the native peptides. Here, some methionine oxidation was seen, however, no methylation or truncation was evident suggesting much of the chemical complexity seen in the native CPRPs is unlikely due to a sample preparation artifact. Collectively, our study represents the most complete characterization of CPRPs to date and provides a foundation for future investigation of CPRP function in C. productus.
Collapse
Affiliation(s)
- Qiang Fu
- Department of Chemistry, University of Wisconsin, 1101 University Avenue, Madison, WI 53706, USA
| | | | | |
Collapse
|
10
|
Hummon AB, Hummon NP, Corbin RW, Li L, Vilim FS, Weiss KR, Sweedler JV. From precursor to final peptides: a statistical sequence-based approach to predicting prohormone processing. J Proteome Res 2004; 2:650-6. [PMID: 14692459 DOI: 10.1021/pr034046d] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Predicting the final neuropeptide products from neuropeptides genes has been problematic because of the large number of enzymes responsible for their processing. The basic processing of 22 Aplysia californica prohormones representing 750 cleavage sites have been analyzed and statistically modeled using binary logistic regression analyses. Two models are presented that predict cleavage probabilities at basic residues based on prohormone sequence. The complex model has a correct classification rate of 97%, a sensitivity of 97%, and a specificity of 96% when tested on the Aplysia dataset.
Collapse
Affiliation(s)
- Amanda B Hummon
- Department of Chemistry and the Beckman Institute, University of Illinois, Urbana, Illinois 61801, USA
| | | | | | | | | | | | | |
Collapse
|