Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Raja K, Natarajan J. Mining protein phosphorylation information from biomedical literature using NLP parsing and Support Vector Machines. Comput Methods Programs Biomed 2018;160:57-64. [PMID: 29728247 DOI: 10.1016/j.cmpb.2018.03.022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2016] [Revised: 02/23/2018] [Accepted: 03/22/2018] [Indexed: 06/08/2023]

For:	Raja K, Natarajan J. Mining protein phosphorylation information from biomedical literature using NLP parsing and Support Vector Machines. Comput Methods Programs Biomed 2018;160:57-64. [PMID: 29728247 DOI: 10.1016/j.cmpb.2018.03.022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2016] [Revised: 02/23/2018] [Accepted: 03/22/2018] [Indexed: 06/08/2023]

Number

Cited by Other Article(s)

Savage SR, Zhang Y, Jaehnig EJ, Liao Y, Shi Z, Pham HA, Xu H, Zhang B. IDPpub: Illuminating the Dark Phosphoproteome Through PubMed Mining. Mol Cell Proteomics 2024;23:100682. [PMID: 37993103 PMCID: PMC10716774 DOI: 10.1016/j.mcpro.2023.100682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 10/25/2023] [Accepted: 11/14/2023] [Indexed: 11/24/2023] Open

Abstract

Global phosphoproteomics experiments quantify tens of thousands of phosphorylation sites. However, data interpretation is hampered by our limited knowledge on functions, biological contexts, or precipitating enzymes of the phosphosites. This study establishes a repository of phosphosites with associated evidence in biomedical abstracts, using deep learning-based natural language processing techniques. Our model for illuminating the dark phosphoproteome through PubMed mining (IDPpub) was generated by fine-tuning BioBERT, a deep learning tool for biomedical text mining. Trained using sentences containing protein substrates and phosphorylation site positions from 3000 abstracts, the IDPpub model was then used to extract phosphorylation sites from all MEDLINE abstracts. The extracted proteins were normalized to gene symbols using the National Center for Biotechnology Information gene query, and sites were mapped to human UniProt sequences using ProtMapper and mouse UniProt sequences by direct match. Precision and recall were calculated using 150 curated abstracts, and utility was assessed by analyzing the CPTAC (Clinical Proteomics Tumor Analysis Consortium) pan-cancer phosphoproteomics datasets and the PhosphoSitePlus database. Using 10-fold cross validation, pairs of correct substrates and phosphosite positions were extracted with an average precision of 0.93 and recall of 0.94. After entity normalization and site mapping to human reference sequences, an independent validation achieved a precision of 0.91 and recall of 0.77. The IDPpub repository contains 18,458 unique human phosphorylation sites with evidence sentences from 58,227 abstracts and 5918 mouse sites in 14,610 abstracts. This included evidence sentences for 1803 sites identified in CPTAC studies that are not covered by manually curated functional information in PhosphoSitePlus. Evaluation results demonstrate the potential of IDPpub as an effective biomedical text mining tool for collecting phosphosites. Moreover, the repository (http://idppub.ptmax.org), which can be automatically updated, can serve as a powerful complement to existing resources.

Collapse

Arumugam K, Sellappan M, Anand D, Anand S, Radhakrishnan SV. A Text Mining and Machine Learning Protocol for Extracting Posttranslational Modifications of Proteins from PubMed: A Special Focus on Glycosylation, Acetylation, Methylation, Hydroxylation, and Ubiquitination. Methods Mol Biol 2022;2496:179-202. [PMID: 35713865 DOI: 10.1007/978-1-0716-2305-3_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Anand S, Iyyappan OR, Manoharan S, Anand D, Jose MA, Shanker RR. Text Mining Protocol to Retrieve Significant Drug-Gene Interactions from PubMed Abstracts. Methods Mol Biol 2022;2496:17-39. [PMID: 35713857 DOI: 10.1007/978-1-0716-2305-3_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Automated Extraction and Visualization of Protein-Protein Interaction Networks and Beyond: A Text-Mining Protocol. Methods Mol Biol 2020;2074:13-34. [PMID: 31583627 DOI: 10.1007/978-1-4939-9873-9_2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]