1
|
Sikirzhytskaya A, Tyagin I, Sutton SS, Wyatt MD, Safro I, Shtutman M. AI-based mining of biomedical literature: Applications for drug repurposing for the treatment of dementia. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.06.597745. [PMID: 38895485 PMCID: PMC11185689 DOI: 10.1101/2024.06.06.597745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Neurodegenerative pathologies such as Alzheimer's disease, Parkinson's disease, Huntington's disease, Amyotrophic lateral sclerosis, Multiple sclerosis, HIV-associated neurocognitive disorder, and others significantly affect individuals, their families, caregivers, and healthcare systems. While there are no cures yet, researchers worldwide are actively working on the development of novel treatments that have the potential to slow disease progression, alleviate symptoms, and ultimately improve the overall health of patients. Huge volumes of new scientific information necessitate new analytical approaches for meaningful hypothesis generation. To enable the automatic analysis of biomedical data we introduced AGATHA, an effective AI-based literature mining tool that can navigate massive scientific literature databases, such as PubMed. The overarching goal of this effort is to adapt AGATHA for drug repurposing by revealing hidden connections between FDA-approved medications and a health condition of interest. Our tool converts the abstracts of peer-reviewed papers from PubMed into multidimensional space where each gene and health condition are represented by specific metrics. We implemented advanced statistical analysis to reveal distinct clusters of scientific terms within the virtual space created using AGATHA-calculated parameters for selected health conditions and genes. Partial Least Squares Discriminant Analysis was employed for categorizing and predicting samples (122 diseases and 20889 genes) fitted to specific classes. Advanced statistics were employed to build a discrimination model and extract lists of genes specific to each disease class. Here we focus on drugs that can be repurposed for dementia treatment as an outcome of neurodegenerative diseases. Therefore, we determined dementia-associated genes statistically highly ranked in other disease classes. Additionally, we report a mechanism for detecting genes common to multiple health conditions. These sets of genes were classified based on their presence in biological pathways, aiding in selecting candidates and biological processes that are exploitable with drug repurposing. Author Summary This manuscript outlines our project involving the application of AGATHA, an AI-based literature mining tool, to discover drugs with the potential for repurposing in the context of neurocognitive disorders. The primary objective is to identify connections between approved medications and specific health conditions through advanced statistical analysis, including techniques like Partial Least Squares Discriminant Analysis (PLSDA) and unsupervised clustering. The methodology involves grouping scientific terms related to different health conditions and genes, followed by building discrimination models to extract lists of disease-specific genes. These genes are then analyzed through pathway analysis to select candidates for drug repurposing.
Collapse
|
2
|
Yan J, Kurgan L. DRNApred, fast sequence-based method that accurately predicts and discriminates DNA- and RNA-binding residues. Nucleic Acids Res 2017; 45:e84. [PMID: 28132027 PMCID: PMC5449545 DOI: 10.1093/nar/gkx059] [Citation(s) in RCA: 75] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2016] [Accepted: 01/24/2017] [Indexed: 01/18/2023] Open
Abstract
Protein-DNA and protein-RNA interactions are part of many diverse and essential cellular functions and yet most of them remain to be discovered and characterized. Recent research shows that sequence-based predictors of DNA-binding residues accurately find these residues but also cross-predict many RNA-binding residues as DNA-binding, and vice versa. Most of these methods are also relatively slow, prohibiting applications on the whole-genome scale. We describe a novel sequence-based method, DRNApred, which accurately and in high-throughput predicts and discriminates between DNA- and RNA-binding residues. DRNApred was designed using a new dataset with both DNA- and RNA-binding proteins, regression that penalizes cross-predictions, and a novel two-layered architecture. DRNApred outperforms state-of-the-art predictors of DNA- or RNA-binding residues on a benchmark test dataset by substantially reducing the cross predictions and predicting arguably higher quality false positives that are located nearby the native binding residues. Moreover, it also more accurately predicts the DNA- and RNA-binding proteins. Application on the human proteome confirms that DRNApred reduces the cross predictions among the native nucleic acid binders. Also, novel putative DNA/RNA-binding proteins that it predicts share similar subcellular locations and residue charge profiles with the known native binding proteins. Webserver of DRNApred is freely available at http://biomine.cs.vcu.edu/servers/DRNApred/.
Collapse
Affiliation(s)
- Jing Yan
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton T6G 2V4, Canada
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, 23284, USA
| |
Collapse
|
3
|
Palharini JG, Richter AC, Silva MF, Ferreira FB, Pirovani CP, Naves KSC, Goulart VA, Mineo TWP, Silva MJB, Santiago FM. Eutirucallin: A Lectin with Antitumor and Antimicrobial Properties. Front Cell Infect Microbiol 2017; 7:136. [PMID: 28487845 PMCID: PMC5403948 DOI: 10.3389/fcimb.2017.00136] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2016] [Accepted: 03/31/2017] [Indexed: 11/13/2022] Open
Abstract
Eutirucallin is a lectin isolated from the latex of Euphorbia tirucalli, a plant known for its medical properties. The present study explores various characteristics of Eutirucallin including stability, cytotoxicity against tumor cells, antimicrobial and antiparasitic activities. Eutirucallin was stable from 2 to 40 days at 4°C, maintained hemagglutinating activity within a restricted range, and showed optimal activity at pH 7.0–8.0. Eutirucallin presented antiproliferative activity for HeLa, PC3, MDA-MB-231, and MCF-7 tumor cells but was not cytotoxic for non-tumorigenic cells such as macrophages and fibroblasts. Eutirucallin inhibited the Ehrlich ascites carcinoma in vivo and it was also observed that Eutirucallin inhibited 62.5% of Escherichia coli growth. Also, Eutirucallin showed to be effective when tested directly against Toxoplasma gondii infection in vitro. Therefore, this study sheds perspectives for pharmacological applications of Eutirucallin.
Collapse
Affiliation(s)
- Julio G Palharini
- Laboratory of Immunoparasitology "Dr. Mario Endsfeldz Camargo", Institute of Biomedical Sciences, Federal University of UberlândiaUberlândia, Brazil
| | - Aline C Richter
- Laboratory of Immunoparasitology "Dr. Mario Endsfeldz Camargo", Institute of Biomedical Sciences, Federal University of UberlândiaUberlândia, Brazil
| | - Mariana F Silva
- Laboratory of Immunoparasitology "Dr. Mario Endsfeldz Camargo", Institute of Biomedical Sciences, Federal University of UberlândiaUberlândia, Brazil
| | - Flavia B Ferreira
- Laboratory of Immunoparasitology "Dr. Mario Endsfeldz Camargo", Institute of Biomedical Sciences, Federal University of UberlândiaUberlândia, Brazil
| | - Carlos P Pirovani
- Biological Sciences Department, State University of Santa CruzIlhéus, Brazil
| | - Karinne S C Naves
- Laboratory of Clinical Bacteriology, Institute of Biomedical Sciences, Federal University of UberlândiaUberlândia, Brazil
| | - Vivian A Goulart
- Laboratory of Nanobiotechnology, Institute of Genetics and Biochemistry, Federal University of UberlândiaUberlândia, Brazil
| | - Tiago W P Mineo
- Laboratory of Immunoparasitology "Dr. Mario Endsfeldz Camargo", Institute of Biomedical Sciences, Federal University of UberlândiaUberlândia, Brazil
| | - Marcelo J B Silva
- Laboratory of Tumor Biomarkers and Osteoimmunology, Institute of Biomedical Sciences, Federal University of UberlândiaUberlândia, Brazil
| | - Fernanda M Santiago
- Laboratory of Immunoparasitology "Dr. Mario Endsfeldz Camargo", Institute of Biomedical Sciences, Federal University of UberlândiaUberlândia, Brazil
| |
Collapse
|
4
|
Lee HC, Hsu YY, Kao HY. AuDis: an automatic CRF-enhanced disease normalization in biomedical text. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw091. [PMID: 27278815 PMCID: PMC4897593 DOI: 10.1093/database/baw091] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2015] [Accepted: 05/09/2016] [Indexed: 01/22/2023]
Abstract
Diseases play central roles in many areas of biomedical research and healthcare. Consequently, aggregating the disease knowledge and treatment research reports becomes an extremely critical issue, especially in rapid-growth knowledge bases (e.g. PubMed). We therefore developed a system, AuDis, for disease mention recognition and normalization in biomedical texts. Our system utilizes an order two conditional random fields model. To optimize the results, we customize several post-processing steps, including abbreviation resolution, consistency improvement and stopwords filtering. As the official evaluation on the CDR task in BioCreative V, AuDis obtained the best performance (86.46% of F-score) among 40 runs (16 unique teams) on disease normalization of the DNER sub task. These results suggest that AuDis is a high-performance recognition system for disease recognition and normalization from biomedical literature.Database URL: http://ikmlab.csie.ncku.edu.tw/CDR2015/AuDis.html.
Collapse
Affiliation(s)
- Hsin-Chun Lee
- Institute of Medical Informatics, National Cheng Kung University, Tainan, Taiwan, R.O.C
| | - Yi-Yu Hsu
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan, R.O.C
| | - Hung-Yu Kao
- Institute of Medical Informatics, National Cheng Kung University, Tainan, Taiwan, R.O.C Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan, R.O.C
| |
Collapse
|
5
|
Yang HT, Ju JH, Wong YT, Shmulevich I, Chiang JH. Literature-based discovery of new candidates for drug repurposing. Brief Bioinform 2016; 18:488-497. [DOI: 10.1093/bib/bbw030] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Indexed: 11/14/2022] Open
|
6
|
Mallory EK, Zhang C, Ré C, Altman RB. Large-scale extraction of gene interactions from full-text literature using DeepDive. Bioinformatics 2015; 32:106-13. [PMID: 26338771 PMCID: PMC4681986 DOI: 10.1093/bioinformatics/btv476] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Accepted: 08/11/2015] [Indexed: 01/18/2023] Open
Abstract
MOTIVATION A complete repository of gene-gene interactions is key for understanding cellular processes, human disease and drug response. These gene-gene interactions include both protein-protein interactions and transcription factor interactions. The majority of known interactions are found in the biomedical literature. Interaction databases, such as BioGRID and ChEA, annotate these gene-gene interactions; however, curation becomes difficult as the literature grows exponentially. DeepDive is a trained system for extracting information from a variety of sources, including text. In this work, we used DeepDive to extract both protein-protein and transcription factor interactions from over 100,000 full-text PLOS articles. METHODS We built an extractor for gene-gene interactions that identified candidate gene-gene relations within an input sentence. For each candidate relation, DeepDive computed a probability that the relation was a correct interaction. We evaluated this system against the Database of Interacting Proteins and against randomly curated extractions. RESULTS Our system achieved 76% precision and 49% recall in extracting direct and indirect interactions involving gene symbols co-occurring in a sentence. For randomly curated extractions, the system achieved between 62% and 83% precision based on direct or indirect interactions, as well as sentence-level and document-level precision. Overall, our system extracted 3356 unique gene pairs using 724 features from over 100,000 full-text articles. AVAILABILITY AND IMPLEMENTATION Application source code is publicly available at https://github.com/edoughty/deepdive_genegene_app CONTACT russ.altman@stanford.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Emily K Mallory
- Biomedical Informatics Training Program, Stanford University, Stanford, CA 94305, USA
| | - Ce Zhang
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA
| | | | - Russ B Altman
- Department of Bioengineering, Department of Genetics and Department of Medicine, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
7
|
Schmitz U, Naderi-Meshkin H, Gupta SK, Wolkenhauer O, Vera J. The RNA world in the 21st century-a systems approach to finding non-coding keys to clinical questions. Brief Bioinform 2015; 17:380-92. [PMID: 26330575 DOI: 10.1093/bib/bbv061] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2015] [Indexed: 02/01/2023] Open
Abstract
There was evidence that RNAs are a functionally rich class of molecules not only since the arrival of the next-generation sequencing technology. Non-coding RNAs (ncRNA) could be the key to accelerated diagnosis and enhanced prediction of disease and therapy outcomes as well as the design of advanced therapeutic strategies to overcome yet unsatisfactory approaches.In this review, we discuss the state of the art in RNA systems biology with focus on the application in the systems biomedicine field. We propose guidelines for analysing the role of microRNAs and long non-coding RNAs in human pathologies. We introduce RNA expression profiling and network approaches for the identification of stable and effective RNomics-based biomarkers, providing insights into the role of ncRNAs in disease regulation. Towards this, we discuss ways to model the dynamics of gene regulatory networks and signalling pathways that involve ncRNAs. We also describe data resources and computational methods for finding putative mechanisms of action of ncRNAs. Finally, we discuss avenues for the computer-aided design of novel RNA-based therapeutics.
Collapse
|
8
|
Renaudin X, Guervilly JH, Aoufouchi S, Rosselli F. Proteomic analysis reveals a FANCA-modulated neddylation pathway involved in CXCR5 membrane targeting and cell mobility. J Cell Sci 2014; 127:3546-54. [PMID: 25015289 DOI: 10.1242/jcs.150706] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
The aim of this study was to identify novel substrates of the FANCcore complex, the inactivation of which leads to the genetic disorder Fanconi anemia, which is associated with bone marrow failure, developmental abnormalities and a predisposition to cancer. Eight FANC proteins participate in the nuclear FANCcore complex, which functions as an E3 ubiquitin-ligase that monoubiquitylates FANCD2 and FANCI in response to replicative stress. Here, we use mass spectrometry to compare proteins from FANCcore-complex-deficient cells to those of rescued control cells after treatment with hydroxyurea, an inducer of FANCD2 monoubiquitylation. FANCD2 and FANCI appear to be the only targets of the FANCcore complex. We identify other proteins that are post-translationally modified in a FANCA- or FANCC-dependent manner. The majority of these potential targets localize to the cell membrane. Finally, we demonstrate that (a) the chemokine receptor CXCR5 is neddylated; (b) FANCA but not FANCC appears to modulate CXCR5 neddylation through an unknown mechanism; (c) CXCR5 neddylation is involved in targeting the receptor to the cell membrane; and (d) CXCR5 neddylation stimulates cell migration and motility. Our work has uncovered a pathway involving FANCA in neddylation and cell motility.
Collapse
Affiliation(s)
- Xavier Renaudin
- Université Paris-Sud, 91400 Orsay, France CNRS UMR 8200 - Institut de Cancérologie Gustave Roussy, 94805 Villejuif, France Equipe Labellisée Ligue Contre le Cancer, 14 Rue Corvisart, 75013 Paris
| | - Jean-Hugues Guervilly
- Université Paris-Sud, 91400 Orsay, France CNRS UMR 8200 - Institut de Cancérologie Gustave Roussy, 94805 Villejuif, France Equipe Labellisée Ligue Contre le Cancer, 14 Rue Corvisart, 75013 Paris
| | - Said Aoufouchi
- Université Paris-Sud, 91400 Orsay, France CNRS UMR 8200 - Institut de Cancérologie Gustave Roussy, 94805 Villejuif, France
| | - Filippo Rosselli
- Université Paris-Sud, 91400 Orsay, France CNRS UMR 8200 - Institut de Cancérologie Gustave Roussy, 94805 Villejuif, France Equipe Labellisée Ligue Contre le Cancer, 14 Rue Corvisart, 75013 Paris
| |
Collapse
|
9
|
A lectin from Bothrops leucurus snake venom raises cytosolic calcium levels and promotes B16-F10 melanoma necrotic cell death via mitochondrial permeability transition. Toxicon 2014; 82:97-103. [PMID: 24593964 DOI: 10.1016/j.toxicon.2014.02.018] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2013] [Revised: 02/18/2014] [Accepted: 02/20/2014] [Indexed: 11/22/2022]
Abstract
BlL, a galactose-binding C-type lectin purified from Bothrops leucurus snake venom, exhibits anticancer activity. The current study was designed to elucidate the cellular mechanisms by which BlL induces melanoma cell death. The viabilities of B16-F10 melanoma cells and HaCaT keratinocytes treated with BlL were evaluated. Necrotic and apoptotic cell death, cytosolic Ca(2+) levels, mitochondrial Ca(2+) transport and superoxide levels were assessed in B16-F10 melanoma cells exposed to BlL. We found that treatment with BlL caused dose-dependent necrotic cell death in B16-F10 melanoma cells. Conversely, the viability of non-tumorigenic HaCaT cells was not affected by similar doses of BlL. BlL-induced B16-F10 necrosis was preceded by a significant (2-fold) increase in cytosolic calcium concentrations and a significant (3-fold) increase in mitochondrial superoxide generation. It is likely that BlL treatment triggers B16-F10 cell death via mitochondrial permeability transition (MPT) pore opening because the pharmacological MPT inhibitors bongkrekic acid and Debio 025 greatly attenuated BlL-induced cell death. Experiments evaluating mitochondrial Ca(2+) transport in permeabilized B16-F10 cells strongly supported the hypothesis that BlL rapidly stimulates cyclosporine A-sensitive Ca(2+)-induced MPT pore opening. We therefore conclude that BlL causes selective B16-F10 melanoma cell death via dysregulation of cellular Ca(2+) homeostasis and Ca(2+)-induced opening of MPT pore.
Collapse
|