1
|
Lötsch J, Lippmann C, Kringel D, Ultsch A. Integrated Computational Analysis of Genes Associated with Human Hereditary Insensitivity to Pain. A Drug Repurposing Perspective. Front Mol Neurosci 2017; 10:252. [PMID: 28848388 PMCID: PMC5550731 DOI: 10.3389/fnmol.2017.00252] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2017] [Accepted: 07/26/2017] [Indexed: 12/31/2022] Open
Abstract
Genes causally involved in human insensitivity to pain provide a unique molecular source of studying the pathophysiology of pain and the development of novel analgesic drugs. The increasing availability of “big data” enables novel research approaches to chronic pain while also requiring novel techniques for data mining and knowledge discovery. We used machine learning to combine the knowledge about n = 20 genes causally involved in human hereditary insensitivity to pain with the knowledge about the functions of thousands of genes. An integrated computational analysis proposed that among the functions of this set of genes, the processes related to nervous system development and to ceramide and sphingosine signaling pathways are particularly important. This is in line with earlier suggestions to use these pathways as therapeutic target in pain. Following identification of the biological processes characterizing hereditary insensitivity to pain, the biological processes were used for a similarity analysis with the functions of n = 4,834 database-queried drugs. Using emergent self-organizing maps, a cluster of n = 22 drugs was identified sharing important functional features with hereditary insensitivity to pain. Several members of this cluster had been implicated in pain in preclinical experiments. Thus, the present concept of machine-learned knowledge discovery for pain research provides biologically plausible results and seems to be suitable for drug discovery by identifying a narrow choice of repurposing candidates, demonstrating that contemporary machine-learned methods offer innovative approaches to knowledge discovery from available evidence.
Collapse
Affiliation(s)
- Jörn Lötsch
- Institute of Clinical Pharmacology, Goethe-UniversityFrankfurt am Main, Germany.,Fraunhofer Institute of Molecular Biology and Applied Ecology-Project Group, Translational Medicine and Pharmacology (IME-TMP)Frankfurt am Main, Germany
| | - Catharina Lippmann
- Fraunhofer Institute of Molecular Biology and Applied Ecology-Project Group, Translational Medicine and Pharmacology (IME-TMP)Frankfurt am Main, Germany
| | - Dario Kringel
- Institute of Clinical Pharmacology, Goethe-UniversityFrankfurt am Main, Germany
| | - Alfred Ultsch
- DataBionics Research Group, University of MarburgMarburg, Germany
| |
Collapse
|
2
|
Wierschin T, Wang K, Welter M, Waack S, Stanke M. Combining features in a graphical model to predict protein binding sites. Proteins 2015; 83:844-52. [PMID: 25663045 DOI: 10.1002/prot.24775] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2014] [Revised: 01/16/2015] [Accepted: 01/26/2015] [Indexed: 11/08/2022]
Abstract
Large efforts have been made in classifying residues as binding sites in proteins using machine learning methods. The prediction task can be translated into the computational challenge of assigning each residue the label binding site or non-binding site. Observational data comes from various possibly highly correlated sources. It includes the structure of the protein but not the structure of the complex. The model class of conditional random fields (CRFs) has previously successfully been used for protein binding site prediction. Here, a new CRF-approach is presented that models the dependencies of residues using a general graphical structure defined as a neighborhood graph and thus our model makes fewer independence assumptions on the labels than sequential labeling approaches. A novel node feature "change in free energy" is introduced into the model, which is then denoted by ΔF-CRF. Parameters are trained with an online large-margin algorithm. Using the standard feature class relative accessible surface area alone, the general graph-structure CRF already achieves higher prediction accuracy than the linear chain CRF of Li et al. ΔF-CRF performs significantly better on a large range of false positive rates than the support-vector-machine-based program PresCont of Zellner et al. on a homodimer set containing 128 chains. ΔF-CRF has a broader scope than PresCont since it is not constrained to protein subgroups and requires no multiple sequence alignment. The improvement is attributed to the advantageous combination of the novel node feature with the standard feature and to the adopted parameter training method.
Collapse
Affiliation(s)
- Torsten Wierschin
- Institute of Mathematics and Computer Science, University of Greifswald, 17487, Greifswald, Germany
| | | | | | | | | |
Collapse
|
3
|
Hindumathi V, Kranthi T, Rao SB, Manimaran P. The prediction of candidate genes for cervix related cancer through gene ontology and graph theoretical approach. MOLECULAR BIOSYSTEMS 2014; 10:1450-60. [PMID: 24647578 DOI: 10.1039/c4mb00004h] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
With rapidly changing technology, prediction of candidate genes has become an indispensable task in recent years mainly in the field of biological research. The empirical methods for candidate gene prioritization that succors to explore the potential pathway between genetic determinants and complex diseases are highly cumbersome and labor intensive. In such a scenario predicting potential targets for a disease state through in silico approaches are of researcher's interest. The prodigious availability of protein interaction data coupled with gene annotation renders an ease in the accurate determination of disease specific candidate genes. In our work we have prioritized the cervix related cancer candidate genes by employing Csaba Ortutay and his co-workers approach of identifying the candidate genes through graph theoretical centrality measures and gene ontology. With the advantage of the human protein interaction data, cervical cancer gene sets and the ontological terms, we were able to predict 15 novel candidates for cervical carcinogenesis. The disease relevance of the anticipated candidate genes was corroborated through a literature survey. Also the presence of the drugs for these candidates was detected through Therapeutic Target Database (TTD) and DrugMap Central (DMC) which affirms that they may be endowed as potential drug targets for cervical cancer.
Collapse
Affiliation(s)
- V Hindumathi
- C R Rao Advanced Institute of Mathematics, Statistics and Computer Science, University of Hyderabad Campus, Prof. C R Rao Road, Gachibowli, Hyderabad - 500046, India.
| | | | | | | |
Collapse
|
4
|
Csermely P, Korcsmáros T, Kiss HJM, London G, Nussinov R. Structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehensive review. Pharmacol Ther 2013; 138:333-408. [PMID: 23384594 PMCID: PMC3647006 DOI: 10.1016/j.pharmthera.2013.01.016] [Citation(s) in RCA: 512] [Impact Index Per Article: 46.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2013] [Accepted: 01/22/2013] [Indexed: 02/02/2023]
Abstract
Despite considerable progress in genome- and proteome-based high-throughput screening methods and in rational drug design, the increase in approved drugs in the past decade did not match the increase of drug development costs. Network description and analysis not only give a systems-level understanding of drug action and disease complexity, but can also help to improve the efficiency of drug design. We give a comprehensive assessment of the analytical tools of network topology and dynamics. The state-of-the-art use of chemical similarity, protein structure, protein-protein interaction, signaling, genetic interaction and metabolic networks in the discovery of drug targets is summarized. We propose that network targeting follows two basic strategies. The "central hit strategy" selectively targets central nodes/edges of the flexible networks of infectious agents or cancer cells to kill them. The "network influence strategy" works against other diseases, where an efficient reconfiguration of rigid networks needs to be achieved by targeting the neighbors of central nodes/edges. It is shown how network techniques can help in the identification of single-target, edgetic, multi-target and allo-network drug target candidates. We review the recent boom in network methods helping hit identification, lead selection optimizing drug efficacy, as well as minimizing side-effects and drug toxicity. Successful network-based drug development strategies are shown through the examples of infections, cancer, metabolic diseases, neurodegenerative diseases and aging. Summarizing >1200 references we suggest an optimized protocol of network-aided drug development, and provide a list of systems-level hallmarks of drug quality. Finally, we highlight network-related drug development trends helping to achieve these hallmarks by a cohesive, global approach.
Collapse
Affiliation(s)
- Peter Csermely
- Department of Medical Chemistry, Semmelweis University, P.O. Box 260, H-1444 Budapest 8, Hungary.
| | | | | | | | | |
Collapse
|
5
|
Sugaya N, Kanai S, Furuya T. Dr. PIAS 2.0: an update of a database of predicted druggable protein-protein interactions. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2012; 2012:bas034. [PMID: 23060433 PMCID: PMC3468816 DOI: 10.1093/database/bas034] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Druggable Protein–protein Interaction Assessment System (Dr. PIAS) is a database of druggable protein–protein interactions (PPIs) predicted by our support vector machine (SVM)-based method. Since the first publication of this database, Dr. PIAS has been updated to version 2.0. PPI data have been increased considerably, from 71 500 to 83 324 entries. As the new positive instances in our method, 4 PPIs and 10 tertiary structures have been added. This addition increases the prediction accuracy of our SVM classifier in comparison with the previous classifier, despite the number of added PPIs and structures is small. We have introduced the novel concept of ‘similar positives’ of druggable PPIs, which will help researchers discover small compounds that can inhibit predicted druggable PPIs. Dr. PIAS will aid the effective search for druggable PPIs from a mine of interactome data being rapidly accumulated. Dr. PIAS 2.0 is available at http://www.drpias.net. Database URL: http://www.drpias.net.
Collapse
Affiliation(s)
- Nobuyoshi Sugaya
- Drug Discovery Department, Research & Development Division, PharmaDesign, Inc., Hatchobori 2-19-8, Chuo-ku, Tokyo 104-0032, Japan.
| | | | | |
Collapse
|
6
|
Fujimori S, Hirai N, Masuoka K, Oshikubo T, Yamashita T, Washio T, Saito A, Nagasaki M, Miyano S, Miyamoto-Sato E. IRView: a database and viewer for protein interacting regions. Bioinformatics 2012; 28:1949-50. [PMID: 22592381 PMCID: PMC3389773 DOI: 10.1093/bioinformatics/bts289] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Summary: Protein–protein interactions (PPIs) are mediated through specific regions on proteins. Some proteins have two or more protein interacting regions (IRs) and some IRs are competitively used for interactions with different proteins. IRView currently contains data for 3417 IRs in human and mouse proteins. The data were obtained from different sources and combined with annotated region data from InterPro. Information on non-synonymous single nucleotide polymorphism sites and variable regions owing to alternative mRNA splicing is also included. The IRView web interface displays all IR data, including user-uploaded data, on reference sequences so that the positional relationship between IRs can be easily understood. IRView should be useful for analyzing underlying relationships between the proteins behind the PPI networks. Availability: IRView is publicly available on the web at http://ir.hgc.jp/. Contact:nekoneko@ims.u-tokyo.ac.jp
Collapse
Affiliation(s)
- Shigeo Fujimori
- Division of Interactome Medical Sciences, Institute of Medical Science, The University of Tokyo, Tokyo 108-8039, Japan
| | | | | | | | | | | | | | | | | | | |
Collapse
|
7
|
|
8
|
Baths V, Roy U, Singh T. Disruption of cell wall fatty acid biosynthesis in Mycobacterium tuberculosis using a graph theoretic approach. Theor Biol Med Model 2011; 8:5. [PMID: 21453530 PMCID: PMC3087688 DOI: 10.1186/1742-4682-8-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2011] [Accepted: 03/31/2011] [Indexed: 12/01/2022] Open
Abstract
Fatty acid biosynthesis of Mycobacterium tuberculosis was analyzed using graph theory and influential (impacting) proteins were identified. The graphs (digraphs) representing this biological network provide information concerning the connectivity of each protein or metabolite in a given pathway, providing an insight into the importance of various components in the pathway, and this can be quantitatively analyzed. Using a graph theoretic algorithm, the most influential set of proteins (sets of {1, 2, 3}, etc.), which when eliminated could cause a significant impact on the biosynthetic pathway, were identified. This set of proteins could serve as drug targets. In the present study, the metabolic network of Mycobacterium tuberculosis was constructed and the fatty acid biosynthesis pathway was analyzed for potential drug targeting. The metabolic network was constructed using the KEGG LIGAND database and subjected to graph theoretical analysis. The nearness index of a protein was used to determine the influence of the said protein on other components in the network, allowing the proteins in a pathway to be ordered according to their nearness indices. A method for identifying the most strategic nodes to target for disrupting the metabolic networks is proposed, aiding the development of new drugs to combat this deadly disease.
Collapse
Affiliation(s)
- Veeky Baths
- Department of Biological Sciences, Birla Institute of Technology & Science (BITS) Pilani K K BIRLA Goa Campus, Goa 403 726, India.
| | | | | |
Collapse
|
9
|
Dr. PIAS: an integrative system for assessing the druggability of protein-protein interactions. BMC Bioinformatics 2011; 12:50. [PMID: 21303559 PMCID: PMC3228542 DOI: 10.1186/1471-2105-12-50] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2010] [Accepted: 02/09/2011] [Indexed: 01/09/2023] Open
Abstract
Background The amount of data on protein-protein interactions (PPIs) available in public databases and in the literature has rapidly expanded in recent years. PPI data can provide useful information for researchers in pharmacology and medicine as well as those in interactome studies. There is urgent need for a novel methodology or software allowing the efficient utilization of PPI data in pharmacology and medicine. Results To address this need, we have developed the 'Druggable Protein-protein Interaction Assessment System' (Dr. PIAS). Dr. PIAS has a meta-database that stores various types of information (tertiary structures, drugs/chemicals, and biological functions associated with PPIs) retrieved from public sources. By integrating this information, Dr. PIAS assesses whether a PPI is druggable as a target for small chemical ligands by using a supervised machine-learning method, support vector machine (SVM). Dr. PIAS holds not only known druggable PPIs but also all PPIs of human, mouse, rat, and human immunodeficiency virus (HIV) proteins identified to date. Conclusions The design concept of Dr. PIAS is distinct from other published PPI databases in that it focuses on selecting the PPIs most likely to make good drug targets, rather than merely collecting PPI data.
Collapse
|
10
|
Biochemical network-based drug-target prediction. Curr Opin Biotechnol 2010; 21:511-6. [DOI: 10.1016/j.copbio.2010.05.004] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2010] [Revised: 05/18/2010] [Accepted: 05/21/2010] [Indexed: 01/09/2023]
|
11
|
Abstract
The candidate gene approach is one of the most commonly used methods for identifying genes underlying disease traits. Advances in genomics have greatly contributed to the development of this approach in the past decade. More recently, with the explosion of genomic resources accessible via the public Web, digital candidate gene approach (DigiCGA) has emerged as a new development in this field. DigiCGA, an approach still in its infancy, has already achieved some primary success in cancer gene discovery. However, a detailed discussion concerning the applications of DigiCGA in cancer gene identification has not been addressed. This chapter will focus on discussing DigiCGA in a generalized sense and its applications to the identification of cancer genes, including the cancer gene resources, application status, platform and tools, challenges, and prospects.
Collapse
|
12
|
Sugaya N, Ikeda K. Assessing the druggability of protein-protein interactions by a supervised machine-learning method. BMC Bioinformatics 2009; 10:263. [PMID: 19703312 PMCID: PMC2739204 DOI: 10.1186/1471-2105-10-263] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2009] [Accepted: 08/25/2009] [Indexed: 02/01/2023] Open
Abstract
Background Protein-protein interactions (PPIs) are challenging but attractive targets of small molecule drugs for therapeutic interventions of human diseases. In this era of rapid accumulation of PPI data, there is great need for a methodology that can efficiently select drug target PPIs by holistically assessing the druggability of PPIs. To address this need, we propose here a novel approach based on a supervised machine-learning method, support vector machine (SVM). Results To assess the druggability of the PPIs, 69 attributes were selected to cover a wide range of structural, drug and chemical, and functional information on the PPIs. These attributes were used as feature vectors in the SVM-based method. Thirty PPIs known to be druggable were carefully selected from previous studies; these were used as positive instances. Our approach was applied to 1,295 human PPIs with tertiary structures of their protein complexes already solved. The best SVM model constructed discriminated the already-known target PPIs from others at an accuracy of 81% (sensitivity, 82%; specificity, 79%) in cross-validation. Among the attributes, the two with the greatest discriminative power in the best SVM model were the number of interacting proteins and the number of pathways. Conclusion Using the model, we predicted several promising candidates for druggable PPIs, such as SMAD4/SKI. As more PPI data are accumulated in the near future, our method will have increased ability to accelerate the discovery of druggable PPIs.
Collapse
Affiliation(s)
- Nobuyoshi Sugaya
- Drug Discovery Department, Research & Development Division, PharmaDesign, Inc, Chuo-ku, Tokyo, Japan.
| | | |
Collapse
|
13
|
Bruijnincx PCA, van Koten G, Klein Gebbink RJM. Mononuclear non-heme iron enzymes with the 2-His-1-carboxylate facial triad: recent developments in enzymology and modeling studies. Chem Soc Rev 2008; 37:2716-44. [DOI: 10.1039/b707179p] [Citation(s) in RCA: 412] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
14
|
Zhu M, Zhao S. Candidate gene identification approach: progress and challenges. Int J Biol Sci 2007; 3:420-7. [PMID: 17998950 PMCID: PMC2043166 DOI: 10.7150/ijbs.3.420] [Citation(s) in RCA: 182] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2007] [Accepted: 10/24/2007] [Indexed: 11/05/2022] Open
Abstract
Although it has been widely applied in identification of genes responsible for biomedically, economically, or even evolutionarily important complex and quantitative traits, traditional candidate gene approach is largely limited by its reliance on the priori knowledge about the physiological, biochemical or functional aspects of possible candidates. Such limitation results in a fatal information bottleneck, which has apparently become an obstacle for further applications of traditional candidate gene approach on many occasions. While the identification of candidate genes involved in genetic traits of specific interest remains a challenge, significant progress in this subject has been achieved in the last few years. Several strategies have been developed, or being developed, to break the barrier of information bottleneck. Recently, being a new developing method of candidate gene approach, digital candidate gene approach (DigiCGA) has emerged and been primarily applied to identify potential candidate genes in some studies. This review summarizes the progress, application software, online tools, and challenges related to this approach.
Collapse
Affiliation(s)
- Mengjin Zhu
- Key Laboratory of Agricultural Animal Genetics, Breeding, Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, PR China
| | | |
Collapse
|