1
|
Ghadermarzi S, Li X, Li M, Kurgan L. Sequence-Derived Markers of Drug Targets and Potentially Druggable Human Proteins. Front Genet 2019; 10:1075. [PMID: 31803227 PMCID: PMC6872670 DOI: 10.3389/fgene.2019.01075] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Accepted: 10/09/2019] [Indexed: 12/16/2022] Open
Abstract
Recent research shows that majority of the druggable human proteome is yet to be annotated and explored. Accurate identification of these unexplored druggable proteins would facilitate development, screening, repurposing, and repositioning of drugs, as well as prediction of new drug–protein interactions. We contrast the current drug targets against the datasets of non-druggable and possibly druggable proteins to formulate markers that could be used to identify druggable proteins. We focus on the markers that can be extracted from protein sequences or names/identifiers to ensure that they can be applied across the entire human proteome. These markers quantify key features covered in the past works (topological features of PPIs, cellular functions, and subcellular locations) and several novel factors (intrinsic disorder, residue-level conservation, alternative splicing isoforms, domains, and sequence-derived solvent accessibility). We find that the possibly druggable proteins have significantly higher abundance of alternative splicing isoforms, relatively large number of domains, higher degree of centrality in the protein-protein interaction networks, and lower numbers of conserved and surface residues, when compared with the non-druggable proteins. We show that the current drug targets and possibly druggable proteins share involvement in the catalytic and signaling functions. However, unlike the drug targets, the possibly druggable proteins participate in the metabolic and biosynthesis processes, are enriched in the intrinsic disorder, interact with proteins and nucleic acids, and are localized across the cell. To sum up, we formulate several markers that can help with finding novel druggable human proteins and provide interesting insights into the cellular functions and subcellular locations of the current drug targets and potentially druggable proteins.
Collapse
Affiliation(s)
- Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Xingyi Li
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| |
Collapse
|
2
|
Hu G, Wang K, Song J, Uversky VN, Kurgan L. Taxonomic Landscape of the Dark Proteomes: Whole-Proteome Scale Interplay Between Structural Darkness, Intrinsic Disorder, and Crystallization Propensity. Proteomics 2018; 18:e1800243. [PMID: 30198635 DOI: 10.1002/pmic.201800243] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Revised: 08/30/2018] [Indexed: 12/14/2022]
Abstract
Growth rate of the protein sequence universe dramatically exceeds the speed of expansion for the protein structure universe, generating an immense dark proteome that includes proteins with unknown structure. A whole-proteome scale analysis of 5.4 million proteins from 987 proteomes in the three domains of life and viruses to systematically dissect an interplay between structural coverage, degree of putative intrinsic disorder, and predicted propensity for structure determination is performed. It has been found that Archaean and Bacterial proteomes have relatively high structural coverage and low amounts of disorder, whereas Eukaryotic and Viral proteomes are characterized by a broad spread of structural coverage and higher disorder levels. The analysis reveals that dark proteomes (i.e., proteomes containing high fractions of proteins with unknown structure) have significantly elevated amounts of intrinsic disorder and are predicted to be difficult to solve structurally. Although the majority of dark proteomes are of viral origin, many dark viral proteomes have at least modest crystallization propensity and only a handful of them are enriched in the intrinsic disorder. The disorder, structural coverage, and propensity are mapped for structural determination onto a novel proteome-level sequence similarity network to analyze the interplay of these characteristics in the taxonomic landscape.
Collapse
Affiliation(s)
- Gang Hu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, 300071, P. R. China
| | - Kui Wang
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, 300071, P. R. China
| | - Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.,Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, 33612, USA.,Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, 142290, Russia
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA
| |
Collapse
|
3
|
Bruni R, Kloss B. High-throughput cloning and expression of integral membrane proteins in Escherichia coli. CURRENT PROTOCOLS IN PROTEIN SCIENCE 2013; 74:29.6.1-29.6.34. [PMID: 24510647 PMCID: PMC3920300 DOI: 10.1002/0471140864.ps2906s74] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Recently, several structural genomics centers have been established and a remarkable number of three-dimensional structures of soluble proteins have been solved. For membrane proteins, the number of structures solved has been significantly trailing those for their soluble counterparts, not least because over-expression and purification of membrane proteins is a much more arduous process. By using high-throughput technologies, a large number of membrane protein targets can be screened simultaneously and a greater number of expression and purification conditions can be employed, leading to a higher probability of successfully determining the structure of membrane proteins. This unit describes the cloning, expression, and screening of membrane proteins using high-throughput methodologies developed in the laboratory. Basic Protocol 1 describes cloning of inserts into expression vectors by ligation-independent cloning. Basic Protocol 2 describes the expression and purification of the target proteins on a miniscale. Lastly, for the targets that do express on the miniscale, Basic Protocols 3 and 4 outline the methods employed for the expression and purification of targets on a midi-scale, as well as a procedure for detergent screening and identification of detergent(s) in which the target protein is stable.
Collapse
Affiliation(s)
- Renato Bruni
- New York Consortium on Membrane Protein Structure (NYCOMPS), New York Structural Biology Center (NYSBC), New York
| | - Brian Kloss
- New York Consortium on Membrane Protein Structure (NYCOMPS), New York Structural Biology Center (NYSBC), New York
| |
Collapse
|
4
|
Gao T, Petrlova J, He W, Huser T, Kudlick W, Voss J, Coleman MA. Characterization of de novo synthesized GPCRs supported in nanolipoprotein discs. PLoS One 2012; 7:e44911. [PMID: 23028674 PMCID: PMC3460959 DOI: 10.1371/journal.pone.0044911] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2012] [Accepted: 08/09/2012] [Indexed: 02/05/2023] Open
Abstract
The protein family known as G-protein coupled receptors (GPCRs) comprises an important class of membrane-associated proteins, which remains a difficult family of proteins to characterize because their function requires a native-like lipid membrane environment. This paper focuses on applying a single step method leading to the formation of nanolipoprotein particles (NLPs) capable of solubilizing functional GPCRs for biophysical characterization. NLPs were used to demonstrate increased solubility for multiple GPCRs such as the Neurokinin 1 Receptor (NK1R), the Adrenergic Receptor â2 (ADRB2) and the Dopamine Receptor D1 (DRD1). All three GPCRs showed affinity for their specific ligands using a simple dot blot assay. The NK1R was characterized in greater detail to demonstrate correct folding of the ligand pocket with nanomolar specificity. Electron paramagnetic resonance (EPR) spectroscopy validated the correct folding of the NK1R binding pocket for Substance P (SP). Fluorescence correlation spectroscopy (FCS) was used to identify SP-bound NK1R-containing NLPs and measure their dissociation rate in an aqueous environment. The dissociation constant was found to be 83 nM and was consistent with dot blot assays. This study represents a unique combinational approach involving the single step de novo production of a functional GPCR combined with biophysical techniques to demonstrate receptor association with the NLPs and binding affinity to specific ligands. Such a combined approach provides a novel path forward to screen and characterize GPCRs for drug discovery as well as structural studies outside of the complex cellular environment.
Collapse
Affiliation(s)
- Tingjuan Gao
- NSF Center for Biophotonics Science and Technology, University of California Davis Medical Center, Sacramento, California, United States of America
- Department of Biochemistry and Molecular Medicine, University of California Davis Medical Center, Sacramento, California, United States of America
| | - Jitka Petrlova
- Department of Biochemistry and Molecular Medicine, University of California Davis Medical Center, Sacramento, California, United States of America
| | - Wei He
- Department of Radiation Oncology, University of California Davis Medical Center, Sacramento, California, United States of America
| | - Thomas Huser
- NSF Center for Biophotonics Science and Technology, University of California Davis Medical Center, Sacramento, California, United States of America
| | - Wieslaw Kudlick
- Life Technologies, Carlsbad, California, United States of America
| | - John Voss
- Department of Biochemistry and Molecular Medicine, University of California Davis Medical Center, Sacramento, California, United States of America
- * E-mail: (JV); (MAC)
| | - Matthew A. Coleman
- NSF Center for Biophotonics Science and Technology, University of California Davis Medical Center, Sacramento, California, United States of America
- Department of Radiation Oncology, University of California Davis Medical Center, Sacramento, California, United States of America
- Lawrence Livermore National Laboratory, Livermore, California, United States of America
- * E-mail: (JV); (MAC)
| |
Collapse
|
5
|
Öberg F, Hedfalk K. Recombinant production of the human aquaporins in the yeastPichia pastoris(Invited Review). Mol Membr Biol 2012; 30:15-31. [DOI: 10.3109/09687688.2012.665503] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
6
|
Öberg F, Sjöhamn J, Conner MT, Bill RM, Hedfalk K. Improving recombinant eukaryotic membrane protein yields inPichia pastoris: The importance of codon optimization and clone selection. Mol Membr Biol 2011; 28:398-411. [DOI: 10.3109/09687688.2011.602219] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
7
|
Xiong B, Wu J, Burk DL, Xue M, Jiang H, Shen J. BSSF: a fingerprint based ultrafast binding site similarity search and function analysis server. BMC Bioinformatics 2010; 11:47. [PMID: 20100327 PMCID: PMC3098077 DOI: 10.1186/1471-2105-11-47] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2009] [Accepted: 01/25/2010] [Indexed: 11/17/2022] Open
Abstract
Background Genome sequencing and post-genomics projects such as structural genomics are extending the frontier of the study of sequence-structure-function relationship of genes and their products. Although many sequence/structure-based methods have been devised with the aim of deciphering this delicate relationship, there still remain large gaps in this fundamental problem, which continuously drives researchers to develop novel methods to extract relevant information from sequences and structures and to infer the functions of newly identified genes by genomics technology. Results Here we present an ultrafast method, named BSSF(Binding Site Similarity & Function), which enables researchers to conduct similarity searches in a comprehensive three-dimensional binding site database extracted from PDB structures. This method utilizes a fingerprint representation of the binding site and a validated statistical Z-score function scheme to judge the similarity between the query and database items, even if their similarities are only constrained in a sub-pocket. This fingerprint based similarity measurement was also validated on a known binding site dataset by comparing with geometric hashing, which is a standard 3D similarity method. The comparison clearly demonstrated the utility of this ultrafast method. After conducting the database searching, the hit list is further analyzed to provide basic statistical information about the occurrences of Gene Ontology terms and Enzyme Commission numbers, which may benefit researchers by helping them to design further experiments to study the query proteins. Conclusions This ultrafast web-based system will not only help researchers interested in drug design and structural genomics to identify similar binding sites, but also assist them by providing further analysis of hit list from database searching.
Collapse
Affiliation(s)
- Bing Xiong
- State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Zhangjiang Hi-Tech Park, Pudong, Shanghai, 201203, PR China.
| | | | | | | | | | | |
Collapse
|
8
|
Zhu F, Han B, Kumar P, Liu X, Ma X, Wei X, Huang L, Guo Y, Han L, Zheng C, Chen Y. Update of TTD: Therapeutic Target Database. Nucleic Acids Res 2009; 38:D787-91. [PMID: 19933260 PMCID: PMC2808971 DOI: 10.1093/nar/gkp1014] [Citation(s) in RCA: 200] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Increasing numbers of proteins, nucleic acids and other molecular entities have been explored as therapeutic targets, hundreds of which are targets of approved and clinical trial drugs. Knowledge of these targets and corresponding drugs, particularly those in clinical uses and trials, is highly useful for facilitating drug discovery. Therapeutic Target Database (TTD) has been developed to provide information about therapeutic targets and corresponding drugs. In order to accommodate increasing demand for comprehensive knowledge about the primary targets of the approved, clinical trial and experimental drugs, numerous improvements and updates have been made to TTD. These updates include information about 348 successful, 292 clinical trial and 1254 research targets, 1514 approved, 1212 clinical trial and 2302 experimental drugs linked to their primary targets (3382 small molecule and 649 antisense drugs with available structure and sequence), new ways to access data by drug mode of action, recursive search of related targets or drugs, similarity target and drug searching, customized and whole data download, standardized target ID, and significant increase of data (1894 targets, 560 diseases and 5028 drugs compared with the 433 targets, 125 diseases and 809 drugs in the original release described in previous paper). This database can be accessed at http://bidd.nus.edu.sg/group/cjttd/TTD.asp.
Collapse
Affiliation(s)
- Feng Zhu
- Department of Pharmacy and Computation and Systems Biology, Center for Computational Science and Engineering, Singapore-MIT Alliance, National University of Singapore, Singapore
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Sim DW, Lee YS, Kim JH, Seo MD, Lee BJ, Won HS. HP0902 from Helicobacter pylori is a thermostable, dimeric protein belonging to an all-β topology of the cupin superfamily. BMB Rep 2009; 42:387-92. [PMID: 19558799 DOI: 10.5483/bmbrep.2009.42.6.387] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Affiliation(s)
- Dae-Won Sim
- Department of Biotechnology, College of Biomedical and Health Science, Konkuk University, Chungju, 380-701, Korea
| | | | | | | | | | | |
Collapse
|
10
|
Abstract
A protocol for ligation-dependent cloning using the Flexi Vector method in a 96-well format is described. The complete protocol includes PCR amplification of the desired gene to append Flexi Vector cloning sequences, restriction digestion of the PCR products, ligation of the digested PCR products into a similarly digested acceptor vector, transformation and growth of host cells, analysis of the transformed clones, and storage of a sequence-verified clone. The protocol also includes transfer of the sequence-verified clones into another Flexi Vector plasmid backbone. Smaller numbers of cloning reactions can be undertaken by appropriate scaling of the indicated reaction volumes.
Collapse
|
11
|
Kinoshita K, Murakami Y, Nakamura H. eF-seek: prediction of the functional sites of proteins by searching for similar electrostatic potential and molecular surface shape. Nucleic Acids Res 2007; 35:W398-402. [PMID: 17567616 PMCID: PMC1933152 DOI: 10.1093/nar/gkm351] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We have developed a method to predict ligand-binding sites in a new protein structure by searching for similar binding sites in the Protein Data Bank (PDB). The similarities are measured according to the shapes of the molecular surfaces and their electrostatic potentials. A new web server, eF-seek, provides an interface to our search method. It simply requires a coordinate file in the PDB format, and generates a prediction result as a virtual complex structure, with the putative ligands in a PDB format file as the output. In addition, the predicted interacting interface is displayed to facilitate the examination of the virtual complex structure on our own applet viewer with the web browser (URL: http://eF-site.hgc.jp/eF-seek).
Collapse
Affiliation(s)
- Kengo Kinoshita
- Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minatoku, Tokyo, 108-8639, Japan.
| | | | | |
Collapse
|