1
|
Abstract
All proteins end with a carboxyl terminus that has unique biophysical properties and is often disordered. Although there are examples of important C-termini functions, a more global role for the C-terminus is not yet established. In this review, we summarize research on C-termini, a unique region in proteins that cells exploit. Alternative splicing and proteolysis increase the diversity of proteins and peptides in cells with unique C-termini. The C-termini of proteins contain minimotifs, short peptides with an encoded function generally characterized as binding, posttranslational modifications, and trafficking. Many of these activities are specific to minimotifs on the C-terminus. Approximately 13% of C-termini in the human proteome have a known minimotif, and the majority, if not all of the remaining termini have conserved motifs inferring a function that remains to be discovered. C-termini, their predictions, and their functions are collated in the C-terminome, Proteus, and Terminus Oriented Protein Function INferred Database (TopFIND) database/web systems. Many C-termini are well conserved, and some have a known role in health and disease. We envision that this summary of C-termini will guide future investigation of their biochemical and physiological significance.
Collapse
Affiliation(s)
- Surbhi Sharma
- a Nevada Institute of Personalized Medicine and School of Life Sciences , University of Nevada , Las Vegas , NV , USA
| | - Martin R Schiller
- a Nevada Institute of Personalized Medicine and School of Life Sciences , University of Nevada , Las Vegas , NV , USA
| |
Collapse
|
2
|
Sharma S, Toledo O, Hedden M, Lyon KF, Brooks SB, David RP, Limtong J, Newsome JM, Novakovic N, Rajasekaran S, Thapar V, Williams SR, Schiller MR. The Functional Human C-Terminome. PLoS One 2016; 11:e0152731. [PMID: 27050421 PMCID: PMC4822787 DOI: 10.1371/journal.pone.0152731] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2015] [Accepted: 03/18/2016] [Indexed: 11/24/2022] Open
Abstract
All translated proteins end with a carboxylic acid commonly called the C-terminus. Many short functional sequences (minimotifs) are located on or immediately proximal to the C-terminus. However, information about the function of protein C-termini has not been consolidated into a single source. Here, we built a new "C-terminome" database and web system focused on human proteins. Approximately 3,600 C-termini in the human proteome have a minimotif with an established molecular function. To help evaluate the function of the remaining C-termini in the human proteome, we inferred minimotifs identified by experimentation in rodent cells, predicted minimotifs based upon consensus sequence matches, and predicted novel highly repetitive sequences in C-termini. Predictions can be ranked by enrichment scores or Gene Evolutionary Rate Profiling (GERP) scores, a measurement of evolutionary constraint. By searching for new anchored sequences on the last 10 amino acids of proteins in the human proteome with lengths between 3-10 residues and up to 5 degenerate positions in the consensus sequences, we have identified new consensus sequences that predict instances in the majority of human genes. All of this information is consolidated into a database that can be accessed through a C-terminome web system with search and browse functions for minimotifs and human proteins. A known consensus sequence-based predicted function is assigned to nearly half the proteins in the human proteome. Weblink: http://cterminome.bio-toolkit.com.
Collapse
Affiliation(s)
- Surbhi Sharma
- Nevada Institute of Personalized Medicine, and School of Life Sciences, University of Nevada, Las Vegas, Nevada, United States of America
| | - Oniel Toledo
- Nevada Institute of Personalized Medicine, and School of Life Sciences, University of Nevada, Las Vegas, Nevada, United States of America
| | - Michael Hedden
- Nevada Institute of Personalized Medicine, and School of Life Sciences, University of Nevada, Las Vegas, Nevada, United States of America
| | - Kenneth F. Lyon
- Nevada Institute of Personalized Medicine, and School of Life Sciences, University of Nevada, Las Vegas, Nevada, United States of America
| | - Steven B. Brooks
- Nevada Institute of Personalized Medicine, and School of Life Sciences, University of Nevada, Las Vegas, Nevada, United States of America
| | - Roxanne P. David
- Nevada Institute of Personalized Medicine, and School of Life Sciences, University of Nevada, Las Vegas, Nevada, United States of America
| | - Justin Limtong
- Nevada Institute of Personalized Medicine, and School of Life Sciences, University of Nevada, Las Vegas, Nevada, United States of America
| | - Jacklyn M. Newsome
- Nevada Institute of Personalized Medicine, and School of Life Sciences, University of Nevada, Las Vegas, Nevada, United States of America
| | - Nemanja Novakovic
- Nevada Institute of Personalized Medicine, and School of Life Sciences, University of Nevada, Las Vegas, Nevada, United States of America
| | - Sanguthevar Rajasekaran
- Department of Computer Science and Engineering, University of Connecticut, Storrs, Connecticut 06269–2155, United States of America
| | - Vishal Thapar
- Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts 02114, United States of America
| | - Sean R. Williams
- Nevada Institute of Personalized Medicine, and School of Life Sciences, University of Nevada, Las Vegas, Nevada, United States of America
| | - Martin R. Schiller
- Nevada Institute of Personalized Medicine, and School of Life Sciences, University of Nevada, Las Vegas, Nevada, United States of America
| |
Collapse
|
3
|
Kubrycht J, Sigler K, Souček P, Hudeček J. Structures composing protein domains. Biochimie 2013; 95:1511-24. [DOI: 10.1016/j.biochi.2013.04.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2013] [Accepted: 04/02/2013] [Indexed: 12/21/2022]
|
4
|
Open source drug discovery--a new paradigm of collaborative research in tuberculosis drug development. Tuberculosis (Edinb) 2011; 91:479-86. [PMID: 21782516 DOI: 10.1016/j.tube.2011.06.004] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2010] [Revised: 05/11/2011] [Accepted: 06/12/2011] [Indexed: 11/23/2022]
Abstract
It is being realized that the traditional closed-door and market driven approaches for drug discovery may not be the best suited model for the diseases of the developing world such as tuberculosis and malaria, because most patients suffering from these diseases have poor paying capacity. To ensure that new drugs are created for patients suffering from these diseases, it is necessary to formulate an alternate paradigm of drug discovery process. The current model constrained by limitations for collaboration and for sharing of resources with confidentiality hampers the opportunities for bringing expertise from diverse fields. These limitations hinder the possibilities of lowering the cost of drug discovery. The Open Source Drug Discovery project initiated by Council of Scientific and Industrial Research, India has adopted an open source model to power wide participation across geographical borders. Open Source Drug Discovery emphasizes integrative science through collaboration, open-sharing, taking up multi-faceted approaches and accruing benefits from advances on different fronts of new drug discovery. Because the open source model is based on community participation, it has the potential to self-sustain continuous development by generating a storehouse of alternatives towards continued pursuit for new drug discovery. Since the inventions are community generated, the new chemical entities developed by Open Source Drug Discovery will be taken up for clinical trial in a non-exclusive manner by participation of multiple companies with majority funding from Open Source Drug Discovery. This will ensure availability of drugs through a lower cost community driven drug discovery process for diseases afflicting people with poor paying capacity. Hopefully what LINUX the World Wide Web have done for the information technology, Open Source Drug Discovery will do for drug discovery.
Collapse
|
5
|
Taneja B, Yadav J, Chakraborty TK, Brahmachari SK. An Indian effort towards affordable drugs: “Generic to designer drugs”. Biotechnol J 2009; 4:348-60. [DOI: 10.1002/biot.200900031] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
6
|
Rajasekaran S, Balla S, Gradie P, Gryk MR, Kadaveru K, Kundeti V, Maciejewski MW, Mi T, Rubino N, Vyas J, Schiller MR. Minimotif miner 2nd release: a database and web system for motif search. Nucleic Acids Res 2009; 37:D185-90. [PMID: 18978024 PMCID: PMC2686579 DOI: 10.1093/nar/gkn865] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2008] [Accepted: 10/16/2008] [Indexed: 11/24/2022] Open
Abstract
Minimotif Miner (MnM) consists of a minimotif database and a web-based application that enables prediction of motif-based functions in user-supplied protein queries. We have revised MnM by expanding the database more than 10-fold to approximately 5000 motifs and standardized the motif function definitions. The web-application user interface has been redeveloped with new features including improved navigation, screencast-driven help, support for alias names and expanded SNP analysis. A sample analysis of prion shows how MnM 2 can be used. Weblink: http://mnm.engr.uconn.edu, weblink for version 1 is http://sms.engr.uconn.edu.
Collapse
Affiliation(s)
- Sanguthevar Rajasekaran
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06029-2155, Department of Molecular, Microbial, and Structural Biology, Biological System Modeling Group, University of Connecticut Health Center, 263 Farmington Ave. Farmington, CT 06030-3305 and Memorial Sloan-Kettering Cancer Center, NY 10021, USA
| | - Sudha Balla
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06029-2155, Department of Molecular, Microbial, and Structural Biology, Biological System Modeling Group, University of Connecticut Health Center, 263 Farmington Ave. Farmington, CT 06030-3305 and Memorial Sloan-Kettering Cancer Center, NY 10021, USA
| | - Patrick Gradie
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06029-2155, Department of Molecular, Microbial, and Structural Biology, Biological System Modeling Group, University of Connecticut Health Center, 263 Farmington Ave. Farmington, CT 06030-3305 and Memorial Sloan-Kettering Cancer Center, NY 10021, USA
| | - Michael R. Gryk
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06029-2155, Department of Molecular, Microbial, and Structural Biology, Biological System Modeling Group, University of Connecticut Health Center, 263 Farmington Ave. Farmington, CT 06030-3305 and Memorial Sloan-Kettering Cancer Center, NY 10021, USA
| | - Krishna Kadaveru
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06029-2155, Department of Molecular, Microbial, and Structural Biology, Biological System Modeling Group, University of Connecticut Health Center, 263 Farmington Ave. Farmington, CT 06030-3305 and Memorial Sloan-Kettering Cancer Center, NY 10021, USA
| | - Vamsi Kundeti
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06029-2155, Department of Molecular, Microbial, and Structural Biology, Biological System Modeling Group, University of Connecticut Health Center, 263 Farmington Ave. Farmington, CT 06030-3305 and Memorial Sloan-Kettering Cancer Center, NY 10021, USA
| | - Mark W. Maciejewski
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06029-2155, Department of Molecular, Microbial, and Structural Biology, Biological System Modeling Group, University of Connecticut Health Center, 263 Farmington Ave. Farmington, CT 06030-3305 and Memorial Sloan-Kettering Cancer Center, NY 10021, USA
| | - Tian Mi
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06029-2155, Department of Molecular, Microbial, and Structural Biology, Biological System Modeling Group, University of Connecticut Health Center, 263 Farmington Ave. Farmington, CT 06030-3305 and Memorial Sloan-Kettering Cancer Center, NY 10021, USA
| | - Nicholas Rubino
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06029-2155, Department of Molecular, Microbial, and Structural Biology, Biological System Modeling Group, University of Connecticut Health Center, 263 Farmington Ave. Farmington, CT 06030-3305 and Memorial Sloan-Kettering Cancer Center, NY 10021, USA
| | - Jay Vyas
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06029-2155, Department of Molecular, Microbial, and Structural Biology, Biological System Modeling Group, University of Connecticut Health Center, 263 Farmington Ave. Farmington, CT 06030-3305 and Memorial Sloan-Kettering Cancer Center, NY 10021, USA
| | - Martin R. Schiller
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06029-2155, Department of Molecular, Microbial, and Structural Biology, Biological System Modeling Group, University of Connecticut Health Center, 263 Farmington Ave. Farmington, CT 06030-3305 and Memorial Sloan-Kettering Cancer Center, NY 10021, USA
| |
Collapse
|
7
|
Prakash T, Sandhu KS, Singh NK, Bhasin Y, Ramakrishnan C, Brahmachari SK. Structural assessment of glycyl mutations in invariantly conserved motifs. Proteins 2007; 69:617-32. [PMID: 17623846 DOI: 10.1002/prot.21488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Motifs that are evolutionarily conserved in proteins are crucial to their structure and function. In one of our earlier studies, we demonstrated that the conserved motifs occurring invariantly across several organisms could act as structural determinants of the proteins. We observed the abundance of glycyl residues in these invariantly conserved motifs. The role of glycyl residues in highly conserved motifs has not been studied extensively. Thus, it would be interesting to examine the structural perturbations induced by mutation in these conserved glycyl sites. In this work, we selected a representative set of invariant signature (IS) peptides for which both the PDB structure and mutation information was available. We thoroughly analyzed the conformational features of the glycyl sites and their local interactions with the surrounding residues. Using Ramachandran angles, we showed that the glycyl residues occurring in these IS peptides, which have undergone mutation, occurred more often in the L-disallowed as compared with the L-allowed region of the Ramachandran plot. Short range contacts around the mutation site were analyzed to study the steric effects. With the results obtained from our analysis, we hypothesize that any change of activity arising because of such mutations must be attributed to the long-range interaction(s) of the new residue if the glycyl residue in the IS peptide occurred in the L-allowed region of the Ramachandran plot. However, the mutation of those conserved glycyl residues that occurred in the L-disallowed region of the Ramachandran plot might lead to an altered activity of the protein as a result of an altered conformation of the backbone in the immediate vicinity of the glycyl residue, in addition to long range effects arising from the long side chains of the new residue. Thus, the loss of activity because of mutation in the conserved glycyl site might either relate to long range interactions or to local perturbations around the site depending upon the conformational preference of the glycyl residue.
Collapse
Affiliation(s)
- Tulika Prakash
- G. N. Ramachandran Knowledge Center for Genome Informatics, Institute of Genomics and Integrative Biology, Delhi 110007, India
| | | | | | | | | | | |
Collapse
|
8
|
Sobolevsky Y, Trifonov EN. Protein Modules Conserved Since LUCA. J Mol Evol 2006; 63:622-34. [PMID: 17075700 DOI: 10.1007/s00239-005-0190-4] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2005] [Accepted: 12/02/2005] [Indexed: 11/28/2022]
Abstract
Universal scale of the sequence conservation has been recently introduced based on omnipresence of the protein sequence motifs across species. A large spectrum of short sequences, up to eight residues has been found to reside in all or almost all prokaryotic organisms. By this discovery a principally novel quantitative approach is introduced to the problem of reconstruction of the last universal common ancestor (LUCA). The most conserved elements (protein modules) with defined structures and sequences harboring the omnipresent motifs are outlined in this work, by combining the sequence and protein crystal structure data. The structurally conserved modules involve 25-30 amino acid residues and have appearance of closed loops, loop-n-lock structures. This confirms earlier conclusions on the loop-fold structure of globular proteins. Many of the topmost conserved modules represent the primary closed loop prototypes, that have been derived by whole genome sequence searches. The data presented, thus, make a basis for further developments toward the earliest stages of protein evolution.
Collapse
Affiliation(s)
- Yehoshua Sobolevsky
- Genome Diversity Center, Institute of Evolution, University of Haifa, Haifa 31905, Israel
| | | |
Collapse
|
9
|
Pugalenthi G, Bhaduri A, Sowdhamini R. iMOTdb--a comprehensive collection of spatially interacting motifs in proteins. Nucleic Acids Res 2006; 34:D285-6. [PMID: 16381866 PMCID: PMC1347487 DOI: 10.1093/nar/gkj125] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Realization of conserved residues that represent a protein family is crucial for clearer understanding of biological function as well as for the better recognition of additional members in sequence databases. Functionally important residues are recognized well due to their high degree of conservation in closely related sequences and are annotated in functional motif databases. Structural motifs are central to the integrity of the fold and require careful analysis for their identification. We report the availability of a database of spatially interacting motifs in single protein structures as well as those among distantly related protein structures that belong to a superfamily. Spatial interactions amongst conserved motifs are automatically measured using sequence similarity scores and distance calculations. Interactions between pairs of conserved motifs are described in the form of pseudoenergies. iMOTdb database provides information for 854,488 motifs corresponding to 60,849 protein structural domains and 22,648 protein structural entries.
Collapse
Affiliation(s)
| | | | - R. Sowdhamini
- To whom correspondence should be addressed. Tel: +91 80 23636421; Fax: +91 80 23636462;
| |
Collapse
|
10
|
Prakash T, Ramakrishnan C, Dash D, Brahmachari SK. Conformational Analysis of Invariant Peptide Sequences in Bacterial Genomes. J Mol Biol 2005; 345:937-55. [PMID: 15644196 DOI: 10.1016/j.jmb.2004.11.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2004] [Revised: 10/26/2004] [Accepted: 11/05/2004] [Indexed: 10/26/2022]
Abstract
The functional significance of evolutionarily conserved motifs/patterns of short regions in proteins is well documented. Although a large number of sequences are conserved, only a small fraction of these are invariant across several organisms. Here, we have examined the structural features of the functionally important peptide sequences, which have been found invariant across diverse bacterial genera. Ramachandran angles (phi,psi) have been used to analyze the conformation, folding patterns and geometrical location (buried/exposed) of these invariant peptides in different crystal structures harboring these sequences. The analysis indicates that the peptides preferred a single conformation in different protein structures, with the exception of only a few longer peptides that exhibited some conformational variability. In addition, it is noticed that the variability of conformation occurs mainly due to flipping of peptide units about the virtual C(alpha)...C(alpha) bond. However, for a given invariant peptide, the folding patterns are found to be similar in almost all the cases. Over and above, such peptides are found to be buried in the protein core. Thus, we can safely conclude that these invariant peptides are structurally important for the proteins, since they acquire unique structures across different proteins and can act as structural determinants (SD) of the proteins. The location of these SD peptides on the protein chain indicated that most of them are clustered towards the N-terminal and middle region of the protein with the C-terminal region exhibiting low preference. Another feature that emerges out of this study is that some of these SD peptides can also play the roles of "fold boundaries" or "hinge nucleus" in the protein structure. The study indicates that these SD peptides may act as chain-reversal signatures, guiding the proteins to adopt appropriate folds. In some cases the invariant signature peptides may also act as folding nuclei (FN) of the proteins.
Collapse
Affiliation(s)
- Tulika Prakash
- G.N.R. Knowledge Centre for Genome Informatics, Institute of Genomics and Integrative Biology, CSIR, Mall Road, Delhi 110007, India
| | | | | | | |
Collapse
|