1
|
Moeckel C, Mareboina M, Konnaris MA, Chan CS, Mouratidis I, Montgomery A, Chantzi N, Pavlopoulos GA, Georgakopoulos-Soares I. A survey of k-mer methods and applications in bioinformatics. Comput Struct Biotechnol J 2024; 23:2289-2303. [PMID: 38840832 PMCID: PMC11152613 DOI: 10.1016/j.csbj.2024.05.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 05/14/2024] [Accepted: 05/15/2024] [Indexed: 06/07/2024] Open
Abstract
The rapid progression of genomics and proteomics has been driven by the advent of advanced sequencing technologies, large, diverse, and readily available omics datasets, and the evolution of computational data processing capabilities. The vast amount of data generated by these advancements necessitates efficient algorithms to extract meaningful information. K-mers serve as a valuable tool when working with large sequencing datasets, offering several advantages in computational speed and memory efficiency and carrying the potential for intrinsic biological functionality. This review provides an overview of the methods, applications, and significance of k-mers in genomic and proteomic data analyses, as well as the utility of absent sequences, including nullomers and nullpeptides, in disease detection, vaccine development, therapeutics, and forensic science. Therefore, the review highlights the pivotal role of k-mers in addressing current genomic and proteomic problems and underscores their potential for future breakthroughs in research.
Collapse
Affiliation(s)
- Camille Moeckel
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Manvita Mareboina
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Maxwell A. Konnaris
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Candace S.Y. Chan
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Ioannis Mouratidis
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Penn State University, University Park, Pennsylvania, USA
| | - Austin Montgomery
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Nikol Chantzi
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | | | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Penn State University, University Park, Pennsylvania, USA
| |
Collapse
|
2
|
Fasano C, Lepore Signorile M, Di Nicola E, Pantaleo A, Forte G, De Marco K, Sanese P, Disciglio V, Grossi V, Simone C. The chromatin remodeling factors EP300 and TRRAP are novel SMYD3 interactors involved in the emerging 'nonmutational epigenetic reprogramming' cancer hallmark. Comput Struct Biotechnol J 2023; 21:5240-5248. [PMID: 37954147 PMCID: PMC10632561 DOI: 10.1016/j.csbj.2023.10.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 09/25/2023] [Accepted: 10/10/2023] [Indexed: 11/14/2023] Open
Abstract
SMDY3 is a histone-lysine N-methyltransferase involved in several oncogenic processes and is believed to play a major role in various cancer hallmarks. Recently, we identified ATM, BRCA2, CHK2, MTOR, BLM, MET, AMPK, and p130 as direct SMYD3 interactors by taking advantage of a library of rare tripeptides, which we first tested for their in vitro binding affinity to SMYD3 and then used as in silico probes to systematically search the human proteome. Here, we used this innovative approach to identify further SMYD3-interacting proteins involved in crucial cancer pathways and found that the chromatin remodeling factors EP300 and TRRAP interact directly with SMYD3, thus linking SMYD3 to the emerging 'nonmutational epigenetic reprogramming' cancer hallmark. Of note, we validated these interactions in gastrointestinal cancer cell lines, including HCT-116 cells, which harbor a C-terminal truncating mutation in EP300, suggesting that EP300 binds to SMYD3 via its N-terminal region. While additional studies are required to ascertain the functional mechanisms underlying these interactions and their significance, the identification of two novel SMYD3 interactors involved in epigenetic cancer hallmark pathways adds important pieces to the puzzle of how SMYD3 exerts its oncogenic role.
Collapse
Affiliation(s)
- Candida Fasano
- Medical Genetics, National Institute of Gastroenterology - IRCCS “Saverio de Bellis” Research Hospital, Castellana Grotte, 70013 Bari, Italy
| | - Martina Lepore Signorile
- Medical Genetics, National Institute of Gastroenterology - IRCCS “Saverio de Bellis” Research Hospital, Castellana Grotte, 70013 Bari, Italy
| | - Elisabetta Di Nicola
- Medical Genetics, National Institute of Gastroenterology - IRCCS “Saverio de Bellis” Research Hospital, Castellana Grotte, 70013 Bari, Italy
| | - Antonino Pantaleo
- Medical Genetics, National Institute of Gastroenterology - IRCCS “Saverio de Bellis” Research Hospital, Castellana Grotte, 70013 Bari, Italy
| | - Giovanna Forte
- Medical Genetics, National Institute of Gastroenterology - IRCCS “Saverio de Bellis” Research Hospital, Castellana Grotte, 70013 Bari, Italy
| | - Katia De Marco
- Medical Genetics, National Institute of Gastroenterology - IRCCS “Saverio de Bellis” Research Hospital, Castellana Grotte, 70013 Bari, Italy
| | - Paola Sanese
- Medical Genetics, National Institute of Gastroenterology - IRCCS “Saverio de Bellis” Research Hospital, Castellana Grotte, 70013 Bari, Italy
| | - Vittoria Disciglio
- Medical Genetics, National Institute of Gastroenterology - IRCCS “Saverio de Bellis” Research Hospital, Castellana Grotte, 70013 Bari, Italy
| | - Valentina Grossi
- Medical Genetics, National Institute of Gastroenterology - IRCCS “Saverio de Bellis” Research Hospital, Castellana Grotte, 70013 Bari, Italy
| | - Cristiano Simone
- Medical Genetics, National Institute of Gastroenterology - IRCCS “Saverio de Bellis” Research Hospital, Castellana Grotte, 70013 Bari, Italy
- Medical Genetics, Department of Precision and Regenerative Medicine and Jonic Area (DiMePRe-J), University of Bari Aldo Moro, 70124 Bari, Italy
| |
Collapse
|
3
|
Short Linear Motifs in Colorectal Cancer Interactome and Tumorigenesis. Cells 2022; 11:cells11233739. [PMID: 36496998 PMCID: PMC9737320 DOI: 10.3390/cells11233739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 11/16/2022] [Accepted: 11/21/2022] [Indexed: 11/25/2022] Open
Abstract
Colorectal tumorigenesis is driven by alterations in genes and proteins responsible for cancer initiation, progression, and invasion. This multistage process is based on a dense network of protein-protein interactions (PPIs) that become dysregulated as a result of changes in various cell signaling effectors. PPIs in signaling and regulatory networks are known to be mediated by short linear motifs (SLiMs), which are conserved contiguous regions of 3-10 amino acids within interacting protein domains. SLiMs are the minimum sequences required for modulating cellular PPI networks. Thus, several in silico approaches have been developed to predict and analyze SLiM-mediated PPIs. In this review, we focus on emerging evidence supporting a crucial role for SLiMs in driver pathways that are disrupted in colorectal cancer (CRC) tumorigenesis and related PPI network alterations. As a result, SLiMs, along with short peptides, are attracting the interest of researchers to devise small molecules amenable to be used as novel anti-CRC targeted therapies. Overall, the characterization of SLiMs mediating crucial PPIs in CRC may foster the development of more specific combined pharmacological approaches.
Collapse
|
4
|
Fasano C, Lepore Signorile M, De Marco K, Forte G, Sanese P, Grossi V, Simone C. Identifying novel SMYD3 interactors on the trail of cancer hallmarks. Comput Struct Biotechnol J 2022; 20:1860-1875. [PMID: 35495117 PMCID: PMC9039736 DOI: 10.1016/j.csbj.2022.03.037] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 03/30/2022] [Accepted: 03/31/2022] [Indexed: 12/30/2022] Open
Abstract
SMYD3 overexpression in several human cancers highlights its crucial role in carcinogenesis. Nonetheless, SMYD3 specific activity in cancer development and progression is currently under debate. Taking advantage of a library of rare tripeptides, which we first tested for their in vitro binding affinity to SMYD3 and then used as in silico probes, we recently identified BRCA2, ATM, and CHK2 as direct SMYD3 interactors. To gain insight into novel SMYD3 cancer-related roles, here we performed a comprehensive in silico analysis to cluster all potential SMYD3-interacting proteins identified by screening the human proteome for the previously tested tripeptides, based on their involvement in cancer hallmarks. Remarkably, we identified mTOR, BLM, MET, AMPK, and p130 as new SMYD3 interactors implicated in cancer processes. Further studies are needed to characterize the functional mechanisms underlying these interactions. Still, these findings could be useful to devise novel therapeutic strategies based on the combined inhibition of SMYD3 and its newly identified molecular partners. Of note, our in silico methodology may be useful to search for unidentified interactors of other proteins of interest.
Collapse
Affiliation(s)
- Candida Fasano
- Medical Genetics, National Institute for Gastroenterology, IRCCS ‘S. de Bellis’ Research Hospital, Castellana Grotte (Ba), Italy
- Corresponding authors at: Medical Genetics, National Institute for Gastroenterology, IRCCS ‘S. de Bellis’ Research Hospital, Castellana Grotte (Ba), Italy (C.Fasano, C. Simone).
| | - Martina Lepore Signorile
- Medical Genetics, National Institute for Gastroenterology, IRCCS ‘S. de Bellis’ Research Hospital, Castellana Grotte (Ba), Italy
| | - Katia De Marco
- Medical Genetics, National Institute for Gastroenterology, IRCCS ‘S. de Bellis’ Research Hospital, Castellana Grotte (Ba), Italy
| | - Giovanna Forte
- Medical Genetics, National Institute for Gastroenterology, IRCCS ‘S. de Bellis’ Research Hospital, Castellana Grotte (Ba), Italy
| | - Paola Sanese
- Medical Genetics, National Institute for Gastroenterology, IRCCS ‘S. de Bellis’ Research Hospital, Castellana Grotte (Ba), Italy
| | - Valentina Grossi
- Medical Genetics, National Institute for Gastroenterology, IRCCS ‘S. de Bellis’ Research Hospital, Castellana Grotte (Ba), Italy
| | - Cristiano Simone
- Medical Genetics, National Institute for Gastroenterology, IRCCS ‘S. de Bellis’ Research Hospital, Castellana Grotte (Ba), Italy
- Medical Genetics, Department of Biomedical Sciences and Human Oncology (DIMO), University of Bari Aldo Moro, Bari, Italy
- Corresponding authors at: Medical Genetics, National Institute for Gastroenterology, IRCCS ‘S. de Bellis’ Research Hospital, Castellana Grotte (Ba), Italy (C.Fasano, C. Simone).
| |
Collapse
|
5
|
Georgakopoulos-Soares I, Yizhar-Barnea O, Mouratidis I, Hemberg M, Ahituv N. Absent from DNA and protein: genomic characterization of nullomers and nullpeptides across functional categories and evolution. Genome Biol 2021; 22:245. [PMID: 34433494 PMCID: PMC8386077 DOI: 10.1186/s13059-021-02459-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Accepted: 08/09/2021] [Indexed: 11/13/2022] Open
Abstract
Nullomers and nullpeptides are short DNA or amino acid sequences that are absent from a genome or proteome, respectively. One potential cause for their absence could be their having a detrimental impact on an organism. RESULTS: Here, we identify all possible nullomers and nullpeptides in the genomes and proteomes of thirty eukaryotes and demonstrate that a significant proportion of these sequences are under negative selection. We also identify nullomers that are unique to specific functional categories: coding sequences, exons, introns, 5'UTR, 3'UTR, promoters, and show that coding sequence and promoter nullomers are most likely to be selected against. By analyzing all protein sequences across the tree of life, we further identify 36,081 peptides up to six amino acids in length that do not exist in any known organism, termed primes. We next characterize all possible single base pair mutations that can lead to the appearance of a nullomer in the human genome, observing a significantly higher number of mutations than expected by chance for specific nullomer sequences in transposable elements, likely due to their suppression. We also annotate nullomers that appear due to naturally occurring variants and show that a subset of them can be used to distinguish between different human populations. Analysis of nullomers and nullpeptides across vertebrate evolution shows they can also be used as phylogenetic classifiers. CONCLUSIONS: We provide a catalog of nullomers and nullpeptides in distinct functional categories, develop methods to systematically study them, and highlight the use of variability in these sequences in other analyses.
Collapse
Affiliation(s)
- Ilias Georgakopoulos-Soares
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Ofer Yizhar-Barnea
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Ioannis Mouratidis
- Department of Computer Science, Katholieke Universiteit Leuven, Leuven, Belgium
| | - Martin Hemberg
- Evergrande Center for Immunologic Diseases, Harvard Medical School and Brigham and Women's Hospital, Boston, MA, USA.
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA.
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
6
|
Sanese P, Fasano C, Buscemi G, Bottino C, Corbetta S, Fabini E, Silvestri V, Valentini V, Disciglio V, Forte G, Lepore Signorile M, De Marco K, Bertora S, Grossi V, Guven U, Porta N, Di Maio V, Manoni E, Giannelli G, Bartolini M, Del Rio A, Caretti G, Ottini L, Simone C. Targeting SMYD3 to Sensitize Homologous Recombination-Proficient Tumors to PARP-Mediated Synthetic Lethality. iScience 2020; 23:101604. [PMID: 33205017 PMCID: PMC7648160 DOI: 10.1016/j.isci.2020.101604] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Revised: 08/07/2020] [Accepted: 09/21/2020] [Indexed: 12/17/2022] Open
Abstract
SMYD3 is frequently overexpressed in a wide variety of cancers. Indeed, its inactivation reduces tumor growth in preclinical in vivo animal models. However, extensive characterization in vitro failed to clarify SMYD3 function in cancer cells, although confirming its importance in carcinogenesis. Taking advantage of a SMYD3 mutant variant identified in a high-risk breast cancer family, here we show that SMYD3 phosphorylation by ATM enables the formation of a multiprotein complex including ATM, SMYD3, CHK2, and BRCA2, which is required for the final loading of RAD51 at DNA double-strand break sites and completion of homologous recombination (HR). Remarkably, SMYD3 pharmacological inhibition sensitizes HR-proficient cancer cells to PARP inhibitors, thereby extending the potential of the synthetic lethality approach in human tumors. SMYD3 phosphorylation by ATM favors the formation of HR complexes during DSB response SMYD3 mediates DSB repair by promoting RAD51 recruitment at DNA damage sites SMYD3 inhibition triggers a compensatory PARP-dependent DNA damage response Co-targeting SMYD3/PARP leads to synthetic lethality in HR-proficient cancer cells
Collapse
Affiliation(s)
- Paola Sanese
- Medical Genetics, National Institute of Gastroenterology "S. de Bellis" Research Hospital, Castellana Grotte, Bari 70013, Italy
| | - Candida Fasano
- Medical Genetics, National Institute of Gastroenterology "S. de Bellis" Research Hospital, Castellana Grotte, Bari 70013, Italy
| | - Giacomo Buscemi
- Institute of Molecular Genetics, IGM "Luigi Luca Cavalli-Sforza", National Research Council (CNR), Pavia 27100, Italy
| | - Cinzia Bottino
- Department of Biosciences, University of Milan, Milan 20133, Italy
| | - Silvia Corbetta
- Department of Biosciences, University of Milan, Milan 20133, Italy
| | - Edoardo Fabini
- Department of Pharmacy and Biotechnology, Alma Mater Studiorum University of Bologna, Bologna 40126, Italy.,BioChemoInformatics Unit, Institute of Organic Synthesis and Photoreactivity (ISOF), National Research Council (CNR), Bologna 40129, Italy
| | - Valentina Silvestri
- Department of Molecular Medicine, University of Roma "La Sapienza", Roma 00185, Italy
| | - Virginia Valentini
- Department of Molecular Medicine, University of Roma "La Sapienza", Roma 00185, Italy
| | - Vittoria Disciglio
- Medical Genetics, National Institute of Gastroenterology "S. de Bellis" Research Hospital, Castellana Grotte, Bari 70013, Italy
| | - Giovanna Forte
- Medical Genetics, National Institute of Gastroenterology "S. de Bellis" Research Hospital, Castellana Grotte, Bari 70013, Italy
| | - Martina Lepore Signorile
- Medical Genetics, National Institute of Gastroenterology "S. de Bellis" Research Hospital, Castellana Grotte, Bari 70013, Italy
| | - Katia De Marco
- Medical Genetics, National Institute of Gastroenterology "S. de Bellis" Research Hospital, Castellana Grotte, Bari 70013, Italy
| | - Stefania Bertora
- Medical Genetics, National Institute of Gastroenterology "S. de Bellis" Research Hospital, Castellana Grotte, Bari 70013, Italy
| | - Valentina Grossi
- Medical Genetics, National Institute of Gastroenterology "S. de Bellis" Research Hospital, Castellana Grotte, Bari 70013, Italy
| | - Ummu Guven
- Department of Biosciences, University of Milan, Milan 20133, Italy
| | - Natale Porta
- Department of Medical-Surgical Sciences and Biotechnology, Polo Pontino University of Roma "La Sapienza", Latina 04100, Italy
| | - Valeria Di Maio
- Department of Medical-Surgical Sciences and Biotechnology, Polo Pontino University of Roma "La Sapienza", Latina 04100, Italy
| | - Elisabetta Manoni
- BioChemoInformatics Unit, Institute of Organic Synthesis and Photoreactivity (ISOF), National Research Council (CNR), Bologna 40129, Italy
| | - Gianluigi Giannelli
- Medical Genetics, National Institute of Gastroenterology "S. de Bellis" Research Hospital, Castellana Grotte, Bari 70013, Italy
| | - Manuela Bartolini
- Department of Pharmacy and Biotechnology, Alma Mater Studiorum University of Bologna, Bologna 40126, Italy
| | - Alberto Del Rio
- BioChemoInformatics Unit, Institute of Organic Synthesis and Photoreactivity (ISOF), National Research Council (CNR), Bologna 40129, Italy.,Innovamol Consulting Srl, Modena 41123, Italy
| | | | - Laura Ottini
- Department of Molecular Medicine, University of Roma "La Sapienza", Roma 00185, Italy
| | - Cristiano Simone
- Medical Genetics, National Institute of Gastroenterology "S. de Bellis" Research Hospital, Castellana Grotte, Bari 70013, Italy.,Department of Biomedical Sciences and Human Oncology (DIMO), Medical Genetics; University of Bari Aldo Moro, Bari 70124, Italy
| |
Collapse
|
7
|
Minkiewicz P, Darewicz M, Iwaniak A, Sokołowska J, Starowicz P, Bucholska J, Hrynkiewicz M. Common Amino Acid Subsequences in a Universal Proteome--Relevance for Food Science. Int J Mol Sci 2015; 16:20748-73. [PMID: 26340620 PMCID: PMC4613229 DOI: 10.3390/ijms160920748] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2015] [Revised: 08/18/2015] [Accepted: 08/24/2015] [Indexed: 02/06/2023] Open
Abstract
A common subsequence is a fragment of the amino acid chain that occurs in more than one protein. Common subsequences may be an object of interest for food scientists as biologically active peptides, epitopes, and/or protein markers that are used in comparative proteomics. An individual bioactive fragment, in particular the shortest fragment containing two or three amino acid residues, may occur in many protein sequences. An individual linear epitope may also be present in multiple sequences of precursor proteins. Although recent recommendations for prediction of allergenicity and cross-reactivity include not only sequence identity, but also similarities in secondary and tertiary structures surrounding the common fragment, local sequence identity may be used to screen protein sequence databases for potential allergens in silico. The main weakness of the screening process is that it overlooks allergens and cross-reactivity cases without identical fragments corresponding to linear epitopes. A single peptide may also serve as a marker of a group of allergens that belong to the same family and, possibly, reveal cross-reactivity. This review article discusses the benefits for food scientists that follow from the common subsequences concept.
Collapse
Affiliation(s)
- Piotr Minkiewicz
- Department of Food Biochemistry, University of Warmia and Mazury in Olsztyn, Plac Cieszyński 1, Olsztyn-Kortowo 10-726, Poland.
| | - Małgorzata Darewicz
- Department of Food Biochemistry, University of Warmia and Mazury in Olsztyn, Plac Cieszyński 1, Olsztyn-Kortowo 10-726, Poland.
| | - Anna Iwaniak
- Department of Food Biochemistry, University of Warmia and Mazury in Olsztyn, Plac Cieszyński 1, Olsztyn-Kortowo 10-726, Poland.
| | - Jolanta Sokołowska
- Department of Food Biochemistry, University of Warmia and Mazury in Olsztyn, Plac Cieszyński 1, Olsztyn-Kortowo 10-726, Poland.
| | - Piotr Starowicz
- Department of Food Biochemistry, University of Warmia and Mazury in Olsztyn, Plac Cieszyński 1, Olsztyn-Kortowo 10-726, Poland.
| | - Justyna Bucholska
- Department of Food Biochemistry, University of Warmia and Mazury in Olsztyn, Plac Cieszyński 1, Olsztyn-Kortowo 10-726, Poland.
| | - Monika Hrynkiewicz
- Department of Food Biochemistry, University of Warmia and Mazury in Olsztyn, Plac Cieszyński 1, Olsztyn-Kortowo 10-726, Poland.
| |
Collapse
|
8
|
Trost B, Kusalik A, Lucchese G, Kanduc D. Bacterial peptides are intensively present throughout the human proteome. SELF NONSELF 2014; 1:71-74. [PMID: 21559180 DOI: 10.4161/self.1.1.9588] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2009] [Revised: 07/16/2009] [Accepted: 07/22/2009] [Indexed: 11/19/2022]
Abstract
Forty bacterial proteomes-20 pathogens and 20 non-pathogens-were examined for amino acid sequence similarity to the human proteome. All bacterial proteomes, independent of their pathogenicity, share hundreds of nonamer sequences with the human proteome. This overlap is very widespread, with one third of human proteins sharing at least one nonapeptide with one of these bacteria. On the whole, the bacteria-versus-human nonamer overlap is numerically defined by 47,610 total perfect matches disseminated through 10,701 human proteins. These findings open new perspectives on the immune relationship between bacteria and host, and might help our understanding of fundamental phenomena such as self-nonself discrimination and tolerance versus auto-reactivity.
Collapse
Affiliation(s)
- Brett Trost
- Department of Computer Science; University of Saskatchewan; Saskatoon, SK CA
| | | | | | | |
Collapse
|
9
|
Trost B, Lucchese G, Stufano A, Bickis M, Kusalik A, Kanduc D. No human protein is exempt from bacterial motifs, not even one. SELF NONSELF 2014; 1:328-334. [PMID: 21487508 DOI: 10.4161/self.1.4.13315] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2010] [Revised: 08/10/2010] [Accepted: 08/11/2010] [Indexed: 02/08/2023]
Abstract
The hypothesis that mimicry between a self and a microbial peptide antigen is strictly related to autoimmune pathology remains a debated concept in autoimmunity research. Clear evidence for a causal link between molecular mimicry and autoimmunity is still lacking. In recent studies we have demonstrated that viruses and bacteria share amino acid sequences with the human proteome at such a high extent that the molecular mimicry hypothesis becomes questionable as a causal factor in autoimmunity. Expanding upon our analysis, here we detail the bacterial peptide overlapping to the human proteome at the penta-, hexa-, hepta- and octapeptide levels by exact peptide matching analysis and demonstrate that there does not exist a single human protein that does not harbor a bacterial pentapeptide or hexapeptide motif. This finding suggests that molecular mimicry between a self and a microbial peptide antigen cannot be assumed as a basis for autoimmune pathologies. Moreover, the data are discussed in relation to the microbial immune escape phenomenon and the possible vaccine-related autoimmune effects.
Collapse
Affiliation(s)
- Brett Trost
- Department of Computer Science; University of Saskatchewan; Saskatoon, Canada
| | | | | | | | | | | |
Collapse
|
10
|
Motomura K, Nakamura M, Otaki JM. A frequency-based linguistic approach to protein decoding and design: Simple concepts, diverse applications, and the SCS Package. Comput Struct Biotechnol J 2013; 5:e201302010. [PMID: 24688703 PMCID: PMC3962227 DOI: 10.5936/csbj.201302010] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2012] [Revised: 02/07/2013] [Accepted: 02/08/2013] [Indexed: 11/23/2022] Open
Abstract
Protein structure and function information is coded in amino acid sequences. However, the relationship between primary sequences and three-dimensional structures and functions remains enigmatic. Our approach to this fundamental biochemistry problem is based on the frequencies of short constituent sequences (SCSs) or words. A protein amino acid sequence is considered analogous to an English sentence, where SCSs are equivalent to words. Availability scores, which are defined as real SCS frequencies in the non-redundant amino acid database relative to their probabilistically expected frequencies, demonstrate the biological usage bias of SCSs. As a result, this frequency-based linguistic approach is expected to have diverse applications, such as secondary structure specifications by structure-specific SCSs and immunological adjuvants with rare or non-existent SCSs. Linguistic similarities (e.g., wide ranges of scale-free distributions) and dissimilarities (e.g., behaviors of low-rank samples) between proteins and the natural English language have been revealed in the rank-frequency relationships of SCSs or words. We have developed a web server, the SCS Package, which contains five applications for analyzing protein sequences based on the linguistic concept. These tools have the potential to assist researchers in deciphering structurally and functionally important protein sites, species-specific sequences, and functional relationships between SCSs. The SCS Package also provides researchers with a tool to construct amino acid sequences de novo based on the idiomatic usage of SCSs.
Collapse
Affiliation(s)
- Kenta Motomura
- The BCPH Unit of Molecular Physiology, Department of Chemistry, Biology and Marine Science, University of the Ryukyus, Senbaru, Nishihara, Okinawa 903-0213, Japan ; Department of Information Science, University of the Ryukyus, Senbaru, Nishihara, Okinawa 903-0213, Japan
| | - Morikazu Nakamura
- Department of Information Science, University of the Ryukyus, Senbaru, Nishihara, Okinawa 903-0213, Japan
| | - Joji M Otaki
- The BCPH Unit of Molecular Physiology, Department of Chemistry, Biology and Marine Science, University of the Ryukyus, Senbaru, Nishihara, Okinawa 903-0213, Japan
| |
Collapse
|
11
|
Minkiewicz P, Bucholska J, Darewicz M, Borawska J. Epitopic hexapeptide sequences from Baltic cod parvalbumin beta (allergen Gad c 1) are common in the universal proteome. Peptides 2012; 38:105-9. [PMID: 22940202 DOI: 10.1016/j.peptides.2012.08.011] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/18/2012] [Revised: 08/14/2012] [Accepted: 08/14/2012] [Indexed: 01/25/2023]
Abstract
The aim of this study was to analyze the distribution of hexapeptide fragments considered as epitopes of Baltic cod parvalbumin beta (allergen Gad c 1) in the universal proteome. Cod (Gadus morhua subsp. callarias) parvalbumin hexapeptides cataloged in the Immune Epitope Database were used as query sequences. The UniProt database was screened using the WU-BLAST 2 program. The distribution of hexapeptide fragments was investigated in various protein families, classified according to the presence of the appropriate domains, and in proteins of plant, animal and microbial species. Hexapeptides from cod parvalbumin were found in the proteins of plants and animals which are food sources, microorganisms with various applications in food technology and biotechnology, microorganisms which are human symbionts and commensals as well as human pathogens. In the last case possible coverage between epitopes from pathogens and allergens should be avoided during vaccine design.
Collapse
Affiliation(s)
- Piotr Minkiewicz
- University of Warmia and Mazury in Olsztyn, Chair of Food Biochemistry, Olsztyn-Kortowo, Poland.
| | | | | | | |
Collapse
|
12
|
Pentamers not found in the universal proteome can enhance antigen specific immune responses and adjuvant vaccines. PLoS One 2012; 7:e43802. [PMID: 22937099 PMCID: PMC3427150 DOI: 10.1371/journal.pone.0043802] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2012] [Accepted: 07/26/2012] [Indexed: 12/22/2022] Open
Abstract
Certain short peptides do not occur in humans and are rare or non-existent in the universal proteome. Antigens that contain rare amino acid sequences are in general highly immunogenic and may activate different arms of the immune system. We first generated a list of rare, semi-common, and common 5-mer peptides using bioinformatics tools to analyze the UniProtKB database. Experimental observations indicated that rare and semi-common 5-mers generated stronger cellular responses in comparison with common-occurring sequences. We hypothesized that the biological process responsible for this enhanced immunogenicity could be used to positively modulate immune responses with potential application for vaccine development. Initially, twelve rare 5-mers, 9-mers, and 13-mers were incorporated in frame at the end of an H5N1 hemagglutinin (HA) antigen and expressed from a DNA vaccine. The presence of some 5-mer peptides induced improved immune responses. Adding one 5-mer peptide exogenously also offered improved clinical outcome and/or survival against a lethal H5N1 or H1N1 influenza virus challenge in BALB/c mice and ferrets, respectively. Interestingly, enhanced anti-HBsAg antibody production by up to 25-fold in combination with a commercial Hepatitis B vaccine (Engerix-B, GSK) was also observed in BALB/c mice. Mechanistically, NK cell activation and dependency was observed with enhancing peptides ex vivo and in NK-depleted mice. Overall, the data suggest that rare or non-existent oligopeptides can be developed as immunomodulators and supports the further evaluation of some 5-mer peptides as potential vaccine adjuvants.
Collapse
|
13
|
Pasikowski P, Goździewicz T, Stefanowicz P, Artym J, Zimecki M, Szewczuk Z. A novel immunosuppressory peptide originating from the ubiquitin sequence. Peptides 2011; 32:2418-27. [PMID: 22008734 DOI: 10.1016/j.peptides.2011.10.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/19/2011] [Revised: 10/03/2011] [Accepted: 10/03/2011] [Indexed: 01/01/2023]
Abstract
Ubiquitin is a conservative polypeptide present in every eukaryotic cell. Apart from its involvement in proteasomal degradation and other intracellular signal pathways, it was suggested to play an important role as the extracellular immunomodulator and antimicrobial agent. Moreover, ubiquitin-derived peptides were shown to express significant biological activities. Our previous studies showed a high immunosuppressive potency of the ubiquitin peptic hydrolysate in which we identified over 70 different peptides. The present work focuses on synthesizing the most abundant of these peptides and investigating their immunomodulatory potency. The peptide VKTLTGKTI possessed the highest immunosuppressory activity in AFC experiments, comparable to the previously described LEDGRTLSDY sequence (a previously discovered ubiquitin-derived peptide). Moreover, some of the investigated peptides expressed immunostimulatory effects. These findings support the idea that ubiquitin, together with products of its degradation, could represent a self-regulating immunoregulatory system. Peptide VKTLTGKTI was also tested for its activity to prolong the skin graft survival in mice. The results showed that the investigated peptide significantly extended the skin transplant rejection time, therefore it could be considered as a potential supplementary medicine in the post-transplantation therapy. Moreover, we synthesized two analogs of investigated peptides, first designed to mimic the non-linear epitope consisting of ubiquitin 16-21 and ubiquitin 52-57 fragments, and second designed to mimic the ubiquitin 5-13 hairpin. We also tested their immunosuppressory activity in in vitro experiments.
Collapse
|
14
|
Bavaro SL, Kanduc D. Pentapeptide commonality between Corynebacterium diphtheriae toxin and the Homo sapiens proteome. Immunotherapy 2010; 3:49-58. [PMID: 21174557 DOI: 10.2217/imt.10.83] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Cross-reactivity may affect diagnostic tests and cause harmful autoimmune reactions following immunotherapy. To predict potential cross-reactivity and search for safe immunotherapeutic approaches, we analyzed sequence identity between microbial antigens and the human proteome. Using diphtheria toxin (DT) as a model, we examined its patterns of identity with human proteins at the pentapeptide level. DT shares 503 pentapeptides with the human proteome, while only 31 pentapeptides are unique to the toxin. DT pentapeptide identity involves multiple/repeated matches in human proteins (a total of 4966 occurrences). Human proteins containing bacterial peptide matches include antigens linked to fundamental cellular functions, such as cell cycle control, proliferation, development and differentiation. The data presented in this article offer a rational basis for designing peptide-based vaccines that specifically target DT and thus eliminate the potential risk of cross-reactivity with human proteins. More generally, this study proposes a methodological approach for avoiding cross-reactivity in immune reactions.
Collapse
Affiliation(s)
- Simona Lucia Bavaro
- Department of Biochemistry & Molecular Biology, University of Bari, Bari 70126, Italy
| | | |
Collapse
|
15
|
Capone G, Novello G, Fasano C, Trost B, Bickis M, Kusalik A, Kanduc D. The oligodeoxynucleotide sequences corresponding to never-expressed peptide motifs are mainly located in the non-coding strand. BMC Bioinformatics 2010; 11:383. [PMID: 20646284 PMCID: PMC2919516 DOI: 10.1186/1471-2105-11-383] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2010] [Accepted: 07/20/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND We study the usage of specific peptide platforms in protein composition. Using the pentapeptide as a unit of length, we find that in the universal proteome many pentapeptides are heavily repeated (even thousands of times), whereas some are quite rare, and a small number do not appear at all. To understand the physico-chemical-biological basis underlying peptide usage at the proteomic level, in this study we analyse the energetic costs for the synthesis of rare and never-expressed versus frequent pentapeptides. In addition, we explore residue bulkiness, hydrophobicity, and codon number as factors able to modulate specific peptide frequencies. Then, the possible influence of amino acid composition is investigated in zero- and high-frequency pentapeptide sets by analysing the frequencies of the corresponding inverse-sequence pentapeptides. As a final step, we analyse the pentadecamer oligodeoxynucleotide sequences corresponding to the never-expressed pentapeptides. RESULTS We find that only DNA context-dependent constraints (such as oligodeoxynucleotide sequence location in the minus strand, introns, pseudogenes, frameshifts, etc.) provide a coherent mechanistic platform to explain the occurrence of never-expressed versus frequent pentapeptides in the protein world. CONCLUSIONS This study is of importance in cell biology. Indeed, the rarity (or lack of expression) of specific 5-mer peptide modules implies the rarity (or lack of expression) of the corresponding n-mer peptide sequences (with n < 5), so possibly modulating protein compositional trends. Moreover the data might further our understanding of the role exerted by rare pentapeptide modules as critical biological effectors in protein-protein interactions.
Collapse
Affiliation(s)
- Giovanni Capone
- Department of Biochemistry and Molecular Biology "Ernesto Quagliariello", University of Bari, Bari, Italy
| | - Giuseppe Novello
- Department of Biochemistry and Molecular Biology "Ernesto Quagliariello", University of Bari, Bari, Italy
| | - Candida Fasano
- Department of Biochemistry and Molecular Biology "Ernesto Quagliariello", University of Bari, Bari, Italy
| | - Brett Trost
- Department of Computer Science, University of Saskatchewan, Saskatoon, Canada
| | - Mik Bickis
- Department of Mathematics and Statistics, University of Saskatchewan, Saskatoon, Canada
| | - Anthony Kusalik
- Department of Computer Science, University of Saskatchewan, Saskatoon, Canada
| | - Darja Kanduc
- Department of Biochemistry and Molecular Biology "Ernesto Quagliariello", University of Bari, Bari, Italy
| |
Collapse
|