1
|
Choi DH, Kang SK, Lee KE, Jung J, Kim EJ, Kim WH, Kwon YG, Kim KP, Jo I, Park YS, Park SI. Nitrosylation of β2-Tubulin Promotes Microtubule Disassembly and Differentiated Cardiomyocyte Beating in Ischemic Mice. Tissue Eng Regen Med 2023; 20:921-937. [PMID: 37679590 PMCID: PMC10519925 DOI: 10.1007/s13770-023-00582-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 05/04/2023] [Accepted: 05/10/2023] [Indexed: 09/09/2023] Open
Abstract
BACKGROUND Beating cardiomyocyte regeneration therapies have revealed as alternative therapeutics for heart transplantation. Nonetheless, the importance of nitric oxide (NO) in cardiomyocyte regeneration has been widely suggested, little has been reported concerning endogenous NO during cardiomyocyte differentiation. METHODS Here, we used P19CL6 cells and a Myocardiac infarction (MI) model to confirm NO-induced protein modification and its role in cardiac beating. Two tyrosine (Tyr) residues of β2-tubulin (Y106 and Y340) underwent nitrosylation (Tyr-NO) by endogenously generated NO during cardiomyocyte differentiation from pre-cardiomyocyte-like P19CL6 cells. RESULTS Tyr-NO-β2-tubulin mediated the interaction with Stathmin, which promotes microtubule disassembly, and was prominently observed in spontaneously beating cell clusters and mouse embryonic heart (E11.5d). In myocardial infarction mice, Tyr-NO-β2-tubulin in transplanted cells was closely related with cardiac troponin-T expression with their functional recovery, reduced infarct size and thickened left ventricular wall. CONCLUSION This is the first discovery of a new target molecule of NO, β2-tubulin, that can promote normal cardiac beating and cardiomyocyte regeneration. Taken together, we suggest therapeutic potential of Tyr-NO-β2-tubulin, for ischemic cardiomyocyte, which can reduce unexpected side effect of stem cell transplantation, arrhythmogenesis.
Collapse
Affiliation(s)
- Da Hyeon Choi
- Department of Biological Sciences and Biotechnology, School of Biological Sciences, College of Natural Sciences, Chungbuk National University, Cheongju, Republic of Korea
| | - Seong Ki Kang
- Division of Intractable Diseases, Center for Biomedical Sciences, Korea National Institute of Health (KNIH), Cheongju, Republic of Korea
- Department of Laboratory Medicine, Green Cross Laboratories, Yongin, Republic of Korea
| | - Kyeong Eun Lee
- Department of Biological Sciences and Biotechnology, School of Biological Sciences, College of Natural Sciences, Chungbuk National University, Cheongju, Republic of Korea
| | - Jongsun Jung
- AI Drug Platform Center, Syntekabio, Daejeon, Republic of Korea
| | - Eun Ju Kim
- Department of Applied Chemistry, Kyung Hee University, Yongin, Republic of Korea
| | - Won-Ho Kim
- Division of Cardiovascular and Rare Diseases, Center for Biomedical Sciences, Korea National Institute of Health, Cheongju, Republic of Korea
| | - Young-Guen Kwon
- Department of Biochemistry, College of Life Science and Biotechnology, Yonsei University, Seoul, Republic of Korea
| | - Kwang Pyo Kim
- Department of Applied Chemistry, Kyung Hee University, Yongin, Republic of Korea
| | - Inho Jo
- Department of Molecular Medicine, College of Ewha Womans University, Seoul, Republic of Korea
- Graduate Program in System Health Science and Engineering, Ewha Womans University, Seoul, Republic of Korea
| | - Yoon Shin Park
- Department of Biological Sciences and Biotechnology, School of Biological Sciences, College of Natural Sciences, Chungbuk National University, Cheongju, Republic of Korea.
| | - Sang Ick Park
- Division of Intractable Diseases, Center for Biomedical Sciences, Korea National Institute of Health (KNIH), Cheongju, Republic of Korea.
| |
Collapse
|
2
|
Dixit H, Kulharia M, Verma SK. Metalloproteome of human-infective RNA viruses: a study towards understanding the role of metal ions in virology. Pathog Dis 2023; 81:ftad020. [PMID: 37653445 DOI: 10.1093/femspd/ftad020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 08/07/2023] [Accepted: 08/29/2023] [Indexed: 09/02/2023] Open
Abstract
Metalloproteins and metal-based inhibitors have been shown to effectively combat infectious diseases, particularly those caused by RNA viruses. In this study, a diverse set of bioinformatics methods was employed to identify metal-binding proteins of human RNA viruses. Seventy-three viral proteins with a high probability of being metal-binding proteins were identified. These proteins included 40 zinc-, 47 magnesium- and 14 manganese-binding proteins belonging to 29 viral species and eight significant viral families, including Coronaviridae, Flaviviridae and Retroviridae. Further functional characterization has revealed that these proteins play a critical role in several viral processes, including viral replication, fusion and host viral entry. They fall under the essential categories of viral proteins, including polymerase and protease enzymes. Magnesium ion is abundantly predicted to interact with these viral enzymes, followed by zinc. In addition, this study also examined the evolutionary aspects of predicted viral metalloproteins, offering essential insights into the metal utilization patterns among different viral species. The analysis indicates that the metal utilization patterns are conserved within the functional classes of the proteins. In conclusion, the findings of this study provide significant knowledge on viral metalloproteins that can serve as a valuable foundation for future research in this area.
Collapse
Affiliation(s)
- Himisha Dixit
- Centre for Computational Biology & Bioinformatics, Central University of Himachal Pradesh, Kangra 176206, Himachal Pradesh, India
| | - Mahesh Kulharia
- Centre for Computational Biology & Bioinformatics, Central University of Himachal Pradesh, Kangra 176206, Himachal Pradesh, India
| | - Shailender Kumar Verma
- Centre for Computational Biology & Bioinformatics, Central University of Himachal Pradesh, Kangra 176206, Himachal Pradesh, India
- Department of Environmental Studies, University of Delhi 110007, Delhi, India
| |
Collapse
|
3
|
SeqCP: A sequence-based algorithm for searching circularly permuted proteins. Comput Struct Biotechnol J 2022; 21:185-201. [PMID: 36582435 PMCID: PMC9763678 DOI: 10.1016/j.csbj.2022.11.024] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 11/10/2022] [Accepted: 11/10/2022] [Indexed: 11/16/2022] Open
Abstract
Circular permutation (CP) is a protein sequence rearrangement in which the amino- and carboxyl-termini of a protein can be created in different positions along the imaginary circularized sequence. Circularly permutated proteins usually exhibit conserved three-dimensional structures and functions. By comparing the structures of circular permutants (CPMs), protein research and bioengineering applications can be approached in ways that are difficult to achieve by traditional mutagenesis. Most current CP detection algorithms depend on structural information. Because there is a vast number of proteins with unknown structures, many CP pairs may remain unidentified. An efficient sequence-based CP detector will help identify more CP pairs and advance many protein studies. For instance, some hypothetical proteins may have CPMs with known functions and structures that are informative for functional annotation, but existing structure-based CP search methods cannot be applied when those hypothetical proteins lack structural information. Despite the considerable potential for applications, sequence-based CP search methods have not been well developed. We present a sequence-based method, SeqCP, which analyzes normal and duplicated sequence alignments to identify CPMs and determine candidate CP sites for proteins. SeqCP was trained by data obtained from the Circular Permutation Database and tested with nonredundant datasets from the Protein Data Bank. It shows high reliability in CP identification and achieves an AUC of 0.9. SeqCP has been implemented into a web server available at: http://pcnas.life.nthu.edu.tw/SeqCP/.
Collapse
Key Words
- AUC, area under the ROC curve
- CE, combinatorial extension
- CE-CP, CE with Circular Permutations
- CP, circular permutation
- CPDB, Circular Permutation Database
- CPMs, circular permutants
- CPSARST, Circular Permutation Search Aided by Ramachandran Sequential Transformation
- Circular permutants
- Circular permutation
- MCC, Matthews correlation coefficient
- Protein sequence analysis
- Protein structure modeling
- RMSD, root-mean-square distance
- ROC, receiver operating characteristic
Collapse
|
4
|
Liu C, Boland S, Scholle MD, Bardiot D, Marchand A, Chaltin P, Blatt LM, Beigelman L, Symons JA, Raboisson P, Gurard-Levin ZA, Vandyck K, Deval J. Dual inhibition of SARS-CoV-2 and human rhinovirus with protease inhibitors in clinical development. Antiviral Res 2021; 187:105020. [PMID: 33515606 PMCID: PMC7839511 DOI: 10.1016/j.antiviral.2021.105020] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Revised: 01/05/2021] [Accepted: 01/17/2021] [Indexed: 12/14/2022]
Abstract
The 3-chymotrypsin-like cysteine protease (3CLpro) of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is considered a major target for the discovery of direct antiviral agents. We previously reported the evaluation of SARS-CoV-2 3CLpro inhibitors in a novel self-assembled monolayer desorption ionization mass spectrometry (SAMDI-MS) enzymatic assay (Gurard-Levin et al., 2020). The assay was further improved by adding the rhinovirus HRV3C protease to the same well as the SARS-CoV-2 3CLpro enzyme. High substrate specificity for each enzyme allowed the proteases to be combined in a single assay reaction without interfering with their individual activities. This novel duplex assay was used to profile a diverse set of reference protease inhibitors. The protease inhibitors were grouped into three categories based on their relative potency against 3CLpro and HRV3C including those that are: equipotent against 3CLpro and HRV3C (GC376 and calpain inhibitor II), selective for 3CLpro (PF-00835231, calpain inhibitor XII, boceprevir), and selective for HRV3C (rupintrivir). Structural analysis showed that the combination of minimal interactions, conformational flexibility, and limited bulk allows GC376 and calpain inhibitor II to potently inhibit both enzymes. In contrast, bulkier compounds interacting more tightly with pockets P2, P3, and P4 due to optimization for a specific target display a more selective inhibition profile. Consistently, the most selective viral protease inhibitors were relatively weak inhibitors of human cathepsin L. Taken together, these results can guide the design of cysteine protease inhibitors that are either virus-specific or retain a broad antiviral spectrum against coronaviruses and rhinoviruses.
Collapse
Affiliation(s)
- Cheng Liu
- Aligos Therapeutics, Inc., South San Francisco, USA
| | | | | | | | | | - Patrick Chaltin
- Cistim, Leuven, Belgium; Centre for Drug Design and Discovery (CD3), KU Leuven, Leuven, Belgium
| | | | | | | | | | | | | | - Jerome Deval
- Aligos Therapeutics, Inc., South San Francisco, USA.
| |
Collapse
|
5
|
Wen Z, He J, Huang SY. Topology-independent and global protein structure alignment through an FFT-based algorithm. Bioinformatics 2020; 36:478-486. [PMID: 31384919 DOI: 10.1093/bioinformatics/btz609] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Revised: 07/22/2019] [Accepted: 08/02/2019] [Indexed: 12/12/2022] Open
Abstract
MOTIVATION Protein structure alignment is one of the fundamental problems in computational structure biology. A variety of algorithms have been developed to address this important issue in the past decade. However, due to their heuristic nature, current structure alignment methods may suffer from suboptimal alignment and/or over-fragmentation and thus lead to a biologically wrong alignment in some cases. To overcome these limitations, we have developed an accurate topology-independent and global structure alignment method through an FFT-based exhaustive search algorithm, which is referred to as FTAlign. RESULTS Our FTAlign algorithm was extensively tested on six commonly used datasets and compared with seven state-of-the-art structure alignment approaches, TMalign, DeepAlign, Kpax, 3DCOMB, MICAN, SPalignNS and CLICK. It was shown that FTAlign outperformed the other methods in reproducing manually curated alignments and obtained a high success rate of 96.7 and 90.0% on two gold-standard benchmarks, MALIDUP and MALISAM, respectively. Moreover, FTAlign also achieved the overall best performance in terms of biologically meaningful structure overlap (SO) and TMscore on both the sequential alignment test sets including MALIDUP, MALISAM and 64 difficult cases from HOMSTRAD, and the non-sequential sets including MALIDUP-NS, MALISAM-NS, 199 topology-different cases, where FTAlign especially showed more advantage for non-sequential alignment. Despite its global search feature, FTAlign is also computationally efficient and can normally complete a pairwise alignment within one second. AVAILABILITY AND IMPLEMENTATION http://huanglab.phys.hust.edu.cn/ftalign/.
Collapse
Affiliation(s)
- Zeyu Wen
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, People's Republic of China
| | - Jiahua He
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, People's Republic of China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, People's Republic of China
| |
Collapse
|
6
|
Benchmarking Methods of Protein Structure Alignment. J Mol Evol 2020; 88:575-597. [PMID: 32725409 DOI: 10.1007/s00239-020-09960-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Accepted: 07/10/2020] [Indexed: 10/23/2022]
Abstract
The function of a protein is primarily determined by its structure and amino acid sequence. Many biological questions of interest rely on being able to accurately determine the group of structures to which domains of a protein belong; this can be done through alignment and comparison of protein structures. Dozens of different methods for Protein Structure Alignment (PSA) have been proposed that use a wide range of techniques. The aim of this study is to determine the ability of PSA methods to identify pairs of protein domains known to share differing levels of structural similarity, and to assess their utility for clustering domains from several different folds into known groups. We present the results of a comprehensive investigation into eighteen PSA methods, to our knowledge the largest piece of independent research on this topic. Overall, SP-AlignNS (non-sequential) was found to be the best method for classification, and among the best performing methods for clustering. Methods (where possible) were split into the algorithm used to find the optimal alignment and the score used to assess similarity. This allowed us to largely separate the algorithm from the score it maximizes and thus, to assess their effectiveness independently of each other. Surprisingly, we found that some hybrids of mismatched scores and algorithms performed better than either of the native methods at classification and, in some cases, clustering as well. It is hoped that this investigation and the accompanying discussion will be useful for researchers selecting or designing methods to align protein structures.
Collapse
|
7
|
Frank M, Beccati D, Leeflang BR, Vliegenthart JFG. C-Mannosylation Enhances the Structural Stability of Human RNase 2. iScience 2020; 23:101371. [PMID: 32739833 PMCID: PMC7399192 DOI: 10.1016/j.isci.2020.101371] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Revised: 06/22/2020] [Accepted: 07/13/2020] [Indexed: 12/25/2022] Open
Abstract
C-Mannosylation is a relatively rare form of protein glycosylation involving the attachment of an α-mannopyranosyl residue to C-2 of the indole moiety of the amino acid tryptophan. This type of linkage was initially discovered in RNase 2 from human urine but later confirmed to be present in many other important proteins. Based on NMR experiments and extensive molecular dynamics simulations on the hundred microsecond timescale we demonstrate that, for isolated glycopeptides and denatured RNase 2, the C-linked mannopyranosyl residue exists as an ensemble of conformations, among which 1C4 is the most abundant. However, for native RNase 2, molecular dynamics and NMR studies revealed that the mannopyranosyl residue favors a specific conformation, which optimally stabilizes the protein fold through a network of hydrogen bonds and which leads to a significant reduction of the protein dynamics on the microsecond timescale. Our findings contribute to the understanding of the biological role of C-mannosylation. NMR and MD show that C-linked mannose exists as an ensemble of conformations Conformation of mannose is influenced by the protein environment and solvent In RNase 2 mannose favors a conformation that optimally stabilizes the protein fold Efficient methods for analysis of a large number of MD trajectories are presented
Collapse
Affiliation(s)
| | - Daniela Beccati
- Bijvoet Center, Division of Bio-Organic Chemistry, Utrecht University, Padualaan 8, Utrecht 3584 CH, The Netherlands
| | - Bas R Leeflang
- Bijvoet Center, Division of Bio-Organic Chemistry, Utrecht University, Padualaan 8, Utrecht 3584 CH, The Netherlands
| | - Johannes F G Vliegenthart
- Bijvoet Center, Division of Bio-Organic Chemistry, Utrecht University, Padualaan 8, Utrecht 3584 CH, The Netherlands.
| |
Collapse
|
8
|
Koo N, Shin AY, Oh S, Kim H, Hong S, Park SJ, Sim YM, Byeon I, Kim KY, Lim YP, Kwon SY, Kim YM. Comprehensive analysis of Translationally Controlled Tumor Protein (TCTP) provides insights for lineage-specific evolution and functional divergence. PLoS One 2020; 15:e0232029. [PMID: 32374732 PMCID: PMC7202613 DOI: 10.1371/journal.pone.0232029] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 04/06/2020] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Translationally controlled tumor protein (TCTP) is a conserved, multifunctional protein involved in numerous cellular processes in eukaryotes. Although the functions of TCTP have been investigated sporadically in animals, invertebrates, and plants, few lineage-specific activities of this molecule, have been reported. An exception is in Arabidopsis thaliana, in which TCTP (AtTCTP1) functions in stomatal closuer by regulating microtubule stability. Further, although the development of next-generation sequencing technologies has facilitated the analysis of many eukaryotic genomes in public databases, inter-kingdom comparative analyses using available genome information are comparatively scarce. METHODOLOGY To carry out inter-kingdom comparative analysis of TCTP, TCTP genes were identified from 377 species. Then phylogenetic analysis, prediction of protein structure, molecular docking simulation and molecular dynamics analysis were performed to investigate the evolution of TCTP genes and their binding proteins. RESULTS A total of 533 TCTP genes were identified from 377 eukaryotic species, including protozoa, fungi, invertebrates, vertebrates, and plants. Phylogenetic and secondary structure analyses reveal lineage-specific evolution of TCTP, and inter-kingdom comparisons highlight the lineage-specific emergence of, or changes in, secondary structure elements in TCTP proteins from different kingdoms. Furthermore, secondary structure comparisons between TCTP proteins within each kingdom, combined with measurements of the degree of sequence conservation, suggest that TCTP genes have evolved to conserve protein secondary structures in a lineage-specific manner. Additional tertiary structure analysis of TCTP-binding proteins and their interacting partners and docking simulations between these proteins further imply that TCTP gene variation may influence the tertiary structures of TCTP-binding proteins in a lineage-specific manner. CONCLUSIONS Our analysis suggests that TCTP has undergone lineage-specific evolution and that structural changes in TCTP proteins may correlate with the tertiary structure of TCTP-binding proteins and their binding partners in a lineage-specific manner.
Collapse
Affiliation(s)
- Namjin Koo
- Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Republic of Korea
| | - Ah-Young Shin
- Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Republic of Korea
| | - Sangho Oh
- Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Republic of Korea
| | - Hyeongmin Kim
- Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Republic of Korea
- Department of Biomedical Informatics, Center for Genome Science, National Institute of Health, KCDC, Choongchung-Buk-do, Republic of Korea
| | - Seongmin Hong
- Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Republic of Korea
- Molecular Genetics and Genomics Laboratory, Department of Horticulture, College of Agriculture and Life Science, Chungnam National University, Daejeon, Korea
| | - Seong-Jin Park
- Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Republic of Korea
| | - Young Mi Sim
- Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Republic of Korea
| | - Iksu Byeon
- Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Republic of Korea
| | - Kye Young Kim
- Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Republic of Korea
| | - Yong Pyo Lim
- Molecular Genetics and Genomics Laboratory, Department of Horticulture, College of Agriculture and Life Science, Chungnam National University, Daejeon, Korea
| | - Suk-Yoon Kwon
- Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Republic of Korea
| | - Yong-Min Kim
- Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Republic of Korea
| |
Collapse
|
9
|
Fallaize CJ, Green PJ, Mardia KV, Barber S. Bayesian protein sequence and structure alignment. J R Stat Soc Ser C Appl Stat 2020. [DOI: 10.1111/rssc.12394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Affiliation(s)
| | - Peter J. Green
- University of Bristol UK
- University of Technology Sydney Australia
| | | | | |
Collapse
|
10
|
Saidi R, Dhifli W, Maddouri M, Mephu Nguifo E. Efficiently Mining Recurrent Substructures from Protein Three-Dimensional Structure Graphs. J Comput Biol 2019; 26:561-571. [DOI: 10.1089/cmb.2018.0171] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Rabie Saidi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | - Wajdi Dhifli
- University of Lille, Faculty of Pharmaceutical and Biological Sciences, EA2694, F-59000 Lille, France
| | - Mondher Maddouri
- University of Jeddah, School of Business, Jeddah, Kingdom of Saudi Arabia
| | | |
Collapse
|
11
|
Antlion optimization algorithm for pairwise structural alignment with bi-objective functions. Neural Comput Appl 2019. [DOI: 10.1007/s00521-019-04176-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
12
|
Lee J, Son A, Kim P, Kwon SB, Yu JE, Han G, Seong BL. RNA‐dependent chaperone (chaperna) as an engineered pro‐region for the folding of recombinant microbial transglutaminase. Biotechnol Bioeng 2019; 116:490-502. [DOI: 10.1002/bit.26879] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2018] [Revised: 11/15/2018] [Accepted: 11/22/2018] [Indexed: 12/14/2022]
Affiliation(s)
- Jinhee Lee
- Department of Integrated OMICS for Biomedical Science, College of Life science and BiotechnologyYonsei UniversitySeoul Korea
| | - Ahyun Son
- Department of Integrated OMICS for Biomedical Science, College of Life science and BiotechnologyYonsei UniversitySeoul Korea
- Present affiliation: Department of Chemistry and BiochemistryKnoebel Institute for Healthy AgingUniversity of DenverDenver Colorado
| | - Paul Kim
- Department of Integrated OMICS for Biomedical Science, College of Life science and BiotechnologyYonsei UniversitySeoul Korea
| | - Soon Bin Kwon
- Department of BiotechnologyCollege of Life science and BiotechnologyYonsei UniversitySeoul Korea
| | - Ji Eun Yu
- Department of BiotechnologyCollege of Life science and BiotechnologyYonsei UniversitySeoul Korea
| | - Gyoonhee Han
- Department of Integrated OMICS for Biomedical Science, College of Life science and BiotechnologyYonsei UniversitySeoul Korea
- Department of BiotechnologyCollege of Life science and BiotechnologyYonsei UniversitySeoul Korea
| | - Baik L. Seong
- Department of BiotechnologyCollege of Life science and BiotechnologyYonsei UniversitySeoul Korea
| |
Collapse
|
13
|
Kim P, Jang YH, Kwon SB, Lee CM, Han G, Seong BL. Glycosylation of Hemagglutinin and Neuraminidase of Influenza A Virus as Signature for Ecological Spillover and Adaptation among Influenza Reservoirs. Viruses 2018; 10:v10040183. [PMID: 29642453 PMCID: PMC5923477 DOI: 10.3390/v10040183] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2018] [Revised: 03/25/2018] [Accepted: 04/05/2018] [Indexed: 12/12/2022] Open
Abstract
Glycosylation of the hemagglutinin (HA) and neuraminidase (NA) of the influenza provides crucial means for immune evasion and viral fitness in a host population. However, the time-dependent dynamics of each glycosylation sites have not been addressed. We monitored the potential N-linked glycosylation (NLG) sites of over 10,000 HA and NA of H1N1 subtype isolated from human, avian, and swine species over the past century. The results show a shift in glycosylation sites as a hallmark of 1918 and 2009 pandemics, and also for the 1976 “abortive pandemic”. Co-segregation of particular glycosylation sites was identified as a characteristic of zoonotic transmission from animal reservoirs, and interestingly, of “reverse zoonosis” of human viruses into swine populations as well. After the 2009 pandemic, recent isolates accrued glycosylation at canonical sites in HA, reflecting gradual seasonal adaptation, and a novel glycosylation in NA as an independent signature for adaptation among humans. Structural predictions indicated a remarkably pleiotropic influence of glycans on multiple HA epitopes for immune evasion, without sacrificing the receptor binding of HA or the activity of NA. The results provided the rationale for establishing the ecological niche of influenza viruses among the reservoir and could be implemented for influenza surveillance and improving pandemic preparedness.
Collapse
Affiliation(s)
- Paul Kim
- Vaccine Translational Research Center, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Korea.
- Department of Integrated OMICS for Biomedical Science, College of World Class University, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Korea.
| | - Yo Han Jang
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Korea.
| | - Soon Bin Kwon
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Korea.
- Vaccine Translational Research Center, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Korea.
| | - Chung Min Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Korea.
- Biomedicine Pharmaceutical Group, CJ Healthcare R&D Center, CJ HealthCare, 811 Deokpyeong-ro, Majang-myeon, Icheon 17389, Korea.
| | - Gyoonhee Han
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Korea.
- Department of Integrated OMICS for Biomedical Science, College of World Class University, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Korea.
| | - Baik Lin Seong
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Korea.
- Vaccine Translational Research Center, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Korea.
| |
Collapse
|
14
|
Aronsson A, Güler F, Petoukhov MV, Crennell SJ, Svergun DI, Linares-Pastén JA, Nordberg Karlsson E. Structural insights of Rm Xyn10A – A prebiotic-producing GH10 xylanase with a non-conserved aglycone binding region. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2018; 1866:292-306. [DOI: 10.1016/j.bbapap.2017.11.006] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Revised: 10/05/2017] [Accepted: 11/12/2017] [Indexed: 02/02/2023]
|
15
|
Dhifli W, Diallo AB. ProtNN: fast and accurate protein 3D-structure classification in structural and topological space. BioData Min 2016; 9:30. [PMID: 27688811 PMCID: PMC5034655 DOI: 10.1186/s13040-016-0108-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2016] [Accepted: 08/22/2016] [Indexed: 11/30/2022] Open
Abstract
Background Studying the functions and structures of proteins is important for understanding the molecular mechanisms of life. The number of publicly available protein structures has increasingly become extremely large. Still, the classification of a protein structure remains a difficult, costly, and time consuming task. The difficulties are often due to the essential role of spatial and topological structures in the classification of protein structures. Results We propose ProtNN, a novel classification approach for protein 3D-structures. Given an unannotated query protein structure and a set of annotated proteins, ProtNN assigns to the query protein the class with the highest number of votes across the k nearest neighbor reference proteins, where k is a user-defined parameter. The search of the nearest neighbor annotated structures is based on a protein-graph representation model and pairwise similarities between vector embedding of the query and the reference protein structures in structural and topological spaces. Conclusions We demonstrate through an extensive experimental evaluation that ProtNN is able to accurately classify several datasets in an extremely fast runtime compared to state-of-the-art approaches. We further show that ProtNN is able to scale up to a whole PDB dataset in a single-process mode with no parallelization, with a gain of thousands order of magnitude in runtime compared to state-of-the-art approaches.
Collapse
Affiliation(s)
- Wajdi Dhifli
- Department of Computer Science, University of Quebec At Montreal, PO box 8888, Downtown stationMontreal, H3C 3P8 Canada
| | - Abdoulaye Baniré Diallo
- Department of Computer Science, University of Quebec At Montreal, PO box 8888, Downtown stationMontreal, H3C 3P8 Canada
| |
Collapse
|
16
|
Identification of amino acid networks governing catalysis in the closed complex of class I terpene synthases. Proc Natl Acad Sci U S A 2016; 113:E958-67. [PMID: 26842837 DOI: 10.1073/pnas.1519680113] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Class I terpene synthases generate the structural core of bioactive terpenoids. Deciphering structure-function relationships in the reactive closed complex and targeted engineering is hampered by highly dynamic carbocation rearrangements during catalysis. Available crystal structures, however, represent the open, catalytically inactive form or harbor nonproductive substrate analogs. Here, we present a catalytically relevant, closed conformation of taxadiene synthase (TXS), the model class I terpene synthase, which simulates the initial catalytic time point. In silico modeling of subsequent catalytic steps allowed unprecedented insights into the dynamic reaction cascades and promiscuity mechanisms of class I terpene synthases. This generally applicable methodology enables the active-site localization of carbocations and demonstrates the presence of an active-site base motif and its dominating role during catalysis. It additionally allowed in silico-designed targeted protein engineering that unlocked the path to alternate monocyclic and bicyclic synthons representing the basis of a myriad of bioactive terpenoids.
Collapse
|
17
|
Gutiérrez FI, Rodriguez-Valenzuela F, Ibarra IL, Devos DP, Melo F. Efficient and automated large-scale detection of structural relationships in proteins with a flexible aligner. BMC Bioinformatics 2016; 17:20. [PMID: 26732380 PMCID: PMC4702403 DOI: 10.1186/s12859-015-0866-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Accepted: 12/21/2015] [Indexed: 12/01/2022] Open
Abstract
Background The total number of known three-dimensional protein structures is rapidly increasing. Consequently, the need for fast structural search against complete databases without a significant loss of accuracy is increasingly demanding. Recently, TopSearch, an ultra-fast method for finding rigid structural relationships between a query structure and the complete Protein Data Bank (PDB), at the multi-chain level, has been released. However, comparable accurate flexible structural aligners to perform efficient whole database searches of multi-domain proteins are not yet available. The availability of such a tool is critical for a sustainable boosting of biological discovery. Results Here we report on the development of a new method for the fast and flexible comparison of protein structure chains. The method relies on the calculation of 2D matrices containing a description of the three-dimensional arrangement of secondary structure elements (angles and distances). The comparison involves the matching of an ensemble of substructures through a nested-two-steps dynamic programming algorithm. The unique features of this new approach are the integration and trade-off balancing of the following: 1) speed, 2) accuracy and 3) global and semiglobal flexible structure alignment by integration of local substructure matching. The comparison, and matching with competitive accuracy, of one medium sized (250-aa) query structure against the complete PDB database (216,322 protein chains) takes about 8 min using an average desktop computer. The method is at least 2–3 orders of magnitude faster than other tested tools with similar accuracy. We validate the performance of the method for fold and superfamily assignment in a large benchmark set of protein structures. We finally provide a series of examples to illustrate the usefulness of this method and its application in biological discovery. Conclusions The method is able to detect partial structure matching, rigid body shifts, conformational changes and tolerates substantial structural variation arising from insertions, deletions and sequence divergence, as well as structural convergence of unrelated proteins. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0866-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Fernando I Gutiérrez
- Departamento de Genética Molecular y Microbiología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Alameda 340, Santiago, Chile.,Centre for Organismal Studies (COS), Heidelberg University, Heidelberg, Germany
| | - Felipe Rodriguez-Valenzuela
- Departamento de Genética Molecular y Microbiología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Alameda 340, Santiago, Chile
| | - Ignacio L Ibarra
- Departamento de Genética Molecular y Microbiología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Alameda 340, Santiago, Chile.,Centro Andaluz de Biología del Desarrollo (CABD), Universidad Pablo de Olavide, Sevilla, Spain
| | - Damien P Devos
- Centre for Organismal Studies (COS), Heidelberg University, Heidelberg, Germany. .,Centro Andaluz de Biología del Desarrollo (CABD), Universidad Pablo de Olavide, Sevilla, Spain.
| | - Francisco Melo
- Departamento de Genética Molecular y Microbiología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Alameda 340, Santiago, Chile.
| |
Collapse
|
18
|
Stamm M, Forrest LR. Structure alignment of membrane proteins: Accuracy of available tools and a consensus strategy. Proteins 2015; 83:1720-32. [PMID: 26178143 DOI: 10.1002/prot.24857] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2015] [Revised: 05/07/2015] [Accepted: 06/07/2015] [Indexed: 12/31/2022]
Abstract
Protein structure alignment methods are used for the detection of evolutionary and functionally related positions in proteins. A wide array of different methods are available, but the choice of the best method is often not apparent to the user. Several studies have assessed the alignment accuracy and consistency of structure alignment methods, but none of these explicitly considered membrane proteins, which are important targets for drug development and have distinct structural features. Here, we compared 13 widely used pairwise structural alignment methods on a test set of homologous membrane protein structures (called HOMEP3). Each pair of structures was aligned and the corresponding sequence alignment was used to construct homology models. The model accuracy compared to the known structures was assessed using scoring functions not incorporated in the tested structural alignment methods. The analysis shows that fragment-based approaches such as FR-TM-align are the most useful for aligning structures of membrane proteins. Moreover, fragment-based approaches are more suitable for comparison of protein structures that have undergone large conformational changes. Nevertheless, no method was clearly superior to all other methods. Additionally, all methods lack a measure to rate the reliability of a position within a structure alignment. To solve both of these problems, we propose a consensus-type approach, combining alignments from four different methods, namely FR-TM-align, DaliLite, MATT, and FATCAT. Agreement between the methods is used to assign confidence values to each position of the alignment. Overall, we conclude that there remains scope for the improvement of structural alignment methods for membrane proteins.
Collapse
Affiliation(s)
- Marcus Stamm
- Computational Structural Biology Group, Max Planck Institute of Biophysics, Frankfurt Am Main, Germany
| | - Lucy R Forrest
- Computational Structural Biology Group, Max Planck Institute of Biophysics, Frankfurt Am Main, Germany.,Computational Structural Biology Section, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, Maryland
| |
Collapse
|
19
|
Zhao C, Sacan A. UniAlign: protein structure alignment meets evolution. Bioinformatics 2015; 31:3139-46. [PMID: 26059715 DOI: 10.1093/bioinformatics/btv354] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2015] [Accepted: 06/02/2015] [Indexed: 11/15/2022] Open
Abstract
MOTIVATION During the evolution, functional sites on the surface of the protein as well as the hydrophobic core maintaining the structural integrity are well-conserved. However, available protein structure alignment methods align protein structures based solely on the 3D geometric similarity, limiting their ability to detect functionally relevant correspondences between the residues of the proteins, especially for distantly related homologous proteins. RESULTS In this article, we propose a new protein pairwise structure alignment algorithm (UniAlign) that incorporates additional evolutionary information captured in the form of sequence similarity, sequence profiles and residue conservation. We define a per-residue score (UniScore) as a weighted sum of these and other features and develop an iterative optimization procedure to search for an alignment with the best overall UniScore. Our extensive experiments on CDD, HOMSTRAD and BAliBASE benchmark datasets show that UniAlign outperforms commonly used structure alignment methods. We further demonstrate UniAlign's ability to develop family-specific models to drastically improve the quality of the alignments. AVAILABILITY AND IMPLEMENTATION UniAlign is available as a web service at: http://sacan.biomed.drexel.edu/unialign CONTACT ahmet.sacan@drexel.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chunyu Zhao
- Center for Integrated Bioinformatics, School of Biomedical Engineering, Science and Health System, Drexel University, Philadelphia, PA 19104, USA
| | - Ahmet Sacan
- Center for Integrated Bioinformatics, School of Biomedical Engineering, Science and Health System, Drexel University, Philadelphia, PA 19104, USA
| |
Collapse
|
20
|
Carugo O. Protomers of protein hetero-oligomers tend to resemble each other more than expected. SPRINGERPLUS 2014; 3:680. [PMID: 26034682 PMCID: PMC4447755 DOI: 10.1186/2193-1801-3-680] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2014] [Accepted: 11/14/2014] [Indexed: 11/26/2022]
Abstract
A large fraction of the proteome is made by proteins that are not permanently monomeric but form oligomeric assemblies, which can be either homo- or hetero-oligomeric. Here it is described that protomers of hetero-oligomeric proteins tend to resemble each other more than expected. This is verified by comparing the level of similarity of pairs of hetero-oligomeric protein protomers and of pairs of proteins that do not interact with each other. This observation, interesting per se, might reflect the evolution of hetero-oligomers from ancestral homo-oligomers, through gene duplication and paralogs divergence. However, other hypotheses cannot be excluded and the observed structural similarity might result from several causes.
Collapse
Affiliation(s)
- Oliviero Carugo
- Department of Structural and Computational Biology, MFPL, Vienna University, Vienna, Austria ; Department of Chemistry, University of Pavia, Pavia, Italy
| |
Collapse
|
21
|
Micale G, Pulvirenti A, Giugno R, Ferro A. Proteins comparison through probabilistic optimal structure local alignment. Front Genet 2014; 5:302. [PMID: 25228906 PMCID: PMC4151033 DOI: 10.3389/fgene.2014.00302] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2014] [Accepted: 08/12/2014] [Indexed: 11/13/2022] Open
Abstract
Multiple local structure comparison helps to identify common structural motifs or conserved binding sites in 3D structures in distantly related proteins. Since there is no best way to compare structures and evaluate the alignment, a wide variety of techniques and different similarity scoring schemes have been proposed. Existing algorithms usually compute the best superposition of two structures or attempt to solve it as an optimization problem in a simpler setting (e.g., considering contact maps or distance matrices). Here, we present PROPOSAL (PROteins comparison through Probabilistic Optimal Structure local ALignment), a stochastic algorithm based on iterative sampling for multiple local alignment of protein structures. Our method can efficiently find conserved motifs across a set of protein structures. Only the distances between all pairs of residues in the structures are computed. To show the accuracy and the effectiveness of PROPOSAL we tested it on a few families of protein structures. We also compared PROPOSAL with two state-of-the-art tools for pairwise local alignment on a dataset of manually annotated motifs. PROPOSAL is available as a Java 2D standalone application or a command line program at http://ferrolab.dmi.unict.it/proposal/proposal.html.
Collapse
Affiliation(s)
- Giovanni Micale
- Department of Computer Science, University of Pisa Pisa, Italy
| | - Alfredo Pulvirenti
- Department of Clinical and Molecular Biomedicine, University of Catania Catania, Italy
| | - Rosalba Giugno
- Department of Clinical and Molecular Biomedicine, University of Catania Catania, Italy
| | - Alfredo Ferro
- Department of Clinical and Molecular Biomedicine, University of Catania Catania, Italy
| |
Collapse
|
22
|
Nicholls RA, Fischer M, McNicholas S, Murshudov GN. Conformation-independent structural comparison of macromolecules with ProSMART. ACTA CRYSTALLOGRAPHICA. SECTION D, BIOLOGICAL CRYSTALLOGRAPHY 2014; 70:2487-99. [PMID: 25195761 PMCID: PMC4157452 DOI: 10.1107/s1399004714016241] [Citation(s) in RCA: 150] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/09/2014] [Accepted: 07/12/2014] [Indexed: 12/05/2023]
Abstract
The identification and exploration of (dis)similarities between macromolecular structures can help to gain biological insight, for instance when visualizing or quantifying the response of a protein to ligand binding. Obtaining a residue alignment between compared structures is often a prerequisite for such comparative analysis. If the conformational change of the protein is dramatic, conventional alignment methods may struggle to provide an intuitive solution for straightforward analysis. To make such analyses more accessible, the Procrustes Structural Matching Alignment and Restraints Tool (ProSMART) has been developed, which achieves a conformation-independent structural alignment, as well as providing such additional functionalities as the generation of restraints for use in the refinement of macromolecular models. Sensible comparison of protein (or DNA/RNA) structures in the presence of conformational changes is achieved by enforcing neither chain nor domain rigidity. The visualization of results is facilitated by popular molecular-graphics software such as CCP4mg and PyMOL, providing intuitive feedback regarding structural conservation and subtle dissimilarities between close homologues that can otherwise be hard to identify. Automatically generated colour schemes corresponding to various residue-based scores are provided, which allow the assessment of the conservation of backbone and side-chain conformations relative to the local coordinate frame. Structural comparison tools such as ProSMART can help to break the complexity that accompanies the constantly growing pool of structural data into a more readily accessible form, potentially offering biological insight or influencing subsequent experiments.
Collapse
Affiliation(s)
- Robert A. Nicholls
- Structural Studies Division, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, England
| | - Marcus Fischer
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, CA 94158, USA
| | - Stuart McNicholas
- Structural Biology Laboratory, Department of Chemistry, University of York, Heslington, York YO10 5DD, England
| | - Garib N. Murshudov
- Structural Studies Division, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, England
| |
Collapse
|
23
|
Ma J, Wang S. Algorithms, Applications, and Challenges of Protein Structure Alignment. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2014; 94:121-75. [DOI: 10.1016/b978-0-12-800168-4.00005-6] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
24
|
Protein structure alignment beyond spatial proximity. Sci Rep 2013; 3:1448. [PMID: 23486213 PMCID: PMC3596798 DOI: 10.1038/srep01448] [Citation(s) in RCA: 98] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2012] [Accepted: 02/25/2013] [Indexed: 11/08/2022] Open
Abstract
Protein structure alignment is a fundamental problem in computational structure biology. Many programs have been developed for automatic protein structure alignment, but most of them align two protein structures purely based upon geometric similarity without considering evolutionary and functional relationship. As such, these programs may generate structure alignments which are not very biologically meaningful from the evolutionary perspective. This paper presents a novel method DeepAlign for automatic pairwise protein structure alignment. DeepAlign aligns two protein structures using not only spatial proximity of equivalent residues (after rigid-body superposition), but also evolutionary relationship and hydrogen-bonding similarity. Experimental results show that DeepAlign can generate structure alignments much more consistent with manually-curated alignments than other automatic tools especially when proteins under consideration are remote homologs. These results imply that in addition to geometric similarity, evolutionary information and hydrogen-bonding similarity are essential to aligning two protein structures.
Collapse
|
25
|
Topham CM, Rouquier M, Tarrat N, André I. Adaptive Smith-Waterman residue match seeding for protein structural alignment. Proteins 2013; 81:1823-39. [DOI: 10.1002/prot.24327] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2013] [Revised: 04/22/2013] [Accepted: 05/15/2013] [Indexed: 12/30/2022]
Affiliation(s)
- Christopher M. Topham
- Université de Toulouse, INSA, UPS, INP, LISBP; 135 Avenue de Rangueil F-31077 Toulouse France
- CNRS, UMR5504; F-31400 Toulouse France
- INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés; F-31400 Toulouse France
| | - Mickaël Rouquier
- Université de Toulouse, INSA, UPS, INP, LISBP; 135 Avenue de Rangueil F-31077 Toulouse France
- CNRS, UMR5504; F-31400 Toulouse France
- INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés; F-31400 Toulouse France
| | - Nathalie Tarrat
- Université de Toulouse, INSA, UPS, INP, LISBP; 135 Avenue de Rangueil F-31077 Toulouse France
- CNRS, UMR5504; F-31400 Toulouse France
- INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés; F-31400 Toulouse France
| | - Isabelle André
- Université de Toulouse, INSA, UPS, INP, LISBP; 135 Avenue de Rangueil F-31077 Toulouse France
- CNRS, UMR5504; F-31400 Toulouse France
- INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés; F-31400 Toulouse France
| |
Collapse
|
26
|
Khan MB, Sponder G, Sjöblom B, Svidová S, Schweyen RJ, Carugo O, Djinović-Carugo K. Structural and functional characterization of the N-terminal domain of the yeast Mg2+channel Mrs2. ACTA CRYSTALLOGRAPHICA SECTION D: BIOLOGICAL CRYSTALLOGRAPHY 2013; 69:1653-64. [DOI: 10.1107/s0907444913011712] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2013] [Accepted: 04/29/2013] [Indexed: 01/08/2023]
|
27
|
Cheraghi R, Hosseinkhani S, Davoodi J, Nazari M, Amini-Bayat Z, Karimi H, Shamseddin M, Gheidari F. Structural and functional effects of circular permutation on firefly luciferase: In vitro assay of caspase 3/7. Int J Biol Macromol 2013; 58:336-42. [DOI: 10.1016/j.ijbiomac.2013.04.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2012] [Revised: 03/28/2013] [Accepted: 04/08/2013] [Indexed: 02/08/2023]
|
28
|
Wang JJY, Bensmail H, Gao X. Multiple graph regularized protein domain ranking. BMC Bioinformatics 2012; 13:307. [PMID: 23157331 PMCID: PMC3583823 DOI: 10.1186/1471-2105-13-307] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2012] [Accepted: 10/29/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. RESULTS To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. CONCLUSION The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications.
Collapse
Affiliation(s)
- Jim Jing-Yan Wang
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia.
| | | | | |
Collapse
|
29
|
Ritchie DW, Ghoorah AW, Mavridis L, Venkatraman V. Fast protein structure alignment using Gaussian overlap scoring of backbone peptide fragment similarity. Bioinformatics 2012; 28:3274-81. [DOI: 10.1093/bioinformatics/bts618] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
30
|
Bonnel N, Marteau PF. LNA: fast protein structural comparison using a Laplacian characterization of tertiary structure. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:1451-1458. [PMID: 22547433 DOI: 10.1109/tcbb.2012.64] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Abstract—In the last two decades, a lot of protein 3D shapes have been discovered, characterized, and made available thanks to the Protein Data Bank (PDB), that is nevertheless growing very quickly. New scalable methods are thus urgently required to search through the PDB efficiently. This paper presents an approach entitled LNA (Laplacian Norm Alignment) that performs a structural comparison of two proteins with dynamic programming algorithms. This is achieved by characterizing each residue in the protein with scalar features. The feature values are calculated using a Laplacian operator applied on the graph corresponding to the adjacency matrix of the residues. The weighted Laplacian operator we use estimates, at various scales, local deformations of the topology where each residue is located. On some benchmarks, which are widely shared by the community, we obtain qualitatively similar results compared to other competing approaches, but with an algorithm one or two order of magnitudes faster. 180,000 protein comparisons can be done within 1 second with a single recent Graphical Processing Unit (GPU), which makes our algorithm very scalable and suitable for real-time database querying across the web.
Collapse
Affiliation(s)
- Nicolas Bonnel
- IRISA, Université de Bretagne Sud, Campus de Tohannic, Vannes 56000, France.
| | | |
Collapse
|
31
|
Ho HK, Gange G, Kuiper MJ, Ramamohanarao K. BetaSearch: a new method for querying β-residue motifs. BMC Res Notes 2012; 5:391. [PMID: 22839199 PMCID: PMC3532365 DOI: 10.1186/1756-0500-5-391] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2012] [Accepted: 06/15/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Searching for structural motifs across known protein structures can be useful for identifying unrelated proteins with similar function and characterising secondary structures such as β-sheets. This is infeasible using conventional sequence alignment because linear protein sequences do not contain spatial information. β-residue motifs are β-sheet substructures that can be represented as graphs and queried using existing graph indexing methods, however, these approaches are designed for general graphs that do not incorporate the inherent structural constraints of β-sheets and require computationally-expensive filtering and verification procedures. 3D substructure search methods, on the other hand, allow β-residue motifs to be queried in a three-dimensional context but at significant computational costs. FINDINGS We developed a new method for querying β-residue motifs, called BetaSearch, which leverages the natural planar constraints of β-sheets by indexing them as 2D matrices, thus avoiding much of the computational complexities involved with structural and graph querying. BetaSearch exhibits faster filtering, verification, and overall query time than existing graph indexing approaches whilst producing comparable index sizes. Compared to 3D substructure search methods, BetaSearch achieves 33 and 240 times speedups over index-based and pairwise alignment-based approaches, respectively. Furthermore, we have presented case-studies to demonstrate its capability of motif matching in sequentially dissimilar proteins and described a method for using BetaSearch to predict β-strand pairing. CONCLUSIONS We have demonstrated that BetaSearch is a fast method for querying substructure motifs. The improvements in speed over existing approaches make it useful for efficiently performing high-volume exploratory querying of possible protein substructural motifs or conformations. BetaSearch was used to identify a nearly identical β-residue motif between an entirely synthetic (Top7) and a naturally-occurring protein (Charcot-Leyden crystal protein), as well as identifying structural similarities between biotin-binding domains of avidin, streptavidin and the lipocalin gamma subunit of human C8.
Collapse
Affiliation(s)
- Hui Kian Ho
- Department of Computing and Information Systems, The University of Melbourne, Victoria, Australia.
| | | | | | | |
Collapse
|
32
|
Shealy P, Valafar H. Multiple structure alignment with msTALI. BMC Bioinformatics 2012; 13:105. [PMID: 22607234 PMCID: PMC3473313 DOI: 10.1186/1471-2105-13-105] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2011] [Accepted: 04/18/2012] [Indexed: 11/10/2022] Open
Abstract
Background Multiple structure alignments have received increasing attention in recent years as an alternative to multiple sequence alignments. Although multiple structure alignment algorithms can potentially be applied to a number of problems, they have primarily been used for protein core identification. A method that is capable of solving a variety of problems using structure comparison is still absent. Here we introduce a program msTALI for aligning multiple protein structures. Our algorithm uses several informative features to guide its alignments: torsion angles, backbone Cα atom positions, secondary structure, residue type, surface accessibility, and properties of nearby atoms. The algorithm allows the user to weight the types of information used to generate the alignment, which expands its utility to a wide variety of problems. Results msTALI exhibits competitive results on 824 families from the Homstrad and SABmark databases when compared to Matt and Mustang. We also demonstrate success at building a database of protein cores using 341 randomly selected CATH domains and highlight the contribution of msTALI compared to the CATH classifications. Finally, we present an example applying msTALI to the problem of detecting hinges in a protein undergoing rigid-body motion. Conclusions msTALI is an effective algorithm for multiple structure alignment. In addition to its performance on standard comparison databases, it utilizes clear, informative features, allowing further customization for domain-specific applications. The C++ source code for msTALI is available for Linux on the web at
http://ifestos.cse.sc.edu/mstali.
Collapse
Affiliation(s)
- Paul Shealy
- Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USA
| | | |
Collapse
|
33
|
Wang J, Gao X, Wang Q, Li Y. ProDis-ContSHC: learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval. BMC Bioinformatics 2012; 13 Suppl 7:S2. [PMID: 22594999 PMCID: PMC3348016 DOI: 10.1186/1471-2105-13-s7-s2] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND The need to retrieve or classify protein molecules using structure or sequence-based similarity measures underlies a wide range of biomedical applications. Traditional protein search methods rely on a pairwise dissimilarity/similarity measure for comparing a pair of proteins. This kind of pairwise measures suffer from the limitation of neglecting the distribution of other proteins and thus cannot satisfy the need for high accuracy of the retrieval systems. Recent work in the machine learning community has shown that exploiting the global structure of the database and learning the contextual dissimilarity/similarity measures can improve the retrieval performance significantly. However, most existing contextual dissimilarity/similarity learning algorithms work in an unsupervised manner, which does not utilize the information of the known class labels of proteins in the database. RESULTS In this paper, we propose a novel protein-protein dissimilarity learning algorithm, ProDis-ContSHC. ProDis-ContSHC regularizes an existing dissimilarity measure dij by considering the contextual information of the proteins. The context of a protein is defined by its neighboring proteins. The basic idea is, for a pair of proteins (i, j), if their context N(i) and N(j) is similar to each other, the two proteins should also have a high similarity. We implement this idea by regularizing dij by a factor learned from the context N(i) and N(j).Moreover, we divide the context to hierarchial sub-context and get the contextual dissimilarity vector for each protein pair. Using the class label information of the proteins, we select the relevant (a pair of proteins that has the same class labels) and irrelevant (with different labels) protein pairs, and train an SVM model to distinguish between their contextual dissimilarity vectors. The SVM model is further used to learn a supervised regularizing factor. Finally, with the new Supervised learned Dissimilarity measure, we update the Protein Hierarchial Context Coherently in an iterative algorithm--ProDis-ContSHC.We test the performance of ProDis-ContSHC on two benchmark sets, i.e., the ASTRAL 1.73 database and the FSSP/DALI database. Experimental results demonstrate that plugging our supervised contextual dissimilarity measures into the retrieval systems significantly outperforms the context-free dissimilarity/similarity measures and other unsupervised contextual dissimilarity measures that do not use the class label information. CONCLUSIONS Using the contextual proteins with their class labels in the database, we can improve the accuracy of the pairwise dissimilarity/similarity measures dramatically for the protein retrieval tasks. In this work, for the first time, we propose the idea of supervised contextual dissimilarity learning, resulting in the ProDis-ContSHC algorithm. Among different contextual dissimilarity learning approaches that can be used to compare a pair of proteins, ProDis-ContSHC provides the highest accuracy. Finally, ProDis-ContSHC compares favorably with other methods reported in the recent literature.
Collapse
Affiliation(s)
- Jingyan Wang
- King Abdullah University of Science and Technology (KAUST), Mathematical and Computer Sciences and Engineering Division, Thuwal, 23955-6900, Saudi Arabia
| | | | | | | |
Collapse
|
34
|
SALEM SAEED, ZAKI MOHAMMEDJ, BYSTROFF CHRISTOPHER. ITERATIVE NON-SEQUENTIAL PROTEIN STRUCTURAL ALIGNMENT. J Bioinform Comput Biol 2011; 7:571-96. [PMID: 19507290 DOI: 10.1142/s0219720009004205] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2008] [Revised: 11/05/2008] [Accepted: 11/06/2008] [Indexed: 11/18/2022]
Abstract
Structural similarity between proteins gives us insights into their evolutionary relationships when there is low sequence similarity. In this paper, we present a novel approach called SNAP for non-sequential pair-wise structural alignment. Starting from an initial alignment, our approach iterates over a two-step process consisting of a superposition step and an alignment step, until convergence. We propose a novel greedy algorithm to construct both sequential and non-sequential alignments. The quality of SNAP alignments were assessed by comparing against the manually curated reference alignments in the challenging SISY and RIPC datasets. Moreover, when applied to a dataset of 4410 protein pairs selected from the CATH database, SNAP produced longer alignments with lower rmsd than several state-of-the-art alignment methods. Classification of folds using SNAP alignments was both highly sensitive and highly selective. The SNAP software along with the datasets are available online at
Collapse
Affiliation(s)
- SAEED SALEM
- Department of Computer Science, Rensselaer Polytechnic Institute, 110 8th st. Troy, New York 12180, USA
| | - MOHAMMED J. ZAKI
- Department of Computer Science, Rensselaer Polytechnic Institute, 110 8th st. Troy, New York 12180, USA
| | - CHRISTOPHER BYSTROFF
- Department of Computer Science, Rensselaer Polytechnic Institute, 110 8th st. Troy, New York 12180, USA
- Department of Biology, Rensselaer Polytechnic Institute, 110 8th st. Troy, New York 12180, USA
| |
Collapse
|
35
|
Daniluk P, Lesyng B. A novel method to compare protein structures using local descriptors. BMC Bioinformatics 2011; 12:344. [PMID: 21849047 PMCID: PMC3179968 DOI: 10.1186/1471-2105-12-344] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2011] [Accepted: 08/17/2011] [Indexed: 11/15/2022] Open
Abstract
Background Protein structure comparison is one of the most widely performed tasks in bioinformatics. However, currently used methods have problems with the so-called "difficult similarities", including considerable shifts and distortions of structure, sequential swaps and circular permutations. There is a demand for efficient and automated systems capable of overcoming these difficulties, which may lead to the discovery of previously unknown structural relationships. Results We present a novel method for protein structure comparison based on the formalism of local descriptors of protein structure - DEscriptor Defined Alignment (DEDAL). Local similarities identified by pairs of similar descriptors are extended into global structural alignments. We demonstrate the method's capability by aligning structures in difficult benchmark sets: curated alignments in the SISYPHUS database, as well as SISY and RIPC sets, including non-sequential and non-rigid-body alignments. On the most difficult RIPC set of sequence alignment pairs the method achieves an accuracy of 77% (the second best method tested achieves 60% accuracy). Conclusions DEDAL is fast enough to be used in whole proteome applications, and by lowering the threshold of detectable structure similarity it may shed additional light on molecular evolution processes. It is well suited to improving automatic classification of structure domains, helping analyze protein fold space, or to improving protein classification schemes. DEDAL is available online at http://bioexploratorium.pl/EP/DEDAL.
Collapse
Affiliation(s)
- Paweł Daniluk
- Faculty of Physics, Department of Biophysics and CoE BioExploratorium, University of Warsaw, Żwirki i Wigury 93, Warsaw, Poland
| | | |
Collapse
|
36
|
Shen YF, Li B, Liu ZP. Protein structure alignment based on internal coordinates. Interdiscip Sci 2010; 2:308-19. [DOI: 10.1007/s12539-010-0019-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2008] [Revised: 01/05/2010] [Accepted: 01/06/2010] [Indexed: 10/18/2022]
|
37
|
Chu CH, Lo WC, Wang HW, Hsu YC, Hwang JK, Lyu PC, Pai TW, Tang CY. Detection and alignment of 3D domain swapping proteins using angle-distance image-based secondary structural matching techniques. PLoS One 2010; 5:e13361. [PMID: 20976204 PMCID: PMC2955075 DOI: 10.1371/journal.pone.0013361] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2010] [Accepted: 09/13/2010] [Indexed: 11/18/2022] Open
Abstract
This work presents a novel detection method for three-dimensional domain swapping (DS), a mechanism for forming protein quaternary structures that can be visualized as if monomers had “opened” their “closed” structures and exchanged the opened portion to form intertwined oligomers. Since the first report of DS in the mid 1990s, an increasing number of identified cases has led to the postulation that DS might occur in a protein with an unconstrained terminus under appropriate conditions. DS may play important roles in the molecular evolution and functional regulation of proteins and the formation of depositions in Alzheimer's and prion diseases. Moreover, it is promising for designing auto-assembling biomaterials. Despite the increasing interest in DS, related bioinformatics methods are rarely available. Owing to a dramatic conformational difference between the monomeric/closed and oligomeric/open forms, conventional structural comparison methods are inadequate for detecting DS. Hence, there is also a lack of comprehensive datasets for studying DS. Based on angle-distance (A-D) image transformations of secondary structural elements (SSEs), specific patterns within A-D images can be recognized and classified for structural similarities. In this work, a matching algorithm to extract corresponding SSE pairs from A-D images and a novel DS score have been designed and demonstrated to be applicable to the detection of DS relationships. The Matthews correlation coefficient (MCC) and sensitivity of the proposed DS-detecting method were higher than 0.81 even when the sequence identities of the proteins examined were lower than 10%. On average, the alignment percentage and root-mean-square distance (RMSD) computed by the proposed method were 90% and 1.8Å for a set of 1,211 DS-related pairs of proteins. The performances of structural alignments remain high and stable for DS-related homologs with less than 10% sequence identities. In addition, the quality of its hinge loop determination is comparable to that of manual inspection. This method has been implemented as a web-based tool, which requires two protein structures as the input and then the type and/or existence of DS relationships between the input structures are determined according to the A-D image-based structural alignments and the DS score. The proposed method is expected to trigger large-scale studies of this interesting structural phenomenon and facilitate related applications.
Collapse
Affiliation(s)
- Chia-Han Chu
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, Republic of China
| | - Wei-Cheng Lo
- Institute of Bioinformatics and Structural Biology, National Tsing Hua University, Hsinchu, Taiwan, Republic of China
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan, Republic of China
| | - Hsin-Wei Wang
- Department of Computer Science and Engineering, National Taiwan Ocean University, Keelung, Taiwan, Republic of China
| | - Yen-Chu Hsu
- Department of Computer Science and Engineering, National Taiwan Ocean University, Keelung, Taiwan, Republic of China
| | - Jenn-Kang Hwang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan, Republic of China
| | - Ping-Chiang Lyu
- Institute of Bioinformatics and Structural Biology, National Tsing Hua University, Hsinchu, Taiwan, Republic of China
| | - Tun-Wen Pai
- Department of Computer Science and Engineering, National Taiwan Ocean University, Keelung, Taiwan, Republic of China
- * E-mail: (T-WP); (CYT)
| | - Chuan Yi Tang
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, Republic of China
- Department of Computer Science and Information Engineering, Providence University, Taichung, Taiwan, Republic of China
- * E-mail: (T-WP); (CYT)
| |
Collapse
|
38
|
Cagnoli C, Stevanin G, Brussino A, Barberis M, Mancini C, Margolis RL, Holmes SE, Nobili M, Forlani S, Padovan S, Pappi P, Zaros C, Leber I, Ribai P, Pugliese L, Assalto C, Brice A, Migone N, Dürr A, Brusco A. Missense mutations in the AFG3L2 proteolytic domain account for ∼1.5% of European autosomal dominant cerebellar ataxias. Hum Mutat 2010; 31:1117-24. [PMID: 20725928 DOI: 10.1002/humu.21342] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Spinocerebellar ataxia type 28 is an autosomal dominant form of cerebellar ataxia (ADCA) caused by mutations in AFG3L2, a gene that encodes a subunit of the mitochondrial m-AAA protease. We screened 366 primarily Caucasian ADCA families, negative for the most common triplet expansions, for point mutations in AFG3L2 using DHPLC. Whole-gene deletions were excluded in 300 of the patients, and duplications were excluded in 129 patients. We found six missense mutations in nine unrelated index cases (9/366, 2.6%): c.1961C>T (p.Thr654Ile) in exon 15, c.1996A>G (p.Met666Val), c.1997T>G (p.Met666Arg), c.1997T>C (p.Met666Thr), c.2011G>A (p.Gly671Arg), and c.2012G>A (p.Gly671Glu) in exon 16. All mutated amino acids were located in the C-terminal proteolytic domain. In available cases, we demonstrated the mutations segregated with the disease. Mutated amino acids are highly conserved, and bioinformatic analysis indicates the substitutions are likely deleterious. This investigation demonstrates that SCA28 accounts for ∼3% of ADCA Caucasian cases negative for triplet expansions and, in extenso, to ∼1.5% of all ADCA. We further confirm both the involvement of AFG3L2 gene in SCA28 and the presence of a mutational hotspot in exons 15-16. Screening for SCA28, is warranted in patients who test negative for more common SCAs and present with a slowly progressive cerebellar ataxia accompanied by oculomotor signs.
Collapse
Affiliation(s)
- Claudia Cagnoli
- Department of Genetics, Biology and Biochemistry, University of Torino, Torino, Italy
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Stivala AD, Stuckey PJ, Wirth AI. Fast and accurate protein substructure searching with simulated annealing and GPUs. BMC Bioinformatics 2010; 11:446. [PMID: 20813068 PMCID: PMC2944279 DOI: 10.1186/1471-2105-11-446] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2010] [Accepted: 09/03/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Searching a database of protein structures for matches to a query structure, or occurrences of a structural motif, is an important task in structural biology and bioinformatics. While there are many existing methods for structural similarity searching, faster and more accurate approaches are still required, and few current methods are capable of substructure (motif) searching. RESULTS We developed an improved heuristic for tableau-based protein structure and substructure searching using simulated annealing, that is as fast or faster and comparable in accuracy, with some widely used existing methods. Furthermore, we created a parallel implementation on a modern graphics processing unit (GPU). CONCLUSIONS The GPU implementation achieves up to 34 times speedup over the CPU implementation of tableau-based structure search with simulated annealing, making it one of the fastest available methods. To the best of our knowledge, this is the first application of a GPU to the protein structural search problem.
Collapse
Affiliation(s)
- Alex D Stivala
- Department of Computer Science and Software Engineering, The University of Melbourne, Victoria 3010, Australia
| | - Peter J Stuckey
- Department of Computer Science and Software Engineering, The University of Melbourne, Victoria 3010, Australia
- National ICT Australia Victoria Laboratory at The University of Melbourne, Victoria 3010, Australia
| | - Anthony I Wirth
- Department of Computer Science and Software Engineering, The University of Melbourne, Victoria 3010, Australia
| |
Collapse
|
40
|
Wohlers I, Domingues FS, Klau GW. Towards optimal alignment of protein structure distance matrices. Bioinformatics 2010; 26:2273-80. [PMID: 20639543 DOI: 10.1093/bioinformatics/btq420] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Affiliation(s)
- Inken Wohlers
- CWI, Life Sciences Group, Amsterdam, The Netherlands.
| | | | | |
Collapse
|
41
|
The -galactosidase type A gene aglA from Aspergillus niger encodes a fully functional -N-acetylgalactosaminidase. Glycobiology 2010; 20:1410-9. [DOI: 10.1093/glycob/cwq105] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
42
|
The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam. Comput Biol Chem 2010; 34:210-4. [PMID: 20537955 DOI: 10.1016/j.compbiolchem.2010.04.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2010] [Revised: 04/09/2010] [Accepted: 04/25/2010] [Indexed: 11/21/2022]
Abstract
The Pfam database is an important tool in genome annotation, since it provides a collection of curated protein families. However, a subset of these families, known as domains of unknown function (DUFs), remains poorly characterized. We have related sequences from DUF404, DUF407, DUF482, DUF608, DUF810, DUF853, DUF976 and DUF1111 to homologs in PDB, within the midnight zone (9-20%) of sequence identity. These relationships were extended to provide functional annotation by sequence analysis and model building. Also described are examples of residue plasticity within enzyme active sites, and change of function within homologous sequences of a DUF.
Collapse
|
43
|
Hong KW, Jin HS, Lim JE, Cho YS, Go MJ, Jung J, Lee JE, Choi J, Shin C, Hwang SY, Lee SH, Park HK, Oh B. Non-synonymous single-nucleotide polymorphisms associated with blood pressure and hypertension. J Hum Hypertens 2010; 24:763-74. [PMID: 20147969 DOI: 10.1038/jhh.2010.9] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
In this study, we determined the association of 1180 non-synonymous single-nucleotide polymorphisms (SNPs) with systolic blood pressure (SBP) and hypertensive status. A total of 8842 subjects were taken from two community-based cohorts--Ansung (n=4183) and Ansan (n=4659), South Korea--which had been established for genome-wide association studies (GWAS). Five SNPs (rs16835244, rs2286672, rs6265, rs17237198 and rs7312017) were significantly associated (P-values: 0.003-0.0001, not corrected for genome-wide significance) with SBP in both cohorts. Of these SNPs, rs16835244 and rs2286672 correlated with risk for hypertension. The rs16835244 SNP replaces Ala288 in arginine decarboxylase (ADC) with serine, and rs2286672 replaces Arg172 in phospholipase D2 (PLD2) with cysteine. A comparison of peptide sequences between vertebrate homologues revealed that the SNPs identified occur at conserved amino-acid residues. In silico analysis of the protein structure showed that the substitution of a polar residue, serine, for a non-polar alanine at amino-acid residue 288 affects a conformational change in ADC, and that Arg172 in PLD2 resides in the PX domain, which is important for membrane trafficking. These results provide insights into the function of these non-synonymous SNPs in the development of hypertension. The study investigating non-synonymous SNPs from GWAS not only by statistical association analysis but also by biological relevance through the protein structure might be a good approach for identifying genetic risk factors for hypertension, in addition to discovering causative variations.
Collapse
Affiliation(s)
- K-W Hong
- Department of Biomedical Engineering, School of Medicine, Kyung Hee University, Seoul, Korea
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Kairys V, Gilson MK, Lather V, Schiffer CA, Fernandes MX. Toward the design of mutation-resistant enzyme inhibitors: further evaluation of the substrate envelope hypothesis. Chem Biol Drug Des 2009; 74:234-45. [PMID: 19703025 DOI: 10.1111/j.1747-0285.2009.00851.x] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Previous studies have shown the usefulness of the substrate envelope concept in the analysis and prediction of drug resistance profiles for human immunodeficiency virus protease mutants. This study tests its applicability to several other therapeutic targets: Abl kinase, chitinase, thymidylate synthase, dihydrofolate reductase, and neuraminidase. For the targets where many (> or =6) mutation data are available to compute the average mutation sensitivity of inhibitors, the total volume of an inhibitor molecule that projects outside the substrate envelope V(out), is found to correlate with average mutation sensitivity. Analysis of a locally computed volume suggests that the same correlation would hold for the other targets, if more extensive mutation data sets were available. It is concluded that the substrate envelope concept offers a promising and easily implemented computational tool for the design of drugs that will tend to resist mutations. Software implementing these calculations is provided with the 'Supporting Information'.
Collapse
Affiliation(s)
- Visvaldas Kairys
- Centro de Química da Madeira, Departamento de Química, Universidade da Madeira, 9000-390 Funchal, Portugal
| | | | | | | | | |
Collapse
|
45
|
|
46
|
Micheletti C, Orland H. MISTRAL: a tool for energy-based multiple structural alignment of proteins. ACTA ACUST UNITED AC 2009; 25:2663-9. [PMID: 19692555 DOI: 10.1093/bioinformatics/btp506] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION The steady growth of the number of available protein structures has constantly motivated the development of new algorithms for detecting structural correspondences in proteins. Detecting structural equivalences in two or more proteins is computationally demanding as it typically entails the exploration of the combinatorial space of all possible amino acid pairings in the parent proteins. The search is often aided by the introduction of various constraints such as considering protein fragments, rather than single amino acids, and/or seeking only sequential correspondences in the given proteins. An additional challenge is represented by the difficulty of associating to a given alignment, a reliable a priori measure of its statistical significance. RESULTS Here, we present and discuss MISTRAL (Multiple STRuctural ALignment), a novel strategy for multiple protein alignment based on the minimization of an energy function over the low-dimensional space of the relative rotations and translations of the molecules. The energy minimization avoids combinatorial searches and returns pairwise alignment scores for which a reliable a priori statistical significance can be given. AVAILABILITY MISTRAL is freely available for academic users as a standalone program and as a web service at http://ipht.cea.fr/protein.php.
Collapse
Affiliation(s)
- Cristian Micheletti
- SISSA, CNR-INFM Democritos and Italian Institute of Technology, Via Beirut 2-4, 34014 Trieste, Italy.
| | | |
Collapse
|
47
|
Chi PH, Pang B, Korkin D, Shyu CR. Efficient SCOP-fold classification and retrieval using index-based protein substructure alignments. ACTA ACUST UNITED AC 2009; 25:2559-65. [PMID: 19667079 DOI: 10.1093/bioinformatics/btp474] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION To investigate structure-function relationships, life sciences researchers usually retrieve and classify proteins with similar substructures into the same fold. A manually constructed database, SCOP, is believed to be highly accurate; however, it is labor intensive. Another known method, DALI, is also precise but computationally expensive. We have developed an efficient algorithm, namely, index-based protein substructure alignment (IPSA), for protein-fold classification. IPSA constructs a two-layer indexing tree to quickly retrieve similar substructures in proteins and suggests possible folds by aligning these substructures. RESULTS Compared with known algorithms, such as DALI, CE, MultiProt and MAMMOTH, on a sample dataset of non-redundant proteins from SCOP v1.73, IPSA exhibits an efficiency improvement of 53.10, 16.87, 3.60 and 1.64 times speedup, respectively. Evaluated on three different datasets of non-redundant proteins from SCOP, average accuracy of IPSA is approximately equal to DALI and better than CE, MAMMOTH, MultiProt and SSM. With reliable accuracy and efficiency, this work will benefit the study of high-throughput protein structure-function relationships. AVAILABILITY IPSA is publicly accessible at http://ProteinDBS.rnet.missouri.edu/IPSA.php
Collapse
Affiliation(s)
- Pin-Hao Chi
- Medical and Biological Digital Library Research Lab, Informatics Institute, University of Missouri, Columbia, MO 65211, USA
| | | | | | | |
Collapse
|
48
|
Kim C, Tai CH, Lee B. Iterative refinement of structure-based sequence alignments by Seed Extension. BMC Bioinformatics 2009; 10:210. [PMID: 19589133 PMCID: PMC2753854 DOI: 10.1186/1471-2105-10-210] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2009] [Accepted: 07/09/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Accurate sequence alignment is required in many bioinformatics applications but, when sequence similarity is low, it is difficult to obtain accurate alignments based on sequence similarity alone. The accuracy improves when the structures are available, but current structure-based sequence alignment procedures still mis-align substantial numbers of residues. In order to correct such errors, we previously explored the possibility of replacing the residue-based dynamic programming algorithm in structure alignment procedures with the Seed Extension algorithm, which does not use a gap penalty. Here, we describe a new procedure called RSE (Refinement with Seed Extension) that iteratively refines a structure-based sequence alignment. RESULTS RSE uses SE (Seed Extension) in its core, which is an algorithm that we reported recently for obtaining a sequence alignment from two superimposed structures. The RSE procedure was evaluated by comparing the correctly aligned fractions of residues before and after the refinement of the structure-based sequence alignments produced by popular programs. CE, DaliLite, FAST, LOCK2, MATRAS, MATT, TM-align, SHEBA and VAST were included in this analysis and the NCBI's CDD root node set was used as the reference alignments. RSE improved the average accuracy of sequence alignments for all programs tested when no shift error was allowed. The amount of improvement varied depending on the program. The average improvements were small for DaliLite and MATRAS but about 5% for CE and VAST. More substantial improvements have been seen in many individual cases. The additional computation times required for the refinements were negligible compared to the times taken by the structure alignment programs. CONCLUSION RSE is a computationally inexpensive way of improving the accuracy of a structure-based sequence alignment. It can be used as a standalone procedure following a regular structure-based sequence alignment or to replace the traditional iterative refinement procedures based on residue-level dynamic programming algorithm in many structure alignment programs.
Collapse
Affiliation(s)
- Changhoon Kim
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD 20892, USA.
| | | | | |
Collapse
|
49
|
Abstract
Protein structures often show similarities to another which would not be seen at the sequence level. Given the coordinates of a protein chain, the SALAMI server atwww.zbh.uni-hamburg.de/salami will search the protein data bank and return a set of similar structures without using sequence information. The results page lists the related proteins, details of the sequence and structure similarity and implied sequence alignments. Via a simple structure viewer, one can view superpositions of query and library structures and finally download superimposed coordinates. The alignment method is very tolerant of large gaps and insertions, and tends to produce slightly longer alignments than other similar programs.
Collapse
Affiliation(s)
- Thomas Margraf
- Centre for Bioinformatics, University of Hamburg, Bundesstr. 43, 20146 Hamburg, Germany.
| | | | | |
Collapse
|
50
|
Thompson KE, Wang Y, Madej T, Bryant SH. Improving protein structure similarity searches using domain boundaries based on conserved sequence information. BMC STRUCTURAL BIOLOGY 2009; 9:33. [PMID: 19454035 PMCID: PMC2694201 DOI: 10.1186/1472-6807-9-33] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/07/2008] [Accepted: 05/19/2009] [Indexed: 11/10/2022]
Abstract
BACKGROUND The identification of protein domains plays an important role in protein structure comparison. Domain query size and composition are critical to structure similarity search algorithms such as the Vector Alignment Search Tool (VAST), the method employed for computing related protein structures in NCBI Entrez system. Currently, domains identified on the basis of structural compactness are used for VAST computations. In this study, we have investigated how alternative definitions of domains derived from conserved sequence alignments in the Conserved Domain Database (CDD) would affect the domain comparisons and structure similarity search performance of VAST. RESULTS Alternative domains, which have significantly different secondary structure composition from those based on structurally compact units, were identified based on the alignment footprints of curated protein sequence domain families. Our analysis indicates that domain boundaries disagree on roughly 8% of protein chains in the medium redundancy subset of the Molecular Modeling Database (MMDB). These conflicting sequence based domain boundaries perform slightly better than structure domains in structure similarity searches, and there are interesting cases when structure similarity search performance is markedly improved. CONCLUSION Structure similarity searches using domain boundaries based on conserved sequence information can provide an additional method for investigators to identify interesting similarities between proteins with known structures. Because of the improvement in performance of structure similarity searches using sequence domain boundaries, we are in the process of implementing their inclusion into the VAST search and MMDB resources in the NCBI Entrez system.
Collapse
Affiliation(s)
- Kenneth Evan Thompson
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| | | | | | | |
Collapse
|