1
|
Moldenhauer HJ, Tammen K, Meredith AL. Structural mapping of patient-associated KCNMA1 gene variants. Biophys J 2024; 123:1984-2000. [PMID: 38042986 DOI: 10.1016/j.bpj.2023.11.3404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 11/30/2023] [Accepted: 11/30/2023] [Indexed: 12/04/2023] Open
Abstract
KCNMA1-linked channelopathy is a neurological disorder characterized by seizures, motor abnormalities, and neurodevelopmental disabilities. The disease mechanisms are predicted to result from alterations in KCNMA1-encoded BK K+ channel activity; however, only a subset of the patient-associated variants have been functionally studied. The localization of these variants within the tertiary structure or evaluation by pathogenicity algorithms has not been systematically assessed. In this study, 82 nonsynonymous patient-associated KCNMA1 variants were mapped within the BK channel protein. Fifty-three variants localized within cryoelectron microscopy-resolved structures, including 21 classified as either gain of function (GOF) or loss of function (LOF) in BK channel activity. Clusters of LOF variants were identified in the pore, the AC region (RCK1), and near the Ca2+ bowl (RCK2), overlapping with sites of pharmacological or endogenous modulation. However, no clustering was found for GOF variants. To further understand variants of uncertain significance (VUSs), assessments by multiple standard pathogenicity algorithms were compared, and new thresholds for sensitivity and specificity were established from confirmed GOF and LOF variants. An ensemble algorithm was constructed (KCNMA1 meta score (KMS)), consisting of a weighted summation of this trained dataset combined with a structural component derived from the Ca2+-bound and unbound BK channels. KMS assessment differed from the highest-performing individual algorithm (REVEL) at 10 VUS residues, and a subset were studied further by electrophysiology in HEK293 cells. M578T, E656A, and D965V (KMS+;REVEL-) were confirmed to alter BK channel properties in voltage-clamp recordings, and D800Y (KMS-;REVEL+) was assessed as benign under the test conditions. However, KMS failed to accurately assess K457E. These combined results reveal the distribution of potentially disease-causing KCNMA1 variants within BK channel functional domains and pathogenicity evaluation for VUSs, suggesting strategies for improving channel-level predictions in future studies by building on ensemble algorithms such as KMS.
Collapse
Affiliation(s)
- Hans J Moldenhauer
- Department of Physiology, University of Maryland School of Medicine, Baltimore, Maryland
| | - Kelly Tammen
- Department of Physiology, University of Maryland School of Medicine, Baltimore, Maryland
| | - Andrea L Meredith
- Department of Physiology, University of Maryland School of Medicine, Baltimore, Maryland.
| |
Collapse
|
2
|
Moldenhauer HJ, Tammen K, Meredith AL. Structural mapping of patient-associated KCNMA1 gene variants. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.27.550850. [PMID: 37546746 PMCID: PMC10402178 DOI: 10.1101/2023.07.27.550850] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
KCNMA1-linked channelopathy is a neurological disorder characterized by seizures, motor abnormalities, and neurodevelopmental disabilities. The disease mechanisms are predicted to result from alterations in KCNMA1-encoded BK K+ channel activity; however, only a subset of the patient-associated variants have been functionally studied. The localization of these variants within the tertiary structure or evaluation by pathogenicity algorithms has not been systematically assessed. In this study, 82 nonsynonymous patient-associated KCNMA1 variants were mapped within the BK channel protein. Fifty-three variants localized within cryo-EM resolved structures, including 21 classified as either gain-of-function (GOF) or loss-of-function (LOF) in BK channel activity. Clusters of LOF variants were identified in the pore, the AC region (RCK1), and near the Ca 2+ bowl (RCK2), overlapping with sites of pharmacological or endogenous modulation. However, no clustering was found for GOF variants. To further understand variants of uncertain significance (VUS), assessments by multiple standard pathogenicity algorithms were compared, and new thresholds for sensitivity and specificity were established from confirmed GOF and LOF variants. An ensemble algorithm was constructed (KCNMA1 Meta Score), consisting of a weighted summation of this trained dataset combined with a structural component derived from the Ca 2+ bound and unbound BK channels. KMS assessment differed from the highest performing individual algorithm (REVEL) at 10 VUS residues, and a subset were studied further by electrophysiology in HEK293 cells. M578T, E656A, and D965V (KMS+;REVEL-) were confirmed to alter BK channel properties in voltage-clamp recordings, and D800Y (KMS-;REVEL+) was assessed as benign under the test conditions. However, KMS failed to accurately assess K457E. These combined results reveal the distribution of potentially disease-causing KCNMA1 variants within BK channel functional domains and pathogenicity evaluation for VUS, suggesting strategies for improving channel-level predictions in future studies by building on ensemble algorithms such as KMS.
Collapse
|
3
|
Liang Y, Yang S, Zheng L, Wang H, Zhou J, Huang S, Yang L, Zuo Y. Research progress of reduced amino acid alphabets in protein analysis and prediction. Comput Struct Biotechnol J 2022; 20:3503-3510. [PMID: 35860409 PMCID: PMC9284397 DOI: 10.1016/j.csbj.2022.07.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 06/30/2022] [Accepted: 07/01/2022] [Indexed: 11/29/2022] Open
Abstract
A comprehensive summary of the literature on the reduced amino acid alphabets. A systematic review of the development history of reduced amino acid alphabets. Rich application cases of amino acid reduction alphabets are described in the article. A detailed analysis of the properties and uses of the reduced amino acid alphabets.
Proteins are the executors of cellular physiological activities, and accurate structural and function elucidation are crucial for the refined mapping of proteins. As a feature engineering method, the reduction of amino acid composition is not only an important method for protein structure and function analysis, but also opens a broad horizon for the complex field of machine learning. Representing sequences with fewer amino acid types greatly reduces the complexity and noise of traditional feature engineering in dimension, and provides more interpretable predictive models for machine learning to capture key features. In this paper, we systematically reviewed the strategy and method studies of the reduced amino acid (RAA) alphabets, and summarized its main research in protein sequence alignment, functional classification, and prediction of structural properties, respectively. In the end, we gave a comprehensive analysis of 672 RAA alphabets from 74 reduction methods.
Collapse
Affiliation(s)
- Yuchao Liang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Siqi Yang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Lei Zheng
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Hao Wang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Jian Zhou
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Shenghui Huang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Lei Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
- Corresponding authors.
| | - Yongchun Zuo
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
- Corresponding authors.
| |
Collapse
|
4
|
Schmidt M, Hamacher K. Identification of biophysical interaction patterns in direct coupling analysis. Phys Rev E 2021; 103:042418. [PMID: 34005861 DOI: 10.1103/physreve.103.042418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2020] [Accepted: 03/27/2021] [Indexed: 11/07/2022]
Abstract
Direct-coupling analysis is a statistical learning method for protein contact prediction based on sequence information alone. The maximum entropy principle leads to an effective inverse Potts model. Predictions on contacts are based on fitted local fields and couplings from an empirical multiple sequence alignment. Typically, the l_{2} norm of the resulting two-body couplings is used for contact prediction. However, this procedure discards important information. In this paper we show that the usage of the full fields and coupling information improves prediction accuracy.
Collapse
Affiliation(s)
- Michael Schmidt
- Department of Physics, TU Darmstadt, Karolinenpl. 5, 64289 Darmstadt, Germany
| | - Kay Hamacher
- Department of Physics, TU Darmstadt, Karolinenpl. 5, 64289 Darmstadt, Germany.,Department of Biology, TU Darmstadt, Schnittspahnstr. 10, 64287 Darmstadt, Germany.,Department of Computer Science, TU Darmstadt, Karolinenpl. 5, 64289 Darmstadt, Germany
| |
Collapse
|
5
|
Dong GF, Zheng L, Huang SH, Gao J, Zuo YC. Amino Acid Reduction Can Help to Improve the Identification of Antimicrobial Peptides and Their Functional Activities. Front Genet 2021; 12:669328. [PMID: 33959153 PMCID: PMC8093877 DOI: 10.3389/fgene.2021.669328] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 03/23/2021] [Indexed: 02/03/2023] Open
Abstract
Antimicrobial peptides (AMPs) are considered as potential substitutes of antibiotics in the field of new anti-infective drug design. There have been several machine learning algorithms and web servers in identifying AMPs and their functional activities. However, there is still room for improvement in prediction algorithms and feature extraction methods. The reduced amino acid (RAA) alphabet effectively solved the problems of simplifying protein complexity and recognizing the structure conservative region. This article goes into details about evaluating the performances of more than 5,000 amino acid reduced descriptors generated from 74 types of amino acid reduced alphabet in the first stage and the second stage to construct an excellent two-stage classifier, Identification of Antimicrobial Peptides by Reduced Amino Acid Cluster (iAMP-RAAC), for identifying AMPs and their functional activities, respectively. The results show that the first stage AMP classifier is able to achieve the accuracy of 97.21 and 97.11% for the training data set and independent test dataset. In the second stage, our classifier still shows good performance. At least three of the four metrics, sensitivity (SN), specificity (SP), accuracy (ACC), and Matthews correlation coefficient (MCC), exceed the calculation results in the literature. Further, the ANOVA with incremental feature selection (IFS) is used for feature selection to further improve prediction performance. The prediction performance is further improved after the feature selection of each stage. At last, a user-friendly web server, iAMP-RAAC, is established at http://bioinfor.imu.edu. cn/iampraac.
Collapse
Affiliation(s)
- Gai-Fang Dong
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application of Agriculture and Animal Husbandry, College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China
| | - Lei Zheng
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Sheng-Hui Huang
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Jing Gao
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application of Agriculture and Animal Husbandry, College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China
| | - Yong-Chun Zuo
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| |
Collapse
|
6
|
Solis AD. Amino acid alphabet reduction preserves fold information contained in contact interactions in proteins. Proteins 2015; 83:2198-216. [DOI: 10.1002/prot.24936] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Revised: 09/04/2015] [Accepted: 09/04/2015] [Indexed: 12/14/2022]
Affiliation(s)
- Armando D. Solis
- Biological Sciences Department, New York City College of Technology; the City University of New York (CUNY); Brooklyn New York 11201
| |
Collapse
|
7
|
Hatton L, Warr G. Protein structure and evolution: are they constrained globally by a principle derived from information theory? PLoS One 2015; 10:e0125663. [PMID: 25970335 PMCID: PMC4429977 DOI: 10.1371/journal.pone.0125663] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2014] [Accepted: 02/19/2015] [Indexed: 01/01/2023] Open
Abstract
That the physicochemical properties of amino acids constrain the structure, function and evolution of proteins is not in doubt. However, principles derived from information theory may also set bounds on the structure (and thus also the evolution) of proteins. Here we analyze the global properties of the full set of proteins in release 13-11 of the SwissProt database, showing by experimental test of predictions from information theory that their collective structure exhibits properties that are consistent with their being guided by a conservation principle. This principle (Conservation of Information) defines the global properties of systems composed of discrete components each of which is in turn assembled from discrete smaller pieces. In the system of proteins, each protein is a component, and each protein is assembled from amino acids. Central to this principle is the inter-relationship of the unique amino acid count and total length of a protein and its implications for both average protein length and occurrence of proteins with specific unique amino acid counts. The unique amino acid count is simply the number of distinct amino acids (including those that are post-translationally modified) that occur in a protein, and is independent of the number of times that the particular amino acid occurs in the sequence. Conservation of Information does not operate at the local level (it is independent of the physicochemical properties of the amino acids) where the influences of natural selection are manifest in the variety of protein structure and function that is well understood. Rather, this analysis implies that Conservation of Information would define the global bounds within which the whole system of proteins is constrained; thus it appears to be acting to constrain evolution at a level different from natural selection, a conclusion that appears counter-intuitive but is supported by the studies described herein.
Collapse
Affiliation(s)
- Leslie Hatton
- Faculty of Science, Engineering and Computing, Kingston University, London, UK
- * E-mail:
| | - Gregory Warr
- Medical University of South Carolina, Charleston, South Carolina, USA
| |
Collapse
|
8
|
Baruah A, Biswas P. The role of site-directed point mutations in protein misfolding. Phys Chem Chem Phys 2014; 16:13964-73. [PMID: 24898496 DOI: 10.1039/c3cp55367a] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Mutations inducing higher clashing and lower matching residue pairs lead to misfolding.
Collapse
Affiliation(s)
- Anupaul Baruah
- Department of Chemistry
- University of Delhi
- Delhi-110007, India
| | - Parbati Biswas
- Department of Chemistry
- University of Delhi
- Delhi-110007, India
| |
Collapse
|
9
|
Pape S, Hoffgaard F, Dür M, Hamacher K. Distance dependency and minimum amino acid alphabets for decoy scoring potentials. J Comput Chem 2012; 34:10-20. [DOI: 10.1002/jcc.23099] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2012] [Revised: 07/12/2012] [Accepted: 07/26/2012] [Indexed: 11/09/2022]
|
10
|
Fuzzy clustering of physicochemical and biochemical properties of amino acids. Amino Acids 2011; 43:583-94. [PMID: 21993537 PMCID: PMC3397137 DOI: 10.1007/s00726-011-1106-9] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2011] [Accepted: 09/23/2011] [Indexed: 12/03/2022]
Abstract
In this article, we categorize presently available experimental and theoretical knowledge of various physicochemical and biochemical features of amino acids, as collected in the AAindex database of known 544 amino acid (AA) indices. Previously reported 402 indices were categorized into six groups using hierarchical clustering technique and 142 were left unclustered. However, due to the increasing diversity of the database these indices are overlapping, therefore crisp clustering method may not provide optimal results. Moreover, in various large-scale bioinformatics analyses of whole proteomes, the proper selection of amino acid indices representing their biological significance is crucial for efficient and error-prone encoding of the short functional sequence motifs. In most cases, researchers perform exhaustive manual selection of the most informative indices. These two facts motivated us to analyse the widely used AA indices. The main goal of this article is twofold. First, we present a novel method of partitioning the bioinformatics data using consensus fuzzy clustering, where the recently proposed fuzzy clustering techniques are exploited. Second, we prepare three high quality subsets of all available indices. Superiority of the consensus fuzzy clustering method is demonstrated quantitatively, visually and statistically by comparing it with the previously proposed hierarchical clustered results. The processed AAindex1 database, supplementary material and the software are available at http://sysbio.icm.edu.pl/aaindex/.
Collapse
|
11
|
Computation of mutual information from Hidden Markov Models. Comput Biol Chem 2010; 34:328-33. [DOI: 10.1016/j.compbiolchem.2010.08.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2010] [Revised: 08/30/2010] [Accepted: 08/30/2010] [Indexed: 11/22/2022]
|