1
|
Yang Y, Chong Z, Vihinen M. PON-Fold: Prediction of Substitutions Affecting Protein Folding Rate. Int J Mol Sci 2023; 24:13023. [PMID: 37629203 PMCID: PMC10455311 DOI: 10.3390/ijms241613023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 08/08/2023] [Accepted: 08/09/2023] [Indexed: 08/27/2023] Open
Abstract
Most proteins fold into characteristic three-dimensional structures. The rate of folding and unfolding varies widely and can be affected by variations in proteins. We developed a novel machine-learning-based method for the prediction of the folding rate effects of amino acid substitutions in two-state folding proteins. We collected a data set of experimentally defined folding rates for variants and used them to train a gradient boosting algorithm starting with 1161 features. Two predictors were designed. The three-class classifier had, in blind tests, specificity and sensitivity ranging from 0.324 to 0.419 and from 0.256 to 0.451, respectively. The other tool was a regression predictor that showed a Pearson correlation coefficient of 0.525. The error measures, mean absolute error and mean squared error, were 0.581 and 0.603, respectively. One of the previously presented tools could be used for comparison with the blind test data set, our method called PON-Fold showed superior performance on all used measures. The applicability of the tool was tested by predicting all possible substitutions in a protein domain. Predictions for different conformations of proteins, open and closed forms of a protein kinase, and apo and holo forms of an enzyme indicated that the choice of the structure had a large impact on the outcome. PON-Fold is freely available.
Collapse
Affiliation(s)
- Yang Yang
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China; (Y.Y.); (Z.C.)
- Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing 210000, China
| | - Zhang Chong
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China; (Y.Y.); (Z.C.)
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, BMC B13, SE-221 84 Lund, Sweden
| |
Collapse
|
2
|
Wang Z, Zhou M, Cao N, Wang X. Site-directed modification of multifunctional lignocellulose-degrading enzymes of straw based on homologous modeling. World J Microbiol Biotechnol 2023; 39:214. [PMID: 37256388 DOI: 10.1007/s11274-023-03663-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 05/24/2023] [Indexed: 06/01/2023]
Abstract
Studying the straw lignocellulose strengthening mechanism during simultaneous degradation has important practical significance for improving resource utilization and reducing environmental pollution. In this paper, the degradation ability of four straw lignocellulose-degrading enzymes was evaluated by molecular docking and molecular dynamics. Using the significantly binds to straw lignocellulose-degrading enzyme as a template, a multifunctional lignocellulose-degrading enzyme 3CBH-1KS5-4XQD-1B85 was constructed based on amino acid recombination and homologous modeling. Five efficient degrading enzymes (3CBH-1, 3CBH-2, 3CBH-3, 3CBH-4, and 3CBH-5) were designed by site-directed mutagenesis of 3CBH-1KS5-4XQD-1B85 amino acid at position 346. Molecular dynamics showed that the degradation ability of 3CBH-1 was significant and it was 1.45 times higher than 3CBH-1KS5-4XQD-1B85. Moreover, the mechanism of enhanced degradability and the stability of the enzymes were explored. With the aid of Taguchi experiments, the suitable external environment for degrading straw was determined. In the presence of inhibitors (organic acids and phenolic compounds), the binding energy of 3CBH-1 (238.46 ± 30.96 kJ/mol) is 36.42% higher than that of 3CBH-1KS5-4XQD-1B85 (174.79 ± 20.35 kJ/mol) without external environmental stimulation. Based on homology modeling, this paper constructed a site-directed mutagenesis scheme of multifunctional enzymes, and the aim was to obtain multifunctional and efficient straw lignocellulose-degrading enzymes through protein engineering, which provided a feasible scheme for straw biodegradation.
Collapse
Affiliation(s)
- Zini Wang
- College of Plant Science, Jilin University, 5333 Xian Road, Changchun, 130062, China
| | - Mengying Zhou
- China Guangdong Nuclear Research Institute Limited Company, 1001 Shangbu Middle Road, Shenzhen, 518000, China
| | - Ning Cao
- College of Plant Science, Jilin University, 5333 Xian Road, Changchun, 130062, China
| | - Xiaoli Wang
- College of Plant Science, Jilin University, 5333 Xian Road, Changchun, 130062, China.
| |
Collapse
|
3
|
del Pino-Molina L, Bravo Gallego LY, Soto Serrano Y, Reche Yebra K, Marty Lobo J, González Martínez B, Bravo García-Morato M, Rodríguez Pena R, van der Burg M, López Granados E. Research-based flow cytometry assays for pathogenic assessment in the human B-cell biology of gene variants revealed in the diagnosis of inborn errors of immunity: a Bruton's tyrosine kinase case-study. Front Immunol 2023; 14:1095123. [PMID: 37197664 PMCID: PMC10183671 DOI: 10.3389/fimmu.2023.1095123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 04/13/2023] [Indexed: 05/19/2023] Open
Abstract
Introduction Inborn errors of immunity (IEI) are an expanding group of rare diseases whose field has been boosted by next-generation sequencing (NGS), revealing several new entities, accelerating routine diagnoses, expanding the number of atypical presentations and generating uncertainties regarding the pathogenic relevance of several novel variants. Methods Research laboratories that diagnose and provide support for IEI require accurate, reproducible and sustainable phenotypic, cellular and molecular functional assays to explore the pathogenic consequences of human leukocyte gene variants and contribute to their assessment. We have implemented a set of advanced flow cytometry-based assays to better dissect human B-cell biology in a translational research laboratory. We illustrate the utility of these techniques for the in-depth characterization of a novel (c.1685G>A, p.R562Q) de novo gene variant predicted as probably pathogenic but with no previous insights into the protein and cellular effects, located in the tyrosine kinase domain of the Bruton's tyrosine kinase (BTK) gene, in an apparently healthy 14-year-old male patient referred to our clinic for an incidental finding of low immunoglobulin (Ig) M levels with no history of recurrent infections. Results and discussion A phenotypic analysis of bone marrow (BM) revealed a slightly high percentage of pre-B-I subset in BM, with no blockage at this stage, as typically observed in classical X-linked agammaglobulinemia (XLA) patients. The phenotypic analysis in peripheral blood also revealed reduced absolute numbers of B cells, all pre-germinal center maturation stages, together with reduced but detectable numbers of different memory and plasma cell isotypes. The R562Q variant allows Btk expression and normal activation of anti-IgM-induced phosphorylation of Y551 but diminished autophosphorylation at Y223 after anti IgM and CXCL12 stimulation. Lastly, we explored the potential impact of the variant protein for downstream Btk signaling in B cells. Within the canonical nuclear factor kappa B (NF-κB) activation pathway, normal IκBα degradation occurs after CD40L stimulation in patient and control cells. In contrast, disturbed IκBα degradation and reduced calcium ion (Ca2+) influx occurs on anti-IgM stimulation in the patient's B cells, suggesting an enzymatic impairment of the mutated tyrosine kinase domain.
Collapse
Affiliation(s)
- L. del Pino-Molina
- Center for Biomedical Network Research on Rare Diseases, Instituto de Salud Carlos III (ISCII)I (CIBERER), Madrid, Spain
- Lymphocyte Pathophysiology in Immunodeficiencies Group, La Paz Institute for Health Research (IdiPAZ), Madrid, Spain
- *Correspondence: L. del Pino-Molina, ; E. López Granados,
| | - L. Y. Bravo Gallego
- Center for Biomedical Network Research on Rare Diseases, Instituto de Salud Carlos III (ISCII)I (CIBERER), Madrid, Spain
- Lymphocyte Pathophysiology in Immunodeficiencies Group, La Paz Institute for Health Research (IdiPAZ), Madrid, Spain
| | - Y. Soto Serrano
- Lymphocyte Pathophysiology in Immunodeficiencies Group, La Paz Institute for Health Research (IdiPAZ), Madrid, Spain
| | - K. Reche Yebra
- Lymphocyte Pathophysiology in Immunodeficiencies Group, La Paz Institute for Health Research (IdiPAZ), Madrid, Spain
| | - J. Marty Lobo
- Lymphocyte Pathophysiology in Immunodeficiencies Group, La Paz Institute for Health Research (IdiPAZ), Madrid, Spain
| | - B. González Martínez
- Pediatric Hemato-Oncology Unit, La Paz University Hospital Madrid, Madrid, Spain
| | - M. Bravo García-Morato
- Center for Biomedical Network Research on Rare Diseases, Instituto de Salud Carlos III (ISCII)I (CIBERER), Madrid, Spain
- Lymphocyte Pathophysiology in Immunodeficiencies Group, La Paz Institute for Health Research (IdiPAZ), Madrid, Spain
- Clinical Immunology Department, La Paz University Hospital Madrid, Madrid, Spain
| | - R. Rodríguez Pena
- Center for Biomedical Network Research on Rare Diseases, Instituto de Salud Carlos III (ISCII)I (CIBERER), Madrid, Spain
- Lymphocyte Pathophysiology in Immunodeficiencies Group, La Paz Institute for Health Research (IdiPAZ), Madrid, Spain
- Clinical Immunology Department, La Paz University Hospital Madrid, Madrid, Spain
| | - M. van der Burg
- Department of Pediatrics, Laboratory for Pediatric Immunology, Willem-Alexander Children’s Hospital, Leiden University Medical Centre, Leiden, Netherlands
| | - E. López Granados
- Center for Biomedical Network Research on Rare Diseases, Instituto de Salud Carlos III (ISCII)I (CIBERER), Madrid, Spain
- Lymphocyte Pathophysiology in Immunodeficiencies Group, La Paz Institute for Health Research (IdiPAZ), Madrid, Spain
- Clinical Immunology Department, La Paz University Hospital Madrid, Madrid, Spain
- *Correspondence: L. del Pino-Molina, ; E. López Granados,
| |
Collapse
|
4
|
Vihinen M. Individual Genetic Heterogeneity. Genes (Basel) 2022; 13:1626. [PMID: 36140794 PMCID: PMC9498725 DOI: 10.3390/genes13091626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 08/25/2022] [Accepted: 09/08/2022] [Indexed: 11/28/2022] Open
Abstract
Genetic variation has been widely covered in literature, however, not from the perspective of an individual in any species. Here, a synthesis of genetic concepts and variations relevant for individual genetic constitution is provided. All the different levels of genetic information and variation are covered, ranging from whether an organism is unmixed or hybrid, has variations in genome, chromosomes, and more locally in DNA regions, to epigenetic variants or alterations in selfish genetic elements. Genetic constitution and heterogeneity of microbiota are highly relevant for health and wellbeing of an individual. Mutation rates vary widely for variation types, e.g., due to the sequence context. Genetic information guides numerous aspects in organisms. Types of inheritance, whether Mendelian or non-Mendelian, zygosity, sexual reproduction, and sex determination are covered. Functions of DNA and functional effects of variations are introduced, along with mechanism that reduce and modulate functional effects, including TARAR countermeasures and intraindividual genetic conflict. TARAR countermeasures for tolerance, avoidance, repair, attenuation, and resistance are essential for life, integrity of genetic information, and gene expression. The genetic composition, effects of variations, and their expression are considered also in diseases and personalized medicine. The text synthesizes knowledge and insight on individual genetic heterogeneity and organizes and systematizes the central concepts.
Collapse
Affiliation(s)
- Mauno Vihinen
- Department of Experimental Medical Science, BMC B13, Lund University, SE-22184 Lund, Sweden
| |
Collapse
|
5
|
Targeted RNAseq Improves Clinical Diagnosis of Very Early-Onset Pediatric Immune Dysregulation. J Pers Med 2022; 12:jpm12060919. [PMID: 35743704 PMCID: PMC9224647 DOI: 10.3390/jpm12060919] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 05/26/2022] [Accepted: 05/27/2022] [Indexed: 02/05/2023] Open
Abstract
Despite increased use of whole exome sequencing (WES) for the clinical analysis of rare disease, overall diagnostic yield for most disorders hovers around 30%. Previous studies of mRNA have succeeded in increasing diagnoses for clearly defined disorders of monogenic inheritance. We asked if targeted RNA sequencing could provide similar benefits for primary immunodeficiencies (PIDs) and very early-onset inflammatory bowel disease (VEOIBD), both of which are difficult to diagnose due to high heterogeneity and variable severity. We performed targeted RNA sequencing of a panel of 260 immune-related genes for a cohort of 13 patients (seven suspected PID cases and six VEOIBD) and analyzed variants, splicing, and exon usage. Exonic variants were identified in seven cases, some of which had been previously prioritized by exome sequencing. For four cases, allele specific expression or lack thereof provided additional insights into possible disease mechanisms. In addition, we identified five instances of aberrant splicing associated with four variants. Three of these variants had been previously classified as benign in ClinVar based on population frequency. Digenic or oligogenic inheritance is suggested for at least two patients. In addition to validating the use of targeted RNA sequencing, our results show that rare disease research will benefit from incorporating contributing genetic factors into the diagnostic approach.
Collapse
|
6
|
Zhou Q, Teng Y, Pan J, Shi Q, Liu Y, Liang D, Li Z, Wu L. Identification of four novel mutations in BTK from six Chinese families with X-linked agammaglobulinemia. Clin Chim Acta 2022; 531:48-55. [PMID: 35245483 DOI: 10.1016/j.cca.2022.02.019] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 02/15/2022] [Accepted: 02/26/2022] [Indexed: 01/17/2023]
Abstract
BACKGROUND The defect of Bruton's tyrosine kinase (BTK) gene resulted in X-linked agammaglobulinemia (XLA), which is characterized by recurrent bacterial infections, immunodeficiency with low B-cell numbers and immunoglobulin. Diagnosis of XLA depends on clinical phenotype and genetic testing. METHODS Six unrelated Chinese families with high suspicion of XLA were enrolled in this study. Potential pathogenic variants were detected and validated by Whole Exome Sequencing (WES) and Sanger Sequencing. Western blot, Quantitative PCR (qPCR) analysis and immunofluorescence analysis were used to evaluate the preliminary function of candidate BTK variants. RESULTS A total of six variants were identified, four of which were not reported before. The novel missense mutation(c.1900T>G) and deletion(c.897delG) were found that the mutant protein and mRNA expression levels have fallen by Western Blot and qPCR identification. We also constructed minigene expression vector to determine the deletion (c.1751-6_1755delttctagGGGTT) resulting a 35bp skipping in exon 18. Meanwhile, the break point of gross deletion (Exon2-5) discovered based on WES was confirmed to be located at site ChX:101367539_101376531 through qPCR and Gap-PCR. CONCLUSION This study makes definitive diagnosis for 6 families with suspected XLA and further expands the spectrum of BTK mutations, providing new information for the diagnosis of the disease.
Collapse
Affiliation(s)
- Qimin Zhou
- Center for Medical Genetics, Hunan Key Laboratory of Medical Genetics & Hunan Key Laboratory of Animal Models for Human Diseases, School of Life Sciences, Central South University, Changsha, China
| | - Yanling Teng
- Center for Medical Genetics, Hunan Key Laboratory of Medical Genetics & Hunan Key Laboratory of Animal Models for Human Diseases, School of Life Sciences, Central South University, Changsha, China
| | - Jianyan Pan
- Center for Medical Genetics, Hunan Key Laboratory of Medical Genetics & Hunan Key Laboratory of Animal Models for Human Diseases, School of Life Sciences, Central South University, Changsha, China
| | - Qingxin Shi
- Center for Medical Genetics, Hunan Key Laboratory of Medical Genetics & Hunan Key Laboratory of Animal Models for Human Diseases, School of Life Sciences, Central South University, Changsha, China
| | - Yingdi Liu
- Center for Medical Genetics, Hunan Key Laboratory of Medical Genetics & Hunan Key Laboratory of Animal Models for Human Diseases, School of Life Sciences, Central South University, Changsha, China
| | - Desheng Liang
- Center for Medical Genetics, Hunan Key Laboratory of Medical Genetics & Hunan Key Laboratory of Animal Models for Human Diseases, School of Life Sciences, Central South University, Changsha, China; Laboratory of Molecular Genetics, Hunan Jiahui Genetics Hospital, Changsha, Hunan, China
| | - Zhuo Li
- Center for Medical Genetics, Hunan Key Laboratory of Medical Genetics & Hunan Key Laboratory of Animal Models for Human Diseases, School of Life Sciences, Central South University, Changsha, China.
| | - Lingqian Wu
- Center for Medical Genetics, Hunan Key Laboratory of Medical Genetics & Hunan Key Laboratory of Animal Models for Human Diseases, School of Life Sciences, Central South University, Changsha, China; Laboratory of Molecular Genetics, Hunan Jiahui Genetics Hospital, Changsha, Hunan, China.
| |
Collapse
|
7
|
Zhou M, Li Y. Modification of PAE-degrading Esterase(CarEW) for Higher Degradation Efficiency Through Integrated Homology Modeling, Molecular Docking, and Molecular Dynamics Simulation. Chem Res Chin Univ 2022. [DOI: 10.1007/s40242-022-1433-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
8
|
Zain R, Vihinen M. Structure-Function Relationships of Covalent and Non-Covalent BTK Inhibitors. Front Immunol 2021; 12:694853. [PMID: 34349760 PMCID: PMC8328433 DOI: 10.3389/fimmu.2021.694853] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 06/21/2021] [Indexed: 01/20/2023] Open
Abstract
Low-molecular weight chemical compounds have a longstanding history as drugs. Target specificity and binding efficiency represent major obstacles for small molecules to become clinically relevant. Protein kinases are attractive cellular targets; however, they are challenging because they present one of the largest protein families and share structural similarities. Bruton tyrosine kinase (BTK), a cytoplasmic protein tyrosine kinase, has received much attention as a promising target for the treatment of B-cell malignancies and more recently autoimmune and inflammatory diseases. Here we describe the structural properties and binding modes of small-molecule BTK inhibitors, including irreversible and reversible inhibitors. Covalently binding compounds, such as ibrutinib, acalabrutinib and zanubrutinib, are discussed along with non-covalent inhibitors fenebrutinib and RN486. The focus of this review is on structure-function relationships.
Collapse
Affiliation(s)
- Rula Zain
- Department of Laboratory Medicine, Clinical Research Centre, Karolinska Institutet, Karolinska University Hospital, Huddinge, Sweden.,Centre for Rare Diseases, Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, Lund, Sweden
| |
Collapse
|
9
|
Enhanced plant-microbe remediation of PCBs in soil using enzyme modification technique combined with molecular docking and molecular dynamics. Biochem J 2021; 478:1921-1941. [PMID: 33900386 DOI: 10.1042/bcj20210104] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 04/21/2021] [Accepted: 04/26/2021] [Indexed: 11/17/2022]
Abstract
The study on the enhanced mechanisms of the enzymes involved in plant absorption, plant degradation, and microbial mineralization in the remediation of soils contaminated with polychlorinated biphenyls (PCBs) is of great significance for the application of plant-microbe combined remediation technique in PCB-contaminated soils. The present study first used a combination of molecular docking and molecular dynamics methods to calculate the effects of the plant absorption enzyme, plant degradation enzyme, and microbial mineralization enzyme on the PCBs in the soil environment. A multifunctional plant degradation enzyme was constructed with three functional roles of absorption, degradation, and mineralization using amino acid sequence recombination and site-directed mutagenesis to modify the template of plant degradation enzyme. Finally, using the Taguchi experimental design-assisted molecular dynamics simulation method, the suitable external environmental conditions of plant-microbe combined remediation of the PCB-contaminated soil were determined. In total, six multifunctional plant degradation enzymes were designed, which exhibited a significantly improved efficiency of PCB degradation. In comparison to the complex of plant absorption enzyme, plant degradation enzyme, and microorganism mineralization enzyme (6QIM-3GZX-1B85), the six multifunctional plant degradation enzymes exhibited significantly higher efficiency (2.10-2.38 times) in degrading the PCBs, with a maximum of 2.69 times under suitable external environmental conditions.
Collapse
|
10
|
Primary Immunodeficiencies in India: Molecular Diagnosis and the Role of Next-Generation Sequencing. J Clin Immunol 2020; 41:393-413. [PMID: 33225392 DOI: 10.1007/s10875-020-00923-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Accepted: 11/13/2020] [Indexed: 10/22/2022]
Abstract
Primary immunodeficiency diseases (PIDs) are a group of clinically and genetically heterogeneous disorders showing ethnic and geographic diversities. Next-generation sequencing (NGS) is a comprehensive tool to diagnose PID. Although PID is common in India, data on the genetic spectrum of PIDs are limited due to financial restrictions. The study aims to characterize the clinical and genetic spectrum of PID patients in India and highlight the importance of a cost-effective targeted gene panel sequencing approach for PID in a resource-limited setting. The study includes 229 patients with clinical and laboratory features suggestive of PIDs. Mutation analysis was done by Sanger sequencing and NGS targeting a customized panel of genes. Pathogenic variants were identified in 97 patients involving 42 different genes with BTK and IL12RB1 being the most common mutated genes. Autosomal recessive and X-linked recessive inheritance were seen in 51.6% and 23.7% of patients. Mendelian susceptibility to mycobacterial diseases (MSMD) and IL12RB1 mutations was more common in our population compared to the Western world and the Middle East. Two patients with hypomorphic RAG1 mutations and one female with skewed CYBB mutation were also identified. Another 40 patients had variants classified as variants of uncertain significance (VUS). The study shows that targeted NGS is an effective diagnostic strategy for PIDs in countries with limited diagnostic resources. Molecular diagnosis of PID helps in genetic counseling and to make therapeutic decisions including the need for a stem cell transplantation.
Collapse
|
11
|
Sarkar A, Yang Y, Vihinen M. Variation benchmark datasets: update, criteria, quality and applications. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020; 2020:5710862. [PMID: 32016318 PMCID: PMC6997940 DOI: 10.1093/database/baz117] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Revised: 06/03/2019] [Accepted: 07/01/2019] [Indexed: 02/07/2023]
Abstract
Development of new computational methods and testing their performance has to be carried out using experimental data. Only in comparison to existing knowledge can method performance be assessed. For that purpose, benchmark datasets with known and verified outcome are needed. High-quality benchmark datasets are valuable and may be difficult, laborious and time consuming to generate. VariBench and VariSNP are the two existing databases for sharing variation benchmark datasets used mainly for variation interpretation. They have been used for training and benchmarking predictors for various types of variations and their effects. VariBench was updated with 419 new datasets from 109 papers containing altogether 329 014 152 variants; however, there is plenty of redundancy between the datasets. VariBench is freely available at http://structure.bmc.lu.se/VariBench/. The contents of the datasets vary depending on information in the original source. The available datasets have been categorized into 20 groups and subgroups. There are datasets for insertions and deletions, substitutions in coding and non-coding region, structure mapped, synonymous and benign variants. Effect-specific datasets include DNA regulatory elements, RNA splicing, and protein property for aggregation, binding free energy, disorder and stability. Then there are several datasets for molecule-specific and disease-specific applications, as well as one dataset for variation phenotype effects. Variants are often described at three molecular levels (DNA, RNA and protein) and sometimes also at the protein structural level including relevant cross references and variant descriptions. The updated VariBench facilitates development and testing of new methods and comparison of obtained performances to previously published methods. We compared the performance of the pathogenicity/tolerance predictor PON-P2 to several benchmark studies, and show that such comparisons are feasible and useful, however, there may be limitations due to lack of provided details and shared data. Database URL: http://structure.bmc.lu.se/VariBench
Collapse
Affiliation(s)
- Anasua Sarkar
- Department of Experimental Medical Science, BMC B13, Lund University, SE-22 184 Lund, Sweden
| | - Yang Yang
- School of Computer Science and Technology, Soochow University, No1. Shizi Street, Suzhou, 215006 Jiangsu, China.,Provincial Key Laboratory for Computer Information Processing Technology, No1. Shizi Street, Soochow University, Suzhou, 215006 Jiangsu, China
| | - Mauno Vihinen
- Department of Experimental Medical Science, BMC B13, Lund University, SE-22 184 Lund, Sweden
| |
Collapse
|
12
|
Vihinen M. Functional effects of protein variants. Biochimie 2020; 180:104-120. [PMID: 33164889 DOI: 10.1016/j.biochi.2020.10.009] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 10/15/2020] [Accepted: 10/19/2020] [Indexed: 12/11/2022]
Abstract
Genetic and other variations frequently affect protein functions. Scientific articles can contain confusing descriptions about which function or property is affected, and in many cases the statements are pure speculation without any experimental evidence. To clarify functional effects of protein variations of genetic or non-genetic origin, a systematic conceptualisation and framework are introduced. This framework describes protein functional effects on abundance, activity, specificity and affinity, along with countermeasures, which allow cells, tissues and organisms to tolerate, avoid, repair, attenuate or resist (TARAR) the effects. Effects on abundance discussed include gene dosage, restricted expression, mis-localisation and degradation. Enzymopathies, effects on kinetics, allostery and regulation of protein activity are subtopics for the effects of variants on activity. Variation outcomes on specificity and affinity comprise promiscuity, specificity, affinity and moonlighting. TARAR mechanisms redress variations with active and passive processes including chaperones, redundancy, robustness, canalisation and metabolic and signalling rewiring. A framework for pragmatic protein function analysis and presentation is introduced. All of the mechanisms and effects are described along with representative examples, most often in relation to diseases. In addition, protein function is discussed from evolutionary point of view. Application of the presented framework facilitates unambiguous, detailed and specific description of functional effects and their systematic study.
Collapse
Affiliation(s)
- Mauno Vihinen
- Department of Experimental Medical Science, BMC B13, Lund University, SE-22 184, Lund, Sweden.
| |
Collapse
|
13
|
Impact of amino acid substitution in the kinase domain of Bruton tyrosine kinase and its association with X-linked agammaglobulinemia. Int J Biol Macromol 2020; 164:2399-2408. [PMID: 32784026 DOI: 10.1016/j.ijbiomac.2020.08.057] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Revised: 08/02/2020] [Accepted: 08/03/2020] [Indexed: 02/06/2023]
Abstract
X-linked agammaglobulinemia (XLA) is a rare disease that affects the immune system, characterized by a serial development of bacterial infection from the onset of infantile age. Bruton tyrosine kinase (BTK) is a non-receptor cytoplasmic kinase that plays a crucial role in the B-lymphocyte maturation. The altered expression, mutation and/or structural variations of BTK are responsible for causing XLA. Here, we have performed extensive sequence and structure analyses of BTK to find deleterious variations and their pathogenic association with XLA. First, we screened the pathogenic variations in the BTK from a pool of publicly available resources, and their pathogenicity/tolerance and stability predictions were carried out. Finally, two pathogenic variations (E589G and M630K) were studied in detail and subjected to all-atom molecular dynamics simulation for 200 ns. Intramolecular hydrogen bonds (H-bonds), secondary structure, and principal component analysis revealed significant conformational changes in variants that support the structural basis of BTK dysfunction in XLA. The free energy landscape analysis revealed the presence of multiple energy minima, suggests that E589G brings a large destabilization and consequently unfolding behavior compared to M630K. Overall, our study suggests that amino acid substitutions, E589G, and M630K, significantly alter the structural conformation and stability of BTK.
Collapse
|
14
|
Martin TA, Wu T, Tang Q, Dougherty LL, Parente DJ, Swint-Kruse L, Fenton AW. Identification of biochemically neutral positions in liver pyruvate kinase. Proteins 2020; 88:1340-1350. [PMID: 32449829 DOI: 10.1002/prot.25953] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 03/10/2020] [Accepted: 05/16/2020] [Indexed: 01/08/2023]
Abstract
Understanding how each residue position contributes to protein function has been a long-standing goal in protein science. Substitution studies have historically focused on conserved protein positions. However, substitutions of nonconserved positions can also modify function. Indeed, we recently identified nonconserved positions that have large substitution effects in human liver pyruvate kinase (hLPYK), including altered allosteric coupling. To facilitate a comparison of which characteristics determine when a nonconserved position does vs does not contribute to function, the goal of the current work was to identify neutral positions in hLPYK. However, existing hLPYK data showed that three features commonly associated with neutral positions-high sequence entropy, high surface exposure, and alanine scanning-lacked the sensitivity needed to guide experimental studies. We used multiple evolutionary patterns identified in a sequence alignment of the PYK family to identify which positions were least patterned, reasoning that these were most likely to be neutral. Nine positions were tested with a total of 117 amino acid substitutions. Although exploring all potential functions is not feasible for any protein, five parameters associated with substrate/effector affinities and allosteric coupling were measured for hLPYK variants. For each position, the aggregate functional outcomes of all variants were used to quantify a "neutrality" score. Three positions showed perfect neutral scores for all five parameters. Furthermore, the nine positions showed larger neutral scores than 17 positions located near allosteric binding sites. Thus, our strategy successfully enriched the dataset for positions with neutral and modest substitutions.
Collapse
Affiliation(s)
- Tyler A Martin
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Tiffany Wu
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Qingling Tang
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Larissa L Dougherty
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Daniel J Parente
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA.,Department of Family and Community Medicine, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Liskin Swint-Kruse
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Aron W Fenton
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| |
Collapse
|
15
|
New insights into the pathogenicity of non-synonymous variants through multi-level analysis. Sci Rep 2019; 9:1667. [PMID: 30733553 PMCID: PMC6367327 DOI: 10.1038/s41598-018-38189-9] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2018] [Accepted: 12/19/2018] [Indexed: 12/17/2022] Open
Abstract
Precise classification of non-synonymous single nucleotide variants (SNVs) is a fundamental goal of clinical genetics. Next-generation sequencing technology is effective for establishing the basis of genetic diseases. However, identification of variants that are causal for genetic diseases remains a challenge. We analyzed human non-synonymous SNVs from a multilevel perspective to characterize pathogenicity. We showed that computational tools, though each having its own strength and weakness, tend to be overly dependent on the degree of conservation. For the mutations at non-degenerate sites, the amino acid sites of pathogenic substitutions show a distinct distribution in the classes of protein domains compared with the sites of benign substitutions. Overlooked disease susceptibility of genes explains in part the failures of computational tools. The more pathogenic sites observed, the more likely the gene is expressed in a high abundance or in a high tissue-specific manner, and have a high node degree of protein-protein interaction. The destroyed functions due to some false-negative mutations may arise because of a reprieve from the epigenetic repressed state which shouldn't happen in multiple biological conditions, instead of the defective protein. Our work adds more to our knowledge of non-synonymous SNVs' pathogenicity, thus will benefit the field of clinical genetics.
Collapse
|
16
|
Vince N, Mouillot G, Malphettes M, Limou S, Boutboul D, Guignet A, Bertrand V, Pellet P, Gourraud PA, Debré P, Oksenhendler E, Théodorou I, Fieschi C. Genetic screening of male patients with primary hypogammaglobulinemia can guide diagnosis and clinical management. Hum Immunol 2018; 79:571-577. [PMID: 29709555 DOI: 10.1016/j.humimm.2018.04.014] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2017] [Revised: 04/25/2018] [Accepted: 04/26/2018] [Indexed: 10/17/2022]
Abstract
The precise diagnosis of an immunodeficiency is sometimes difficult to assess, especially due to the large spectrum of phenotypic variation reported among patients. Common variable immunodeficiency disorders (CVID) do not have, for a large part, an identified genetic cause. The identification of a causal genetic mutation is important to confirm, or in some cases correct, the diagnosis. We screened >150 male patients with hypogammaglobulinemia for mutations in three genes involved in pediatric X-linked primary immunoglobulin deficiency: CD40LG, SH2D1A and BTK. The SH2D1A screening allowed to reclassify two individuals with an initial CVID presentation as XLP after mutations identification. All these mutations were associated with a lack of protein expression. In addition, 4 patients with a primary diagnosis of CVID and one with a primary IgG subclass deficiency were requalified as XLA after identifying BTK mutations. Interestingly, two out of these 5 patients carried a damaging coding BTK mutation associated with a lower, but detectable, BTK expression in monocytes, suggesting that a dysfunctional protein explains the disease phenotype in these patients. In conclusion, our results advocate to include SH2D1A and BTK in newly developed targeted NGS genetic testing, to contribute to providing the most appropriate medical treatment and genetic counselling.
Collapse
Affiliation(s)
- Nicolas Vince
- EA3963, Université Paris 7 Denis Diderot, Centre Hayem, Hôpital Saint-Louis, 1 Avenue Claude Vellefaux, 75010 Paris, France; Centre de Recherche en Transplantation et Immunologie UMR 1064, INSERM, Université de Nantes, Nantes, France; Institut de Transplantation Urologie Néphrologie (ITUN), CHU Nantes, Nantes, France.
| | - Gaël Mouillot
- Laboratoire Central d'Immunologie Cellulaire et Tissulaire, Hôpital Pitié Salpêtrière et INSERM UMR-S945, Bâtiment CERVI, Paris, France
| | - Marion Malphettes
- EA3963, Université Paris 7 Denis Diderot, Centre Hayem, Hôpital Saint-Louis, 1 Avenue Claude Vellefaux, 75010 Paris, France; Département d'Immunologie Clinique, Hôpital Saint-Louis, AP-HP, 1 Avenue Claude Vellefaux, 75010 Paris, France
| | - Sophie Limou
- Centre de Recherche en Transplantation et Immunologie UMR 1064, INSERM, Université de Nantes, Nantes, France; Institut de Transplantation Urologie Néphrologie (ITUN), CHU Nantes, Nantes, France; Ecole Centrale de Nantes, Nantes, France
| | - David Boutboul
- EA3963, Université Paris 7 Denis Diderot, Centre Hayem, Hôpital Saint-Louis, 1 Avenue Claude Vellefaux, 75010 Paris, France
| | - Angélique Guignet
- EA3963, Université Paris 7 Denis Diderot, Centre Hayem, Hôpital Saint-Louis, 1 Avenue Claude Vellefaux, 75010 Paris, France
| | - Véronique Bertrand
- Laboratoire Central d'Immunologie Cellulaire et Tissulaire, Hôpital Pitié Salpêtrière et INSERM UMR-S945, Bâtiment CERVI, Paris, France
| | - Philippe Pellet
- Laboratoire Central d'Immunologie Cellulaire et Tissulaire, Hôpital Pitié Salpêtrière et INSERM UMR-S945, Bâtiment CERVI, Paris, France
| | - Pierre-Antoine Gourraud
- Centre de Recherche en Transplantation et Immunologie UMR 1064, INSERM, Université de Nantes, Nantes, France; Institut de Transplantation Urologie Néphrologie (ITUN), CHU Nantes, Nantes, France
| | - Patrice Debré
- Laboratoire Central d'Immunologie Cellulaire et Tissulaire, Hôpital Pitié Salpêtrière et INSERM UMR-S945, Bâtiment CERVI, Paris, France
| | - Eric Oksenhendler
- Département d'Immunologie Clinique, Hôpital Saint-Louis, AP-HP, 1 Avenue Claude Vellefaux, 75010 Paris, France
| | - Ioannis Théodorou
- Laboratoire Central d'Immunologie Cellulaire et Tissulaire, Hôpital Pitié Salpêtrière et INSERM UMR-S945, Bâtiment CERVI, Paris, France
| | - Claire Fieschi
- EA3963, Université Paris 7 Denis Diderot, Centre Hayem, Hôpital Saint-Louis, 1 Avenue Claude Vellefaux, 75010 Paris, France; Département d'Immunologie Clinique, Hôpital Saint-Louis, AP-HP, 1 Avenue Claude Vellefaux, 75010 Paris, France
| | | |
Collapse
|
17
|
He S, Tong X, Han M, Bai Y, Dai F. Genome-Wide Identification and Characterization of Tyrosine Kinases in the Silkworm, Bombyx mori. Int J Mol Sci 2018; 19:E934. [PMID: 29561793 PMCID: PMC5979338 DOI: 10.3390/ijms19040934] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Revised: 03/16/2018] [Accepted: 03/20/2018] [Indexed: 12/19/2022] Open
Abstract
The tyrosine kinases (TKs) are important parts of metazoan signaling pathways and play significant roles in cell growth, development, apoptosis and disease. Genome-wide characterization of TKs has been conducted in many metazoans, however, systematic information about this family in Lepidoptera is still lacking. We retrieved 33 TK-encoding genes in silkworm and classified them into 25 subfamilies by sequence analysis, without members in AXL, FRK, PDGFR, STYK1 and TIE subfamilies. Although domain sequences in each subfamily are conserved, TKs in vertebrates tend to be remarkably conserved and stable. Our results of phylogenetic analysis supported the previous conclusion for the second major expansion of TK family. Gene-Ontology (GO) analysis revealed that a higher proportion of BmTKs played roles in binding, catalysis, signal transduction, metabolism, biological regulation and response to stimulus, compared to all silkworm genes annotated in GO. Moreover, the expression profile analysis of BmTKs among multiple tissues and developmental stages demonstrated that many genes exhibited stage-specific and/or sex-related expression during embryogenesis, molting and metamorphosis, and that 8 BmTKs presented tissue-specific high expression. Our study provides systematic description of silkworm tyrosine kinases, and may also provide further insights into metazoan TKs and assist future studies addressing their functions.
Collapse
Affiliation(s)
- Songzhen He
- State Key Laboratory of Silkworm Genome Biology, Key Laboratory of Sericultural Biology and Genetic Breeding, Ministry of Agriculture, Southwest University, Chongqing 400715, China.
| | - Xiaoling Tong
- State Key Laboratory of Silkworm Genome Biology, Key Laboratory of Sericultural Biology and Genetic Breeding, Ministry of Agriculture, Southwest University, Chongqing 400715, China.
| | - Minjin Han
- State Key Laboratory of Silkworm Genome Biology, Key Laboratory of Sericultural Biology and Genetic Breeding, Ministry of Agriculture, Southwest University, Chongqing 400715, China.
| | - Yanmin Bai
- State Key Laboratory of Silkworm Genome Biology, Key Laboratory of Sericultural Biology and Genetic Breeding, Ministry of Agriculture, Southwest University, Chongqing 400715, China.
| | - Fangyin Dai
- State Key Laboratory of Silkworm Genome Biology, Key Laboratory of Sericultural Biology and Genetic Breeding, Ministry of Agriculture, Southwest University, Chongqing 400715, China.
| |
Collapse
|
18
|
Feng J, He L, Li Y, Xiao F, Hu G. Modeling of PH Domains and Phosphoinositides Interactions and Beyond. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2018; 1111:19-32. [DOI: 10.1007/5584_2018_236] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
19
|
Čalyševa J, Vihinen M. PON-SC - program for identifying steric clashes caused by amino acid substitutions. BMC Bioinformatics 2017; 18:531. [PMID: 29187139 PMCID: PMC5707825 DOI: 10.1186/s12859-017-1947-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Accepted: 11/21/2017] [Indexed: 11/10/2022] Open
Abstract
Background Amino acid substitutions due to DNA nucleotide replacements are frequently disease-causing because of affecting functionally important sites. If the substituting amino acid does not fit into the protein, it causes structural alterations that are often harmful. Clashes of amino acids cause local or global structural changes. Testing structural compatibility of variations has been difficult due to the lack of a dedicated method that could handle vast amounts of variation data produced by next generation sequencing technologies. Results We developed a method, PON-SC, for detecting protein structural clashes due to amino acid substitutions. The method utilizes side chain rotamer library and tests whether any of the common rotamers can be fitted into the protein structure. The tool was tested both with variants that cause and do not cause clashes and found to have accuracy of 0.71 over five test datasets. Conclusions We developed a fast method for residue side chain clash detection. The method provides in addition to the prediction also visualization of the variant in three dimensional structure. Electronic supplementary material The online version of this article (10.1186/s12859-017-1947-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jelena Čalyševa
- Protein Structure and Bioinformatics, Department of Experimental Medical Science, Lund University, BMC B13, SE-22 184, Lund, Sweden.,Present address: EMBL Heidelberg, Meyerhofstraße 1, 69117, Heidelberg, Germany
| | - Mauno Vihinen
- Protein Structure and Bioinformatics, Department of Experimental Medical Science, Lund University, BMC B13, SE-22 184, Lund, Sweden.
| |
Collapse
|
20
|
Identification and characterization of tyrosine kinases in anole lizard indicate the conserved tyrosine kinase repertoire in vertebrates. Mol Genet Genomics 2017; 292:1405-1418. [PMID: 28819830 DOI: 10.1007/s00438-017-1356-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Accepted: 08/08/2017] [Indexed: 10/19/2022]
Abstract
The tyrosine kinases (TKs) play principal roles in regulation of multicellular aspects of the organism and are implicated in many cancer types and congenital disorders. The anole lizard has recently been introduced as a model organism for laboratory-based studies of organismal function and field studies of ecology and evolution. However, the TK family of anole lizard has not been systematically identified and characterized yet. In this study, we identified 82 TK-encoding genes in the anole lizard genome and classified them into 28 subfamilies through phylogenetic analysis, with no member from ROS and STYK1 subfamilies identified. Although TK domain sequences and domain organization in each subfamily were conserved, the total number of TKs in different species was much variable. In addition, extensive evolutionary analysis in metazoans indicated that TK repertoire in vertebrates tends to be remarkably stable. Phylogenetic analysis of Eph subfamily indicated that the divergence of EphA and EphB occurred prior to the whole genome duplication (WGD) but after the split of Urochordates and vertebrates. Moreover, the expression pattern analysis of lizard TK genes among 9 different tissues showed that 14 TK genes exhibited tissue-specific expression and 6 TK genes were widely expressed. Comparative analysis of TK expression suggested that the tissue specifically expressed genes showed different expression pattern but the widely expressed genes showed similar pattern between anole lizard and human. These results may provide insights into the evolutionary diversification of animal TK genes and would aid future studies on TK protein regulation of key growth and developmental processes.
Collapse
|
21
|
Schaafsma GCP, Vihinen M. Large differences in proportions of harmful and benign amino acid substitutions between proteins and diseases. Hum Mutat 2017; 38:839-848. [DOI: 10.1002/humu.23236] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Revised: 04/05/2017] [Accepted: 04/20/2017] [Indexed: 12/21/2022]
Affiliation(s)
- Gerard C. P. Schaafsma
- Protein Structure and Bioinformatics; Department of Experimental Medical Science; Lund University; Lund Sweden
| | | |
Collapse
|
22
|
Niroula A, Vihinen M. PON-P and PON-P2 predictor performance in CAGI challenges: Lessons learned. Hum Mutat 2017; 38:1085-1091. [PMID: 28224672 DOI: 10.1002/humu.23199] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2016] [Revised: 01/25/2017] [Accepted: 02/17/2017] [Indexed: 01/14/2023]
Abstract
Computational tools are widely used for ranking and prioritizing variants for characterizing their disease relevance. Since numerous tools have been developed, they have to be properly assessed before being applied. Critical Assessment of Genome Interpretation (CAGI) experiments have significantly contributed toward the assessment of prediction methods for various tasks. Within and outside the CAGI, we have addressed several questions that facilitate development and assessment of variation interpretation tools. These areas include collection and distribution of benchmark datasets, their use for systematic large-scale method assessment, and the development of guidelines for reporting methods and their performance. For us, CAGI has provided a chance to experiment with new ideas, test the application areas of our methods, and network with other prediction method developers. In this article, we discuss our experiences and lessons learned from the various CAGI challenges. We describe our approaches, their performance, and impact of CAGI on our research. Finally, we discuss some of the possibilities that CAGI experiments have opened up and make some suggestions for future experiments.
Collapse
Affiliation(s)
- Abhishek Niroula
- Protein Structure and Bioinformatics Group, Department of Experimental Medical Science, Lund University, Lund, Sweden
| | - Mauno Vihinen
- Protein Structure and Bioinformatics Group, Department of Experimental Medical Science, Lund University, Lund, Sweden
| |
Collapse
|
23
|
Smith CIE. From identification of the BTK kinase to effective management of leukemia. Oncogene 2017; 36:2045-2053. [PMID: 27669440 PMCID: PMC5395699 DOI: 10.1038/onc.2016.343] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Revised: 08/02/2016] [Accepted: 08/03/2016] [Indexed: 12/11/2022]
Abstract
BTK is a cytoplasmic protein-tyrosine kinase, whose corresponding gene was isolated in the early 1990s. BTK was initially identified by positional cloning of the gene causing X-linked agammaglobulinemia and independently in a search for new kinases. Given the phenotype of affected patients, namely lack of B-lymphocytes and plasma cells with the ensuing inability to mount humoral immune responses, BTK inhibitors were anticipated to have beneficial effects on antibody-mediated pathologies, such as autoimmunity. In contrast to, for example, the SRC-family of cytoplasmic kinases, there was no obvious way in which structural alterations would yield constitutively active forms of BTK, and such mutations were also not found in leukemias or lymphomas. In 2007, the first efficient inhibitor, ibrutinib, was reported and soon became approved both in the United States and in Europe for the treatment of three B-cell malignancies, mantle cell lymphoma, chronic lymphocytic leukemia and Waldenström's macroglobulinemia. Over the past few years, additional inhibitors have been developed, with acalabrutinib being more selective, and recently demonstrating fewer clinical adverse effects. The antitumor mechanism is also not related to mutations in BTK. Instead tumor residency in lymphoid organs is inhibited, making these drugs highly versatile. BTK is one of the only 10 human kinases that carry a cysteine in the adenosine triphosphate-binding cleft. As this allows for covalent, irreversible inhibitor binding, it provides these compounds with a highly advantageous character. This quality may be crucial and bodes well for the future of BTK-modifying medicines, which have been estimated to reach annual multi-billion dollar sales in the future.
Collapse
Affiliation(s)
- C I E Smith
- Clinical Research Center, Department of Laboratory Medicine, Karolinska Institutet, Karolinska University Hospital, Huddinge, Sweden
| |
Collapse
|
24
|
Pons T, Vazquez M, Matey-Hernandez ML, Brunak S, Valencia A, Izarzugaza JM. KinMutRF: a random forest classifier of sequence variants in the human protein kinase superfamily. BMC Genomics 2016; 17 Suppl 2:396. [PMID: 27357839 PMCID: PMC4928150 DOI: 10.1186/s12864-016-2723-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Background The association between aberrant signal processing by protein kinases and human diseases such as cancer was established long time ago. However, understanding the link between sequence variants in the protein kinase superfamily and the mechanistic complex traits at the molecular level remains challenging: cells tolerate most genomic alterations and only a minor fraction disrupt molecular function sufficiently and drive disease. Results KinMutRF is a novel random-forest method to automatically identify pathogenic variants in human kinases. Twenty six decision trees implemented as a random forest ponder a battery of features that characterize the variants: a) at the gene level, including membership to a Kinbase group and Gene Ontology terms; b) at the PFAM domain level; and c) at the residue level, the types of amino acids involved, changes in biochemical properties, functional annotations from UniProt, Phospho.ELM and FireDB. KinMutRF identifies disease-associated variants satisfactorily (Acc: 0.88, Prec:0.82, Rec:0.75, F-score:0.78, MCC:0.68) when trained and cross-validated with the 3689 human kinase variants from UniProt that have been annotated as neutral or pathogenic. All unclassified variants were excluded from the training set. Furthermore, KinMutRF is discussed with respect to two independent kinase-specific sets of mutations no included in the training and testing, Kin-Driver (643 variants) and Pon-BTK (1495 variants). Moreover, we provide predictions for the 848 protein kinase variants in UniProt that remained unclassified. A public implementation of KinMutRF, including documentation and examples, is available online (http://kinmut2.bioinfo.cnio.es). The source code for local installation is released under a GPL version 3 license, and can be downloaded from https://github.com/Rbbt-Workflows/KinMut2. Conclusions KinMutRF is capable of classifying kinase variation with good performance. Predictions by KinMutRF compare favorably in a benchmark with other state-of-the-art methods (i.e. SIFT, Polyphen-2, MutationAssesor, MutationTaster, LRT, CADD, FATHMM, and VEST). Kinase-specific features rank as the most elucidatory in terms of information gain and are likely the improvement in prediction performance. This advocates for the development of family-specific classifiers able to exploit the discriminatory power of features unique to individual protein families. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2723-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Tirso Pons
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Melchor Fernández Almagro, 3, 28029, Madrid, Spain
| | - Miguel Vazquez
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Melchor Fernández Almagro, 3, 28029, Madrid, Spain
| | - María Luisa Matey-Hernandez
- Center for Biological Sequence Analysis (CBS), Systems Biology Department, Technical University of Denmark (DTU), Kemitorvet, Building 208, 2800 Kgs., Lyngby, Denmark
| | - Søren Brunak
- Center for Biological Sequence Analysis (CBS), Systems Biology Department, Technical University of Denmark (DTU), Kemitorvet, Building 208, 2800 Kgs., Lyngby, Denmark.,Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, Blegdamsvej 3A, 2200, Copenhagen, Denmark
| | - Alfonso Valencia
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Melchor Fernández Almagro, 3, 28029, Madrid, Spain
| | - Jose Mg Izarzugaza
- Center for Biological Sequence Analysis (CBS), Systems Biology Department, Technical University of Denmark (DTU), Kemitorvet, Building 208, 2800 Kgs., Lyngby, Denmark.
| |
Collapse
|
25
|
Substitution scanning identifies a novel, catalytically active ibrutinib-resistant BTK cysteine 481 to threonine (C481T) variant. Leukemia 2016; 31:177-185. [PMID: 27282255 PMCID: PMC5220130 DOI: 10.1038/leu.2016.153] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2015] [Revised: 05/11/2016] [Accepted: 05/18/2016] [Indexed: 01/01/2023]
Abstract
Irreversible Bruton tyrosine kinase (BTK) inhibitors, ibrutinib and acalabrutinib have demonstrated remarkable clinical responses in multiple B-cell malignancies. Acquired resistance has been identified in a sub-population of patients in which mutations affecting BTK predominantly substitute cysteine 481 in the kinase domain for catalytically active serine, thereby ablating covalent binding of inhibitors. Activating substitutions in the BTK substrate phospholipase Cγ2 (PLCγ2) instead confers resistance independent of BTK. Herein, we generated all six possible amino acid substitutions due to single nucleotide alterations for the cysteine 481 codon, in addition to threonine, requiring two nucleotide substitutions, and performed functional analysis. Replacement by arginine, phenylalanine, tryptophan or tyrosine completely inactivated the catalytic activity, whereas substitution with glycine caused severe impairment. BTK with threonine replacement was catalytically active, similar to substitution with serine. We identify three potential ibrutinib resistance scenarios for cysteine 481 replacement: (1) Serine, being catalytically active and therefore predominating among patients. (2) Threonine, also being catalytically active, but predicted to be scarce, because two nucleotide changes are needed. (3) As BTK variants replaced with other residues are catalytically inactive, they presumably need compensatory mutations, therefore being very scarce. Glycine and tryptophan variants were not yet reported but likely also provide resistance.
Collapse
|
26
|
Niroula A, Vihinen M. Variation Interpretation Predictors: Principles, Types, Performance, and Choice. Hum Mutat 2016; 37:579-97. [DOI: 10.1002/humu.22987] [Citation(s) in RCA: 90] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2015] [Accepted: 03/07/2016] [Indexed: 12/18/2022]
Affiliation(s)
- Abhishek Niroula
- Department of Experimental Medical Science; Lund University; BMC B13 Lund SE-22184 Sweden
| | - Mauno Vihinen
- Department of Experimental Medical Science; Lund University; BMC B13 Lund SE-22184 Sweden
| |
Collapse
|
27
|
Fang M, Abolhassani H, Lim CK, Zhang J, Hammarström L. Next Generation Sequencing Data Analysis in Primary Immunodeficiency Disorders - Future Directions. J Clin Immunol 2016; 36 Suppl 1:68-75. [PMID: 26993986 DOI: 10.1007/s10875-016-0260-y] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2016] [Accepted: 02/28/2016] [Indexed: 12/16/2022]
Abstract
Primary immunodeficiency diseases (PIDs) comprise a group of highly heterogeneous immune system diseases and around 300 forms of PID have been described to date. Next Generation Sequencing (NGS) has recently become an increasingly used approach for gene identification and molecular diagnosis of human diseases. Herein we summarize the practical considerations for the interpretation of NGS data and the techniques for searching disease-related PID genes, and suggest future directions for research in this field.
Collapse
Affiliation(s)
- Mingyan Fang
- Department of Laboratory Medicine, Division of Clinical Immunology and Transfusion Medicine, Karolinska University Hospital Huddinge, -141 86, Stockholm, SE, Sweden.,BGI-Shenzhen, Shenzhen, 518083, China
| | - Hassan Abolhassani
- Department of Laboratory Medicine, Division of Clinical Immunology and Transfusion Medicine, Karolinska University Hospital Huddinge, -141 86, Stockholm, SE, Sweden.,Research Center for Immunodeficiencies, Pediatrics Center of Excellence, Children's Medical Center, Tehran University of Medical Sciences, Tehran, Iran
| | - Che Kang Lim
- Department of Laboratory Medicine, Division of Clinical Immunology and Transfusion Medicine, Karolinska University Hospital Huddinge, -141 86, Stockholm, SE, Sweden.,Department of Clinical Research, Singapore General Hospital, Singapore, 169856, Singapore
| | | | - Lennart Hammarström
- Department of Laboratory Medicine, Division of Clinical Immunology and Transfusion Medicine, Karolinska University Hospital Huddinge, -141 86, Stockholm, SE, Sweden.
| |
Collapse
|
28
|
Computational Analysis of the Binding Specificities of PH Domains. BIOMED RESEARCH INTERNATIONAL 2015; 2015:792904. [PMID: 26881206 PMCID: PMC4735990 DOI: 10.1155/2015/792904] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Revised: 12/13/2015] [Accepted: 12/17/2015] [Indexed: 12/15/2022]
Abstract
Pleckstrin homology (PH) domains share low sequence identities but extremely conserved structures. They have been found in many proteins for cellular signal-dependent membrane targeting by binding inositol phosphates to perform different physiological functions. In order to understand the sequence-structure relationship and binding specificities of PH domains, quantum mechanical (QM) calculations and sequence-based combined with structure-based binding analysis were employed in our research. In the structural aspect, the binding specificities were shown to correlate with the hydropathy characteristics of PH domains and electrostatic properties of the bound inositol phosphates. By comparing these structure properties with sequence-based profiles of physicochemical properties, PH domains can be classified into four functional subgroups according to their binding specificities and affinities to inositol phosphates. The method not only provides a simple and practical paradigm to predict binding specificities for functional genomic research but also gives new insight into the understanding of the basis of diseases with respect to PH domain structures.
Collapse
|
29
|
Vazquez M, Pons T, Brunak S, Valencia A, Izarzugaza JMG. wKinMut-2: Identification and Interpretation of Pathogenic Variants in Human Protein Kinases. Hum Mutat 2015; 37:36-42. [PMID: 26443060 DOI: 10.1002/humu.22914] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Accepted: 09/22/2015] [Indexed: 12/31/2022]
Abstract
Most genomic alterations are tolerated while only a minor fraction disrupts molecular function sufficiently to drive disease. Protein kinases play a central biological function and the functional consequences of their variants are abundantly characterized. However, this heterogeneous information is often scattered across different sources, which makes the integrative analysis complex and laborious. wKinMut-2 constitutes a solution to facilitate the interpretation of the consequences of human protein kinase variation. Nine methods predict their pathogenicity, including a kinase-specific random forest approach. To understand the biological mechanisms causative of human diseases and cancer, information from pertinent reference knowledge bases and the literature is automatically mined, digested, and homogenized. Variants are visualized in their structural contexts and residues affecting catalytic and drug binding are identified. Known protein-protein interactions are reported. Altogether, this information is intended to assist the generation of new working hypothesis to be corroborated with ulterior experimental work. The wKinMut-2 system, along with a user manual and examples, is freely accessible at http://kinmut2.bioinfo.cnio.es, the code for local installations can be downloaded from https://github.com/Rbbt-Workflows/KinMut2.
Collapse
Affiliation(s)
- Miguel Vazquez
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, 28029, Spain
| | - Tirso Pons
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, 28029, Spain
| | - Søren Brunak
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, Copenhagen 2200, Denmark.,Center for Biological Sequence Analysis (CBS), Systems Biology Department, Technical University of Denmark (DTU), Kongens Lyngby 2800, Denmark
| | - Alfonso Valencia
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, 28029, Spain
| | - Jose M G Izarzugaza
- Center for Biological Sequence Analysis (CBS), Systems Biology Department, Technical University of Denmark (DTU), Kongens Lyngby 2800, Denmark
| |
Collapse
|
30
|
Niroula A, Vihinen M. Classification of Amino Acid Substitutions in Mismatch Repair Proteins Using PON-MMR2. Hum Mutat 2015; 36:1128-34. [PMID: 26333163 DOI: 10.1002/humu.22900] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2015] [Accepted: 08/24/2015] [Indexed: 12/21/2022]
Abstract
Variations in mismatch repair (MMR) system genes are causative of Lynch syndrome and other cancers. Thousands of variants have been identified in MMR genes, but the clinical relevance is known for only a small proportion. Recently, the InSiGHT group classified 2,360 MMR variants into five classes. One-third of variants, majority of which is nonsynonymous variants, remain to be of uncertain clinical relevance. Computational tools can be used to prioritize variants for disease relevance investigations. Previously, we classified 248 MMR variants as likely pathogenic and likely benign using PON-MMR. We have developed a novel tool, PON-MMR2, which is trained on a larger and more reliable dataset. In performance comparison, PON-MMR2 outperforms both generic tolerance prediction methods as well as methods optimized for MMR variants. It achieves accuracy and MCC of 0.89 and 0.78, respectively, in cross-validation and 0.86 and 0.69, respectively, on an independent test dataset. We classified 354 class 3 variants in InSiGHT database as well as all possible amino acid substitutions in four MMR proteins. Likely harmful variants mainly appear in the protein core, whereas likely benign variants are on the surface. PON-MMR2 is a highly reliable tool to prioritize variants for functional analysis. It is freely available at http://structure.bmc.lu.se/PON-MMR2/.
Collapse
Affiliation(s)
- Abhishek Niroula
- Department of Experimental Medical Science, Lund University, BMC B13, Lund, SE, 22184, Sweden
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, BMC B13, Lund, SE, 22184, Sweden
| |
Collapse
|
31
|
Niroula A, Vihinen M. Harmful somatic amino acid substitutions affect key pathways in cancers. BMC Med Genomics 2015; 8:53. [PMID: 26282678 PMCID: PMC4539680 DOI: 10.1186/s12920-015-0125-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Accepted: 07/30/2015] [Indexed: 12/14/2022] Open
Abstract
Background Cancer is characterized by the accumulation of large numbers of genetic variations and alterations of multiple biological phenomena. Cancer genomics has largely focused on the identification of such genetic alterations and the genes containing them, known as ‘cancer genes’. However, the non-functional somatic variations out-number functional variations and remain as a major challenge. Recurrent somatic variations are thought to be cancer drivers but they are present in only a small fraction of patients. Methods We performed an extensive analysis of amino acid substitutions (AASs) from 6,861 cancer samples (whole genome or exome sequences) classified into 30 cancer types and performed pathway enrichment analysis. We also studied the overlap between the cancers based on proteins containing harmful AASs and pathways affected by them. Results We found that only a fraction of AASs (39.88 %) are harmful even in known cancer genes. In addition, we found that proteins containing harmful AASs in cancers are often centrally located in protein interaction networks. Based on the proteins containing harmful AASs, we identified significantly affected pathways in 28 cancer types and indicate that proteins containing harmful AASs can affect pathways despite the frequency of AASs in them. Our cross-cancer overlap analysis showed that it would be more beneficial to identify affected pathways in cancers rather than individual genes and variations. Conclusion Pathways affected by harmful AASs reveal key processes involved in cancer development. Our approach filters out the putative benign AASs thus reducing the list of cancer variations allowing reliable identification of affected pathways. The pathways identified in individual cancer and overlap between cancer types open avenues for further experimental research and for developing targeted therapies and interventions. Electronic supplementary material The online version of this article (doi:10.1186/s12920-015-0125-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Abhishek Niroula
- Department of Experimental Medical Science, Lund University, BMC B13, SE-22184, Lund, Sweden.
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, BMC B13, SE-22184, Lund, Sweden.
| |
Collapse
|
32
|
|