1
|
de Abreu AP, Carvalho FC, Mariano D, Bastos LL, Silva JRP, de Oliveira LM, de Melo-Minardi RC, Sabino ADP. An Approach for Engineering Peptides for Competitive Inhibition of the SARS-COV-2 Spike Protein. Molecules 2024; 29:1577. [PMID: 38611856 PMCID: PMC11013848 DOI: 10.3390/molecules29071577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 02/29/2024] [Accepted: 03/22/2024] [Indexed: 04/14/2024] Open
Abstract
SARS-CoV-2 is the virus responsible for a respiratory disease called COVID-19 that devastated global public health. Since 2020, there has been an intense effort by the scientific community to develop safe and effective prophylactic and therapeutic agents against this disease. In this context, peptides have emerged as an alternative for inhibiting the causative agent. However, designing peptides that bind efficiently is still an open challenge. Here, we show an algorithm for peptide engineering. Our strategy consists of starting with a peptide whose structure is similar to the interaction region of the human ACE2 protein with the SPIKE protein, which is important for SARS-COV-2 infection. Our methodology is based on a genetic algorithm performing systematic steps of random mutation, protein-peptide docking (using the PyRosetta library) and selecting the best-optimized peptides based on the contacts made at the peptide-protein interface. We performed three case studies to evaluate the tool parameters and compared our results with proposals presented in the literature. Additionally, we performed molecular dynamics (MD) simulations (three systems, 200 ns each) to probe whether our suggested peptides could interact with the spike protein. Our results suggest that our methodology could be a good strategy for designing peptides.
Collapse
Affiliation(s)
- Ana Paula de Abreu
- Laboratory of Bioinformatics and Systems, Department of Computer Science, Institute of Exact Sciences, Universidade Federal de Minas Gerais, Belo Horizonte 31270-901, Brazil; (A.P.d.A.); (F.C.C.); (L.L.B.); (L.M.d.O.)
| | - Frederico Chaves Carvalho
- Laboratory of Bioinformatics and Systems, Department of Computer Science, Institute of Exact Sciences, Universidade Federal de Minas Gerais, Belo Horizonte 31270-901, Brazil; (A.P.d.A.); (F.C.C.); (L.L.B.); (L.M.d.O.)
| | - Diego Mariano
- Laboratory of Bioinformatics and Systems, Department of Computer Science, Institute of Exact Sciences, Universidade Federal de Minas Gerais, Belo Horizonte 31270-901, Brazil; (A.P.d.A.); (F.C.C.); (L.L.B.); (L.M.d.O.)
| | - Luana Luiza Bastos
- Laboratory of Bioinformatics and Systems, Department of Computer Science, Institute of Exact Sciences, Universidade Federal de Minas Gerais, Belo Horizonte 31270-901, Brazil; (A.P.d.A.); (F.C.C.); (L.L.B.); (L.M.d.O.)
| | - Juliana Rodrigues Pereira Silva
- Department of Biochemistry and Immunology, Institute of Biological Sciences, Universidade Federal de Minas Gerais, Belo Horizonte 31270-901, Brazil
| | - Leandro Morais de Oliveira
- Laboratory of Bioinformatics and Systems, Department of Computer Science, Institute of Exact Sciences, Universidade Federal de Minas Gerais, Belo Horizonte 31270-901, Brazil; (A.P.d.A.); (F.C.C.); (L.L.B.); (L.M.d.O.)
| | - Raquel C. de Melo-Minardi
- Laboratory of Bioinformatics and Systems, Department of Computer Science, Institute of Exact Sciences, Universidade Federal de Minas Gerais, Belo Horizonte 31270-901, Brazil; (A.P.d.A.); (F.C.C.); (L.L.B.); (L.M.d.O.)
| | - Adriano de Paula Sabino
- Laboratory of Clinical and Experimental Hematology, Clinical and Toxicological Analysis Department, Faculty of Pharmacy, Universidade Federal de Minas Gerais, Belo Horizonte 31270-901, Brazil
| |
Collapse
|
2
|
Sohrabi MA, Zare-Mirakabad F, Ghidary SS, Saadat M, Sadegh-Zadeh SA. A novel data augmentation approach for influenza A subtype prediction based on HA proteins. Comput Biol Med 2024; 172:108316. [PMID: 38503091 DOI: 10.1016/j.compbiomed.2024.108316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 02/24/2024] [Accepted: 03/12/2024] [Indexed: 03/21/2024]
Abstract
Influenza, a pervasive viral respiratory illness, remains a significant global health concern. The influenza A virus, capable of causing pandemics, necessitates timely identification of specific subtypes for effective prevention and control, as highlighted by the World Health Organization. The genetic diversity of influenza A virus, especially in the hemagglutinin protein, presents challenges for accurate subtype prediction. This study introduces PreIS as a novel pipeline utilizing advanced protein language models and supervised data augmentation to discern subtle differences in hemagglutinin protein sequences. PreIS demonstrates two key contributions: leveraging pre-trained protein language models for influenza subtype classification and utilizing supervised data augmentation to generate additional training data without extensive annotations. The effectiveness of the pipeline has been rigorously assessed through extensive experiments, demonstrating a superior performance with an impressive accuracy of 94.54% compared to the current state-of-the-art model, the MC-NN model, which achieves an accuracy of 89.6%. PreIS also exhibits proficiency in handling unknown subtypes, emphasizing the importance of early detection. Pioneering the classification of HxNy subtypes solely based on the hemagglutinin protein chain, this research sets a benchmark for future studies. These findings promise more precise and timely influenza subtype prediction, enhancing public health preparedness against influenza outbreaks and pandemics. The data and code underlying this article are available in https://github.com/CBRC-lab/PreIS.
Collapse
Affiliation(s)
- Mohammad Amin Sohrabi
- Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran
| | - Fatemeh Zare-Mirakabad
- Computational Biology Research Center (CBRC), Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran
| | - Saeed Shiri Ghidary
- Department of Computing, School of Digital, Technologies, and Arts, Staffordshire University, Stoke-On-Trent, UK
| | - Mahsa Saadat
- Computational Biology Research Center (CBRC), Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran
| | - Seyed-Ali Sadegh-Zadeh
- Department of Computing, School of Digital, Technologies, and Arts, Staffordshire University, Stoke-On-Trent, UK.
| |
Collapse
|
3
|
Alcaide C, Méndez-López E, Úbeda JR, Gómez P, Aranda MA. Characterization of Two Aggressive PepMV Isolates Useful in Breeding Programs. Viruses 2023; 15:2230. [PMID: 38005907 PMCID: PMC10674935 DOI: 10.3390/v15112230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 11/06/2023] [Accepted: 11/07/2023] [Indexed: 11/26/2023] Open
Abstract
Pepino mosaic virus (PepMV) causes significant economic losses in tomato crops worldwide. Since its first detection infecting tomato in 1999, aggressive PepMV variants have emerged. This study aimed to characterize two aggressive PepMV isolates, PepMV-H30 and PepMV-KLP2. Both isolates were identified in South-Eastern Spain infecting tomato plants, which showed severe symptoms, including bright yellow mosaics. Full-length infectious clones were generated, and phylogenetic relationships were inferred using their nucleotide sequences and another 35 full-length sequences from isolates representing the five known PepMV strains. Our analysis revealed that PepMV-H30 and PepMV-KLP2 belong to the EU and CH2 strains, respectively. Amino acid sequence comparisons between these and mild isolates identified 8 and 15 amino acid substitutions for PepMV-H30 and PepMV-KLP2, respectively, potentially involved in severe symptom induction. None of the substitutions identified in PepMV-H30 have previously been described as symptom determinants. The E236K substitution, originally present in the PepMV-H30 CP, was introduced into a mild PepMV-EU isolate, resulting in a virus that causes symptoms similar to those induced by the parental PepMV-H30 in Nicotiana benthamiana plants. In silico analyses revealed that this residue is located at the C-terminus of the CP and is solvent-accessible, suggesting its potential involvement in CP-host protein interactions. We also examined the subcellular localization of PepGFPm2E236K in comparison to that of PepGFPm2, focusing on chloroplast affection, but no differences were observed in the GFP subcellular distribution between the two viruses in epidermal cells of N. benthamiana plants. Due to the easily visible symptoms that PepMV-H30 and PepMV-KLP2 induce, these isolates represent valuable tools in programs designed to breed resistance to PepMV in tomato.
Collapse
Affiliation(s)
| | | | | | | | - Miguel A. Aranda
- ”Del Segura” Centre for Applied Biology (CEBAS), Consejo Superior de Investigaciones Científicas (CSIC), 30100 Murcia, Spain; (C.A.); (E.M.-L.); (J.R.Ú.); (P.G.)
| |
Collapse
|
4
|
Theophall GG, Ramirez LMS, Premo A, Reverdatto S, Manigrasso MB, Yepuri G, Burz DS, Ramasamy R, Schmidt AM, Shekhtman A. Disruption of the productive encounter complex results in dysregulation of DIAPH1 activity. J Biol Chem 2023; 299:105342. [PMID: 37832872 PMCID: PMC10656230 DOI: 10.1016/j.jbc.2023.105342] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Revised: 09/27/2023] [Accepted: 10/06/2023] [Indexed: 10/15/2023] Open
Abstract
The diaphanous-related formin, Diaphanous 1 (DIAPH1), is required for the assembly of Filamentous (F)-actin structures. DIAPH1 is an intracellular effector of the receptor for advanced glycation end products (RAGE) and contributes to RAGE signaling and effects such as increased cell migration upon RAGE stimulation. Mutations in DIAPH1, including those in the basic "RRKR" motif of its autoregulatory domain, diaphanous autoinhibitory domain (DAD), are implicated in hearing loss, macrothrombocytopenia, and cardiovascular diseases. The solution structure of the complex between the N-terminal inhibitory domain, DID, and the C-terminal DAD, resolved by NMR spectroscopy shows only transient interactions between DID and the basic motif of DAD, resembling those found in encounter complexes. Cross-linking studies placed the RRKR motif into the negatively charged cavity of DID. Neutralizing the cavity resulted in a 5-fold decrease in the binding affinity and 4-fold decrease in the association rate constant of DAD for DID, indicating that the RRKR interactions with DID form a productive encounter complex. A DIAPH1 mutant containing a neutralized RRKR binding cavity shows excessive colocalization with actin and is unresponsive to RAGE stimulation. This is the first demonstration of a specific alteration of the surfaces responsible for productive encounter complexation with implications for human pathology.
Collapse
Affiliation(s)
- Gregory G Theophall
- Department of Chemistry, State University of New York at Albany, Albany, New York, USA
| | - Lisa M S Ramirez
- Department of Chemistry, State University of New York at Albany, Albany, New York, USA
| | - Aaron Premo
- Department of Chemistry, State University of New York at Albany, Albany, New York, USA
| | - Sergey Reverdatto
- Department of Chemistry, State University of New York at Albany, Albany, New York, USA
| | - Michaele B Manigrasso
- Department of Medicine, Diabetes Research Program, New York University Grossman School of Medicine, New York, New York, USA
| | - Gautham Yepuri
- Department of Medicine, Diabetes Research Program, New York University Grossman School of Medicine, New York, New York, USA
| | - David S Burz
- Department of Chemistry, State University of New York at Albany, Albany, New York, USA
| | - Ravichandran Ramasamy
- Department of Medicine, Diabetes Research Program, New York University Grossman School of Medicine, New York, New York, USA
| | - Ann Marie Schmidt
- Department of Medicine, Diabetes Research Program, New York University Grossman School of Medicine, New York, New York, USA
| | - Alexander Shekhtman
- Department of Chemistry, State University of New York at Albany, Albany, New York, USA.
| |
Collapse
|
5
|
Rappoport D, Jinich A. Enzyme Substrate Prediction from Three-Dimensional Feature Representations Using Space-Filling Curves. J Chem Inf Model 2023; 63:1637-1648. [PMID: 36802628 DOI: 10.1021/acs.jcim.3c00005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
Abstract
Compact and interpretable structural feature representations are required for accurately predicting properties and function of proteins. In this work, we construct and evaluate three-dimensional feature representations of protein structures based on space-filling curves (SFCs). We focus on the problem of enzyme substrate prediction, using two ubiquitous enzyme families as case studies: the short-chain dehydrogenase/reductases (SDRs) and the S-adenosylmethionine-dependent methyltransferases (SAM-MTases). Space-filling curves such as the Hilbert curve and the Morton curve generate a reversible mapping from discretized three-dimensional to one-dimensional representations and thus help to encode three-dimensional molecular structures in a system-independent way and with only a few adjustable parameters. Using three-dimensional structures of SDRs and SAM-MTases generated using AlphaFold2, we assess the performance of the SFC-based feature representations in predictions on a new benchmark database of enzyme classification tasks including their cofactor and substrate selectivity. Gradient-boosted tree classifiers yield binary prediction accuracy of 0.77-0.91 and area under curve (AUC) characteristics of 0.83-0.92 for the classification tasks. We investigate the effects of amino acid encoding, spatial orientation, and (the few) parameters of SFC-based encodings on the accuracy of the predictions. Our results suggest that geometry-based approaches such as SFCs are promising for generating protein structural representations and are complementary to the existing protein feature representations such as evolutionary scale modeling (ESM) sequence embeddings.
Collapse
Affiliation(s)
- Dmitrij Rappoport
- Department of Chemistry, University of California, Irvine, 1102 Natural Sciences 2, Irvine, California 92697, United States
| | - Adrian Jinich
- Weill Cornell Medicine, 1300 York Avenue, Box 65, New York, New York 10065, United States
| |
Collapse
|
6
|
Robson B, Baek O. An ontology for very large numbers of longitudinal health records to facilitate data mining and machine learning. INFORMATICS IN MEDICINE UNLOCKED 2023. [DOI: 10.1016/j.imu.2023.101204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023] Open
|
7
|
Panda G, Mishra N, Sharma D, Kutum R, Bhoyar RC, Jain A, Imran M, Senthilvel V, Divakar MK, Mishra A, Garg P, Banerjee P, Sivasubbu S, Scaria V, Ray A. Comprehensive Assessment of Indian Variations in the Druggable Kinome Landscape Highlights Distinct Insights at the Sequence, Structure and Pharmacogenomic Stratum. Front Pharmacol 2022; 13:858345. [PMID: 35865963 PMCID: PMC9294532 DOI: 10.3389/fphar.2022.858345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 06/07/2022] [Indexed: 11/13/2022] Open
Abstract
India confines more than 17% of the world’s population and has a diverse genetic makeup with several clinically relevant rare mutations belonging to many sub-group which are undervalued in global sequencing datasets like the 1000 Genome data (1KG) containing limited samples for Indian ethnicity. Such databases are critical for the pharmaceutical and drug development industry where diversity plays a crucial role in identifying genetic disposition towards adverse drug reactions. A qualitative and comparative sequence and structural study utilizing variant information present in the recently published, largest curated Indian genome database (IndiGen) and the 1000 Genome data was performed for variants belonging to the kinase coding genes, the second most targeted group of drug targets. The sequence-level analysis identified similarities and differences among different populations based on the nsSNVs and amino acid exchange frequencies whereas a comparative structural analysis of IndiGen variants was performed with pathogenic variants reported in UniProtKB Humsavar data. The influence of these variations on structural features of the protein, such as structural stability, solvent accessibility, hydrophobicity, and the hydrogen-bond network was investigated. In-silico screening of the known drugs to these Indian variation-containing proteins reveals critical differences imparted in the strength of binding due to the variations present in the Indian population. In conclusion, this study constitutes a comprehensive investigation into the understanding of common variations present in the second largest population in the world and investigating its implications in the sequence, structural and pharmacogenomic landscape. The preliminary investigation reported in this paper, supporting the screening and detection of ADRs specific to the Indian population could aid in the development of techniques for pre-clinical and post-market screening of drug-related adverse events in the Indian population.
Collapse
Affiliation(s)
- Gayatri Panda
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla, India
| | - Neha Mishra
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla, India
| | - Disha Sharma
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Rintu Kutum
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
- Ashoka University, Sonipat, India
| | - Rahul C. Bhoyar
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Abhinav Jain
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Mohamed Imran
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Vigneshwar Senthilvel
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Mohit Kumar Divakar
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Anushree Mishra
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Parth Garg
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla, India
| | - Priyanka Banerjee
- Institute for Physiology, Charité-University Medicine Berlin, Berlin, Germany
| | - Sridhar Sivasubbu
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Vinod Scaria
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Arjun Ray
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla, India
- *Correspondence: Arjun Ray,
| |
Collapse
|
8
|
Sumanaweera D, Allison L, Konagurthu AS. Bridging the gaps in statistical models of protein alignment. Bioinformatics 2022; 38:i229-i237. [PMID: 35758809 PMCID: PMC9235498 DOI: 10.1093/bioinformatics/btac246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Summary Sequences of proteins evolve by accumulating substitutions together with insertions and deletions (indels) of amino acids. However, it remains a common practice to disconnect substitutions and indels, and infer approximate models for each of them separately, to quantify sequence relationships. Although this approach brings with it computational convenience (which remains its primary motivation), there is a dearth of attempts to unify and model them systematically and together. To overcome this gap, this article demonstrates how a complete statistical model quantifying the evolution of pairs of aligned proteins can be constructed using a time-parameterized substitution matrix and a time-parameterized alignment state machine. Methods to derive all parameters of such a model from any benchmark collection of aligned protein sequences are described here. This has not only allowed us to generate a unified statistical model for each of the nine widely used substitution matrices (PAM, JTT, BLOSUM, JO, WAG, VTML, LG, MIQS and PFASUM), but also resulted in a new unified model, MMLSUM. Our underlying methodology measures the Shannon information content using each model to explain losslessly any given collection of alignments, which has allowed us to quantify the performance of all the above models on six comprehensive alignment benchmarks. Our results show that MMLSUM results in a new and clear overall best performance, followed by PFASUM, VTML, BLOSUM and MIQS, respectively, amongst the top five. We further analyze the statistical properties of MMLSUM model and contrast it with others. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Dinithi Sumanaweera
- Department of Data Science and Artificial Intelligence, Faculty of Information Technology, Monash University, Clayton, VIC 3800, Australia
| | - Lloyd Allison
- Department of Data Science and Artificial Intelligence, Faculty of Information Technology, Monash University, Clayton, VIC 3800, Australia
| | - Arun S Konagurthu
- Department of Data Science and Artificial Intelligence, Faculty of Information Technology, Monash University, Clayton, VIC 3800, Australia
| |
Collapse
|
9
|
Robson B. De novo protein folding on computers. Benefits and challenges. Comput Biol Med 2022; 143:105292. [PMID: 35158120 DOI: 10.1016/j.compbiomed.2022.105292] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Revised: 01/19/2022] [Accepted: 01/20/2022] [Indexed: 01/05/2023]
Abstract
There has been recent success in prediction of the three-dimensional folded native structures of proteins, most famously by the AlphaFold Algorithm running on Google's/Alphabet's DeepMind computer. However, this largely involves machine learning of protein structures and is not a de novo protein structure prediction method for predicting three-dimensional structures from amino acid residue sequences. A de novo approach would be based almost entirely on general principles of energy and entropy that govern protein folding energetics, and importantly do so without the use of the amino acid sequences and structural features of other proteins. Most consider that problem as still unsolved even though it has occupied leading scientists for decades. Many consider that it remains one of the major outstanding issues in modern science. There is crucial continuing help from experimental findings on protein unfolding and refolding in the laboratory, but only to a limited extent because many researchers consider that the speed by which real proteins folds themselves, often from milliseconds to minutes, is itself still not fully understood. This is unfortunate, because a practical solution to the problem would probably have a major effect on personalized medicine, the pharmaceutical industry, biotechnology, and nanotechnology, including for example "smaller" tasks such as better modeling of flexible "unfolded" regions of the SARS-COV-2 spike glycoprotein when interacting with its cell receptor, antibodies, and therapeutic agents. Some important ideas from earlier studies are given before moving on to lessons from periodic and aperiodic crystals, and a possible role for quantum phenomena. The conclusion is that better computation of entropy should be the priority, though that is presented guardedly.
Collapse
Affiliation(s)
- Barry Robson
- Ingine Inc.Cleveland Ohio and The Dirac Foundation, Oxfordshire, UK.
| |
Collapse
|
10
|
Robson B. Towards faster response against emerging epidemics and prediction of variants of concern. INFORMATICS IN MEDICINE UNLOCKED 2022; 31:100966. [PMID: 35611320 PMCID: PMC9119712 DOI: 10.1016/j.imu.2022.100966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 05/05/2022] [Accepted: 05/11/2022] [Indexed: 01/11/2023] Open
Abstract
The author, the journal, Computers in Biology and Medicine (CBM), and Elsevier Press more generally, played a helpful very early role in responding to COVID-19. Within a few days of the appearance of the "Wuhan Seafood isolate" genome on GenBank, a bioinformatics study was posted by the present author in ResearchGate in January 2020, "Preliminary Bioinformatics Studies on the Design of Synthetic Vaccines and Preventative Peptidomimetic Antagonists against the Wuhan Seafood Market Coronavirus. Possible Importance of the KRSFIEDLLFNKV Motif" DOI: 10.13140/RG.2.2.18275.09761. On February 2nd, 2020, a more thorough analysis was submitted to CBM, e-published on February 26, and formally published in April 2020, at about the same time as the virus named as 2019n-CoV was identified as essentially SARS and renames SARS-COV-2. This was followed by four further papers describing in more detail some previously unreported aspects of the early investigation. The speed of research and writing of the papers was made possible by knowledge-gathering tools. Based on this and earlier experiences with fast responses to emerging epidemics such as HIV and Mad Cow Disease, it is possible to envisage the nature of a speedier response to emerging epidemics and new variants of concern in established epidemics.
Collapse
Affiliation(s)
- B Robson
- Ingine Inc., Cleveland, Ohio, USA.,The Dirac Foundation, Oxfordshire, UK
| |
Collapse
|
11
|
Alvarez-Rodrigo I, Wainman A, Saurya S, Raff JW. Ana1 helps recruit Polo to centrioles to promote mitotic PCM assembly and centriole elongation. J Cell Sci 2021; 134:jcs258987. [PMID: 34156068 PMCID: PMC8325959 DOI: 10.1242/jcs.258987] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 06/08/2021] [Indexed: 01/12/2023] Open
Abstract
Polo kinase (PLK1 in mammals) is a master cell cycle regulator that is recruited to various subcellular structures, often by its polo-box domain (PBD), which binds to phosphorylated S-pS/pT motifs. Polo/PLK1 kinases have multiple functions at centrioles and centrosomes, and we have previously shown that in Drosophila phosphorylated Sas-4 initiates Polo recruitment to newly formed centrioles, while phosphorylated Spd-2 recruits Polo to the pericentriolar material (PCM) that assembles around mother centrioles in mitosis. Here, we show that Ana1 (Cep295 in humans) also helps to recruit Polo to mother centrioles in Drosophila. If Ana1-dependent Polo recruitment is impaired, mother centrioles can still duplicate, disengage from their daughters and form functional cilia, but they can no longer efficiently assemble mitotic PCM or elongate during G2. We conclude that Ana1 helps recruit Polo to mother centrioles to specifically promote mitotic centrosome assembly and centriole elongation in G2, but not centriole duplication, centriole disengagement or cilia assembly. This article has an associated First Person interview with the first author of the paper.
Collapse
Affiliation(s)
| | | | | | - Jordan W. Raff
- The Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford OX1 3RE, UK
| |
Collapse
|
12
|
Prasasty VD, Grazzolie K, Rosmalena R, Yazid F, Ivan FX, Sinaga E. Peptide-Based Subunit Vaccine Design of T- and B-Cells Multi-Epitopes against Zika Virus Using Immunoinformatics Approaches. Microorganisms 2019; 7:E226. [PMID: 31370224 PMCID: PMC6722788 DOI: 10.3390/microorganisms7080226] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Revised: 07/15/2019] [Accepted: 07/24/2019] [Indexed: 12/17/2022] Open
Abstract
The Zika virus disease, also known as Zika fever is an arboviral disease that became epidemic in the Pacific Islands and had spread to 18 territories of the Americas in 2016. Zika virus disease has been linked to several health problems such as microcephaly and the Guillain-Barré syndrome, but to date, there has been no vaccine available for Zika. Problems related to the development of a vaccine include the vaccination target, which covers pregnant women and children, and the antibody dependent enhancement (ADE), which can be caused by non-neutralizing antibodies. The peptide vaccine was chosen as a focus of this study as a safer platform to develop the Zika vaccine. In this study, a collection of Zika proteomes was used to find the best candidates for T- and B-cell epitopes using the immunoinformatics approach. The most promising T-cell epitopes were mapped using the selected human leukocyte antigen (HLA) alleles, and further molecular docking and dynamics studies showed a good peptide-HLA interaction for the best major histocompatibility complex-II (MHC-II) epitope. The most promising B-cell epitopes include four linear peptides predicted to be cross-reactive with T-cells, and conformational epitopes from two proteins accessible by antibodies in their native biological assembly. It is believed that the use of immunoinformatics methods is a promising strategy against the Zika viral infection in designing an efficacious multiepitope vaccine.
Collapse
Affiliation(s)
- Vivitri Dewi Prasasty
- Faculty of Biotechnology, Atma Jaya Catholic University of Indonesia, Jakarta 12930, Indonesia.
| | - Karel Grazzolie
- Department of Biology, Faculty of Life Science, Surya University, Tangerang, Banten 15143, Indonesia
| | - Rosmalena Rosmalena
- Department of Medical Chemistry, Faculty of Medicine, Universitas Indonesia, Depok 16424, Indonesia
| | - Fatmawaty Yazid
- Department of Medical Chemistry, Faculty of Medicine, Universitas Indonesia, Depok 16424, Indonesia
| | - Fransiskus Xaverius Ivan
- Department of Biology, Faculty of Life Science, Surya University, Tangerang, Banten 15143, Indonesia
| | - Ernawati Sinaga
- Faculty of Biology, Universitas Nasional, Jakarta 12520, Indonesia
| |
Collapse
|
13
|
Koehl P, Orland H, Delarue M. Numerical Encodings of Amino Acids in Multivariate Gaussian Modeling of Protein Multiple Sequence Alignments. Molecules 2018; 24:E104. [PMID: 30597916 PMCID: PMC6337344 DOI: 10.3390/molecules24010104] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2018] [Revised: 12/21/2018] [Accepted: 12/24/2018] [Indexed: 11/17/2022] Open
Abstract
Residues in proteins that are in close spatial proximity are more prone to covariate as their interactions are likely to be preserved due to structural and evolutionary constraints. If we can detect and quantify such covariation, physical contacts may then be predicted in the structure of a protein solely from the sequences that decorate it. To carry out such predictions, and following the work of others, we have implemented a multivariate Gaussian model to analyze correlation in multiple sequence alignments. We have explored and tested several numerical encodings of amino acids within this model. We have shown that 1D encodings based on amino acid biochemical and biophysical properties, as well as higher dimensional encodings computed from the principal components of experimentally derived mutation/substitution matrices, do not perform as well as a simple twenty dimensional encoding with each amino acid represented with a vector of one along its own dimension and zero elsewhere. The optimum obtained from representations based on substitution matrices is reached by using 10 to 12 principal components; the corresponding performance is less than the performance obtained with the 20-dimensional binary encoding. We highlight also the importance of the prior when constructing the multivariate Gaussian model of a multiple sequence alignment.
Collapse
Affiliation(s)
- Patrice Koehl
- Department of Computer Science, University of California, Davis, CA 95211, USA.
| | - Henri Orland
- Institut de Physique Théorique, CEA Saclay, 91191 Gif-sur-Yvette CEDEX, France.
| | - Marc Delarue
- Department of Structural Biology and Chemistry and UMR 3528 du CNRS, Institut Pasteur, 75015 Paris, France.
| |
Collapse
|
14
|
Facchiano A, Di Giulio M. The genetic code is not an optimal code in a model taking into account both the biosynthetic relationships between amino acids and their physicochemical properties. J Theor Biol 2018; 459:45-51. [DOI: 10.1016/j.jtbi.2018.09.021] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Revised: 09/04/2018] [Accepted: 09/19/2018] [Indexed: 01/22/2023]
|
15
|
Bhattacharya S, Banerjee A, Sah PP, Mal C, Ray S. Mutations and functional analysis of 14-3-3 stress response protein from Triticum aestivum: An evolutionary analysis through in silico structural biochemistry approach. Comput Biol Chem 2018; 77:343-353. [PMID: 30466043 DOI: 10.1016/j.compbiolchem.2018.09.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2018] [Revised: 09/08/2018] [Accepted: 09/19/2018] [Indexed: 10/28/2022]
Abstract
Wheat (Triticum aestivum), having high nutritional values is one of the staple food of most of the countries in the world. The productivity of the crop decreases drastically when it encounters various abiotic stresses, most common of which are heat, drought, flood and salinity. There is a crucial role of stress response proteins for the survival of the crops in stress conditions. So the study of wheat stress response proteins is of great importance to raise wheat production in different stress conditions. In this study, we analysed 14-3-3 protein, a stress response protein that is expressed in three major stresses, for example heat, drought and salinity and helps the plants to survive in those conditions. Effect of mutations in the 14-3-3 sequence was predicted using its domain, secondary structure and multiple sequence alignment of amino acid sequences from wheat and its related species. The functional diversity of the protein in different species was correlated with mutations, change in secondary structure and the evolutionary relatedness of the protein in different species. This is the first novel work for analysing the mutational effect on the structure and function of a stress response protein (14-3-3) from Triticum aestivum and its related species.
Collapse
Affiliation(s)
| | - Arundhati Banerjee
- Department of Biochemistry and Biophysics, University of Kalyani, Kalyani, Nadia, India
| | | | - Chittabrata Mal
- Amity Institute of Biotechnology, Amity University, Kolkata, India
| | - Sujay Ray
- Amity Institute of Biotechnology, Amity University, Kolkata, India.
| |
Collapse
|
16
|
Hamed G, Marey M, Amin SES, Tolba MF. Hybrid, randomized and high capacity conservative mutations DNA-based steganography for large sized data. Biosystems 2018; 167:47-61. [PMID: 29608931 DOI: 10.1016/j.biosystems.2018.03.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2017] [Revised: 03/20/2018] [Accepted: 03/22/2018] [Indexed: 11/16/2022]
Abstract
In this paper, a well secured, high capacity, preserved algorithm is proposed through integrating the cryptography and steganography concepts with the molecular biology concepts. We achieved this by first encrypting the confidential data using the DNA Playfair cipher to avoid extra information sent to the receiver and it consequently acts as a trap for an attacker. Second, it achieves a randomized steganography process by exploiting the DNA conservative mutations. The DNA conservative mutations are utilized in a way that allows a DNA base to be substituted by another base to allow carrying two bits. Consequently, a high capacity feature is obtained with no payload for the used sequence. There are three main achieved contributions in this work. First, is hiding high capacity of data within DNA by exploiting each codon to hide two bits whilst preserving the sequence properties of protein after the steganography process, which is a trade off in the field. Secondly, using the conservative mutation with all its valid biological permutations, leads to the lowest cracking probability achieved and published till now, as proven in the security analysis section. Finally, a comparison is conducted between the proposed algorithm and five recent substitution based algorithms using large sized data up to three megabytes, to prove the algorithm's scalability.
Collapse
Affiliation(s)
- Ghada Hamed
- Department of Scientific Computing, Faculty of Computer and Information Sciences, Ain Shams University, Cairo, Egypt.
| | - Mohammed Marey
- Department of Scientific Computing, Faculty of Computer and Information Sciences, Ain Shams University, Cairo, Egypt
| | - Safaa El-Sayed Amin
- Department of Scientific Computing, Faculty of Computer and Information Sciences, Ain Shams University, Cairo, Egypt
| | - Mohamed Fahmy Tolba
- Department of Scientific Computing, Faculty of Computer and Information Sciences, Ain Shams University, Cairo, Egypt
| |
Collapse
|
17
|
Nojoomi S, Koehl P. A weighted string kernel for protein fold recognition. BMC Bioinformatics 2017; 18:378. [PMID: 28841820 PMCID: PMC5574112 DOI: 10.1186/s12859-017-1795-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2017] [Accepted: 08/15/2017] [Indexed: 11/10/2022] Open
Abstract
Background Alignment-free methods for comparing protein sequences have proved to be viable alternatives to approaches that first rely on an alignment of the sequences to be compared. Much work however need to be done before those methods provide reliable fold recognition for proteins whose sequences share little similarity. We have recently proposed an alignment-free method based on the concept of string kernels, SeqKernel (Nojoomi and Koehl, BMC Bioinformatics, 2017, 18:137). In this previous study, we have shown that while Seqkernel performs better than standard alignment-based methods, its applications are potentially limited, because of biases due mostly to sequence length effects. Methods In this study, we propose improvements to SeqKernel that follows two directions. First, we developed a weighted version of the kernel, WSeqKernel. Second, we expand the concept of string kernels into a novel framework for deriving information on amino acids from protein sequences. Results Using a dataset that only contains remote homologs, we have shown that WSeqKernel performs remarkably well in fold recognition experiments. We have shown that with the appropriate weighting scheme, we can remove the length effects on the kernel values. WSeqKernel, just like any alignment-based sequence comparison method, depends on a substitution matrix. We have shown that this matrix can be optimized so that sequence similarity scores correlate well with structure similarity scores. Starting from no information on amino acid similarity, we have shown that we can derive a scoring matrix that echoes the physico-chemical properties of amino acids. Conclusion We have made progress in characterizing and parametrizing string kernels as alignment-based methods for comparing protein sequences, and we have shown that they provide a framework for extracting sequence information from structure. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1795-5) contains supplementary material, which is available to authorized users.
Collapse
|
18
|
Proteins and bioactive peptides from donkey milk: The molecular basis for its reduced allergenic properties. Food Res Int 2017; 99:41-57. [PMID: 28784499 DOI: 10.1016/j.foodres.2017.07.002] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2017] [Revised: 06/29/2017] [Accepted: 07/02/2017] [Indexed: 12/18/2022]
Abstract
The legendary therapeutics properties of donkey milk have recently been supported by many clinical trials who have clearly demonstrated that, even if with adequate lipid integration, it may represent a valid natural substitute of cow milk for feeding allergic children. During the last decade many investigations by MS-based methods have been performed in order to obtain a better knowledge of donkey milk proteins. The knowledge about the primary structure of donkey milk proteins now may provide the basis for a more accurate comprehension of its potential benefits for human nutrition. In this aspect, experimental data today available clearly demonstrate that donkey milk proteins (especially casein components) are more closely related with the human homologues rather than cow counterparts. Moreover, the low allergenic properties of donkey milk with respect to cow one seem to be related to the low total protein content, the low ratio of caseins to whey fraction, and finally to the presence in almost all bovine IgE-binding linear epitopes of multiple amino acid differences with respect to the corresponding regions of donkey milk counterparts.
Collapse
|
19
|
Yang Y, Kelly PJ, Bai J, Zhang R, Wang C. First Molecular Characterization of Bovine Leukemia Virus Infections in the Caribbean. PLoS One 2016; 11:e0168379. [PMID: 27977761 PMCID: PMC5158060 DOI: 10.1371/journal.pone.0168379] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2016] [Accepted: 11/30/2016] [Indexed: 12/15/2022] Open
Abstract
Bovine leukemia virus (BLV) is a retrovirus that causes enzootic bovine leucosis. To investigate the presence and genetic variability of BLV in the Caribbean for the first time, we preformed fluorescence resonance energy transfer (FRET)-PCR for the pol of BLV on DNA from whole blood of cattle from Dominica, Montserrat, Nevis and St. Kitts. Standard PCRs with primers for the env were used for phylogenetic analysis of BLV in positive animals. We found FRET-PCR positive cattle (12.6%, 41/325) on Dominica (5.2%; 4/77) and St. Kitts (19.2%; 37/193) but not on Montserrat (0%, 0/12) or Nevis (0%, 0/43). Positive animals were cows on farms where animals were raised intensively. Phylogenetic analysis using the neighbor-joining (NJ) method on partial and full-length env sequences obtained for strains from Dominica (n = 2) and St. Kitts (n = 5) and those available in GenBank (n = 90) (genotypes 1-10) revealed the Caribbean strains belonged to genotype 1 (98-100% sequence homology). Ours is the first molecular characterization of BLV infections in the Caribbean and the first description of genotype 1 in the region.
Collapse
Affiliation(s)
- Yi Yang
- Jiangsu Co-Innovation Center for the Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou University College of Veterinary Medicine, Yangzhou, Jiangsu, China
- Department of Veterinary Diagnostic Laboratory, College of Veterinary Medicine, Kansas State University, Kansas, Kansas, United States of America
| | - Patrick John Kelly
- Ross University School of Veterinary Medicine, Basseterre, Saint Kitts and Nevis
| | - Jianfa Bai
- Department of Veterinary Diagnostic Laboratory, College of Veterinary Medicine, Kansas State University, Kansas, Kansas, United States of America
- * E-mail: (CW); (JB)
| | - Rong Zhang
- Jiangsu Co-Innovation Center for the Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou University College of Veterinary Medicine, Yangzhou, Jiangsu, China
| | - Chengming Wang
- Jiangsu Co-Innovation Center for the Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou University College of Veterinary Medicine, Yangzhou, Jiangsu, China
- Department of Pathobiology, College of Veterinary Medicine, Auburn University, Auburn, Alabama, United States of America
- * E-mail: (CW); (JB)
| |
Collapse
|
20
|
Wang D, Xu C, Wang T, Li H, Li Y, Ren J, Tian Y, Li Z, Jiao Y, Kang X, Liu X. Discovery and functional characterization of leptin and its receptors in Japanese quail (Coturnix japonica). Gen Comp Endocrinol 2016; 225:1-12. [PMID: 26342967 DOI: 10.1016/j.ygcen.2015.09.003] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/09/2015] [Revised: 08/07/2015] [Accepted: 09/01/2015] [Indexed: 12/31/2022]
Abstract
Leptin is an important endocrine regulation factor of food intake and energy homeostasis in mammals; however, the existence of a poultry leptin gene (LEP) is still debated. Here, for the first time, we report the cloning of a partial exon 3 sequence of LEP (qLEP) and four different leptin receptor splicing variants, including a long receptor (qLEPRl) and three soluble receptors (qLEPR-a, qLEPR-b and qLEPR-c) in Japanese quail (Coturnix japonica). The qLEP gene had high GC content (64%), which is similar to other reported avian leptin genes. The encoded qLEP protein possessed the conserved pair of cysteine residues that are required to form a lasso knot for full biological activity, but shared relatively low identities with LEPs of other vertebrates. The translated qLEPRl protein contained 1143 amino acids and shared high amino acid sequence identity with a chicken homolog (89% identity). qLEPRl also contained all the motifs, domains, and basic tyrosine residues that are conserved in the LEPRl proteins of other vertebrates. qRT-PCR analysis showed that LEP and the four LEPR variants were expressed extensively in all tissues examined; the expression levels of LEP were relatively high in hypothalamus, skeletal muscle, and pancreas, while the expression levels of the LEPRs were highest in the pituitary. Compared with the expression levels of juvenile qLEP and total qLEPR (including all LEPR variants), the expression levels of mature qLEP and total qLEPR were up-regulated in the hypothalamus and pituitary, and down-regulated in the ovary. The expressions of LEP/LEPR increased when fasting and decreased when refeeding in the brain and peripheral tissues of juvenile quail, which suggested that the LEP/LEPR system modulated food intake and energy expenditure, although, unlike in mammals, LEP may actually act to inhibit food intake during fasting, at least in juvenile quail. The results indicate that qLEP and qLEPR have unique expression patterns and that the encoded proteins play important roles in the regulation of reproduction and energy status in Japanese quail.
Collapse
Affiliation(s)
- Dandan Wang
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou 450002, China
| | - Chunlin Xu
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou 450002, China
| | - Taian Wang
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou 450002, China
| | - Hong Li
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou 450002, China
| | - Yanmin Li
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou 450002, China
| | - Junxiao Ren
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou 450002, China
| | - Yadong Tian
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou 450002, China; Henan Innovative Engineering Research Center of Poultry Germplasm Resource, Henan Agricultural University, Zhengzhou 450002, China; International Joint Research Laboratory for Poultry Breeding of Henan, Henan Agricultural University, Zhengzhou 450002, China
| | - Zhuanjian Li
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou 450002, China; Henan Innovative Engineering Research Center of Poultry Germplasm Resource, Henan Agricultural University, Zhengzhou 450002, China; International Joint Research Laboratory for Poultry Breeding of Henan, Henan Agricultural University, Zhengzhou 450002, China
| | - Yuping Jiao
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou 450002, China; Institute of Animal Husbandry and Veterinary Medicine, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
| | - Xiangtao Kang
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou 450002, China; Henan Innovative Engineering Research Center of Poultry Germplasm Resource, Henan Agricultural University, Zhengzhou 450002, China; International Joint Research Laboratory for Poultry Breeding of Henan, Henan Agricultural University, Zhengzhou 450002, China.
| | - Xiaojun Liu
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou 450002, China; Henan Innovative Engineering Research Center of Poultry Germplasm Resource, Henan Agricultural University, Zhengzhou 450002, China; International Joint Research Laboratory for Poultry Breeding of Henan, Henan Agricultural University, Zhengzhou 450002, China.
| |
Collapse
|
21
|
Kuster CJ, Von Elert E. High-resolution melting analysis: a genotyping tool for population studies on Daphnia. Mol Ecol Resour 2012; 12:1048-57. [PMID: 22925691 DOI: 10.1111/j.1755-0998.2012.03177.x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2012] [Revised: 07/05/2012] [Accepted: 07/17/2012] [Indexed: 11/27/2022]
Abstract
Determining genetic variation at the DNA level within and between natural populations is important for understanding the role of natural selection on phenotypic traits, but many techniques of screening for genetic variation are either cost intensive, not sensitive enough or too labour- and time-consuming. Here, we demonstrate high-resolution melting analysis (HRMA) as a cost-effective and powerful tool for screening variable target genes in natural populations. HRMA is based on monitoring the melting of PCR amplicons. Owing to saturating concentrations of a dye that binds at high concentrations to double-stranded DNA, it is possible to genotype high numbers of samples rapidly and accurately. We analysed digestive trypsins of two Daphnia magna populations as an application example for HRMA. One population originated from a pond containing toxic cyanobacteria that possibly produce protease inhibitors and the other from a pond without such cyanobacteria. The hypothesis was that D. magna clones from ponds with cyanobacteria have undergone selection by these inhibitors, which has led to different trypsin alleles. We first sequenced pooled genomic PCR products of trypsins from both populations to identify variable DNA sequences of active trypsins. Second, we screened variable DNA sequences of each D. magna clone from both populations for single nucleotide polymorphisms via HRMA. The HRMA results revealed that both populations exhibited phenotypic differences in the analysed trypsins. Our results indicate that HRMA is a powerful genotyping tool for studying the variation of target genes in response to selection within and between natural Daphnia populations.
Collapse
Affiliation(s)
- C J Kuster
- Zoological Institute, Aquatic Chemical Ecology, University of Cologne, Cologne, Germany.
| | | |
Collapse
|
22
|
Denver RJ, Bonett RM, Boorse GC. Evolution of leptin structure and function. Neuroendocrinology 2011; 94:21-38. [PMID: 21677426 DOI: 10.1159/000328435] [Citation(s) in RCA: 146] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/31/2011] [Accepted: 04/11/2011] [Indexed: 12/15/2022]
Abstract
Leptin, the protein product of the obese(ob or Lep) gene, is a hormone synthesized by adipocytes that signals available energy reserves to the brain, and thereby influences development, growth, metabolism and reproduction. In mammals, leptin functions as an adiposity signal: circulating leptin fluctuates in proportion to fat mass, and it acts on the hypothalamus to suppress food intake. Orthologs of mammalian Lep genes were recently isolated from several fish and two amphibian species, and here we report the identification of two Lep genes in a reptile, the lizard Anolis carolinensis. While vertebrate leptins show large divergence in their primary amino acid sequence, they form similar tertiary structures, and may have similar potencies when tested in vitro on heterologous leptin receptors (LepRs). Leptin binds to LepRs on the plasma membrane, activating several intracellular signaling pathways. Vertebrate LepRs signal via the Janus kinase (Jak) and signal transducer and activator of transcription (STAT) pathway. Three tyrosine residues located within the LepR cytoplasmic domain are phosphorylated by Jak2 and are required for activation of SH2-containing tyrosine phosphatase-2, STAT5 and STAT3 signaling. These tyrosines are conserved from fishes to mammals, demonstrating their critical role in signaling by the LepR. Leptin is anorexigenic in representatives of all vertebrate classes, suggesting that its role in energy balance is ancient and has been evolutionarily conserved. In addition to its integral role as a regulator of appetite and energy balance, leptin exerts pleiotropic actions in development, physiology and behavior.
Collapse
Affiliation(s)
- Robert J Denver
- Department of Molecular, Cellular and Developmental Biology, The University of Michigan, Ann Arbor, USA. rdenver @ umich.edu
| | | | | |
Collapse
|
23
|
Pape S, Hoffgaard F, Hamacher K. Distance-dependent classification of amino acids by information theory. Proteins 2010; 78:2322-8. [PMID: 20544967 DOI: 10.1002/prot.22744] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Reduced amino acid alphabets are useful to understand molecular evolution as they reveal basal, shared properties of amino acids, which the structures and functions of proteins rely on. Several previous studies derived such reduced alphabets and linked them to the origin of life and biotechnological applications. However, all this previous work presupposes that only direct contacts of amino acids in native protein structures are relevant. We show in this work, using information-theoretical measures, that an appropriate alphabet reduction scheme is in fact a function of the maximum distance amino acids interact at. Although for small distances our results agree with previous ones, we show how long-range interactions change the overall picture and prompt for a revised understanding of the protein design process.
Collapse
Affiliation(s)
- Susanne Pape
- Department of Mathematics, Technische Universität Darmstadt, 64287 Darmstadt, Germany
| | | | | |
Collapse
|
24
|
Friedrich A, Garnier N, Gagnière N, Nguyen H, Albou LP, Biancalana V, Bettler E, Deléage G, Lecompte O, Muller J, Moras D, Mandel JL, Toursel T, Moulinier L, Poch O. SM2PH-db: an interactive system for the integrated analysis of phenotypic consequences of missense mutations in proteins involved in human genetic diseases. Hum Mutat 2010; 31:127-35. [PMID: 19921752 DOI: 10.1002/humu.21155] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Understanding how genetic alterations affect gene products at the molecular level represents a first step in the elucidation of the complex relationships between genotypic and phenotypic variations, and is thus a major challenge in the postgenomic era. Here, we present SM2PH-db (http://decrypthon.igbmc.fr/sm2ph), a new database designed to investigate structural and functional impacts of missense mutations and their phenotypic effects in the context of human genetic diseases. A wealth of up-to-date interconnected information is provided for each of the 2,249 disease-related entry proteins (August 2009), including data retrieved from biological databases and data generated from a Sequence-Structure-Evolution Inference in Systems-based approach, such as multiple alignments, three-dimensional structural models, and multidimensional (physicochemical, functional, structural, and evolutionary) characterizations of mutations. SM2PH-db provides a robust infrastructure associated with interactive analysis tools supporting in-depth study and interpretation of the molecular consequences of mutations, with the more long-term goal of elucidating the chain of events leading from a molecular defect to its pathology. The entire content of SM2PH-db is regularly and automatically updated thanks to a computational grid data federation facilities provided in the context of the Decrypthon program.
Collapse
Affiliation(s)
- Anne Friedrich
- Département de Biologie et Génomique Structurales, Institut de Génétique et de Biologie Moléculaire et Cellulaire (UMR7104), Centre National de la Recherche Scientifique/Institut National de la Santé et de la Recherche Médicale/Université de Strasbourg, Illkirch, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Rakshit S, Ananthasuresh GK. An amino acid map of inter-residue contact energies using metric multi-dimensional scaling. J Theor Biol 2008; 250:291-7. [PMID: 17981305 DOI: 10.1016/j.jtbi.2007.09.032] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2007] [Revised: 08/07/2007] [Accepted: 09/17/2007] [Indexed: 10/22/2022]
Abstract
We present an amino map based on their inter-residue contact energies using the Miyazawa-Jernigan matrix. This work is based on the method of metric multi-dimensional scaling (MMDS). The MMDS map shows, among other things, that the MJ contact energies imply the hydrophobic-hydrophilic nature of the amino acid residues. With the help of the map we are able to compare and draw inferences from uncorrelated data sets such as BLOSUM and PAM with MJ methods. We also use a hierarchical clustering method on our MMDS distance matrix to group the amino acids and arrive at an optimum number of groups for simplifying the amino acid set.
Collapse
Affiliation(s)
- Sourav Rakshit
- Mechanical Engineering, Indian Institute of Science, Bangalore 560012, India.
| | | |
Collapse
|
26
|
Luthra A, Jha AN, Ananthasuresh GK, Vishveswara S. A method for computing the inter-residue interaction potentials for reduced amino acid alphabet. J Biosci 2007; 32:883-9. [PMID: 17914230 DOI: 10.1007/s12038-007-0088-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Inter-residue potentials are extensively used in the design and evaluation of protein structures. However,dealing with all (20 x 20) interactions becomes computationally difficult in extensive investigations. Hence, it is desirable to reduce the alphabet of 20 amino acids to a smaller number. Currently, several methods of reducing the residue types exist; however a critical assessment of these methods is not available. Towards this goal,here we review and evaluate different methods by comparing with the complete (20 x 20) matrix of Miyazawa-Jernigan potential, including a method of grouping adopted by us, based on multi dimensional scaling (MDS). The second goal of this paper is the computation of inter-residue interaction energies for the reduced amino acid alphabet, which has not been explicitly addressed in the literature until now. By using a least squares technique, we present a systematic method of obtaining the interaction energy values for any type of grouping scheme that reduces the amino acid alphabet. This can be valuable in designing the protein structures.
Collapse
Affiliation(s)
- Abhinav Luthra
- Department of Biotechnology, Indian Institute of Technology-Guwahati, Guwahati 781 039, India
| | | | | | | |
Collapse
|
27
|
Bragonzi A, Wiehlmann L, Klockgether J, Cramer N, Worlitzsch D, Döring G, Tümmler B. Sequence diversity of the mucABD locus in Pseudomonas aeruginosa isolates from patients with cystic fibrosis. MICROBIOLOGY-SGM 2007; 152:3261-3269. [PMID: 17074897 DOI: 10.1099/mic.0.29175-0] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The mucA gene of the muc operon, which is instrumental in the control of the biosynthesis of the exopolysaccharide alginate, is a hotspot of mutation in Pseudomonas aeruginosa, a micro-organism that chronically colonizes the airways of individuals with cystic fibrosis (CF). The mucA, mucB and mucD genes were sequenced in nine environmental isolates from aquatic habitats, and in 37 P. aeruginosa strains isolated from 10 patients with CF, at onset or at a late stage of chronic airway colonization, in order to elucidate whether there was any association between mutation and background genotype. The 61 identified single nucleotide polymorphisms (SNPs) segregated into 18 mucABD genotypes. Acquired and de novo stop mucA mutations were present in 14 isolates (38 %) of five mucABD genotypes. DeltaG430 was the most frequent and recurrent mucA mutation detected in four genotypes. The classification of strains by mucABD genotype was generally concordant with that by genome-wide SpeI fragment pattern or multilocus SNP genotypes. The exceptions point to intragenic mosaicism and interclonal recombination as major forces for intraclonal evolution at the mucABD locus.
Collapse
Affiliation(s)
- Alessandra Bragonzi
- Institute for Experimental Treatment of Cystic Fibrosis, DIBIT - HS Raffaele, Milano, Italy
- Institute of Medical Microbiology and Hygiene, Universitätsklinikum Tübingen, Tübingen, Germany
| | - Lutz Wiehlmann
- Klinische Forschergruppe, OE 6710, Medizinische Hochschule Hannover, Carl-Neuberg-Str. 1, D-30625 Hannover, Germany
| | - Jens Klockgether
- Klinische Forschergruppe, OE 6710, Medizinische Hochschule Hannover, Carl-Neuberg-Str. 1, D-30625 Hannover, Germany
| | - Nina Cramer
- Klinische Forschergruppe, OE 6710, Medizinische Hochschule Hannover, Carl-Neuberg-Str. 1, D-30625 Hannover, Germany
| | - Dieter Worlitzsch
- Institute of Medical Microbiology and Hygiene, Universitätsklinikum Tübingen, Tübingen, Germany
| | - Gerd Döring
- Institute of Medical Microbiology and Hygiene, Universitätsklinikum Tübingen, Tübingen, Germany
| | - Burkhard Tümmler
- Klinische Forschergruppe, OE 6710, Medizinische Hochschule Hannover, Carl-Neuberg-Str. 1, D-30625 Hannover, Germany
| |
Collapse
|
28
|
Wrabl JO, Grishin NV. Grouping of amino acid types and extraction of amino acid properties from multiple sequence alignments using variance maximization. Proteins 2006; 61:523-34. [PMID: 16184599 DOI: 10.1002/prot.20648] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Understanding of amino acid type co-occurrence in trusted multiple sequence alignments is a prerequisite for improved sequence alignment and remote homology detection algorithms. Two objective approaches were used to investigate co-occurrence, both based on variance maximization of the weighted residue frequencies in columns taken from a large alignment database. The first approach discretely grouped amino acid types, and the second approach extracted orthogonal properties of amino acids using principal components analysis. The grouping results corresponded to amino acid physical properties such as side chain hydrophobicity, size, or backbone flexibility, and an optimal arrangement of approximately eight groups was observed. However, interpretation of the orthogonal properties was more complex. Although the principal components accounting for the largest variances exhibited modest correlations with hydrophobicity and conservation of glycine, in general principal components did not correspond to physical properties of amino acids. Although not intuitive, these amino acid mathematical properties were demonstrated to be robust and to improve local pairwise alignment accuracy, relative to 20 amino acid frequencies alone, for a simple test case.
Collapse
Affiliation(s)
- James O Wrabl
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas 75390-9050, USA
| | | |
Collapse
|
29
|
Robson B. Clinical and Pharmacogenomic Data Mining: 3. Zeta Theory As a General Tactic for Clinical Bioinformatics. J Proteome Res 2005; 4:445-55. [PMID: 15822921 DOI: 10.1021/pr049800p] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
A new approach, a Zeta Theory of observations, data, and data mining, is being forged from a theory of expected information into an even more cohesive and comprehensive form by the challenge of general genomic, pharmacogenomic, and proteomic data. In this paper, the focus is not on studies using the specific tool FANO (CliniMiner) but on extensions to a new broader theoretical approach, aspects of which can easily be implemented into, or otherwise support, excellent existing methods, such as forms of multivariate analysis and IBM's product Intelligent Miner. The theory should perhaps be distinguished from an existing purely number-theoretic area sometimes also known as Zeta Theory, which focuses on the Riemann Zeta Function and the ways in which it governs the distribution of prime numbers. However, Zeta Theory as used here overlaps heavily with it and actually makes use of these same matters. The distinction is that it enters from a Bayesian information theory and data representation perspective. It could thus be considered an application of the 'mathematician's version'. The application is by no means confined to areas of modern biomedicine, and indeed its generality, even merging into quantum mechanics, is a key feature. Other areas with some similar challenges as modern biology, and which have inspired data mining methods such as IBM's Intelligent Miner, include commerce. But for several reasons discussed, modern molecular biology and medicine seem particularly challenging, and this relates to the often irreducible high dimensionality of the data. This thus remains our main target.
Collapse
Affiliation(s)
- Barry Robson
- T. J. Watson Research Center (IBM), 1101 Kitchawan Road, Yorktown Heights, New York 10598, USA
| |
Collapse
|
30
|
Robson B. The Dragon on the Gold: Myths and Realities for Data Mining in Biomedicine and Biotechnology Using Digital and Molecular Libraries. J Proteome Res 2004; 3:1113-9. [PMID: 15595719 DOI: 10.1021/pr0499242] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
To develop bioscience and personalized medicine in the post-genomic era, the biggest problem may be how to extract knowledge from the rich libraries of biomedical data. A particular dragon protects the gold therein: the dragon is the "curse of dimensionality" and its formidable fire weapon, which is burning researchers, is the "combinatorial explosion". This arises because many genomic, proteomic, clinical, and lifestyle factors may interact that cannot necessarily be considered on a simple pairwise or additive basis. A suggested theoretical solution--or at least "road map" that ameliorates management of these problems--borrows from several disciplines. It is undertaken also in the hope might also lead to research with broader impact on several unresolved issues in biotechnology: conversely, mathematical understanding of processes involving molecular libraries, such as cDNA libraries and DNA in the living cell itself, may open the opportunities to use biotechnology to construct nanotechnological storage and query systems.
Collapse
Affiliation(s)
- Barry Robson
- IBM Research, T.J. Watson Research Laboratory, Route 132, Yorktown Heights, NY 10598, USA
| |
Collapse
|
31
|
Kosiol C, Goldman N, Buttimore NH. A new criterion and method for amino acid classification. J Theor Biol 2004; 228:97-106. [PMID: 15064085 DOI: 10.1016/j.jtbi.2003.12.010] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2003] [Revised: 12/04/2003] [Accepted: 12/10/2003] [Indexed: 11/18/2022]
Abstract
It is accepted that many evolutionary changes of amino acid sequence in proteins are conservative: the replacement of one amino acid by another residue has a far greater chance of being accepted if the two residues have similar properties. It is difficult, however, to identify relevant physicochemical properties that capture this similarity. In this paper we introduce a criterion that determines similarity from an evolutionary point of view. Our criterion is based on the description of protein evolution by a Markov process and the corresponding matrix of instantaneous replacement rates. It is inspired by the conductance, a quantity that reflects the strength of mixing in a Markov process. Furthermore we introduce a method to divide the 20 amino acid residues into subsets that achieve good scores with our criterion. The criterion has the time-invariance property that different time distances of the same amino acid replacement rate matrix lead to the same grouping; but different rate matrices lead to different groupings. Therefore it can be used as an automated method to compare matrices derived from consideration of different types of proteins, or from parts of proteins sharing different structural or functional features. We present the groupings resulting from two standard matrices used in sequence alignment and phylogenetic tree estimation.
Collapse
Affiliation(s)
- Carolin Kosiol
- School of Mathematics, Trinity College, University of Dublin, Dublin 2, Ireland.
| | | | | |
Collapse
|
32
|
Robson B, Mushlin R. Clinical and Pharmacogenomic Data Mining: 2. A Simple Method for the Combination of Information from Associations and Multivariances to Facilitate Analysis, Decision, and Design in Clinical Research and Practice. J Proteome Res 2004; 3:697-711. [PMID: 15359722 DOI: 10.1021/pr0340680] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The physician and researcher must ultimately be able to combine qualitative and quantitative features from a variety of combinations of observations on data of many component items (i.e., many dimensions), and hence reach simple conclusions about interpretation, rational courses of action, and design. In the first paper of this series, it was noted that such needs are challenging the classical means of using statistics. Hence, the paper proposed the use of a Generalized Theory of Expected Information or "Zeta Theory". The conjoint event [a,b,c,..] is seen as a rule of association for a,b,c,.. associated with a rule strength I(a;b;c;...) = xi(s,o[a,b,c,..]) - xi (s,e[a,b,c,...]), where xi is the incomplete Zeta Function. Here, o[a,b,c,...] is the observed, and e[a,b,c,..] the expected, frequency of occurrence of conjoint event [a,b,c,...]. The present paper explores how output from this approach might be assembled in a form better suited for decision support. Related to this is the difficulty that the treatment of covariance and multivariance was previously rendered as a "fuzzy association" so that the output would fall into a similar form as the true associations, but this was a somewhat ad hoc approach in which only the final I( ) had any meaning. Users at clinical research sites had subsequently requested an alternative approach in which "effective frequencies" o[ ] and e[ ] calculated from the above variances and used to evaluate I( ) give some intuitive feeling analogous to the association treatment, and this is explored here. Though the present paper is theoretical, real examples are used to illustrate application. One clinical-genomic example illustrates experimental design by identifying data which is, or is not, statistically germane to the study. We also report on some impressions based on applying these techniques in studies of real, extensive patient record data which are now emerging, as well as on molecular design data originally studied in part to test the ability to deduce the effects of simple natural patient sequence variations ("SNPs") on patient protein activity. On the basis of these study experiences, methods of rationalizing and condensing the rules implied by associations and variances between data, as well as discussion of the difficulty of what is meant by "condensed", are presented in the Appendix.
Collapse
Affiliation(s)
- Barry Robson
- T. J. Watson Research Center, Yorktown Heights, New York 10598, USA
| | | |
Collapse
|
33
|
Farrell BD. Evolutionary assembly of the milkweed fauna: cytochrome oxidase I and the age of Tetraopes beetles. Mol Phylogenet Evol 2001; 18:467-78. [PMID: 11277638 DOI: 10.1006/mpev.2000.0888] [Citation(s) in RCA: 126] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The insects that feed on the related plant families Apocynaceae and Asclepiadaceae (here collectively termed "milkweeds") comprise a "component community" of highly specialized, distinctive lineages of species that frequently sequester toxic cardiac glycosides from their host plants for defense against predators and are thus often aposematic, advertising their consequent unpalatability. Such sets of specialized lineages provide opportunities for comparative studies of the rate of adaptation, diversification, and habitat-related effects on molecular evolution. The cerambycid genus Tetraopes is the most diverse of the new world milkweed herbivores and the species are generally host specific, being restricted to single, different species of Asclepias, more often so than most other milkweed insects. Previous work revealed correspondence between the phylogeny of these beetles and that of their hosts. The present study provides analyses of near-complete DNA sequences for Tetraopes and relatives that are used to establish a molecular clock and temporal framework for Tetraopes evolution with their milkweed hosts.
Collapse
Affiliation(s)
- B D Farrell
- Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts 02138, USA
| |
Collapse
|
34
|
Gómez-Zurita J, Juan C, Petitpierre E. The evolutionary history of the genus Timarcha (Coleoptera, Chrysomelidae) inferred from mitochondrial COII gene and partial 16S rDNA sequences. Mol Phylogenet Evol 2000; 14:304-17. [PMID: 10679162 DOI: 10.1006/mpev.1999.0712] [Citation(s) in RCA: 66] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The apterous genus Timarcha consists of three subgenera and more than 100 species in its Palearctic distribution, with specialized feeding on few plant families. Fifty-four sequences sampled from 31 taxa of the genus plus three outgroup leaf beetles were studied for their complete cytochrome oxidase II (COII) and a fragment of 16S rDNA mitochondrial genes, representing a total of about 1200 bp. Phylogenetic analyses using maximum-parsimony and distance methods for each gene separately and for the combined data set gave compatible topologies. The subgenus Metallotimarcha consistently appears in a basal position and is well differentiated from the remaining Timarcha, but no clear monophyletic grouping of Timarchostoma and Timarcha s. str. subgenera can be deduced from our analysis. Calibration of the molecular clock has been done using the opening of the Gibraltar Strait after the Messinian salinity crisis (about 5.5 MYA) as the biogeographic event causing disjunction of two particular taxa. Accordingly, the COII evolutionary rate has been estimated to be of 0.76 x 10(-8) substitution/site/year in Timarcha. Relation between phylogeny and host-plant use indicates widening of trophic regime as a derived character in Timarcha.
Collapse
Affiliation(s)
- J Gómez-Zurita
- Lab. Genètica, Universitat de les Illes Balears, Palma de Mallorca, Balearic Islands, E-07071, Spain
| | | | | |
Collapse
|
35
|
Adenot M, Sarrauste de Menthière C, Chavanieu A, Calas B, Grassy G. Peptides quantitative structure-function relationships: an automated mutation strategy to design peptides and pseudopeptides from substitution matrices. J Mol Graph Model 1999; 17:292-309. [PMID: 10840689 DOI: 10.1016/s1093-3263(99)00037-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
The process by which analogs in peptide chemistry are currently designed does not include any quantitative basis for amino acid substitutions from pharmacological leads. Here, we show that substitution matrices such as PAM 250 can provide quantitative constraints compatible with biological activity. This article describes its use in a strategy of rational amino acid substitution in peptides and proteins: we have computed a chemically derived matrix equivalent to the well-known PAM 250 matrix, reflecting the natural mutability rates of amino acids in protein evolutions but that can be extended to all the noncoded amino acids. Some of these noncoded amino acids are widely used to mimic secondary structure, to constrain backbone conformation, or to evade protease degradation. An automated sequence mutation (ASM) strategy has been defined to generate mutations within constraints. Application of such a substitution matrix to quantitative structure-function relationship studies will be of use in the design of proteins and peptides destined to become pharmaceutical drugs. In particular, issues such as which functionally conserved substitutions are able to satisfy conformational restrictions, oral bioavailability, or formulation demands can be quantitatively addressed.
Collapse
Affiliation(s)
- M Adenot
- Centre de Biochimie Structurale, CNRS UMR 9955, INSERM U 414, Faculté de Pharmacie 15, Montpellier, France
| | | | | | | | | |
Collapse
|
36
|
Abstract
Phylogenetic relationships within the Aphidiinae, and between this and other subfamilies of Braconidae (Hymenoptera), were investigated using sequence data from three genes: elongation factor-1alpha, cytochrome b, and the second expansion segment of the 28S ribosomal subunit. Variation in both protein-coding genes was characterized by a high level of homoplasy, but analysis of the expansion segment--robust over a range of alignment methods and parameters-resolved some of the older divergences. Parsimony analysis of the combined data suggests the following tribal relationships: (Ephedrini + (Praini + (Aphidiini + Trioxini))). In addition, the cyclostome subfamilies were found to form a clade separate from the Aphidiinae, but relationships between the Aphidiinae and the noncyclostome braconids could not be resolved. The inferred phylogeny also supported a secondary loss of internal pupation within the Praini and a polyphyletic origin of endoparasitism within the Braconidae.
Collapse
Affiliation(s)
- R Belshaw
- Biology Department, Imperial College at Silwood Park, Ascot, Berks, United Kingdom
| | | |
Collapse
|
37
|
Brabetz W, Brade H. Molecular cloning, sequence analysis and functional characterization of the gene kdsA, encoding 3-deoxy-D-manno-2-octulosonate-8-phosphate synthase of Chlamydia psittaci 6BC. EUROPEAN JOURNAL OF BIOCHEMISTRY 1997; 244:66-73. [PMID: 9063447 DOI: 10.1111/j.1432-1033.1997.00066.x] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
The kdsA gene encoding 3-deoxy-D-manno-2-octulosonate-8-phosphate (Kdo-8-P) synthase of Chlamydia psittaci 6BC was cloned by complementing the temperature-sensitive kdsA mutant Salmonella enterica serovar Typhimurium AG701i50. The sequence analysis of a recombinant DNA fragment revealed an open reading frame of 807 nucleotides which codes for a polypeptide of 269 amino acids with a high degree of similarity to known KdsA proteins. In addition, alignments of Kdo-8-P synthases with bacterial and fungal 3-deoxy-D-arabino-2-heptulosonate-7-phosphate (Dha-7-P) synthases suggested that both classes of enzymes are structurally related and may belong to a family of 2-keto-3-deoxy-aldonic acid synthases. The chlamydial protein was overexpressed and functionally characterized in vitro to synthesize Kdo-8-P from D-arabinose 5-phosphate and phosphoenolpyruvate. A chlamydial DNA region upstream of the gene exhibiting similarities to the consensus sequence of sigma 70 promoters of Escherichia coli was responsible for the heterologous expression of kdsA.
Collapse
Affiliation(s)
- W Brabetz
- Division of Biochemical Microbiology, Research Center Borstel, Center for Medicine and Biosciences, Germany
| | | |
Collapse
|
38
|
Podlesek Z, Comino A, Herzog-Velikonja B, Zgur-Bertok D, Komel R, Grabnar M. Bacillus licheniformis bacitracin-resistance ABC transporter: relationship to mammalian multidrug resistance. Mol Microbiol 1995; 16:969-76. [PMID: 7476193 DOI: 10.1111/j.1365-2958.1995.tb02322.x] [Citation(s) in RCA: 98] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
The nucleotide sequence of the Bacillus licheniformis bacitracin-resistance locus was determined. The presence of three open reading frames, bcrA, bcrB and bcrC, was revealed. The BcrA protein shares a high degree of homology with the hydrophilic ATP-binding components of the ABC family of transport proteins. The bcrB and bcrC genes were found to encode hydrophobic proteins, which may function as membrane components of the permease. Apart from Bacillus subtilis, these genes also confer resistance upon the Gram-negative Escherichia coli. The presumed function of the Bcr transporter is to remove the bacitracin molecule from its membrane target. In addition to the homology of the nucleotide-binding sites, BcrA protein and mammalian multidrug transporter or P-glycoprotein share collateral detergent sensitivity of resistant cells and possibly the mode of Bcr transport activity within the membrane. The advantage of the resistance phenotype of the Bcr transporter was used to construct deletions within the nucleotide-binding protein to determine the importance of various regions in transport.
Collapse
Affiliation(s)
- Z Podlesek
- Department of Biology, University of Ljubljana, Slovenia
| | | | | | | | | | | |
Collapse
|
39
|
Abstract
Arrestins constitute a superfamily of regulatory proteins that down-regulate phosphorylated G-protein membrane receptors, including rod and cone photoreceptors and adrenergic receptors. The potential role of arrestin in color visual processes led us to identify a cDNA encoding a cone-like arrestin in Xenopus laevis, the principle amphibian biological model system. Alignment of 18 deduced amino acid sequences of all known arrestins from both invertebrate and vertebrate species reveals five arrestin families. Further analysis identifies 7 variable and 4 conservative arrestin structural motifs that may identify potential functional domains. The adaptive evolutionary relationship of Xenopus cone arrestin to the arrestin gene tree suggests high intrafamily homology and early gene duplication events.
Collapse
Affiliation(s)
- C M Craft
- Department of Cell and Neurobiology, University of Southern California School of Medicine, Mary D. Allen Laboratories, Doheny Eye Research Institute, San Pablo, Los Angeles 90033, USA
| | | |
Collapse
|
40
|
Venanzetti F, Cecconi F, Giorgi M, Cesaroni D, Sbordoni V, Mariottini P. Cloning and characterization of the European seabass, Dicentrarchus labrax, mitochondrial genome. Curr Genet 1994; 26:139-45. [PMID: 8001168 DOI: 10.1007/bf00313802] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Mitochondrial DNA (mtDNA) from the European seabass, Dicentrarchus labrax, has been cloned and characterized. Its gene organization was deduced by a comparison of the sequenced termini of different subclones obtained from European seabass mtDNA to the completely-sequenced mtDNAs from carp and freshwater loach. The difference in genome size between the European seabass mtDNA (approximately 18 kb) and most of the other characterized fish mtDNAs (approximately 16.5 kb) is accounted for by the displacement-loop (D-loop). Comparisons have been performed between the derived amino-acid sequences of three sequenced genes, cytochrome c oxidase subunit 2 (COII), NADH dehydrogenase subunit 4L (ND4L) and ATP synthase subunit 8 (ATPase8), from D. labrax, and their counterparts in other fishes and Xenopus laevis.
Collapse
Affiliation(s)
- F Venanzetti
- Dipartimento di Biologia, Università degli Studi di Roma Tor Vergata, Italy
| | | | | | | | | | | |
Collapse
|
41
|
Abstract
The widely used Mutation Data Matrix (MDM), is an amino acid comparison matrix calculated from a study of the exchange probabilities (or odds) derived from an analysis of the evolutionary changes seen in groups of very similar proteins. In this work, a mutation data matrix is calculated for membrane spanning segments. This new mutation data matrix is found to be very different from matrices calculated from general sequence sets which are biased towards water-soluble globular proteins, and the differences are discussed in the context of specific structural requirements of membrane spanning segments. This new matrix will help improve the accuracy of integral membrane protein sequence alignments, and could also be of use in the rational design of site directed mutagenesis experiments for this class of proteins.
Collapse
Affiliation(s)
- D T Jones
- Department of Biochemistry and Molecular Biology, University College, London, UK
| | | | | |
Collapse
|
42
|
Colman PM. Structural basis of antigenic variation: studies of influenza virus neuraminidase. Immunol Cell Biol 1992; 70 ( Pt 3):209-14. [PMID: 1452222 DOI: 10.1038/icb.1992.26] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Affiliation(s)
- P M Colman
- CSIRO Division of Biomolecular Engineering, Parkville, Victoria, Australia
| |
Collapse
|
43
|
Zabin HB, Horvath MP, Terwilliger TC. Approaches to predicting effects of single amino acid substitutions on the function of a protein. Biochemistry 1991; 30:6230-40. [PMID: 2059630 DOI: 10.1021/bi00239a022] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
The relative activities of 313 mutants of the gene V protein of bacteriophage f1, assayed in vivo, have been used to evaluate two approaches to predicting the effects of single amino acid substitutions on the function of a protein. First, we tested methods that only depend on the properties of the wild-type and substituting amino acids. None of the properties or measures of the functional equivalence of amino acids we tested, including the frequency of exchange of amino acids among homologous proteins as well as changes in side-chain size, hydrophobicity, and charge, were found to be more than weakly correlated with the activities of mutants. The principal reason for this poor correlation was found to be that the effect of a particular substitution varies considerably from site to site. We then tested an approach using the activities of several mutants with substitutions at a site to predict the activity of another mutant, and we find that this is a relatively good indicator of whether the other mutant at that site will be functional. A predictive scheme was developed that combines the weak information from the models depending on the properties of the wild-type and substituting amino acids with the stronger information from the tolerance of a site to substitution. Although this scheme requires no knowledge of the structure of a mutant protein, it is useful in predicting the activities of mutants.
Collapse
Affiliation(s)
- H B Zabin
- Department of Biochemistry and Molecular Biology, University of Chicago, Illinois 60637
| | | | | |
Collapse
|
44
|
Kern L, de Montigny J, Lacroute F, Jund R. Regulation of the pyrimidine salvage pathway by the FUR1 gene product of Saccharomyces cerevisiae. Curr Genet 1991; 19:333-7. [PMID: 1913872 DOI: 10.1007/bf00309592] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
In Saccharomyces cerevisiae, the protein encoded by the FUR1 gene is absolutely required for the expression of uracil phosphoribosyl transferase activity. The occurrence of semi-dominant mutations for 5-fluorouracil-(5FU)-resistance at this locus led us to clone and sequence the semi-dominant fur1-5 allele. A single point mutation, resulting in the substitution of arginine 134 for serine, is responsible for this mutant phenotype. The fur1-5 allele is transcribed and expressed at the same level as the wild-type allele. But, in contrast with the wild-type, the UPRTase activity of the fur1-5 mutant strain is stimulated in vitro by UTP and does not, therefore, correspond to a loss of feedback of UPRTase activity. We found that uracil, as a free base, induces a significative increase in transcription and UPRTase activity in a wild-type strain as well as in uracil-overproducing mutants which principally explains the high efficiency of the pyrimidine salvage pathway in S. cerevisiae.
Collapse
Affiliation(s)
- L Kern
- Laboratoire de Génétique Physiologique, Institut de Biologie Moléculaire et Cellulaire du CNRS, Strasbourg, France
| | | | | | | |
Collapse
|
45
|
Merchant S, Hill K, Kim JH, Thompson J, Zaitlin D, Bogorad L. Isolation and characterization of a complementary DNA clone for an algal pre-apoplastocyanin. J Biol Chem 1990. [DOI: 10.1016/s0021-9258(19)38356-5] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
46
|
|
47
|
Abstract
In this paper, I define a measure of the relative position of each amino acid in the genetic code by means of a 21-dimensional vector describing its potential for mutation, in a single step, to each of the other amino acids, or to a chain termination codon. This measure allows us to make a systematic investigation of the type and number of the physicochemical properties of the amino acids that were involved in evolution. The polar character and size of amino acids are identified in this analysis as properties that played a leading role in the evolutionary history of the genetic code. The application of cluster analysis and discriminant analysis reveals the characteristics of the structural organization of the genetic code. Finally, I suggest the existence of a relationship between the molecular weight of the amino acids and the number of synonymous codons.
Collapse
Affiliation(s)
- M Di Giulio
- International Institute of Genetics and Biophysics, CNR, Napoli, Italy
| |
Collapse
|
48
|
Howell N. Evolutionary conservation of protein regions in the protonmotive cytochrome b and their possible roles in redox catalysis. J Mol Evol 1989; 29:157-69. [PMID: 2509716 DOI: 10.1007/bf02100114] [Citation(s) in RCA: 107] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
The amino acid sequences of the protonmotive cytochrome b from seven representative and phylogenetically diverse species have been compared to identify protein regions or segments that are conserved during evolution. The sequences analyzed included both prokaryotic and eukaryotic examples as well as mitochondrial cytochrome b and chloroplast b6 proteins. The principal conclusion from these analyses is that there are five protein regions--each comprising about 20 amino acid residues--that are consistently conserved during evolution. These domains are evident despite the low density of invariant residues. The two most highly conserved regions, spanning approximately consensus residues 130-150 and 270-290, are located in extramembrane loops and are hypothesized to constitute part of the Qo reaction center. The intramembrane, hydrophobic protein regions containing the heme-ligating histidines are also conserved during evolution. It was found, however, that the conservation of the protein segments extramembrane to the histidine residues ligating the low potential b566 heme group showed a higher degree of sequence conservation. The location of these conserved regions suggests that these extramembrane segments are also involved in forming the Qo reaction center. A protein segment putatively constituting a portion of the Qi reaction center, located approximately in the region spanned by consensus residues 20-40, is conserved in species as divergent as mouse and Rhodobacter. This region of the protein shows substantially less sequence conservation in the chloroplast cytochrome b6. The catalytic role of these conserved regions is strongly supported by locations of residues that are altered in mutants resistant to inhibitors of cytochrome b electron transport.
Collapse
Affiliation(s)
- N Howell
- Department of Radiation Therapy, University of Texas Medical Branch, Galveston 77550
| |
Collapse
|
49
|
Affiliation(s)
- H Bernstein
- Department of Microbiology and Immunology, College of Medicine, University of Arizona, Tucson 85724
| | | |
Collapse
|
50
|
Albright LM, Ronson CW, Nixon BT, Ausubel FM. Identification of a gene linked to Rhizobium meliloti ntrA whose product is homologous to a family to ATP-binding proteins. J Bacteriol 1989; 171:1932-41. [PMID: 2703463 PMCID: PMC209842 DOI: 10.1128/jb.171.4.1932-1941.1989] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
The ntrA gene of Rhizobium meliloti has recently been identified and shown to be required for a diverse set of metabolic functions (C. W. Ronson, B. T. Nixon, L. M. Albright, and F. M. Ausubel, J. Bacteriol. 169:2424-2431, 1987). As a result of sequencing the ntrA gene and its flanking regions from R. meliloti, we identified an open reading frame directly upstream of ntrA, ORF1, whose predicted product is homologous to a superfamily of ATP-binding proteins involved in transport, cell division, nodulation, and DNA repair. The homology of ORF1 to this superfamily and its proximity to ntrA led us to investigate its role in symbiosis by mutagenesis and expression studies. We were unable to isolate an insertion mutation in ORF1, suggesting that ORF1 may code for an essential function. We identified the start of transcription for the ntrA gene in vegetative cells and bacteroids and showed that ORF1 and ntrA are transcriptionally unlinked. ORF1 appears to be in an operon with one or more upstream genes.
Collapse
Affiliation(s)
- L M Albright
- Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115
| | | | | | | |
Collapse
|