1
|
Brudenell EL, Pohare MB, Zafred D, Phipps J, Hornsby HR, Darby JF, Dai J, Liggett E, Cain KM, Barran PE, de Silva TI, Sayers JR. Efficient overexpression and purification of severe acute respiratory syndrome coronavirus 2 nucleocapsid proteins in Escherichia coli. Biochem J 2024; 481:669-682. [PMID: 38713013 PMCID: PMC11346444 DOI: 10.1042/bcj20240019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 04/30/2024] [Accepted: 05/07/2024] [Indexed: 05/08/2024]
Abstract
The fundamental biology of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) nucleocapsid protein (Ncap), its use in diagnostic assays and its potential application as a vaccine component have received considerable attention since the outbreak of the Covid19 pandemic in late 2019. Here we report the scalable expression and purification of soluble, immunologically active, SARS-CoV-2 Ncap in Escherichia coli. Codon-optimised synthetic genes encoding the original Ncap sequence and four common variants with an N-terminal 6His affinity tag (sequence MHHHHHHG) were cloned into an inducible expression vector carrying a regulated bacteriophage T5 synthetic promoter controlled by lac operator binding sites. The constructs were used to express Ncap proteins and protocols developed which allow efficient production of purified Ncap with yields of over 200 mg per litre of culture media. These proteins were deployed in ELISA assays to allow comparison of their responses to human sera. Our results suggest that there was no detectable difference between the 6His-tagged and untagged original Ncap proteins but there may be a slight loss of sensitivity of sera to other Ncap isolates.
Collapse
Affiliation(s)
- Emma L. Brudenell
- Sheffield Institute for Nucleic Acids and Florey Institute, Section of Infection and Immunity, Division of Clinical Medicine, School of Medicine and Population Health, The University of Sheffield, Beech Hill Road, Sheffield S10 2RX, U.K
| | - Manoj B. Pohare
- Sheffield Institute for Nucleic Acids and Florey Institute, Section of Infection and Immunity, Division of Clinical Medicine, School of Medicine and Population Health, The University of Sheffield, Beech Hill Road, Sheffield S10 2RX, U.K
| | - Domen Zafred
- Sheffield Institute for Nucleic Acids and Florey Institute, Section of Infection and Immunity, Division of Clinical Medicine, School of Medicine and Population Health, The University of Sheffield, Beech Hill Road, Sheffield S10 2RX, U.K
| | - Janine Phipps
- Sheffield Institute for Nucleic Acids and Florey Institute, Section of Infection and Immunity, Division of Clinical Medicine, School of Medicine and Population Health, The University of Sheffield, Beech Hill Road, Sheffield S10 2RX, U.K
| | - Hailey R. Hornsby
- Sheffield Institute for Nucleic Acids and Florey Institute, Section of Infection and Immunity, Division of Clinical Medicine, School of Medicine and Population Health, The University of Sheffield, Beech Hill Road, Sheffield S10 2RX, U.K
| | - John F. Darby
- Sheffield Institute for Nucleic Acids and Florey Institute, Section of Infection and Immunity, Division of Clinical Medicine, School of Medicine and Population Health, The University of Sheffield, Beech Hill Road, Sheffield S10 2RX, U.K
| | - Junxiao Dai
- Michael Barber Centre for Collaborative Mass Spectrometry, Department of Chemistry, Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street, Manchester M1 7DN, UK
| | - Ellen Liggett
- Michael Barber Centre for Collaborative Mass Spectrometry, Department of Chemistry, Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street, Manchester M1 7DN, UK
| | - Kathleen M. Cain
- Michael Barber Centre for Collaborative Mass Spectrometry, Department of Chemistry, Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street, Manchester M1 7DN, UK
| | - Perdita E. Barran
- Michael Barber Centre for Collaborative Mass Spectrometry, Department of Chemistry, Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street, Manchester M1 7DN, UK
| | - Thushan I. de Silva
- Sheffield Institute for Nucleic Acids and Florey Institute, Section of Infection and Immunity, Division of Clinical Medicine, School of Medicine and Population Health, The University of Sheffield, Beech Hill Road, Sheffield S10 2RX, U.K
| | - Jon R. Sayers
- Sheffield Institute for Nucleic Acids and Florey Institute, Section of Infection and Immunity, Division of Clinical Medicine, School of Medicine and Population Health, The University of Sheffield, Beech Hill Road, Sheffield S10 2RX, U.K
| |
Collapse
|
2
|
Heron EA, Valle G, Bernasconi A. Editorial: Identification of phenotypically important genomic variants. FRONTIERS IN BIOINFORMATICS 2023; 3:1328945. [PMID: 38025396 PMCID: PMC10668015 DOI: 10.3389/fbinf.2023.1328945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 11/03/2023] [Indexed: 12/01/2023] Open
Affiliation(s)
| | - Giorgio Valle
- Department of Biology, Università di Padova, Padova, Italy
| | - Anna Bernasconi
- Department of Electronics, Information, and Bioengineering, Politecnico di Milano, Milan, Italy
| |
Collapse
|
3
|
Serna García G, Al Khalaf R, Invernici F, Ceri S, Bernasconi A. CoVEffect: interactive system for mining the effects of SARS-CoV-2 mutations and variants based on deep learning. Gigascience 2022; 12:giad036. [PMID: 37222749 PMCID: PMC10205000 DOI: 10.1093/gigascience/giad036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 04/11/2023] [Accepted: 04/27/2023] [Indexed: 05/25/2023] Open
Abstract
BACKGROUND Literature about SARS-CoV-2 widely discusses the effects of variations that have spread in the past 3 years. Such information is dispersed in the texts of several research articles, hindering the possibility of practically integrating it with related datasets (e.g., millions of SARS-CoV-2 sequences available to the community). We aim to fill this gap, by mining literature abstracts to extract-for each variant/mutation-its related effects (in epidemiological, immunological, clinical, or viral kinetics terms) with labeled higher/lower levels in relation to the nonmutated virus. RESULTS The proposed framework comprises (i) the provisioning of abstracts from a COVID-19-related big data corpus (CORD-19) and (ii) the identification of mutation/variant effects in abstracts using a GPT2-based prediction model. The above techniques enable the prediction of mutations/variants with their effects and levels in 2 distinct scenarios: (i) the batch annotation of the most relevant CORD-19 abstracts and (ii) the on-demand annotation of any user-selected CORD-19 abstract through the CoVEffect web application (http://gmql.eu/coveffect), which assists expert users with semiautomated data labeling. On the interface, users can inspect the predictions and correct them; user inputs can then extend the training dataset used by the prediction model. Our prototype model was trained through a carefully designed process, using a minimal and highly diversified pool of samples. CONCLUSIONS The CoVEffect interface serves for the assisted annotation of abstracts, allowing the download of curated datasets for further use in data integration or analysis pipelines. The overall framework can be adapted to resolve similar unstructured-to-structured text translation tasks, which are typical of biomedical domains.
Collapse
Affiliation(s)
- Giuseppe Serna García
- Dipartimento di Informazione, Elettronica e Bioingegneria, 20133 Milano Country: Italy, Italy
| | - Ruba Al Khalaf
- Dipartimento di Informazione, Elettronica e Bioingegneria, 20133 Milano Country: Italy, Italy
| | - Francesco Invernici
- Dipartimento di Informazione, Elettronica e Bioingegneria, 20133 Milano Country: Italy, Italy
| | - Stefano Ceri
- Dipartimento di Informazione, Elettronica e Bioingegneria, 20133 Milano Country: Italy, Italy
| | - Anna Bernasconi
- Dipartimento di Informazione, Elettronica e Bioingegneria, 20133 Milano Country: Italy, Italy
| |
Collapse
|
4
|
Bernasconi A, Guizzardi G, Pastor O, Storey VC. Semantic interoperability: ontological unpacking of a viral conceptual model. BMC Bioinformatics 2022; 23:491. [PMID: 36396980 PMCID: PMC9672571 DOI: 10.1186/s12859-022-05022-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Accepted: 10/29/2022] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Genomics and virology are unquestionably important, but complex, domains being investigated by a large number of scientists. The need to facilitate and support work within these domains requires sharing of databases, although it is often difficult to do so because of the different ways in which data is represented across the databases. To foster semantic interoperability, models are needed that provide a deep understanding and interpretation of the concepts in a domain, so that the data can be consistently interpreted among researchers. RESULTS In this research, we propose the use of conceptual models to support semantic interoperability among databases and assess their ontological clarity to support their effective use. This modeling effort is illustrated by its application to the Viral Conceptual Model (VCM) that captures and represents the sequencing of viruses, inspired by the need to understand the genomic aspects of the virus responsible for COVID-19. For achieving semantic clarity on the VCM, we leverage the "ontological unpacking" method, a process of ontological analysis that reveals the ontological foundation of the information that is represented in a conceptual model. This is accomplished by applying the stereotypes of the OntoUML ontology-driven conceptual modeling language.As a result, we propose a new OntoVCM, an ontologically grounded model, based on the initial VCM, but with guaranteed interoperability among the data sources that employ it. CONCLUSIONS We propose and illustrate how the unpacking of the Viral Conceptual Model resolves several issues related to semantic interoperability, the importance of which is recognized by the "I" in FAIR principles. The research addresses conceptual uncertainty within the domain of SARS-CoV-2 data and knowledge.The method employed provides the basis for further analyses of complex models currently used in life science applications, but lacking ontological grounding, subsequently hindering the interoperability needed for scientists to progress their research.
Collapse
Affiliation(s)
- Anna Bernasconi
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy.
- PROS Research Center, VRAIN Research Institute, Universitat Politècnica de València, Valencia, Spain.
| | - Giancarlo Guizzardi
- Conceptual and Cognitive Modeling Research Group, Free University of Bozen-Bolzano, Bolzano, Italy
- Services and Cybersecurity Group, University of Twente, Enschede, The Netherlands
| | - Oscar Pastor
- PROS Research Center, VRAIN Research Institute, Universitat Politècnica de València, Valencia, Spain
| | - Veda C Storey
- J. Mack Robinson College of Business, Georgia State University, Atlanta, Georgia, USA
| |
Collapse
|
5
|
Al Khalaf R, Bernasconi A, Pinoli P, Ceri S. Analysis of co-occurring and mutually exclusive amino acid changes and detection of convergent and divergent evolution events in SARS-CoV-2. Comput Struct Biotechnol J 2022; 20:4238-4250. [PMID: 35945925 PMCID: PMC9352683 DOI: 10.1016/j.csbj.2022.07.051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 07/29/2022] [Accepted: 07/29/2022] [Indexed: 11/28/2022] Open
Abstract
The inflation of SARS-CoV-2 lineages with a high number of accumulated mutations (such as the recent case of Omicron) has risen concerns about the evolutionary capacity of this virus. Here, we propose a computational study to examine non-synonymous mutations gathered within genomes of SARS-CoV-2 from the beginning of the pandemic until February 2022. We provide both qualitative and quantitative descriptions of such corpus, focusing on statistically significant co-occurring and mutually exclusive mutations within single genomes. Then, we examine in depth the distributions of mutations over defined lineages and compare those of frequently co-occurring mutation pairs. Based on this comparison, we study mutations' convergence/divergence on the phylogenetic tree. As a result, we identify 1,818 co-occurring pairs of non-synonymous mutations showing at least one event of convergent evolution and 6,625 co-occurring pairs with at least one event of divergent evolution. Notable examples of both types are shown by means of a tree-based representation of lineages, visually capturing mutations' behaviors. Our method confirms several well-known cases; moreover, the provided evidence suggests that our workflow can explain aspects of the future mutational evolution of SARS-CoV-2.
Collapse
Affiliation(s)
- Ruba Al Khalaf
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milan, Italy
| | - Anna Bernasconi
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milan, Italy
| | - Pietro Pinoli
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milan, Italy
| | - Stefano Ceri
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milan, Italy
| |
Collapse
|