51
|
Bowling A, Eastman A, Merlo C, Lin G, West N, Patel S, Cutting G, Sharma N. Downstream Alternate Start Site Allows N-Terminal Nonsense Variants to Escape NMD and Results in Functional Recovery by Readthrough and Modulator Combination. J Pers Med 2022; 12:jpm12091448. [PMID: 36143233 PMCID: PMC9504986 DOI: 10.3390/jpm12091448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 08/26/2022] [Accepted: 08/31/2022] [Indexed: 11/23/2022] Open
Abstract
Genetic variants that introduce premature termination codons (PTCs) have remained difficult to therapeutically target due to lack of protein product. Nonsense mediated mRNA decay (NMD) targets PTC-bearing transcripts to reduce the potentially damaging effects of truncated proteins. Readthrough compounds have been tested on PTC-generating variants in attempt to permit translation through a premature stop. However, readthrough compounds have not proved efficacious in a clinical setting due to lack of stable mRNA. Here, we investigate N-terminal variants in the cystic fibrosis transmembrane conductance regulator (CFTR) gene, which have been shown to escape NMD, potentially through a mechanism of alternative translation initiation at downstream AUG codons. We hypothesized that N-terminal variants in CFTR that evade NMD will produce stable transcript, allowing CFTR function to be restored by a combination of readthrough and protein modulator therapy. We investigate this using two cell line models expressing CFTR-expression minigenes (EMG; HEK293s and CFBEs) and primary human nasal epithelial (NE) cells, and we test readthrough compounds G418 and ELX-02 in combination with CFTR protein modulators. HEK293 cells expressing the variants E60X and L88X generate CFTR-specific core glycosylated products that are consistent with downstream translation initiation. Mutation of downstream methionines at codons 150 and 152 does not result in changes in CFTR protein processing in cells expressing L88X-CFTR-EMG. However, mutation of methionine at 265 results in loss of detectable CFTR protein in cells expressing E60X, L88X, and Y122X CFTR-EMGs, indicating that downstream translation initiation is occurring at the AUG codon at position M265. In HEK293 stable cells harboring L88X, treatment with readthrough compounds alone allows for formation of full-length, but misfolded CFTR protein. Upon addition of protein modulators in combination with readthrough, we observe formation of mature, complex-glycosylated CFTR. In CFBE and NE cells, addition of readthrough ELX-02 and modulator therapy results in substantial recovery of CFTR function. Our work indicates that N-terminal variants generate stable CFTR transcript due to translation initiation at a downstream AUG codon. Thus, individuals with CF bearing 5′ nonsense variants that evade NMD are ideal candidates for treatment with clinically safe readthrough compounds and modulator therapy.
Collapse
Affiliation(s)
- Alyssa Bowling
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Alice Eastman
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Christian Merlo
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Johns Hopkins Hospital, Baltimore, MD 21205, USA
| | - Gabrielle Lin
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Natalie West
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Johns Hopkins Hospital, Baltimore, MD 21205, USA
| | - Shivani Patel
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Johns Hopkins Hospital, Baltimore, MD 21205, USA
| | - Garry Cutting
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Neeraj Sharma
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
- Correspondence:
| |
Collapse
|
52
|
Condé L, Allatif O, Ohlmann T, de Breyne S. Translation of SARS-CoV-2 gRNA Is Extremely Efficient and Competitive despite a High Degree of Secondary Structures and the Presence of an uORF. Viruses 2022; 14:1505. [PMID: 35891485 PMCID: PMC9322171 DOI: 10.3390/v14071505] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 07/06/2022] [Accepted: 07/07/2022] [Indexed: 12/15/2022] Open
Abstract
The SARS-CoV-2 infection generates up to nine different sub-genomic mRNAs (sgRNAs), in addition to the genomic RNA (gRNA). The 5'UTR of each viral mRNA shares the first 75 nucleotides (nt.) at their 5'end, called the leader, but differentiates by a variable sequence (0 to 190 nt. long) that follows the leader. As a result, each viral mRNA has its own specific 5'UTR in term of length, RNA structure, uORF and Kozak context; each one of these characteristics could affect mRNA expression. In this study, we have measured and compared translational efficiency of each of the ten viral transcripts. Our data show that most of them are very efficiently translated in all translational systems tested. Surprisingly, the gRNA 5'UTR, which is the longest and the most structured, was also the most efficient to initiate translation. This property is conserved in the 5'UTR of SARS-CoV-1 but not in MERS-CoV strain, mainly due to the regulation imposed by the uORF. Interestingly, the translation initiation mechanism on the SARS-CoV-2 gRNA 5'UTR requires the cap structure and the components of the eIF4F complex but showed no dependence in the presence of the poly(A) tail in vitro. Our data strongly suggest that translation initiation on SARS-CoV-2 mRNAs occurs via an unusual cap-dependent mechanism.
Collapse
Affiliation(s)
| | | | - Théophile Ohlmann
- CIRI, Centre International de Recherche en Infectiologie, (Team Ohlmann), Univ Lyon, Inserm U1111, Université Claude Bernard Lyon 1, CNRS UMR5308, ENS de Lyon, F-69007 Lyon, France; (L.C.); (O.A.)
| | - Sylvain de Breyne
- CIRI, Centre International de Recherche en Infectiologie, (Team Ohlmann), Univ Lyon, Inserm U1111, Université Claude Bernard Lyon 1, CNRS UMR5308, ENS de Lyon, F-69007 Lyon, France; (L.C.); (O.A.)
| |
Collapse
|
53
|
Benjamin R, Giacoletto CJ, FitzHugh ZT, Eames D, Buczek L, Wu X, Newsome J, Han MV, Pearson T, Wei Z, Banerjee A, Brown L, Valente LJ, Shen S, Deng HW, Schiller MR. GigaAssay - An adaptable high-throughput saturation mutagenesis assay platform. Genomics 2022; 114:110439. [PMID: 35905834 PMCID: PMC9420302 DOI: 10.1016/j.ygeno.2022.110439] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Revised: 07/12/2022] [Accepted: 07/24/2022] [Indexed: 11/17/2022]
Abstract
High-throughput assay systems have had a large impact on understanding the mechanisms of basic cell functions. However, high-throughput assays that directly assess molecular functions are limited. Herein, we describe the "GigaAssay", a modular high-throughput one-pot assay system for measuring molecular functions of thousands of genetic variants at once. In this system, each cell was infected with one virus from a library encoding thousands of Tat mutant proteins, with each viral particle encoding a random unique molecular identifier (UMI). We demonstrate proof of concept by measuring transcription of a GFP reporter in an engineered reporter cell line driven by binding of the HIV Tat transcription factor to the HIV long terminal repeat. Infected cells were flow-sorted into 3 bins based on their GFP fluorescence readout. The transcriptional activity of each Tat mutant was calculated from the ratio of signals from each bin. The use of UMIs in the GigaAssay produced a high average accuracy (95%) and positive predictive value (98%) determined by comparison to literature benchmark data, known C-terminal truncations, and blinded independent mutant tests. Including the substitution tolerance with structure/function analysis shows restricted substitution types spatially concentrated in the Cys-rich region. Tat has abundant intragenic epistasis (10%) when single and double mutants are compared.
Collapse
Affiliation(s)
- Ronald Benjamin
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
| | - Christopher J Giacoletto
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA; School of Life Sciences, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA; Heligenics Inc., 833 Las Vegas Blvd. North, Suite B, Las Vegas, NV 89101, USA
| | - Zachary T FitzHugh
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
| | - Danielle Eames
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
| | - Lindsay Buczek
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
| | - Xiaogang Wu
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
| | - Jacklyn Newsome
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
| | - Mira V Han
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA; School of Life Sciences, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
| | - Tony Pearson
- School of Life Sciences, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA; Heligenics Inc., 833 Las Vegas Blvd. North, Suite B, Las Vegas, NV 89101, USA
| | - Zhi Wei
- Department of Computer Science, New Jersey Institute of Technology, GITC 4214C, University Heights, Newark, NJ 07102, USA
| | - Atoshi Banerjee
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
| | - Lancer Brown
- Heligenics Inc., 833 Las Vegas Blvd. North, Suite B, Las Vegas, NV 89101, USA
| | - Liz J Valente
- Heligenics Inc., 833 Las Vegas Blvd. North, Suite B, Las Vegas, NV 89101, USA
| | - Shirley Shen
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
| | - Hong-Wen Deng
- Center for Biomedical Informatics & Genomics Tulane University, 1440 Canal Street, Suite 1621, New Orleans, LA 70112, USA
| | - Martin R Schiller
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA; School of Life Sciences, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA; Heligenics Inc., 833 Las Vegas Blvd. North, Suite B, Las Vegas, NV 89101, USA.
| |
Collapse
|
54
|
Engineering Toehold-Mediated Switches for Native RNA Detection and Regulation in Bacteria. J Mol Biol 2022; 434:167689. [PMID: 35717997 DOI: 10.1016/j.jmb.2022.167689] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 05/19/2022] [Accepted: 06/09/2022] [Indexed: 01/24/2023]
Abstract
RNA switches are versatile tools in synthetic biology for sensing and regulation applications. The discoveries of RNA-mediated translational and transcriptional control have facilitated the development of complexde novodesigns of RNA switches. Specifically, RNA toehold-mediated switches, in which binding to the toehold sensing domain controls the transition between switch states via strand displacement, have been extensively adapted for coupling systems responses to specifictrans-RNA inputs. This review highlights some of the challenges associated with applying these switches for native RNA detectionin vivo, including transferability between organisms. The applicability and design considerations of toehold-mediated switches are discussed by highlighting twelve recently developed switch designs. This review finishes with future perspectives to address current gaps in the field, particularly regarding the power of structural prediction algorithms for improved in vivo functionality of RNA switches.
Collapse
|
55
|
Andreev DE, Loughran G, Fedorova AD, Mikhaylova MS, Shatsky IN, Baranov PV. Non-AUG translation initiation in mammals. Genome Biol 2022; 23:111. [PMID: 35534899 PMCID: PMC9082881 DOI: 10.1186/s13059-022-02674-2] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Accepted: 04/14/2022] [Indexed: 12/12/2022] Open
Abstract
Recent proteogenomic studies revealed extensive translation outside of annotated protein coding regions, such as non-coding RNAs and untranslated regions of mRNAs. This non-canonical translation is largely due to start codon plurality within the same RNA. This plurality is often due to the failure of some scanning ribosomes to recognize potential start codons leading to initiation downstream—a process termed leaky scanning. Codons other than AUG (non-AUG) are particularly leaky due to their inefficiency. Here we discuss our current understanding of non-AUG initiation. We argue for a near-ubiquitous role of non-AUG initiation in shaping the dynamic composition of mammalian proteomes.
Collapse
|
56
|
Chiu CW, Li YR, Lin CY, Yeh HH, Liu MJ. Translation initiation landscape profiling reveals hidden open-reading frames required for the pathogenesis of tomato yellow leaf curl Thailand virus. THE PLANT CELL 2022; 34:1804-1821. [PMID: 35080617 PMCID: PMC9048955 DOI: 10.1093/plcell/koac019] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Accepted: 01/06/2022] [Indexed: 05/12/2023]
Abstract
Plant viruses with densely packed genomes employ noncanonical translational strategies to increase the coding capacity for viral function. However, the diverse translational strategies used make it challenging to define the full set of viral genes. Here, using tomato yellow leaf curl Thailand virus (TYLCTHV, genus Begomovirus) as a model system, we identified genes beyond the annotated gene sets by experimentally profiling in vivo translation initiation sites (TISs). We found that unanticipated AUG TISs were prevalent and determined that their usage involves alternative transcriptional and/or translational start sites and is associated with flanking mRNA sequences. Specifically, two downstream in-frame TISs were identified in the viral gene AV2. These TISs were conserved in the begomovirus lineage and led to the translation of different protein isoforms localized to cytoplasmic puncta and at the cell periphery, respectively. In addition, we found translational evidence of an unexplored gene, BV2. BV2 is conserved among TYLCTHV isolates and localizes to the endoplasmic reticulum and plasmodesmata. Mutations of AV2 isoforms and BV2 significantly attenuated disease symptoms in tomato (Solanum lycopersicum). In conclusion, our study pinpointing in vivo TISs untangles the coding complexity of a plant viral genome and, more importantly, illustrates the biological significance of the hidden open-reading frames encoding viral factors for pathogenicity.
Collapse
Affiliation(s)
- Ching-Wen Chiu
- Biotechnology Center in Southern Taiwan, Academia Sinica, Tainan 711, Taiwan
| | - Ya-Ru Li
- Biotechnology Center in Southern Taiwan, Academia Sinica, Tainan 711, Taiwan
| | - Cheng-Yuan Lin
- Biotechnology Center in Southern Taiwan, Academia Sinica, Tainan 711, Taiwan
| | - Hsin-Hung Yeh
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei 115, Taiwan
| | | |
Collapse
|
57
|
Isolation and Characterization of Two Pseudorabies Virus and Evaluation of Their Effects on Host Natural Immune Responses and Pathogenicity. Viruses 2022; 14:v14040712. [PMID: 35458442 PMCID: PMC9032386 DOI: 10.3390/v14040712] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 03/26/2022] [Accepted: 03/26/2022] [Indexed: 02/01/2023] Open
Abstract
Pseudorabies, caused by the pseudorabies virus (PRV), is an acute fatal disease, which can infect rodents, mammals, and other livestock and wild animals across species. Recently, the emergence of PRV virulent isolates indicates a high risk of a variant PRV epidemic and the need for continuous surveillance. In this study, PRV-GD and PRV-JM, two fatal PRV variants, were isolated and their pathogenicity as well as their effects on host natural immune responses were assessed. PRV-GD and PRV-JM were genetically closest to PRV variants currently circulating in Heilongjiang (HLJ8) and Jiangxi (JX/CH/2016), which belong to genotype 2.2. Consistently, antisera from sows immunized with PRV-Ea classical vaccination showed much lower neutralization ability to PRV-GD and PRV-JM. However, the antisera from the pigs infected with PRV-JM had an extremely higher neutralization ability to PRV-TJ (as a positive control), PRV-GD and PRV-JM. In vivo, PRV-GD and PRV-JM infections caused 100% death in mice and piglets and induced extensive tissue damage, cell death, and inflammatory cytokine release. Our analysis of the emergence of PRV variants indicate that pigs immunized with the classical PRV vaccine are incapable of providing sufficient protection against these PRV isolates, and there is a risk of continuous evolution and virulence enhancement. Efforts are still needed to conduct epidemiological monitoring for the PRV and to develop novel vaccines against this emerging and reemerging infectious disease.
Collapse
|
58
|
Bartas M, Volná A, Beaudoin CA, Poulsen ET, Červeň J, Brázda V, Špunda V, Blundell TL, Pečinka P. Unheeded SARS-CoV-2 proteins? A deep look into negative-sense RNA. Brief Bioinform 2022; 23:6539840. [PMID: 35229157 PMCID: PMC9116216 DOI: 10.1093/bib/bbac045] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 01/13/2022] [Accepted: 01/29/2022] [Indexed: 01/27/2023] Open
Abstract
SARS-CoV-2 is a novel positive-sense single-stranded RNA virus from the Coronaviridae family (genus Betacoronavirus), which has been established as causing the COVID-19 pandemic. The genome of SARS-CoV-2 is one of the largest among known RNA viruses, comprising of at least 26 known protein-coding loci. Studies thus far have outlined the coding capacity of the positive-sense strand of the SARS-CoV-2 genome, which can be used directly for protein translation. However, it has been recently shown that transcribed negative-sense viral RNA intermediates that arise during viral genome replication from positive-sense viruses can also code for proteins. No studies have yet explored the potential for negative-sense SARS-CoV-2 RNA intermediates to contain protein-coding loci. Thus, using sequence and structure-based bioinformatics methodologies, we have investigated the presence and validity of putative negative-sense ORFs (nsORFs) in the SARS-CoV-2 genome. Nine nsORFs were discovered to contain strong eukaryotic translation initiation signals and high codon adaptability scores, and several of the nsORFs were predicted to interact with RNA-binding proteins. Evolutionary conservation analyses indicated that some of the nsORFs are deeply conserved among related coronaviruses. Three-dimensional protein modeling revealed the presence of higher order folding among all putative SARS-CoV-2 nsORFs, and subsequent structural mimicry analyses suggest similarity of the nsORFs to DNA/RNA-binding proteins and proteins involved in immune signaling pathways. Altogether, these results suggest the potential existence of still undescribed SARS-CoV-2 proteins, which may play an important role in the viral lifecycle and COVID-19 pathogenesis.
Collapse
Affiliation(s)
- Martin Bartas
- Department of Biology and Ecology, University of Ostrava, Ostrava 710 00, Czech Republic
| | - Adriana Volná
- Department of Physics, University of Ostrava, Ostrava 710 00, Czech Republic
| | - Christopher A Beaudoin
- Department of Biochemistry, Sanger Building, University of Cambridge, Tennis Court Rd, Cambridge CB2 1GA, UK
| | | | - Jiří Červeň
- Department of Biology and Ecology, University of Ostrava, Ostrava 710 00, Czech Republic
| | - Václav Brázda
- Institute of Biophysics, Czech Academy of Sciences, Brno, 612 65, Czech Republic
| | - Vladimír Špunda
- Department of Physics, University of Ostrava, Ostrava 710 00, Czech Republic.,Global Change Research Institute, Czech Academy of Sciences, Brno, 603 00, Czech Republic
| | - Tom L Blundell
- Department of Biochemistry, Sanger Building, University of Cambridge, Tennis Court Rd, Cambridge CB2 1GA, UK
| | - Petr Pečinka
- Department of Biology and Ecology, University of Ostrava, Ostrava 710 00, Czech Republic
| |
Collapse
|
59
|
Neumann T, Tuller T. Modeling the ribosomal small subunit dynamic in Saccharomyces cerevisiae based on TCP-seq data. Nucleic Acids Res 2022; 50:1297-1316. [PMID: 35100399 PMCID: PMC8860609 DOI: 10.1093/nar/gkac021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 12/31/2021] [Accepted: 01/07/2022] [Indexed: 11/13/2022] Open
Abstract
Translation Complex Profile Sequencing (TCP-seq), a protocol that was developed and implemented on Saccharomyces cerevisiae, provides the footprints of the small subunit (SSU) of the ribosome (with additional factors) across the entire transcriptome of the analyzed organism. In this study, based on the TCP-seq data, we developed for the first-time a predictive model of the SSU density and analyzed the effect of transcript features on the dynamics of the SSU scan in the 5′UTR. Among others, our model is based on novel tools for detecting complex statistical relations tailored to TCP-seq. We quantitatively estimated the effect of several important features, including the context of the upstream AUG, the upstream ORF length and the mRNA folding strength. Specifically, we suggest that around 50% of the variance related to the read counts (RC) distribution near a start codon can be attributed to the AUG context score. We provide the first large scale direct quantitative evidence that shows that indeed AUG context affects the small sub-unit movement. In addition, we suggest that strong folding may cause the detachment of the SSU from the mRNA. We also identified a number of novel sequence motifs that can affect the SSU scan; some of these motifs affect transcription factors and RNA binding proteins. The results presented in this study provide a better understanding of the biophysical aspects related to the SSU scan along the 5′UTR and of translation initiation in S. cerevisiae, a fundamental step toward a comprehensive modeling of initiation.
Collapse
Affiliation(s)
- Tamar Neumann
- Department of Biomedical Engineering, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Tamir Tuller
- Department of Biomedical Engineering, Tel Aviv University, Tel Aviv 6997801, Israel
- The Sagol School of Neuroscience, Tel-Aviv University, Tel Aviv 6997801, Israel
| |
Collapse
|
60
|
Ahmed S, Bhasin M, Manjunath K, Varadarajan R. Prediction of Residue-specific Contributions to Binding and Thermal Stability Using Yeast Surface Display. Front Mol Biosci 2022; 8:800819. [PMID: 35127820 PMCID: PMC8814602 DOI: 10.3389/fmolb.2021.800819] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2021] [Accepted: 12/14/2021] [Indexed: 12/11/2022] Open
Abstract
Accurate prediction of residue burial as well as quantitative prediction of residue-specific contributions to protein stability and activity is challenging, especially in the absence of experimental structural information. This is important for prediction and understanding of disease causing mutations, and for protein stabilization and design. Using yeast surface display of a saturation mutagenesis library of the bacterial toxin CcdB, we probe the relationship between ligand binding and expression level of displayed protein, with in vivo solubility in E. coli and in vitro thermal stability. We find that both the stability and solubility correlate well with the total amount of active protein on the yeast cell surface but not with total amount of expressed protein. We coupled FACS and deep sequencing to reconstruct the binding and expression mean fluorescent intensity of each mutant. The reconstructed mean fluorescence intensity (MFIseq) was used to differentiate between buried site, exposed non active-site and exposed active-site positions with high accuracy. The MFIseq was also used as a criterion to identify destabilized as well as stabilized mutants in the library, and to predict the melting temperatures of destabilized mutants. These predictions were experimentally validated and were more accurate than those of various computational predictors. The approach was extended to successfully identify buried and active-site residues in the receptor binding domain of the spike protein of SARS-CoV-2, suggesting it has general applicability.
Collapse
Affiliation(s)
- Shahbaz Ahmed
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | - Munmun Bhasin
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | | | | |
Collapse
|
61
|
Castillo-Hair SM, Seelig G. Machine Learning for Designing Next-Generation mRNA Therapeutics. Acc Chem Res 2022; 55:24-34. [PMID: 34905691 DOI: 10.1021/acs.accounts.1c00621] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Over just the last 2 years, mRNA therapeutics and vaccines have undergone a rapid transition from an intriguing concept to real-world impact. However, whereas some aspects of mRNA therapeutics, such as the use of chemical modifications to increase stability and reduce immunogenicity, have been extensively optimized for over two decades, other aspects, particularly the selection and design of the noncoding leader and trailer sequences which control translation efficiency and stability, have received comparably less attention. In practice, such 5' and 3' untranslated regions (UTRs) are often borrowed from highly expressed human genes with few or no modifications, as in the case for the Pfizer/BioNTech Covid vaccine. Focusing on the 5'UTR, we here argue that model-driven design is a promising alternative that provides unprecedented control over 5'UTR function. We review recent work that combines synthetic biology with machine learning to build quantitative models that relate ribosome loading, and thus translation efficiency, to the 5'UTR sequence. We first introduce an experimental approach that uses polysome profiling and high-throughput sequencing to quantify ribosome loading for hundreds of thousands of 5'UTRs in parallel. We apply this approach to measure ribosome loading in synthetic RNA libraries with a random sequence inserted into the 5'UTR. We then review Optimus 5-Prime, a convolutional neural network model trained on the experimental data. We highlight that very accurate models of biological regulation can be learned from synthetic data sets with degenerate 5'UTRs. We validate model predictions not only on held-out data sets from our random library but also on a large library of over 30 000 human 5'UTR fragments and using translation reporter data collected independently by other groups. Both the experiment and model are compatible with commonly used chemically modified nucleosides, in particular, pseudouridine (Ψ) and 1-methyl-pseudouridine (m1Ψ). We find that, in general, 5'UTRs have very similar impacts when combined with different protein-coding sequences and even in the context of different chemical modifications. We demonstrate that Optimus 5-Prime can be combined with design algorithms to generate de novo sequences with precisely defined translation efficiencies. We emphasize recent developments in design algorithms that rely on activation maximization and generative modeling to improve both the fitness and diversity of designed sequences. Compared with prior approaches such as genetic algorithms, we show that these approaches are not only faster but also less likely to get stuck in local sequence optima. Finally, we discuss how the approach reviewed here can be generalized to other gene regions and applications.
Collapse
Affiliation(s)
- Sebastian M. Castillo-Hair
- Department of Electrical & Computer Engineering, University of Washington, Seattle, Washington 98195, United States
- eScience Institute, University of Washington, Seattle, Washington 98195, United States
| | - Georg Seelig
- Department of Electrical & Computer Engineering, University of Washington, Seattle, Washington 98195, United States
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, Washington 98195, United States
| |
Collapse
|
62
|
May GE, McManus CJ. Multiplexed Analysis of Human uORF Regulatory Functions During the ISR Using PoLib-Seq. Methods Mol Biol 2022; 2428:41-62. [PMID: 35171472 DOI: 10.1007/978-1-0716-1975-9_3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Protein synthesis is a highly regulated essential process. As such, it is subjected to substantial regulation in response to stress. One hallmark of the Integrated Stress Response (ISR) is the immediate shutdown of most translation through phosphorylation of the alpha subunit of translation initiation factor eIF2 and activation of eIF4E binding proteins. While these posttranslational modifications largely inhibit cap-dependent translation, many mRNA resist this inhibition by alternative translation mechanisms involving cis-regulatory sequences and structures in 5' transcript leaders, including upstream Open Reading Frames (uORFs), Internal Ribosome Entry Sites (IRESes), and Cap-Independent Translation Elements (CITEs). Studies of uORF and IRES activity are often performed on a gene-by-gene basis; however, high-throughput methods have recently emerged. Here, we describe a protocol for Polysome Library Sequencing (PoLib-Seq; Fig. 1), a multiplexed assay of reporter gene translation that can be used during the ISR. A designer library of reporter RNAs are transfected into tissue-culture cells, and their translation is assayed via sucrose gradient fractionation followed by high-throughput sequencing. As an example, we include PoLib-seq results simultaneously assaying translation of wildtype and uORF mutant human ATF4 reporter RNAs, recapitulating the known function of uORF1 in resisting translational inhibition during the ISR.
Collapse
Affiliation(s)
- Gemma E May
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| | - C Joel McManus
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA.
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA.
| |
Collapse
|
63
|
Armetta J, Schantz-Klausen M, Shepelin D, Vazquez-Uribe R, Bahl MI, Laursen MF, Licht TR, Sommer MO. Escherichia coli Promoters with Consistent Expression throughout the Murine Gut. ACS Synth Biol 2021; 10:3359-3368. [PMID: 34842418 DOI: 10.1021/acssynbio.1c00325] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Advanced microbial therapeutics have great potential as a novel modality to diagnose and treat a wide range of diseases. Yet, to realize this potential, robust parts for regulating gene expression and consequent therapeutic activity in situ are needed. In this study, we characterized the expression level of more than 8000 variants of the Escherichia coli sigma factor 70 (σ70) promoter in a range of different environmental conditions and growth states using fluorescence-activated cell sorting and deep sequencing. Sampled conditions include aerobic and anaerobic culture in the laboratory as well as growth in several locations of the murine gastrointestinal tract. We found that σ70 promoters in E. coli generally maintain consistent expression levels across the murine gut (R2: 0.55-0.85, p value < 1 × 10-5), suggesting a limited environmental influence but a higher variability between in vitro and in vivo expression levels, highlighting the challenges of translating in vitro promoter activity to in vivo applications. Based on these data, we design the Schantzetta library, composed of eight promoters spanning a wide expression range and displaying a high degree of robustness in both laboratory and in vivo conditions (R2 = 0.98, p = 0.000827). This study provides a systematic assessment of the σ70 promoter activity in E. coli as it transits the murine gut leading to the definition of robust expression cassettes that could be a valuable tool for reliable engineering and development of advanced microbial therapeutics.
Collapse
Affiliation(s)
- Jeremy Armetta
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, DK-2800 Lyngby, Denmark
| | - Michael Schantz-Klausen
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, DK-2800 Lyngby, Denmark
| | - Denis Shepelin
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, DK-2800 Lyngby, Denmark
| | - Ruben Vazquez-Uribe
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, DK-2800 Lyngby, Denmark
| | - Martin Iain Bahl
- National Food Institute, Technical University of Denmark, DK-2800 Lyngby, Denmark
| | | | - Tine Rask Licht
- National Food Institute, Technical University of Denmark, DK-2800 Lyngby, Denmark
| | - Morten O.A. Sommer
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, DK-2800 Lyngby, Denmark
| |
Collapse
|
64
|
Abstract
Synthetic messenger RNA (mRNA), once delivered into cells, can be readily translated into proteins by ribosomes, which do not distinguish exogenous mRNAs from endogenous transcripts. Until recently, the intrinsic instability and immunostimulatory property of exogenous RNAs largely hindered the therapeutic application of synthetic mRNAs. Thanks to major technological innovations, such as introduction of chemically modified nucleosides, synthetic mRNAs have become programmable therapeutic reagents. Compared to DNA or protein-based therapeutic reagents, synthetic mRNAs bear several advantages: flexible design, easy optimization, low-cost preparation, and scalable synthesis. Therapeutic mRNAs are commonly designed to encode specific antigens to elicit organismal immune response to pathogens like viruses, express functional proteins to replace defective ones inside cells, or introduce novel enzymes to achieve unique functions like genome editing. Recent years have witnessed stunning progress on the development of mRNA vaccines against SARS-Cov2. This success is built upon our fundamental understanding of mRNA metabolism and translational control, a knowledge accumulated during the past several decades. Given the astronomical number of sequence combinations of four nucleotides, sequence-dependent control of mRNA translation remains incompletely understood. Rational design of synthetic mRNAs with robust translation and optimal stability remains challenging. Massively paralleled reporter assay (MPRA) has been proven to be powerful in identifying sequence elements in controlling mRNA translatability and stability. Indeed, a completely randomized sequence in 5' untranslated region (5'UTR) drives a wide range of translational outputs. In this Account, we will discuss general principles of mRNA translation in eukaryotic cells and elucidate the role of coding and noncoding regions in the translational regulation. From the therapeutic perspective, we will highlight the unique features of 5' cap, 5'UTR, coding region (CDS), stop codon, 3'UTR, and poly(A) tail. By focusing on the design strategies in mRNA engineering, we hope this Account will contribute to the rational design of synthetic mRNAs with broad therapeutic potential.
Collapse
Affiliation(s)
- Longfei Jia
- Division of Nutritional Sciences, Cornell University, Ithaca, New York 14853, United States
| | - Shu-Bing Qian
- Division of Nutritional Sciences, Cornell University, Ithaca, New York 14853, United States
| |
Collapse
|
65
|
Bhandari BK, Lim CS, Remus DM, Chen A, van Dolleweerd C, Gardner PP. Analysis of 11,430 recombinant protein production experiments reveals that protein yield is tunable by synonymous codon changes of translation initiation sites. PLoS Comput Biol 2021; 17:e1009461. [PMID: 34610008 PMCID: PMC8519471 DOI: 10.1371/journal.pcbi.1009461] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 10/15/2021] [Accepted: 09/19/2021] [Indexed: 12/16/2022] Open
Abstract
Recombinant protein production is a key process in generating proteins of interest in the pharmaceutical industry and biomedical research. However, about 50% of recombinant proteins fail to be expressed in a variety of host cells. Here we show that the accessibility of translation initiation sites modelled using the mRNA base-unpairing across the Boltzmann's ensemble significantly outperforms alternative features. This approach accurately predicts the successes or failures of expression experiments, which utilised Escherichia coli cells to express 11,430 recombinant proteins from over 189 diverse species. On this basis, we develop TIsigner that uses simulated annealing to modify up to the first nine codons of mRNAs with synonymous substitutions. We show that accessibility captures the key propensity beyond the target region (initiation sites in this case), as a modest number of synonymous changes is sufficient to tune the recombinant protein expression levels. We build a stochastic simulation model and show that higher accessibility leads to higher protein production and slower cell growth, supporting the idea of protein cost, where cell growth is constrained by protein circuits during overexpression.
Collapse
Affiliation(s)
- Bikash K. Bhandari
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| | - Chun Shen Lim
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| | - Daniela M. Remus
- Callaghan Innovation Protein Science and Engineering, University of Canterbury, Christchurch, New Zealand
| | - Augustine Chen
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| | - Craig van Dolleweerd
- Biomolecular Interaction Center, University of Canterbury, Christchurch, New Zealand
| | - Paul P. Gardner
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
- Biomolecular Interaction Center, University of Canterbury, Christchurch, New Zealand
| |
Collapse
|
66
|
Koberstein JN, Stewart ML, Mighell TL, Smith CB, Cohen MS. A Sort-Seq Approach to the Development of Single Fluorescent Protein Biosensors. ACS Chem Biol 2021; 16:1709-1720. [PMID: 34431656 PMCID: PMC9807264 DOI: 10.1021/acschembio.1c00423] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Motivated by the growing importance of single fluorescent protein biosensors (SFPBs) in biological research and the difficulty in rationally engineering these tools, we sought to increase the rate at which SFPB designs can be optimized. SFPBs generally consist of three components: a circularly permuted fluorescent protein, a ligand-binding domain, and linkers connecting the two domains. In the absence of predictive methods for biosensor engineering, most designs combining these three components will fail to produce allosteric coupling between ligand binding and fluorescence emission. While methods to construct diverse libraries with variation in the site of GFP insertion and linker sequences have been developed, the remaining bottleneck is the ability to test these libraries for functional biosensors. We address this challenge by applying a massively parallel assay termed "sort-seq," which combines binned fluorescence-activated cell sorting, next-generation sequencing, and maximum likelihood estimation to quantify the brightness and dynamic range for many biosensor variants in parallel. We applied this method to two common biosensor optimization tasks: the choice of insertion site and optimization of linker sequences. The sort-seq assay applied to a maltose-binding protein domain-insertion library not only identified previously described high-dynamic-range variants but also discovered new functional insertion sites with diverse properties. A sort-seq assay performed on a pyruvate biosensor linker library expressed in mammalian cell culture identified linker variants with substantially improved dynamic range. Machine learning models trained on the resulting data can predict dynamic range from linker sequences. This high-throughput approach will accelerate the design and optimization of SFPBs, expanding the biosensor toolbox.
Collapse
Affiliation(s)
- John N. Koberstein
- Vollum Institute, Oregon Health & Science University, Portland, OR 97239, USA
| | - Melissa L. Stewart
- Vollum Institute, Oregon Health & Science University, Portland, OR 97239, USA
| | - Taylor L. Mighell
- Department of Molecular and Medical Genetics, Oregon Health & Science University, Portland, OR 97239, USA
| | - Chadwick B. Smith
- Vollum Institute, Oregon Health & Science University, Portland, OR 97239, USA
| | - Michael S. Cohen
- Department of Physiology and Pharmacology, Oregon Health & Science University, Portland, OR 97239, USA
| |
Collapse
|
67
|
Ramzan S, Tennstedt S, Tariq M, Khan S, Noor Ul Ayan H, Ali A, Munz M, Thiele H, Korejo AA, Mughal AR, Jamal SZ, Nürnberg P, Baig SM, Erdmann J, Ahmad I. A Novel Missense Mutation in TNNI3K Causes Recessively Inherited Cardiac Conduction Disease in a Consanguineous Pakistani Family. Genes (Basel) 2021; 12:genes12081282. [PMID: 34440456 PMCID: PMC8395014 DOI: 10.3390/genes12081282] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 08/15/2021] [Accepted: 08/16/2021] [Indexed: 11/16/2022] Open
Abstract
Cardiac conduction disease (CCD), which causes altered electrical impulse propagation in the heart, is a life-threatening condition with high morbidity and mortality. It exhibits genetic and clinical heterogeneity with diverse pathomechanisms, but in most cases, it disrupts the synchronous activity of impulse-generating nodes and impulse-conduction underlying the normal heartbeat. In this study, we investigated a consanguineous Pakistani family comprised of four patients with CCD. We applied whole exome sequencing (WES) and co-segregation analysis, which identified a novel homozygous missense mutation (c.1531T>C;(p.Ser511Pro)) in the highly conserved kinase domain of the cardiac troponin I-interacting kinase (TNNI3K) encoding gene. The behaviors of mutant and native TNNI3K were compared by performing all-atom long-term molecular dynamics simulations, which revealed changes at the protein surface and in the hydrogen bond network. Furthermore, intra and intermolecular interaction analyses revealed that p.Ser511Pro causes structural variation in the ATP-binding pocket and the homodimer interface. These findings suggest p.Ser511Pro to be a pathogenic variant. Our study provides insights into how the variant perturbs the TNNI3K structure-function relationship, leading to a disease state. This is the first report of a recessive mutation in TNNI3K and the first mutation in this gene identified in the Pakistani population.
Collapse
Affiliation(s)
- Shafaq Ramzan
- Institute for Cardiogenetics, University of Lübeck, 23562 Lübeck, Germany; (S.R.); (S.T.); (H.N.U.A.); (M.M.); (J.E.)
- National Institute for Biotechnology and Genetic Engineering (NIBGE-C), Institute of Engineering and Applied Sciences (PIEAS), Islamabad 44000, Pakistan; (M.T.); (S.K.); (A.A.); (S.M.B.)
| | - Stephanie Tennstedt
- Institute for Cardiogenetics, University of Lübeck, 23562 Lübeck, Germany; (S.R.); (S.T.); (H.N.U.A.); (M.M.); (J.E.)
- DZHK (German Research Centre for Cardiovascular Research) Partner Site Hamburg/Lübeck/Kiel, 23562 Lübeck, Germany
- University Heart Center Lübeck, 23562 Lübeck, Germany
| | - Muhammad Tariq
- National Institute for Biotechnology and Genetic Engineering (NIBGE-C), Institute of Engineering and Applied Sciences (PIEAS), Islamabad 44000, Pakistan; (M.T.); (S.K.); (A.A.); (S.M.B.)
| | - Sheraz Khan
- National Institute for Biotechnology and Genetic Engineering (NIBGE-C), Institute of Engineering and Applied Sciences (PIEAS), Islamabad 44000, Pakistan; (M.T.); (S.K.); (A.A.); (S.M.B.)
| | - Hafiza Noor Ul Ayan
- Institute for Cardiogenetics, University of Lübeck, 23562 Lübeck, Germany; (S.R.); (S.T.); (H.N.U.A.); (M.M.); (J.E.)
- National Institute for Biotechnology and Genetic Engineering (NIBGE-C), Institute of Engineering and Applied Sciences (PIEAS), Islamabad 44000, Pakistan; (M.T.); (S.K.); (A.A.); (S.M.B.)
| | - Aamir Ali
- National Institute for Biotechnology and Genetic Engineering (NIBGE-C), Institute of Engineering and Applied Sciences (PIEAS), Islamabad 44000, Pakistan; (M.T.); (S.K.); (A.A.); (S.M.B.)
| | - Matthias Munz
- Institute for Cardiogenetics, University of Lübeck, 23562 Lübeck, Germany; (S.R.); (S.T.); (H.N.U.A.); (M.M.); (J.E.)
- DZHK (German Research Centre for Cardiovascular Research) Partner Site Hamburg/Lübeck/Kiel, 23562 Lübeck, Germany
| | - Holger Thiele
- Cologne Center for Genomics (CCG), University of Cologne, Faculty of Medicine, University Hospital Cologne, 50931 Cologne, Germany; (H.T.); (P.N.)
| | - Asad Aslam Korejo
- National Institute of Cardiovascular Disease, Karachi 75510, Pakistan; (A.A.K.); (S.Z.J.)
| | | | - Syed Zahid Jamal
- National Institute of Cardiovascular Disease, Karachi 75510, Pakistan; (A.A.K.); (S.Z.J.)
| | - Peter Nürnberg
- Cologne Center for Genomics (CCG), University of Cologne, Faculty of Medicine, University Hospital Cologne, 50931 Cologne, Germany; (H.T.); (P.N.)
- Center for Molecular Medicine Cologne (CMMC), University of Cologne, Faculty of Medicine, University Hospital Cologne, 50931 Cologne, Germany
| | - Shahid Mahmood Baig
- National Institute for Biotechnology and Genetic Engineering (NIBGE-C), Institute of Engineering and Applied Sciences (PIEAS), Islamabad 44000, Pakistan; (M.T.); (S.K.); (A.A.); (S.M.B.)
- Department of Biological and Biomedical Sciences, Aga Khan University, Karachi 74000, Pakistan
- Pakistan Science Foundation (PSF), 1-Constitution Avenue, G-5/2, Islamabad 44000, Pakistan
| | - Jeanette Erdmann
- Institute for Cardiogenetics, University of Lübeck, 23562 Lübeck, Germany; (S.R.); (S.T.); (H.N.U.A.); (M.M.); (J.E.)
- DZHK (German Research Centre for Cardiovascular Research) Partner Site Hamburg/Lübeck/Kiel, 23562 Lübeck, Germany
- University Heart Center Lübeck, 23562 Lübeck, Germany
| | - Ilyas Ahmad
- Institute for Cardiogenetics, University of Lübeck, 23562 Lübeck, Germany; (S.R.); (S.T.); (H.N.U.A.); (M.M.); (J.E.)
- DZHK (German Research Centre for Cardiovascular Research) Partner Site Hamburg/Lübeck/Kiel, 23562 Lübeck, Germany
- University Heart Center Lübeck, 23562 Lübeck, Germany
- Correspondence: ; Tel.: +49-(0)451-3101-8320
| |
Collapse
|
68
|
Shukla N, Roelle SM, Suzart VG, Bruchez AM, Matreyek KA. Mutants of human ACE2 differentially promote SARS-CoV and SARS-CoV-2 spike mediated infection. PLoS Pathog 2021; 17:e1009715. [PMID: 34270613 PMCID: PMC8284657 DOI: 10.1371/journal.ppat.1009715] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 06/15/2021] [Indexed: 01/10/2023] Open
Abstract
SARS-CoV and SARS-CoV-2 encode spike proteins that bind human ACE2 on the cell surface to enter target cells during infection. A small fraction of humans encode variants of ACE2, thus altering the biochemical properties at the protein interaction interface. These and other ACE2 coding mutants can reveal how the spike proteins of each virus may differentially engage the ACE2 protein surface during infection. We created an engineered HEK 293T cell line for facile stable transgenic modification, and expressed the major human ACE2 allele or 28 of its missense mutants, 24 of which are possible through single nucleotide changes from the human reference sequence. Infection with SARS-CoV or SARS-CoV-2 spike pseudotyped lentiviruses revealed that high ACE2 cell-surface expression could mask the effects of impaired binding during infection. Drastically reducing ACE2 cell surface expression revealed a range of infection efficiencies across the panel of mutants. Our infection results revealed a non-linear relationship between soluble SARS-CoV-2 RBD binding to ACE2 and pseudovirus infection, supporting a major role for binding avidity during entry. While ACE2 mutants D355N, R357A, and R357T abrogated entry by both SARS-CoV and SARS-CoV-2 spike proteins, the Y41A mutant inhibited SARS-CoV entry much more than SARS-CoV-2, suggesting differential utilization of the ACE2 side-chains within the largely overlapping interaction surfaces utilized by the two CoV spike proteins. These effects correlated well with cytopathic effects observed during SARS-CoV-2 replication in ACE2-mutant cells. The panel of ACE2 mutants also revealed altered ACE2 surface dependencies by the N501Y spike variant, including a near-complete utilization of the K353D ACE2 variant, despite decreased infection mediated by the parental SARS-CoV-2 spike. Our results clarify the relationship between ACE2 abundance, binding, and infection, for various SARS-like coronavirus spike proteins and their mutants, and inform our understanding for how changes to ACE2 sequence may correspond with different susceptibilities to infection.
Collapse
Affiliation(s)
- Nidhi Shukla
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
| | - Sarah M. Roelle
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
| | - Vinicius G. Suzart
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
| | - Anna M. Bruchez
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
| | - Kenneth A. Matreyek
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
| |
Collapse
|
69
|
Wright CF, Quaife NM, Ramos-Hernández L, Danecek P, Ferla MP, Samocha KE, Kaplanis J, Gardner EJ, Eberhardt RY, Chao KR, Karczewski KJ, Morales J, Gallone G, Balasubramanian M, Banka S, Gompertz L, Kerr B, Kirby A, Lynch SA, Morton JEV, Pinz H, Sansbury FH, Stewart H, Zuccarelli BD, Cook SA, Taylor JC, Juusola J, Retterer K, Firth HV, Hurles ME, Lara-Pezzi E, Barton PJR, Whiffin N. Non-coding region variants upstream of MEF2C cause severe developmental disorder through three distinct loss-of-function mechanisms. Am J Hum Genet 2021; 108:1083-1094. [PMID: 34022131 PMCID: PMC8206381 DOI: 10.1016/j.ajhg.2021.04.025] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Accepted: 04/29/2021] [Indexed: 02/08/2023] Open
Abstract
Clinical genetic testing of protein-coding regions identifies a likely causative variant in only around half of developmental disorder (DD) cases. The contribution of regulatory variation in non-coding regions to rare disease, including DD, remains very poorly understood. We screened 9,858 probands from the Deciphering Developmental Disorders (DDD) study for de novo mutations in the 5' untranslated regions (5' UTRs) of genes within which variants have previously been shown to cause DD through a dominant haploinsufficient mechanism. We identified four single-nucleotide variants and two copy-number variants upstream of MEF2C in a total of ten individual probands. We developed multiple bespoke and orthogonal experimental approaches to demonstrate that these variants cause DD through three distinct loss-of-function mechanisms, disrupting transcription, translation, and/or protein function. These non-coding region variants represent 23% of likely diagnoses identified in MEF2C in the DDD cohort, but these would all be missed in standard clinical genetics approaches. Nonetheless, these variants are readily detectable in exome sequence data, with 30.7% of 5' UTR bases across all genes well covered in the DDD dataset. Our analyses show that non-coding variants upstream of genes within which coding variants are known to cause DD are an important cause of severe disease and demonstrate that analyzing 5' UTRs can increase diagnostic yield. We also show how non-coding variants can help inform both the disease-causing mechanism underlying protein-coding variants and dosage tolerance of the gene.
Collapse
Affiliation(s)
- Caroline F Wright
- Institute of Biomedical and Clinical Science, University of Exeter Medical School, Royal Devon & Exeter Hospital, Exeter EX2 5DW, UK
| | - Nicholas M Quaife
- National Heart & Lung Institute and MRC London Institute of Medical Sciences, Imperial College London, London W12 0NN, UK; Cardiovascular Research Centre, Royal Brompton & Harefield Hospitals NHS Trust, London SW3 6NP, UK
| | - Laura Ramos-Hernández
- Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), 28029 Madrid, Spain
| | - Petr Danecek
- Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1RQ, UK
| | - Matteo P Ferla
- National Institute for Health Research Oxford Biomedical Research Centre, Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Kaitlin E Samocha
- Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1RQ, UK
| | - Joanna Kaplanis
- Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1RQ, UK
| | - Eugene J Gardner
- Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1RQ, UK
| | - Ruth Y Eberhardt
- Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1RQ, UK
| | - Katherine R Chao
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Konrad J Karczewski
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Joannella Morales
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge CB10 1SD, UK
| | - Giuseppe Gallone
- Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1RQ, UK
| | - Meena Balasubramanian
- Sheffield Clinical Genetics Service, Sheffield Children's NHS Foundation Trust, Sheffield S10 2TH, UK; Academic Unit of Child Health, Department of Oncology & Metabolism, University of Sheffield, Sheffield S10 2TH, UK
| | - Siddharth Banka
- Manchester Centre for Genomic Medicine, St Mary's Hospital, Manchester University Hospitals NHS Foundation Trust, Health Innovation Manchester, Manchester M13 9WL, UK; Division of Evolution and Genomic Sciences, School of Biological Sciences, University of Manchester, Oxford Road, Manchester M13 9PL, UK
| | - Lianne Gompertz
- Manchester Centre for Genomic Medicine, St Mary's Hospital, Manchester University Hospitals NHS Foundation Trust, Health Innovation Manchester, Manchester M13 9WL, UK
| | - Bronwyn Kerr
- Division of Evolution and Genomic Sciences, School of Biological Sciences, University of Manchester, Oxford Road, Manchester M13 9PL, UK
| | - Amelia Kirby
- Department of Pediatrics, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
| | - Sally A Lynch
- UCD Academic Centre on Rare Diseases, School of Medicine and Medical Sciences, University College Dublin, and Clinical Genetics, Temple Street Children's University Hospital, Dublin D01 XD99, Ireland
| | - Jenny E V Morton
- West Midlands Regional Clinical Genetics Service and Birmingham Health Partners, Birmingham Women's and Children's Hospitals NHS Foundation Trust, Birmingham B4 6NH, UK
| | - Hailey Pinz
- Department of Pediatrics, Saint Louis University School of Medicine, Saint Louis, MO 63104, USA
| | - Francis H Sansbury
- All Wales Medical Genomics Service, NHS Wales Cardiff and Vale University Health Board, Institute of Medical Genetics, University Hospital of Wales, Cardiff CF14 4AY, UK
| | - Helen Stewart
- Oxford Centre for Genomic Medicine, Oxford University Hospitals NHS Foundation Trust, Oxford OX3 7LE, UK
| | - Britton D Zuccarelli
- Department of Neurology, University of Kansas School of Medicine-Salina Campus, Salina, KS 67401, USA
| | - Stuart A Cook
- National Heart & Lung Institute and MRC London Institute of Medical Sciences, Imperial College London, London W12 0NN, UK
| | - Jenny C Taylor
- National Institute for Health Research Oxford Biomedical Research Centre, Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | | | | | - Helen V Firth
- Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1RQ, UK; East Anglian Medical Genetics Service, Cambridge University Hospitals NHS Foundation Trust, Cambridge CB2 0QQ, UK
| | - Matthew E Hurles
- Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1RQ, UK
| | - Enrique Lara-Pezzi
- Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), 28029 Madrid, Spain; CIBER de enfermedades CardioVasculares (CIBERCV), 28029 Madrid, Spain
| | - Paul J R Barton
- National Heart & Lung Institute and MRC London Institute of Medical Sciences, Imperial College London, London W12 0NN, UK; Cardiovascular Research Centre, Royal Brompton & Harefield Hospitals NHS Trust, London SW3 6NP, UK
| | - Nicola Whiffin
- Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1RQ, UK; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK.
| |
Collapse
|
70
|
Giess A, Torres Cleuren YN, Tjeldnes H, Krause M, Bizuayehu TT, Hiensch S, Okon A, Wagner CR, Valen E. Profiling of Small Ribosomal Subunits Reveals Modes and Regulation of Translation Initiation. Cell Rep 2021; 31:107534. [PMID: 32320657 DOI: 10.1016/j.celrep.2020.107534] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Revised: 02/28/2020] [Accepted: 03/27/2020] [Indexed: 12/11/2022] Open
Abstract
Translation initiation is often attributed as the rate-determining step of eukaryotic protein synthesis and key to gene expression control. Despite this centrality, the series of steps involved in this process is poorly understood. Here, we capture the transcriptome-wide occupancy of ribosomes across all stages of translation initiation, enabling us to characterize the transcriptome-wide dynamics of ribosome recruitment to mRNAs, scanning across 5' UTRs and stop codon recognition, in a higher eukaryote. We provide mechanistic evidence for ribosomes attaching to the mRNA by threading the mRNA through the small subunit. Moreover, we identify features that regulate the recruitment and processivity of scanning ribosomes and redefine optimal initiation contexts. Our approach enables deconvoluting translation initiation into separate stages and identifying regulators at each step.
Collapse
Affiliation(s)
- Adam Giess
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen 5020, Norway
| | - Yamila N Torres Cleuren
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen 5020, Norway.
| | - Håkon Tjeldnes
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen 5020, Norway
| | - Maximilian Krause
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen 5020, Norway; Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen 5008, Norway
| | | | - Senna Hiensch
- Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen 5008, Norway
| | - Aniekan Okon
- Department Medicinal Chemistry, University of Minnesota, Minneapolis, MN 55455, USA
| | - Carston R Wagner
- Department Medicinal Chemistry, University of Minnesota, Minneapolis, MN 55455, USA
| | - Eivind Valen
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen 5020, Norway; Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen 5008, Norway.
| |
Collapse
|
71
|
Karollus A, Avsec Ž, Gagneur J. Predicting mean ribosome load for 5'UTR of any length using deep learning. PLoS Comput Biol 2021; 17:e1008982. [PMID: 33970899 PMCID: PMC8136849 DOI: 10.1371/journal.pcbi.1008982] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Revised: 05/20/2021] [Accepted: 04/19/2021] [Indexed: 01/07/2023] Open
Abstract
The 5’ untranslated region plays a key role in regulating mRNA translation and consequently protein abundance. Therefore, accurate modeling of 5’UTR regulatory sequences shall provide insights into translational control mechanisms and help interpret genetic variants. Recently, a model was trained on a massively parallel reporter assay to predict mean ribosome load (MRL)—a proxy for translation rate—directly from 5’UTR sequence with a high degree of accuracy. However, this model is restricted to sequence lengths investigated in the reporter assay and therefore cannot be applied to the majority of human sequences without a substantial loss of information. Here, we introduced frame pooling, a novel neural network operation that enabled the development of an MRL prediction model for 5’UTRs of any length. Our model shows state-of-the-art performance on fixed length randomized sequences, while offering better generalization performance on longer sequences and on a variety of translation-related genome-wide datasets. Variant interpretation is demonstrated on a 5’UTR variant of the gene HBB associated with beta-thalassemia. Frame pooling could find applications in other bioinformatics predictive tasks. Moreover, our model, released open source, could help pinpoint pathogenic genetic variants. The human genome carries a complex code. It consists of genes, which provide blueprints to assemble proteins, and regulatory elements, which control when, where, and how often particular genes are transcribed and translated into protein. To read the genome correctly and specifically to find the causes of inherited diseases, we need to be able to find and interpret these regulatory elements. Here, we focus on particular regions of the genome, the so-called 5’ untranslated regions, which play an important role in determining how often a transcribed gene is translated into protein. We develop deep learning models which can quantitatively interpret regulatory elements in human 5’ untranslated regions and use this information to predict a proxy of the translation efficiency. Our model generalizes a previous model to 5’ untranslated regions of any length, just as they are encountered in natural human genes. Because this model requires only the sequence as input, it can give estimates for the impact of mutations in the sequence, even if these particular mutations are very rare or entirely novel. Such estimates could help pinpoint mutations that disrupt the normal functioning of gene regulation, which could be used to better diagnose patients suffering from rare genetic disorders.
Collapse
Affiliation(s)
- Alexander Karollus
- Department of Informatics, Technical University of Munich, Garching, Germany
| | - Žiga Avsec
- Department of Informatics, Technical University of Munich, Garching, Germany
- Graduate School of Quantitative Biosciences (QBM), Ludwig-Maximilians-Universität München, Munich, Germany
| | - Julien Gagneur
- Department of Informatics, Technical University of Munich, Garching, Germany
- Institute of Human Genetics, Technical University of Munich, Munich, Germany
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
- * E-mail:
| |
Collapse
|
72
|
McClements ME, Butt A, Piotter E, Peddle CF, MacLaren RE. An analysis of the Kozak consensus in retinal genes and its relevance to gene therapy. Mol Vis 2021; 27:233-242. [PMID: 34012226 PMCID: PMC8116250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 05/06/2021] [Indexed: 11/24/2022] Open
Abstract
PURPOSE The classic Kozak consensus is a critical genetic element included in gene therapy transgenes to encourage the translation of the therapeutic coding sequence. Despite optimizations of other transgene elements, the Kozak consensus has not yet been considered for potential tissue-specific sequence refinement. We screened the -9 to -1 region relative to the AUG start codon of retina-specific genes to identify whether a Kozak consensus that is different from the classic sequence may be more appropriate for inclusion in gene therapy transgenes that treat inherited retinal disease. METHODS Sequences for 135 genes known to cause nonsyndromic inherited retinal disease were extracted from the NCBI database, and the -9 to -1 nucleotides were compared. This panel was then refined to 75 genes with specific retinal functions, for which the -9 to -1 nucleotides were placed in front of a GFP transcript sequence and RNAfold predictions performed. These were compared with a GFP sequence with the classic Kozak consensus (GCCGCCACC), and sequences from retinal genes with minimum free energy (MFE) predictions greater than the reference sequence were selected to generate an optimized Kozak consensus sequence. The original Kozak consensus and the refined retina Kozak consensus were placed upstream of the Renilla luciferase coding sequence, which were used to transfect retinoblastoma cell lines Y-79 and WERI-RB-1 and HEK 293T/17 cells. RESULTS The nucleotide frequencies of the original panel of genes were determined to be comparable to the classic Kozak consensus. RNAfold analysis of a GFP transcript with the classic Kozak sequence in the 5' untranslated region (UTR) generated an MFE prediction of -503.3 kcal/mol. RNAfold analysis was then performed with a GFP transcript containing each -9 to -1 Kozak sequence of 75 retinal genes. Thirty-eight of the 75 genes provided a greater MFE value than -503.3 kcal/mol and exhibited an absence of stable secondary structures before the AUG codon. The -9 to -1 nucleotide frequencies of these genes identified a Kozak consensus of ACCGAGACC, differing from the classic Kozak consensus at positions -9, -5, and -4. Applying this sequence to the GFP transcript increased the MFE prediction to -500.1 kcal/mol. The newly identified retina Kozak sequence was also applied to Renilla luciferase plus the REP1 and RPGR transcripts used in current clinical trials. In all examples, the predicted transcript MFE score increased when compared with the current transcript sequences containing classic Kozak consensus sequences. In vitro transfections identified a 7%-9% increase in Renilla activity when incorporating the optimized Kozak sequence. CONCLUSIONS The Kozak consensus is a critical element of eukaryotic genes; therefore, it is a required feature of gene therapy transgenes. To date, the classic sequence of GCCRCC (-6 to -1) has typically been incorporated in gene therapy transgenes, but the analysis described here suggests that, for vectors targeting the retina, using a Kozak consensus derived from retinal genes can provide increased expression of the target product.
Collapse
|
73
|
Xu C, Zhang J. Mammalian Alternative Translation Initiation Is Mostly Nonadaptive. Mol Biol Evol 2021; 37:2015-2028. [PMID: 32145028 DOI: 10.1093/molbev/msaa063] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Alternative translation initiation (ATLI) refers to the existence of multiple translation initiation sites per gene and is a widespread phenomenon in eukaryotes. ATLI is commonly assumed to be advantageous through creating proteome diversity or regulating protein synthesis. We here propose an alternative hypothesis that ATLI arises primarily from nonadaptive initiation errors presumably due to the limited ability of ribosomes to distinguish sequence motifs truly signaling translation initiation from similar sequences. Our hypothesis, but not the adaptive hypothesis, predicts a series of global patterns of ATLI, all of which are confirmed at the genomic scale by quantitative translation initiation sequencing in multiple human and mouse cell lines and tissues. Similarly, although many codons differing from AUG by one nucleotide can serve as start codons, our analysis suggests that using non-AUG start codons is mostly disadvantageous. These and other findings strongly suggest that ATLI predominantly results from molecular error, requiring a major revision of our understanding of the precision and regulation of translation initiation.
Collapse
Affiliation(s)
- Chuan Xu
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI
| |
Collapse
|
74
|
Reprogramming translation for gene therapy. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2021; 182:439-476. [PMID: 34175050 DOI: 10.1016/bs.pmbts.2021.01.028] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Translational control plays a fundamental role in the regulation of gene expression in eukaryotes. Modulating translational efficiency allows the cell to fine-tune the expression of genes, spatially control protein localization, and trigger fast responses to environmental stresses. Translational regulation involves mechanisms acting on multiple steps of the protein synthesis pathway: initiation, elongation, and termination. Many cis-acting elements present in the 5' UTR of transcripts can influence translation at the initiation step. Among them, the Kozak sequence impacts translational efficiency by regulating the recognition of the start codon; upstream open reading frames (uORFs) are associated with inhibition of translation of the downstream protein; internal ribosomal entry sites (IRESs) can promote cap-independent translation. CRISPR-Cas technology is a revolutionary gene-editing tool that has also been applied to the regulation of gene expression. In this chapter, we focus on the genome editing approaches developed to modulate the translational efficiency with the aim to find novel therapeutic approaches, in particular acting on the cis-elements, that regulate the initiation of protein synthesis.
Collapse
|
75
|
Torma G, Tombácz D, Csabai Z, Göbhardter D, Deim Z, Snyder M, Boldogkői Z. An Integrated Sequencing Approach for Updating the Pseudorabies Virus Transcriptome. Pathogens 2021; 10:pathogens10020242. [PMID: 33672563 PMCID: PMC7924054 DOI: 10.3390/pathogens10020242] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 02/17/2021] [Accepted: 02/18/2021] [Indexed: 01/06/2023] Open
Abstract
In the last couple of years, the implementation of long-read sequencing (LRS) technologies for transcriptome profiling has uncovered an extreme complexity of viral gene expression. In this study, we carried out a systematic analysis on the pseudorabies virus transcriptome by combining our current data obtained by using Pacific Biosciences Sequel and Oxford Nanopore Technologies MinION sequencing with our earlier data generated by other LRS and short-read sequencing techniques. As a result, we identified a number of novel genes, transcripts, and transcript isoforms, including splice and length variants, and also confirmed earlier annotated RNA molecules. One of the major findings of this study is the discovery of a large number of 5′-truncations of larger putative mRNAs being 3′-co-terminal with canonical mRNAs of PRV. A large fraction of these putative RNAs contain in-frame ATGs, which might initiate translation of N-terminally truncated polypeptides. Our analyses indicate that CTO-S, a replication origin-associated RNA molecule is expressed at an extremely high level. This study demonstrates that the PRV transcriptome is much more complex than previously appreciated.
Collapse
Affiliation(s)
- Gábor Torma
- Department of Medical Biology, Faculty of Medicine, University of Szeged, 6720 Szeged, Hungary; (G.T.); (D.T.); (Z.C.); (D.G.)
| | - Dóra Tombácz
- Department of Medical Biology, Faculty of Medicine, University of Szeged, 6720 Szeged, Hungary; (G.T.); (D.T.); (Z.C.); (D.G.)
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94304, USA;
| | - Zsolt Csabai
- Department of Medical Biology, Faculty of Medicine, University of Szeged, 6720 Szeged, Hungary; (G.T.); (D.T.); (Z.C.); (D.G.)
| | - Dániel Göbhardter
- Department of Medical Biology, Faculty of Medicine, University of Szeged, 6720 Szeged, Hungary; (G.T.); (D.T.); (Z.C.); (D.G.)
| | - Zoltán Deim
- Department of Biotechnology, Faculty of Science and Informatics, University of Szeged, 6726 Szeged, Hungary;
| | - Michael Snyder
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94304, USA;
| | - Zsolt Boldogkői
- Department of Medical Biology, Faculty of Medicine, University of Szeged, 6720 Szeged, Hungary; (G.T.); (D.T.); (Z.C.); (D.G.)
- Correspondence:
| |
Collapse
|
76
|
Zhang H, Wang Y, Wu X, Tang X, Wu C, Lu J. Determinants of genome-wide distribution and evolution of uORFs in eukaryotes. Nat Commun 2021; 12:1076. [PMID: 33597535 PMCID: PMC7889888 DOI: 10.1038/s41467-021-21394-y] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Accepted: 01/20/2021] [Indexed: 01/02/2023] Open
Abstract
Upstream open reading frames (uORFs) play widespread regulatory functions in modulating mRNA translation in eukaryotes, but the principles underlying the genomic distribution and evolution of uORFs remain poorly understood. Here, we analyze ~17 million putative canonical uORFs in 478 eukaryotic species that span most of the extant taxa of eukaryotes. We demonstrate how positive and purifying selection, coupled with differences in effective population size (Ne), has shaped the contents of uORFs in eukaryotes. Besides, gene expression level is important in influencing uORF occurrences across genes in a species. Our analyses suggest that most uORFs might play regulatory roles rather than encode functional peptides. We also show that the Kozak sequence context of uORFs has evolved across eukaryotic clades, and that noncanonical uORFs tend to have weaker suppressive effects than canonical uORFs in translation regulation. This study provides insights into the driving forces underlying uORF evolution in eukaryotes.
Collapse
Affiliation(s)
- Hong Zhang
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China
| | - Yirong Wang
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China
- College of Biology, Hunan University, Changsha, China
| | - Xinkai Wu
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China
| | - Xiaolu Tang
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China
| | - Changcheng Wu
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China
| | - Jian Lu
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China.
| |
Collapse
|
77
|
Li YR, Liu MJ. Prevalence of alternative AUG and non-AUG translation initiators and their regulatory effects across plants. Genome Res 2020; 30:1418-1433. [PMID: 32973042 PMCID: PMC7605272 DOI: 10.1101/gr.261834.120] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Accepted: 08/19/2020] [Indexed: 12/11/2022]
Abstract
Translation initiation is a key step determining protein synthesis. Studies have uncovered a number of alternative translation initiation sites (TISs) in mammalian mRNAs and showed their roles in reshaping the proteome. However, the extent to which alternative TISs affect gene expression across plants remains largely unclear. Here, by profiling initiating ribosome positions, we globally identified in vivo TISs in tomato and Arabidopsis and found thousands of genes with more than one TIS. Of the identified TISs, >19% and >20% were located at unannotated AUG and non-AUG sites, respectively. CUG and ACG were the most frequently observed codons at non-AUG TISs, a phenomenon also found in mammals. In addition, although alternative TISs were usually found in both orthologous genes, the TIS sequences were not conserved, suggesting the conservation of alternative initiation mechanisms but flexibility in using TISs. Unlike upstream AUG TISs, the presence of upstream non-AUG TISs was not correlated with the translational repression of main open reading frames, a pattern observed across plants. Also, the generation of proteins with diverse N-terminal regions through the use of alternative TISs contributes to differential subcellular localization, as mutating alternative TISs resulted in the loss of organelle localization. Our findings uncovered the hidden coding potential of plant genomes and, importantly, the constraint and flexibility of translational initiation mechanisms in the regulation of gene expression across plant species.
Collapse
Affiliation(s)
- Ya-Ru Li
- Biotechnology Center in Southern Taiwan, Academia Sinica, Tainan 741, Taiwan
| | - Ming-Jung Liu
- Biotechnology Center in Southern Taiwan, Academia Sinica, Tainan 741, Taiwan.,Agricultural Biotechnology Research Center, Academia Sinica, Taipei 115, Taiwan
| |
Collapse
|
78
|
Unusually efficient CUG initiation of an overlapping reading frame in POLG mRNA yields novel protein POLGARF. Proc Natl Acad Sci U S A 2020; 117:24936-24946. [PMID: 32958672 DOI: 10.1073/pnas.2001433117] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
While near-cognate codons are frequently used for translation initiation in eukaryotes, their efficiencies are usually low (<10% compared to an AUG in optimal context). Here, we describe a rare case of highly efficient near-cognate initiation. A CUG triplet located in the 5' leader of POLG messenger RNA (mRNA) initiates almost as efficiently (∼60 to 70%) as an AUG in optimal context. This CUG directs translation of a conserved 260-triplet-long overlapping open reading frame (ORF), which we call POLGARF (POLG Alternative Reading Frame). Translation of a short upstream ORF 5' of this CUG governs the ratio between POLG (the catalytic subunit of mitochondrial DNA polymerase) and POLGARF synthesized from a single POLG mRNA. Functional investigation of POLGARF suggests a role in extracellular signaling. While unprocessed POLGARF localizes to the nucleoli together with its interacting partner C1QBP, serum stimulation results in rapid cleavage and secretion of a POLGARF C-terminal fragment. Phylogenetic analysis shows that POLGARF evolved ∼160 million y ago due to a mammalian-wide interspersed repeat (MIR) transposition into the 5' leader sequence of the mammalian POLG gene, which became fixed in placental mammals. This discovery of POLGARF unveils a previously undescribed mechanism of de novo protein-coding gene evolution.
Collapse
|
79
|
Dever TE, Ivanov IP, Sachs MS. Conserved Upstream Open Reading Frame Nascent Peptides That Control Translation. Annu Rev Genet 2020; 54:237-264. [PMID: 32870728 DOI: 10.1146/annurev-genet-112618-043822] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Cells utilize transcriptional and posttranscriptional mechanisms to alter gene expression in response to environmental cues. Gene-specific controls, including changing the translation of specific messenger RNAs (mRNAs), provide a rapid means to respond precisely to different conditions. Upstream open reading frames (uORFs) are known to control the translation of mRNAs. Recent studies in bacteria and eukaryotes have revealed the functions of evolutionarily conserved uORF-encoded peptides. Some of these uORF-encoded nascent peptides enable responses to specific metabolites to modulate the translation of their mRNAs by stalling ribosomes and through ribosome stalling may also modulate the level of their mRNAs. In this review, we highlight several examples of conserved uORF nascent peptides that stall ribosomes to regulate gene expression in response to specific metabolites in bacteria, fungi, mammals, and plants.
Collapse
Affiliation(s)
- Thomas E Dever
- Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland 20892, USA; ,
| | - Ivaylo P Ivanov
- Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland 20892, USA; ,
| | - Matthew S Sachs
- Department of Biology, Texas A&M University, College Station, Texas 77843, USA;
| |
Collapse
|
80
|
Akirtava C, McManus CJ. Control of translation by eukaryotic mRNA transcript leaders-Insights from high-throughput assays and computational modeling. WILEY INTERDISCIPLINARY REVIEWS-RNA 2020; 12:e1623. [PMID: 32869519 DOI: 10.1002/wrna.1623] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Revised: 07/23/2020] [Accepted: 07/30/2020] [Indexed: 12/21/2022]
Abstract
Eukaryotic gene expression is tightly regulated during translation of mRNA to protein. Mis-regulation of translation can lead to aberrant proteins which accumulate in cancers and cause neurodegenerative diseases. Foundational studies on model genes established fundamental roles for mRNA 5' transcript leader (TL) sequences in controlling ribosome recruitment, scanning, and initiation. TL cis-regulatory elements and their corresponding trans-acting factors control cap-dependent initiation under unstressed conditions. Under stress, cap-dependent initiation is suppressed, and specific mRNA structures and sequences promote translation of stress-responsive transcripts to remodel the proteome. In this review, we summarize current knowledge of TL functions in translation initiation. We focus on insights from high-throughput analyses of ribosome occupancy, mRNA structure, RNA Binding Protein occupancy, and massively parallel reporter assays. These data-driven approaches, coupled with computational analyses and modeling, have paved the way for a comprehensive understanding of TL functions. Finally, we will discuss areas of future research on the roles of mRNA sequences and structures in translation. This article is categorized under: Translation > Translation Mechanisms RNA Evolution and Genomics > Computational Analyses of RNA RNA Structure and Dynamics > Influence of RNA Structure in Biological Systems.
Collapse
Affiliation(s)
- Christina Akirtava
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| | - Charles Joel McManus
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA.,Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| |
Collapse
|
81
|
Decoding mRNA translatability and stability from the 5' UTR. Nat Struct Mol Biol 2020; 27:814-821. [PMID: 32719458 DOI: 10.1038/s41594-020-0465-x] [Citation(s) in RCA: 130] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 06/16/2020] [Indexed: 11/09/2022]
Abstract
Precise control of protein synthesis by engineering sequence elements in 5' untranslated regions (5' UTRs) remains a fundamental challenge. To accelerate our understanding of the cis-regulatory code embedded in 5' UTRs, we devised massively parallel reporter assays from a synthetic messenger RNA library composed of over one million 5' UTR variants. A completely randomized 10-nucleotide sequence preceding an upstream open reading frame (uORF) and downstream GFP drives a broad range of translational outputs and mRNA stability in mammalian cells. While efficient translation protects mRNA from degradation, uORF translation triggers mRNA decay in a UPF1-dependent manner. We also identified translational inhibitory elements with G-quadruplexes as marks for mRNA decay in P-bodies. Unexpectedly, an unstructured A-rich element in 5' UTRs destabilizes mRNAs in the absence of translation, although it enables cap-independent translation. Our results not only identify diverse sequence features of 5' UTRs that control mRNA translatability, but they also reveal ribosome-dependent and ribosome-independent mRNA-surveillance pathways.
Collapse
|
82
|
Translation initiation downstream from annotated start codons in human mRNAs coevolves with the Kozak context. Genome Res 2020; 30:974-984. [PMID: 32669370 PMCID: PMC7397870 DOI: 10.1101/gr.257352.119] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Accepted: 06/25/2020] [Indexed: 12/13/2022]
Abstract
Eukaryotic translation initiation involves preinitiation ribosomal complex 5′-to-3′ directional probing of mRNA for codons suitable for starting protein synthesis. The recognition of codons as starts depends on the codon identity and on its immediate nucleotide context known as Kozak context. When the context is weak (i.e., nonoptimal), leaky scanning takes place during which a fraction of ribosomes continues the mRNA probing. We explored the relationship between the context of AUG codons annotated as starts of protein-coding sequences and the next AUG codon occurrence. We found that AUG codons downstream from weak starts occur in the same frame more frequently than downstream from strong starts. We suggest that evolutionary selection on in-frame AUGs downstream from weak start codons is driven by the advantage of the reduction of wasteful out-of-frame product synthesis and also by the advantage of producing multiple proteoforms from certain mRNAs. We confirmed translation initiation downstream from weak start codons using ribosome profiling data. We also tested translation of alternative start codons in 10 specific human genes using reporter constructs. In all tested cases, initiation at downstream start codons was more productive than at the annotated ones. In most cases, optimization of Kozak context did not completely abolish downstream initiation, and in the specific example of CMPK1 mRNA, the optimized start remained unproductive. Collectively, our work reveals previously uncharacterized forces shaping the evolution of protein-coding genes and points to the plurality of translation initiation and the existence of sequence features influencing start codon selection, other than Kozak context.
Collapse
|
83
|
Li D, Wang J. Ribosome heterogeneity in stem cells and development. J Cell Biol 2020; 219:e202001108. [PMID: 32330234 PMCID: PMC7265316 DOI: 10.1083/jcb.202001108] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 04/03/2020] [Accepted: 04/06/2020] [Indexed: 02/08/2023] Open
Abstract
Translation control is critical to regulate protein expression. By directly adjusting protein levels, cells can quickly respond to dynamic transitions during stem cell differentiation and embryonic development. Ribosomes are multisubunit cellular assemblies that mediate translation. Previously seen as invariant machines with the same composition of components in all conditions, recent studies indicate that ribosomes are heterogeneous and that different ribosome types can preferentially translate specific subsets of mRNAs. Such heterogeneity and specialized translation functions are very important in stem cells and development, as they allow cells to quickly respond to stimuli through direct changes of protein abundance. In this review, we discuss ribosome heterogeneity that arises from multiple features of rRNAs, including rRNA variants and rRNA modifications, and ribosomal proteins, including their stoichiometry, compositions, paralogues, and posttranslational modifications. We also discuss alterations of ribosome-associated proteins (RAPs), with a particular focus on their consequent specialized translational control in stem cells and development.
Collapse
Affiliation(s)
- Dan Li
- Department of Cell, Developmental and Regenerative Biology, The Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY
- The Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Jianlong Wang
- Department of Cell, Developmental and Regenerative Biology, The Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY
- The Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, NY
- Department of Medicine, Columbia Center for Human Development, Columbia University Irving Medical Center, New York, NY
| |
Collapse
|
84
|
Whiffin N, Karczewski KJ, Zhang X, Chothani S, Smith MJ, Evans DG, Roberts AM, Quaife NM, Schafer S, Rackham O, Alföldi J, O'Donnell-Luria AH, Francioli LC, Cook SA, Barton PJR, MacArthur DG, Ware JS. Characterising the loss-of-function impact of 5' untranslated region variants in 15,708 individuals. Nat Commun 2020; 11:2523. [PMID: 32461616 PMCID: PMC7253449 DOI: 10.1038/s41467-019-10717-9] [Citation(s) in RCA: 98] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Accepted: 05/23/2019] [Indexed: 01/17/2023] Open
Abstract
Upstream open reading frames (uORFs) are tissue-specific cis-regulators of protein translation. Isolated reports have shown that variants that create or disrupt uORFs can cause disease. Here, in a systematic genome-wide study using 15,708 whole genome sequences, we show that variants that create new upstream start codons, and variants disrupting stop sites of existing uORFs, are under strong negative selection. This selection signal is significantly stronger for variants arising upstream of genes intolerant to loss-of-function variants. Furthermore, variants creating uORFs that overlap the coding sequence show signals of selection equivalent to coding missense variants. Finally, we identify specific genes where modification of uORFs likely represents an important disease mechanism, and report a novel uORF frameshift variant upstream of NF2 in neurofibromatosis. Our results highlight uORF-perturbing variants as an under-recognised functional class that contribute to penetrant human disease, and demonstrate the power of large-scale population sequencing data in studying non-coding variant classes.
Collapse
Affiliation(s)
- Nicola Whiffin
- National Heart and Lung Institute and MRC London Institute of Medical Sciences, Imperial College London, Du Cane Road, London, W12 0NN, UK.
- NIHR Royal Brompton Cardiovascular Research Centre, Royal Brompton and Harefield National Health Service Foundation Trust, Sydney Street, London, SW3 6NP, UK.
- Medical and Population Genetics, Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA, 02142, USA.
| | - Konrad J Karczewski
- Medical and Population Genetics, Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA, 02142, USA
- Analytical and Translational Genetics Unit, Massachusetts General Hospital, 55 Fruit Street, Boston, MA, 02114, USA
| | - Xiaolei Zhang
- National Heart and Lung Institute and MRC London Institute of Medical Sciences, Imperial College London, Du Cane Road, London, W12 0NN, UK
- NIHR Royal Brompton Cardiovascular Research Centre, Royal Brompton and Harefield National Health Service Foundation Trust, Sydney Street, London, SW3 6NP, UK
| | - Sonia Chothani
- Program in Cardiovascular and Metabolic Disorders, Duke-NUS Medical School, 8 College Road, Singapore, 169857, Singapore
| | - Miriam J Smith
- NW Genomic Laboratory Hub, Centre for Genomic Medicine, Division of Evolution and Genomic Science, St Mary's Hospital, University of Manchester, Oxford Road, Manchester, M13 9WL, UK
| | - D Gareth Evans
- NW Genomic Laboratory Hub, Centre for Genomic Medicine, Division of Evolution and Genomic Science, St Mary's Hospital, University of Manchester, Oxford Road, Manchester, M13 9WL, UK
| | - Angharad M Roberts
- National Heart and Lung Institute and MRC London Institute of Medical Sciences, Imperial College London, Du Cane Road, London, W12 0NN, UK
- NIHR Royal Brompton Cardiovascular Research Centre, Royal Brompton and Harefield National Health Service Foundation Trust, Sydney Street, London, SW3 6NP, UK
| | - Nicholas M Quaife
- National Heart and Lung Institute and MRC London Institute of Medical Sciences, Imperial College London, Du Cane Road, London, W12 0NN, UK
- NIHR Royal Brompton Cardiovascular Research Centre, Royal Brompton and Harefield National Health Service Foundation Trust, Sydney Street, London, SW3 6NP, UK
| | - Sebastian Schafer
- Program in Cardiovascular and Metabolic Disorders, Duke-NUS Medical School, 8 College Road, Singapore, 169857, Singapore
- National Heart Centre Singapore, 5 Hospital Drive, Singapore, 169609, Singapore
| | - Owen Rackham
- Program in Cardiovascular and Metabolic Disorders, Duke-NUS Medical School, 8 College Road, Singapore, 169857, Singapore
| | - Jessica Alföldi
- Medical and Population Genetics, Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA, 02142, USA
- Analytical and Translational Genetics Unit, Massachusetts General Hospital, 55 Fruit Street, Boston, MA, 02114, USA
| | - Anne H O'Donnell-Luria
- Medical and Population Genetics, Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA, 02142, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, 02115, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, 02115, USA
| | - Laurent C Francioli
- Medical and Population Genetics, Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA, 02142, USA
- Analytical and Translational Genetics Unit, Massachusetts General Hospital, 55 Fruit Street, Boston, MA, 02114, USA
| | - Stuart A Cook
- National Heart and Lung Institute and MRC London Institute of Medical Sciences, Imperial College London, Du Cane Road, London, W12 0NN, UK
- Program in Cardiovascular and Metabolic Disorders, Duke-NUS Medical School, 8 College Road, Singapore, 169857, Singapore
- National Heart Centre Singapore, 5 Hospital Drive, Singapore, 169609, Singapore
| | - Paul J R Barton
- National Heart and Lung Institute and MRC London Institute of Medical Sciences, Imperial College London, Du Cane Road, London, W12 0NN, UK
- NIHR Royal Brompton Cardiovascular Research Centre, Royal Brompton and Harefield National Health Service Foundation Trust, Sydney Street, London, SW3 6NP, UK
| | - Daniel G MacArthur
- Medical and Population Genetics, Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA, 02142, USA
- Analytical and Translational Genetics Unit, Massachusetts General Hospital, 55 Fruit Street, Boston, MA, 02114, USA
- Centre for Population Genomics, Garvan Institute of Medical Research, and UNSW Sydney, Sydney, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Australia
| | - James S Ware
- National Heart and Lung Institute and MRC London Institute of Medical Sciences, Imperial College London, Du Cane Road, London, W12 0NN, UK
- NIHR Royal Brompton Cardiovascular Research Centre, Royal Brompton and Harefield National Health Service Foundation Trust, Sydney Street, London, SW3 6NP, UK
- Medical and Population Genetics, Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA, 02142, USA
| |
Collapse
|
85
|
Brunet MA, Brunelle M, Lucier JF, Delcourt V, Levesque M, Grenier F, Samandi S, Leblanc S, Aguilar JD, Dufour P, Jacques JF, Fournier I, Ouangraoua A, Scott MS, Boisvert FM, Roucou X. OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes. Nucleic Acids Res 2020; 47:D403-D410. [PMID: 30299502 PMCID: PMC6323990 DOI: 10.1093/nar/gky936] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Accepted: 10/04/2018] [Indexed: 01/06/2023] Open
Abstract
Advances in proteomics and sequencing have highlighted many non-annotated open reading frames (ORFs) in eukaryotic genomes. Genome annotations, cornerstones of today's research, mostly rely on protein prior knowledge and on ab initio prediction algorithms. Such algorithms notably enforce an arbitrary criterion of one coding sequence (CDS) per transcript, leading to a substantial underestimation of the coding potential of eukaryotes. Here, we present OpenProt, the first database fully endorsing a polycistronic model of eukaryotic genomes to date. OpenProt contains all possible ORFs longer than 30 codons across 10 species, and cumulates supporting evidence such as protein conservation, translation and expression. OpenProt annotates all known proteins (RefProts), novel predicted isoforms (Isoforms) and novel predicted proteins from alternative ORFs (AltProts). It incorporates cutting-edge algorithms to evaluate protein orthology and re-interrogate publicly available ribosome profiling and mass spectrometry datasets, supporting the annotation of thousands of predicted ORFs. The constantly growing database currently cumulates evidence from 87 ribosome profiling and 114 mass spectrometry studies from several species, tissues and cell lines. All data is freely available and downloadable from a web platform (www.openprot.org) supporting a genome browser and advanced queries for each species. Thus, OpenProt enables a more comprehensive landscape of eukaryotic genomes’ coding potential.
Collapse
Affiliation(s)
- Marie A Brunet
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada.,PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université de Lille, F-59000 Lille, France
| | - Mylène Brunelle
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada.,PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université de Lille, F-59000 Lille, France
| | - Jean-François Lucier
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, Québec, Canada.,Biology Department, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Vivian Delcourt
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada.,PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université de Lille, F-59000 Lille, France.,INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France
| | - Maxime Levesque
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, Québec, Canada.,Biology Department, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Frédéric Grenier
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, Québec, Canada.,Biology Department, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Sondos Samandi
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada.,PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université de Lille, F-59000 Lille, France
| | - Sébastien Leblanc
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Jean-David Aguilar
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Pascal Dufour
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Jean-Francois Jacques
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada.,PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université de Lille, F-59000 Lille, France
| | - Isabelle Fournier
- INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France
| | - Aida Ouangraoua
- Informatics Department, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Michelle S Scott
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | | | - Xavier Roucou
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada.,PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université de Lille, F-59000 Lille, France
| |
Collapse
|
86
|
Blanco N, Williams AJ, Tang D, Zhan D, Misaghi S, Kelley RF, Simmons LC. Tailoring translational strength using Kozak sequence variants improves bispecific antibody assembly and reduces product‐related impurities in CHO cells. Biotechnol Bioeng 2020; 117:1946-1960. [DOI: 10.1002/bit.27347] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Revised: 03/06/2020] [Accepted: 03/29/2020] [Indexed: 12/21/2022]
Affiliation(s)
- Noelia Blanco
- Departments of Cell CultureGenentech, Inc., 1 DNA Way South San Francisco California
| | - Ambrose J. Williams
- Departments of Purification DevelopmentGenentech, Inc., 1 DNA Way South San Francisco California
| | - Danming Tang
- Departments of Cell CultureGenentech, Inc., 1 DNA Way South San Francisco California
| | - Dejin Zhan
- Departments of Cell CultureGenentech, Inc., 1 DNA Way South San Francisco California
| | - Shahram Misaghi
- Departments of Cell CultureGenentech, Inc., 1 DNA Way South San Francisco California
| | - Robert F. Kelley
- Departments of Drug DeliveryGenentech, Inc., 1 DNA Way South San Francisco California
| | - Laura C. Simmons
- Departments of Cell CultureGenentech, Inc., 1 DNA Way South San Francisco California
| |
Collapse
|
87
|
Williams JL, Paudyal A, Awad S, Nicholson J, Grzesik D, Botta J, Meimaridou E, Maharaj AV, Stewart M, Tinker A, Cox RD, Metherell LA. Mylk3 null C57BL/6N mice develop cardiomyopathy, whereas Nnt null C57BL/6J mice do not. Life Sci Alliance 2020; 3:3/4/e201900593. [PMID: 32213617 PMCID: PMC7103425 DOI: 10.26508/lsa.201900593] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 03/10/2020] [Accepted: 03/10/2020] [Indexed: 12/30/2022] Open
Abstract
The C57BL/6J and C57BL/6N mice have well-documented phenotypic and genotypic differences, including the infamous nicotinamide nucleotide transhydrogenase (Nnt) null mutation in the C57BL/6J substrain, which has been linked to cardiovascular traits in mice and cardiomyopathy in humans. To assess whether Nnt loss alone causes a cardiovascular phenotype, we investigated the C57BL/6N, C57BL/6J mice and a C57BL/6J-BAC transgenic rescuing NNT expression, at 3, 12, and 18 mo. We identified a modest dilated cardiomyopathy in the C57BL/6N mice, absent in the two B6J substrains. Immunofluorescent staining of cardiomyocytes revealed eccentric hypertrophy in these mice, with defects in sarcomere organisation. RNAseq analysis identified differential expression of a number of cardiac remodelling genes commonly associated with cardiac disease segregating with the phenotype. Variant calling from RNAseq data identified a myosin light chain kinase 3 (Mylk3) mutation in C57BL/6N mice, which abolishes MYLK3 protein expression. These results indicate the C57BL/6J Nnt-null mice do not develop cardiomyopathy; however, we identified a null mutation in Mylk3 as a credible cause of the cardiomyopathy phenotype in the C57BL/6N.
Collapse
Affiliation(s)
- Jack L Williams
- Centre for Endocrinology, William Harvey Research Institute, Charterhouse Square, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK
| | - Anju Paudyal
- Medical Research Council Harwell Institute, Mary Lyon Centre, Harwell Campus, Oxfordshire, UK
| | - Sherine Awad
- Centre for Endocrinology, William Harvey Research Institute, Charterhouse Square, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK
| | - James Nicholson
- Centre for Endocrinology, William Harvey Research Institute, Charterhouse Square, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK
| | - Dominika Grzesik
- Centre for Endocrinology, William Harvey Research Institute, Charterhouse Square, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK
| | - Joaquin Botta
- Centre for Endocrinology, William Harvey Research Institute, Charterhouse Square, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK
| | - Eirini Meimaridou
- School of Human Sciences, London Metropolitan University, London, UK
| | - Avinaash V Maharaj
- Centre for Endocrinology, William Harvey Research Institute, Charterhouse Square, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK
| | - Michelle Stewart
- Medical Research Council Harwell Institute, Mary Lyon Centre, Harwell Campus, Oxfordshire, UK
| | - Andrew Tinker
- William Harvey Heart Centre, William Harvey Research Institute, Charterhouse Square, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK
| | - Roger D Cox
- Medical Research Council Harwell Institute, Mammalian Genetics Unit, Harwell Campus, Oxfordshire, UK
| | - Lou A Metherell
- Centre for Endocrinology, William Harvey Research Institute, Charterhouse Square, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK
| |
Collapse
|
88
|
Esposito D, Weile J, Shendure J, Starita LM, Papenfuss AT, Roth FP, Fowler DM, Rubin AF. MaveDB: an open-source platform to distribute and interpret data from multiplexed assays of variant effect. Genome Biol 2019; 20:223. [PMID: 31679514 PMCID: PMC6827219 DOI: 10.1186/s13059-019-1845-6] [Citation(s) in RCA: 155] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2019] [Accepted: 10/01/2019] [Indexed: 11/10/2022] Open
Abstract
Multiplex assays of variant effect (MAVEs), such as deep mutational scans and massively parallel reporter assays, test thousands of sequence variants in a single experiment. Despite the importance of MAVE data for basic and clinical research, there is no standard resource for their discovery and distribution. Here, we present MaveDB ( https://www.mavedb.org ), a public repository for large-scale measurements of sequence variant impact, designed for interoperability with applications to interpret these datasets. We also describe the first such application, MaveVis, which retrieves, visualizes, and contextualizes variant effect maps. Together, the database and applications will empower the community to mine these powerful datasets.
Collapse
Affiliation(s)
- Daniel Esposito
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Jochen Weile
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Lea M Starita
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Anthony T Papenfuss
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
- Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, VIC, Australia
- Department of Mathematics and Statistics, University of Melbourne, Melbourne, VIC, Australia
| | - Frederick P Roth
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada.
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
- Department of Computer Science, University of Toronto, Toronto, ON, Canada.
- Canadian Institute for Advanced Research, Toronto, ON, Canada.
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Canadian Institute for Advanced Research, Toronto, ON, Canada.
- Department of Bioengineering, University of Washington, Seattle, WA, USA.
| | - Alan F Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia.
- Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia.
| |
Collapse
|
89
|
Kemble H, Nghe P, Tenaillon O. Recent insights into the genotype-phenotype relationship from massively parallel genetic assays. Evol Appl 2019; 12:1721-1742. [PMID: 31548853 PMCID: PMC6752143 DOI: 10.1111/eva.12846] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 06/21/2019] [Accepted: 07/02/2019] [Indexed: 12/20/2022] Open
Abstract
With the molecular revolution in Biology, a mechanistic understanding of the genotype-phenotype relationship became possible. Recently, advances in DNA synthesis and sequencing have enabled the development of deep mutational scanning assays, capable of scoring comprehensive libraries of genotypes for fitness and a variety of phenotypes in massively parallel fashion. The resulting empirical genotype-fitness maps pave the way to predictive models, potentially accelerating our ability to anticipate the behaviour of pathogen and cancerous cell populations from sequencing data. Besides from cellular fitness, phenotypes of direct application in industry (e.g. enzyme activity) and medicine (e.g. antibody binding) can be quantified and even selected directly by these assays. This review discusses the technological basis of and recent developments in massively parallel genetics, along with the trends it is uncovering in the genotype-phenotype relationship (distribution of mutation effects, epistasis), their possible mechanistic bases and future directions for advancing towards the goal of predictive genetics.
Collapse
Affiliation(s)
- Harry Kemble
- Infection, Antimicrobials, Modelling, Evolution, INSERM, Unité Mixte de Recherche 1137Université Paris Diderot, Université Paris NordParisFrance
- École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), UMR CNRS‐ESPCI CBI 8231PSL Research UniversityParis Cedex 05France
| | - Philippe Nghe
- École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), UMR CNRS‐ESPCI CBI 8231PSL Research UniversityParis Cedex 05France
| | - Olivier Tenaillon
- Infection, Antimicrobials, Modelling, Evolution, INSERM, Unité Mixte de Recherche 1137Université Paris Diderot, Université Paris NordParisFrance
| |
Collapse
|
90
|
Xiang JS, Kaplan M, Dykstra P, Hinks M, McKeague M, Smolke CD. Massively parallel RNA device engineering in mammalian cells with RNA-Seq. Nat Commun 2019; 10:4327. [PMID: 31548547 PMCID: PMC6757056 DOI: 10.1038/s41467-019-12334-y] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2019] [Accepted: 08/28/2019] [Indexed: 12/21/2022] Open
Abstract
Synthetic RNA-based genetic devices dynamically control a wide range of gene-regulatory processes across diverse cell types. However, the limited throughput of quantitative assays in mammalian cells has hindered fast iteration and interrogation of sequence space needed to identify new RNA devices. Here we report developing a quantitative, rapid and high-throughput mammalian cell-based RNA-Seq assay to efficiently engineer RNA devices. We identify new ribozyme-based RNA devices that respond to theophylline, hypoxanthine, cyclic-di-GMP, and folinic acid from libraries of ~22,700 sequences in total. The small molecule responsive devices exhibit low basal expression and high activation ratios, significantly expanding our toolset of highly functional ribozyme switches. The large datasets obtained further provide conserved sequence and structure motifs that may be used for rationally guided design. The RNA-Seq approach offers a generally applicable strategy for developing broad classes of RNA devices, thereby advancing the engineering of genetic devices for mammalian systems.
Collapse
Affiliation(s)
- Joy S Xiang
- Department of Bioengineering, 443 Via Ortega, MC 4245, Stanford University, Stanford, CA, 94305, USA
| | - Matias Kaplan
- Department of Bioengineering, 443 Via Ortega, MC 4245, Stanford University, Stanford, CA, 94305, USA
| | - Peter Dykstra
- Department of Bioengineering, 443 Via Ortega, MC 4245, Stanford University, Stanford, CA, 94305, USA
| | - Michaela Hinks
- Department of Bioengineering, 443 Via Ortega, MC 4245, Stanford University, Stanford, CA, 94305, USA
| | - Maureen McKeague
- Department of Pharmacology and Therapeutics, McGill University, 3655 Prom. Sir-William-Osler, Montreal, Quebec, H3G 1Y6, Canada
- Department of Chemistry, McGill University, 801 Sherbrooke Street West, Montreal, Quebec, H3A 0B8, Canada
| | - Christina D Smolke
- Department of Bioengineering, 443 Via Ortega, MC 4245, Stanford University, Stanford, CA, 94305, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, 94158, USA.
| |
Collapse
|
91
|
Vainberg Slutskin I, Weinberger A, Segal E. Sequence determinants of polyadenylation-mediated regulation. Genome Res 2019; 29:1635-1647. [PMID: 31530582 PMCID: PMC6771402 DOI: 10.1101/gr.247312.118] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2018] [Accepted: 08/13/2019] [Indexed: 12/31/2022]
Abstract
The cleavage and polyadenylation reaction is a crucial step in transcription termination and pre-mRNA maturation in human cells. Despite extensive research, the encoding of polyadenylation-mediated regulation of gene expression within the DNA sequence is not well understood. Here, we utilized a massively parallel reporter assay to inspect the effect of over 12,000 rationally designed polyadenylation sequences (PASs) on reporter gene expression and cleavage efficiency. We find that the PAS sequence can modulate gene expression by over five orders of magnitude. By using a uniquely designed scanning mutagenesis data set, we gain mechanistic insight into various modes of action by which the cleavage efficiency affects the sensitivity or robustness of the PAS to mutation. Furthermore, we employ motif discovery to identify both known and novel sequence motifs associated with PAS-mediated regulation. By leveraging the large scale of our data, we train a deep learning model for the highly accurate prediction of RNA levels from DNA sequence alone (R = 0.83). Moreover, we devise unique approaches for predicting exact cleavage sites for our reporter constructs and for endogenous transcripts. Taken together, our results expand our understanding of PAS-mediated regulation, and provide an unprecedented resource for analyzing and predicting PAS for regulatory genomics applications.
Collapse
Affiliation(s)
- Ilya Vainberg Slutskin
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 7610001, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Adina Weinberger
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 7610001, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Eran Segal
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 7610001, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 7610001, Israel
| |
Collapse
|
92
|
Jurkute N, Leu C, Pogoda HM, Arno G, Robson AG, Nürnberg G, Altmüller J, Thiele H, Motameny S, Toliat MR, Powell K, Höhne W, Michaelides M, Webster AR, Moore AT, Hammerschmidt M, Nürnberg P, Yu-Wai-Man P, Votruba M. SSBP1 mutations in dominant optic atrophy with variable retinal degeneration. Ann Neurol 2019; 86:368-383. [PMID: 31298765 PMCID: PMC8855788 DOI: 10.1002/ana.25550] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2018] [Revised: 07/10/2019] [Accepted: 07/10/2019] [Indexed: 12/31/2022]
Abstract
OBJECTIVE Autosomal dominant optic atrophy (ADOA) starts in early childhood with loss of visual acuity and color vision deficits. OPA1 mutations are responsible for the majority of cases, but in a portion of patients with a clinical diagnosis of ADOA, the cause remains unknown. This study aimed to identify novel ADOA-associated genes and explore their causality. METHODS Linkage analysis and sequencing were performed in multigeneration families and unrelated patients to identify disease-causing variants. Functional consequences were investigated in silico and confirmed experimentally using the zebrafish model. RESULTS We defined a new ADOA locus on 7q33-q35 and identified 3 different missense variants in SSBP1 (NM_001256510.1; c.113G>A [p.(Arg38Gln)], c.320G>A [p.(Arg107Gln)] and c.422G>A [p.(Ser141Asn)]) in affected individuals from 2 families and 2 singletons with ADOA and variable retinal degeneration. The mutated arginine residues are part of a basic patch that is essential for single-strand DNA binding. The loss of a positive charge at these positions is very likely to lower the affinity of SSBP1 for single-strand DNA. Antisense-mediated knockdown of endogenous ssbp1 messenger RNA (mRNA) in zebrafish resulted in compromised differentiation of retinal ganglion cells. A similar effect was achieved when mutated mRNAs were administered. These findings point toward an essential role of ssbp1 in retinal development and the dominant-negative nature of the identified human variants, which is consistent with the segregation pattern observed in 2 multigeneration families studied. INTERPRETATION SSBP1 is an essential protein for mitochondrial DNA replication and maintenance. Our data have established pathogenic variants in SSBP1 as a cause of ADOA and variable retinal degeneration. ANN NEUROL 2019;86:368-383.
Collapse
Affiliation(s)
- Neringa Jurkute
- Moorfields Eye Hospital NHS Foundation Trust, London, UK
- UCL Institute of Ophthalmology, University College London, London, UK
| | - Costin Leu
- Cologne Center for Genomics (CCG), University of Cologne, D-50931 Cologne, Germany
- Epilepsy Center, Neurological Institute, Cleveland Clinic, Cleveland, OH 44195, US
- Genomic Medicine Institute, Lerner Research Institute Cleveland Clinic, Cleveland, OH 44195, US
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Hans-Martin Pogoda
- Institute for Zoology, Developmental Biology Unit, University of Cologne, D-50674 Cologne, Germany
| | - Gavin Arno
- Moorfields Eye Hospital NHS Foundation Trust, London, UK
- UCL Institute of Ophthalmology, University College London, London, UK
| | - Anthony G. Robson
- Moorfields Eye Hospital NHS Foundation Trust, London, UK
- UCL Institute of Ophthalmology, University College London, London, UK
| | - Gudrun Nürnberg
- Cologne Center for Genomics (CCG), University of Cologne, D-50931 Cologne, Germany
| | - Janine Altmüller
- Cologne Center for Genomics (CCG), University of Cologne, D-50931 Cologne, Germany
- Center for Molecular Medicine Cologne (CMMC), University of Cologne, D-50931 Cologne, Germany
| | - Holger Thiele
- Cologne Center for Genomics (CCG), University of Cologne, D-50931 Cologne, Germany
| | - Susanne Motameny
- Cologne Center for Genomics (CCG), University of Cologne, D-50931 Cologne, Germany
| | - Mohammad Reza Toliat
- Cologne Center for Genomics (CCG), University of Cologne, D-50931 Cologne, Germany
| | - Kate Powell
- School of Optometry and Vision Sciences, Cardiff University, Cardiff, UK
| | - Wolfgang Höhne
- Cologne Center for Genomics (CCG), University of Cologne, D-50931 Cologne, Germany
| | - Michel Michaelides
- Moorfields Eye Hospital NHS Foundation Trust, London, UK
- UCL Institute of Ophthalmology, University College London, London, UK
| | - Andrew R Webster
- Moorfields Eye Hospital NHS Foundation Trust, London, UK
- UCL Institute of Ophthalmology, University College London, London, UK
| | - Anthony T. Moore
- Moorfields Eye Hospital NHS Foundation Trust, London, UK
- UCL Institute of Ophthalmology, University College London, London, UK
- Department of Ophthalmology, University of California, San Francisco, San Francisco, CA, USA
| | - Matthias Hammerschmidt
- Institute for Zoology, Developmental Biology Unit, University of Cologne, D-50674 Cologne, Germany
- Center for Molecular Medicine Cologne (CMMC), University of Cologne, D-50931 Cologne, Germany
- Cologne Excellence Cluster on Cellular Stress Responses in Aging-Associated Diseases (CECAD), University of Cologne, D-50931 Cologne, Germany
| | - Peter Nürnberg
- Cologne Center for Genomics (CCG), University of Cologne, D-50931 Cologne, Germany
- Center for Molecular Medicine Cologne (CMMC), University of Cologne, D-50931 Cologne, Germany
- Cologne Excellence Cluster on Cellular Stress Responses in Aging-Associated Diseases (CECAD), University of Cologne, D-50931 Cologne, Germany
| | - Patrick Yu-Wai-Man
- Moorfields Eye Hospital NHS Foundation Trust, London, UK
- UCL Institute of Ophthalmology, University College London, London, UK
- Cambridge Eye Unit, Addenbrooke’s Hospital, Cambridge University Hospitals, Cambridge, UK
- Cambridge Centre for Brain Repair and MRC Mitochondrial Biology Unit, Department of Clinical Neurosciences, University of Cambridge, Cambridge, UK
| | - Marcela Votruba
- School of Optometry and Vision Sciences, Cardiff University, Cardiff, UK
- Cardiff Eye Unit, University Hospital Wales, Cardiff, UK
| |
Collapse
|
93
|
Li JJ, Chew GL, Biggin MD. Quantitative principles of cis-translational control by general mRNA sequence features in eukaryotes. Genome Biol 2019; 20:162. [PMID: 31399036 PMCID: PMC6689182 DOI: 10.1186/s13059-019-1761-9] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Accepted: 07/11/2019] [Indexed: 12/17/2022] Open
Abstract
Background General translational cis-elements are present in the mRNAs of all genes and affect the recruitment, assembly, and progress of preinitiation complexes and the ribosome under many physiological states. These elements include mRNA folding, upstream open reading frames, specific nucleotides flanking the initiating AUG codon, protein coding sequence length, and codon usage. The quantitative contributions of these sequence features and how and why they coordinate to control translation rates are not well understood. Results Here, we show that these sequence features specify 42–81% of the variance in translation rates in Saccharomyces cerevisiae, Schizosaccharomyces pombe, Arabidopsis thaliana, Mus musculus, and Homo sapiens. We establish that control by RNA secondary structure is chiefly mediated by highly folded 25–60 nucleotide segments within mRNA 5′ regions, that changes in tri-nucleotide frequencies between highly and poorly translated 5′ regions are correlated between all species, and that control by distinct biochemical processes is extensively correlated as is regulation by a single process acting in different parts of the same mRNA. Conclusions Our work shows that general features control a much larger fraction of the variance in translation rates than previously realized. We provide a more detailed and accurate understanding of the aspects of RNA structure that directs translation in diverse eukaryotes. In addition, we note that the strongly correlated regulation between and within cis-control features will cause more even densities of translational complexes along each mRNA and therefore more efficient use of the translation machinery by the cell. Electronic supplementary material The online version of this article (10.1186/s13059-019-1761-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jingyi Jessica Li
- Department of Statistics, Department of Biomathematics, and Department of Human Genetics, University of California, Los Angeles, CA, 90095, USA.
| | - Guo-Liang Chew
- Computational Biology Program, Public Health Sciences and Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
| | - Mark Douglas Biggin
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94708, USA.
| |
Collapse
|
94
|
Petersen SD, Zhang J, Lee JS, Jakociunas T, Grav LM, Kildegaard HF, Keasling JD, Jensen MK. Modular 5'-UTR hexamers for context-independent tuning of protein expression in eukaryotes. Nucleic Acids Res 2019; 46:e127. [PMID: 30124898 PMCID: PMC6265478 DOI: 10.1093/nar/gky734] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Accepted: 08/01/2018] [Indexed: 11/25/2022] Open
Abstract
Functional characterization of regulatory DNA elements in broad genetic contexts is a prerequisite for forward engineering of biological systems. Translation initiation site (TIS) sequences are attractive to use for regulating gene activity and metabolic pathway fluxes because the genetic changes are minimal. However, limited knowledge is available on tuning gene outputs by varying TISs in different genetic and environmental contexts. Here, we created TIS hexamer libraries in baker’s yeast Saccharomyces cerevisiae directly 5′ end of a reporter gene in various promoter contexts and measured gene activity distributions for each library. Next, selected TIS sequences, resulted in almost 10-fold changes in reporter outputs, were experimentally characterized in various environmental and genetic contexts in both yeast and mammalian cells. From our analyses, we observed strong linear correlations (R2 = 0.75–0.98) between all pairwise combinations of TIS order and gene activity. Finally, our analysis enabled the identification of a TIS with almost 50% stronger output than a commonly used TIS for protein expression in mammalian cells, and selected TISs were also used to tune gene activities in yeast at a metabolic branch point in order to prototype fitness and carotenoid production landscapes. Taken together, the characterized TISs support reliable context-independent forward engineering of translation initiation in eukaryotes.
Collapse
Affiliation(s)
- Søren D Petersen
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark
| | - Jie Zhang
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark
| | - Jae S Lee
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark
| | - Tadas Jakociunas
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark
| | - Lise M Grav
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark
| | - Helene F Kildegaard
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark
| | - Jay D Keasling
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark.,Joint BioEnergy Institute, Emeryville, CA 94608, USA.,Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.,Department of Chemical and Biomolecular Engineering, University of California, Berkeley, CA 94720, USA.,Department of Bioengineering, University of California, Berkeley, CA 94720, USA.,Center for Synthetic Biochemistry, Institute for Synthetic Biology, Shenzhen Institutes of Advanced Technologies, Shenzhen 518055, China
| | - Michael K Jensen
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark
| |
Collapse
|
95
|
Diaz de Arce AJ, Noderer WL, Wang CL. Complete motif analysis of sequence requirements for translation initiation at non-AUG start codons. Nucleic Acids Res 2019; 46:985-994. [PMID: 29228265 PMCID: PMC5778536 DOI: 10.1093/nar/gkx1114] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2017] [Accepted: 12/06/2017] [Indexed: 01/23/2023] Open
Abstract
The initiation of mRNA translation from start codons other than AUG was previously believed to be rare and of relatively low impact. More recently, evidence has suggested that as much as half of all translation initiation utilizes non-AUG start codons, codons that deviate from AUG by a single base. Furthermore, non-AUG start codons have been shown to be involved in regulation of expression and disease etiology. Yet the ability to gauge expression based on the sequence of a translation initiation site (start codon and its flanking bases) has been limited. Here we have performed a comprehensive analysis of translation initiation sites that utilize non-AUG start codons. By combining genetic-reporter, cell-sorting, and high-throughput sequencing technologies, we have analyzed the expression associated with all possible variants of the -4 to +4 positions of non-AUG translation initiation site motifs. This complete motif analysis revealed that 1) with the right sequence context, certain non-AUG start codons can generate expression comparable to that of AUG start codons, 2) sequence context affects each non-AUG start codon differently, and 3) initiation at non-AUG start codons is highly sensitive to changes in the flanking sequences. Complete motif analysis has the potential to be a key tool for experimental and diagnostic genomics.
Collapse
Affiliation(s)
| | - William L Noderer
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, USA
| | - Clifford L Wang
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
96
|
Sample PJ, Wang B, Reid DW, Presnyak V, McFadyen IJ, Morris DR, Seelig G. Human 5' UTR design and variant effect prediction from a massively parallel translation assay. Nat Biotechnol 2019; 37:803-809. [PMID: 31267113 PMCID: PMC7100133 DOI: 10.1038/s41587-019-0164-5] [Citation(s) in RCA: 213] [Impact Index Per Article: 35.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2018] [Accepted: 05/21/2019] [Indexed: 12/20/2022]
Abstract
The ability to predict the impact of cis-regulatory sequences on gene expression would facilitate discovery in fundamental and applied biology. Here we combine polysome profiling of a library of 280,000 randomized 5' untranslated regions (UTRs) with deep learning to build a predictive model that relates human 5' UTR sequence to translation. Together with a genetic algorithm, we use the model to engineer new 5' UTRs that accurately direct specified levels of ribosome loading, providing the ability to tune sequences for optimal protein expression. We show that the same approach can be extended to chemically modified RNA, an important feature for applications in mRNA therapeutics and synthetic biology. We test 35,212 truncated human 5' UTRs and 3,577 naturally occurring variants and show that the model predicts ribosome loading of these sequences. Finally, we provide evidence of 45 single-nucleotide variants (SNVs) associated with human diseases that substantially change ribosome loading and thus may represent a molecular basis for disease.
Collapse
Affiliation(s)
- Paul J Sample
- Department of Electrical Engineering, University of Washington, Seattle, WA, USA
| | - Ban Wang
- Department of Electrical Engineering, University of Washington, Seattle, WA, USA
| | | | | | | | - David R Morris
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Georg Seelig
- Department of Electrical Engineering, University of Washington, Seattle, WA, USA.
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, USA.
| |
Collapse
|
97
|
Chen HH, Tarn WY. uORF-mediated translational control: recently elucidated mechanisms and implications in cancer. RNA Biol 2019; 16:1327-1338. [PMID: 31234713 DOI: 10.1080/15476286.2019.1632634] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Protein synthesis is tightly regulated, and its dysregulation can contribute to the pathology of various diseases, including cancer. Increased or selective translation of mRNAs can promote cancer cell proliferation, metastasis and tumor expansion. Translational control is one of the most important means for cells to quickly adapt to environmental stresses. Adaptive translation involves various alternative mechanisms of translation initiation. Upstream open reading frames (uORFs) serve as a major regulator of stress-responsive translational control. Since recent advances in omics technologies including ribo-seq have expanded our knowledge of translation, we discuss emerging mechanisms for uORF-mediated translation regulation and its impact on cancer cell biology. A better understanding of dysregulated translational control of uORFs in cancer would facilitate the development of new strategies for cancer therapy.
Collapse
Affiliation(s)
- Hung-Hsi Chen
- Institute of Biomedical Sciences, Academia Sinica , Taipei , Taiwan
| | - Woan-Yuh Tarn
- Institute of Biomedical Sciences, Academia Sinica , Taipei , Taiwan
| |
Collapse
|
98
|
Mortz M, Dégletagne C, Romestaing C, Duchamp C. Comparative genomic analysis identifies small open reading frames (sORFs) with peptide-encoding features in avian 16S rDNA. Genomics 2019; 112:1120-1127. [PMID: 31247329 DOI: 10.1016/j.ygeno.2019.06.026] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Revised: 06/01/2019] [Accepted: 06/21/2019] [Indexed: 12/14/2022]
Abstract
The mitochondrial genome (mt-DNA) functional repertoire has recently been enriched in mammals by the identification of functional small open reading frames (sORFs) embedded in ribosomal DNAs. Through comparative genomic analyses the presence of putatively functional sORFs was investigated in birds. Alignment of available avian mt-DNA sequences revealed highly conserved regions containing four putative sORFs that presented low insertion/deletion polymorphism rate (<0.1%) and preserved in frame start/stop codons in >80% of species. Detected sORFs included avian homologs of human Humanin and Short-Humanin-Like-Peptide 6 and two new sORFs not yet described in mammals. The amino-acid sequences of the four putative encoded peptides were strongly conserved among birds, with amino-acid p-distances (5.6 to 25.4%) similar to those calculated for typical avian mt-DNA-encoded proteins (14.8%). Conservation resulted from either drastic conservation of the nucleotide sequence or negative selection pressure. These data extend to birds the possibility that mitochondrial rDNA may encode small bioactive peptides.
Collapse
Affiliation(s)
- Mathieu Mortz
- Université de Lyon, Laboratoire d'Ecologie des Hydrosystèmes Naturels et Anthropisés, UMR 5023 CNRS, Université Claude Bernard Lyon 1, ENTPE, Villeurbanne Cedex, France
| | - Cyril Dégletagne
- Université de Lyon, Laboratoire d'Ecologie des Hydrosystèmes Naturels et Anthropisés, UMR 5023 CNRS, Université Claude Bernard Lyon 1, ENTPE, Villeurbanne Cedex, France
| | - Caroline Romestaing
- Université de Lyon, Laboratoire d'Ecologie des Hydrosystèmes Naturels et Anthropisés, UMR 5023 CNRS, Université Claude Bernard Lyon 1, ENTPE, Villeurbanne Cedex, France
| | - Claude Duchamp
- Université de Lyon, Laboratoire d'Ecologie des Hydrosystèmes Naturels et Anthropisés, UMR 5023 CNRS, Université Claude Bernard Lyon 1, ENTPE, Villeurbanne Cedex, France.
| |
Collapse
|
99
|
Boersma S, Khuperkar D, Verhagen BMP, Sonneveld S, Grimm JB, Lavis LD, Tanenbaum ME. Multi-Color Single-Molecule Imaging Uncovers Extensive Heterogeneity in mRNA Decoding. Cell 2019; 178:458-472.e19. [PMID: 31178119 PMCID: PMC6630898 DOI: 10.1016/j.cell.2019.05.001] [Citation(s) in RCA: 101] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 03/05/2019] [Accepted: 04/30/2019] [Indexed: 12/17/2022]
Abstract
mRNA translation is a key step in decoding genetic information. Genetic decoding is surprisingly heterogeneous because multiple distinct polypeptides can be synthesized from a single mRNA sequence. To study translational heterogeneity, we developed the MoonTag, a fluorescence labeling system to visualize translation of single mRNAs. When combined with the orthogonal SunTag system, the MoonTag enables dual readouts of translation, greatly expanding the possibilities to interrogate complex translational heterogeneity. By placing MoonTag and SunTag sequences in different translation reading frames, each driven by distinct translation start sites, start site selection of individual ribosomes can be visualized in real time. We find that start site selection is largely stochastic but that the probability of using a particular start site differs among mRNA molecules and can be dynamically regulated over time. This study provides key insights into translation start site selection heterogeneity and provides a powerful toolbox to visualize complex translation dynamics. Development of MoonTag, a fluorescence labeling system to visualize translation Combining MoonTag and SunTag enables visualization of translational heterogeneity mRNAs from a single gene vary in initiation frequency at different start sites Ribosomes take many different “paths” along the 5′ UTR of a single mRNA molecule
Collapse
Affiliation(s)
- Sanne Boersma
- Oncode Institute, Hubrecht Institute - KNAW and University Medical Center Utrecht, Utrecht, the Netherlands
| | - Deepak Khuperkar
- Oncode Institute, Hubrecht Institute - KNAW and University Medical Center Utrecht, Utrecht, the Netherlands
| | - Bram M P Verhagen
- Oncode Institute, Hubrecht Institute - KNAW and University Medical Center Utrecht, Utrecht, the Netherlands
| | - Stijn Sonneveld
- Oncode Institute, Hubrecht Institute - KNAW and University Medical Center Utrecht, Utrecht, the Netherlands
| | - Jonathan B Grimm
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA
| | - Luke D Lavis
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA
| | - Marvin E Tanenbaum
- Oncode Institute, Hubrecht Institute - KNAW and University Medical Center Utrecht, Utrecht, the Netherlands.
| |
Collapse
|
100
|
Qiu C, Kaplan CD. Functional assays for transcription mechanisms in high-throughput. Methods 2019; 159-160:115-123. [PMID: 30797033 PMCID: PMC6589137 DOI: 10.1016/j.ymeth.2019.02.017] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Accepted: 02/18/2019] [Indexed: 01/12/2023] Open
Abstract
Dramatic increases in the scale of programmed synthesis of nucleic acid libraries coupled with deep sequencing have powered advances in understanding nucleic acid and protein biology. Biological systems centering on nucleic acids or encoded proteins greatly benefit from such high-throughput studies, given that large DNA variant pools can be synthesized and DNA, or RNA products of transcription, can be easily analyzed by deep sequencing. Here we review the scope of various high-throughput functional assays for studies of nucleic acids and proteins in general, followed by discussion of how these types of study have yielded insights into the RNA Polymerase II (Pol II) active site as an example. We discuss methodological considerations in the design and execution of these experiments that should be valuable to studies in any system.
Collapse
Affiliation(s)
- Chenxi Qiu
- Department of Medicine, Division of Translational Therapeutics, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, USA; Cancer Research Institute, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Craig D Kaplan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA.
| |
Collapse
|