1
|
Luna-Cerralbo D, Blasco-Machín I, Adame-Pérez S, Lampaya V, Larraga A, Alejo T, Martínez-Oliván J, Broset E, Bruscolini P. A statistical-physics approach for codon usage optimisation. Comput Struct Biotechnol J 2024; 23:3050-3064. [PMID: 39188969 PMCID: PMC11345917 DOI: 10.1016/j.csbj.2024.07.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 07/25/2024] [Accepted: 07/25/2024] [Indexed: 08/28/2024] Open
Abstract
The concept of "codon optimisation" involves adjusting the coding sequence of a target protein to account for the inherent codon preferences of a host species and maximise protein expression in that species. However, there is still a lack of consensus on the most effective approach to achieve optimal results. Existing methods typically depend on heuristic combinations of different variables, leaving the user with the final choice of the sequence hit. In this study, we propose a new statistical-physics model for codon optimisation. This model, called the Nearest-Neighbour interaction (NN) model, links the probability of any given codon sequence to the "interactions" between neighbouring codons. We used the model to design codon sequences for different proteins of interest, and we compared our sequences with the predictions of some commercial tools. In order to assess the importance of the pair interactions, we additionally compared the NN model with a simpler method (Ind) that disregards interactions. It was observed that the NN method yielded similar Codon Adaptation Index (CAI) values to those obtained by other commercial algorithms, despite the fact that CAI was not explicitly considered in the algorithm. By utilising both the NN and Ind methods to optimise the reporter protein luciferase, and then analysing the translation performance in human cell lines and in a mouse model, we found that the NN approach yielded the highest protein expression in vivo. Consequently, we propose that the NN model may prove advantageous in biotechnological applications, such as heterologous protein expression or mRNA-based therapies.
Collapse
Affiliation(s)
- David Luna-Cerralbo
- Department of Theoretical Physics, Faculty of Science, University of Zaragoza, c/ Pedro Cerbuna s/n, Zaragoza, 50009, Spain
- Institute for Biocomputation and Physics of Complex Systems (BIFI), University of Zaragoza, c/ Mariano Esquillor s/n, Zaragoza, 50018, Spain
| | - Irene Blasco-Machín
- Certest Pharma, Certest Biotec S.L, Polígono Industrial Río Gallego II, Calle J, 1, San Mateo de Gállego, 50840, Spain
| | - Susana Adame-Pérez
- Certest Pharma, Certest Biotec S.L, Polígono Industrial Río Gallego II, Calle J, 1, San Mateo de Gállego, 50840, Spain
| | - Verónica Lampaya
- Certest Pharma, Certest Biotec S.L, Polígono Industrial Río Gallego II, Calle J, 1, San Mateo de Gállego, 50840, Spain
| | - Ana Larraga
- Certest Pharma, Certest Biotec S.L, Polígono Industrial Río Gallego II, Calle J, 1, San Mateo de Gállego, 50840, Spain
| | - Teresa Alejo
- Certest Pharma, Certest Biotec S.L, Polígono Industrial Río Gallego II, Calle J, 1, San Mateo de Gállego, 50840, Spain
| | - Juan Martínez-Oliván
- Certest Pharma, Certest Biotec S.L, Polígono Industrial Río Gallego II, Calle J, 1, San Mateo de Gállego, 50840, Spain
| | - Esther Broset
- Certest Pharma, Certest Biotec S.L, Polígono Industrial Río Gallego II, Calle J, 1, San Mateo de Gállego, 50840, Spain
| | - Pierpaolo Bruscolini
- Department of Theoretical Physics, Faculty of Science, University of Zaragoza, c/ Pedro Cerbuna s/n, Zaragoza, 50009, Spain
- Institute for Biocomputation and Physics of Complex Systems (BIFI), University of Zaragoza, c/ Mariano Esquillor s/n, Zaragoza, 50018, Spain
| |
Collapse
|
2
|
Fumagalli SE, Smith S, Ghazanchyan T, Meyer D, Paul R, Campbell C, Santana-Quintero L, Golikov A, Ibla J, Bar H, Komar AA, Hunt RC, Lin B, DiCuccio M, Kimchi-Sarfaty C. Mouse embryo CoCoPUTs: novel murine transcriptomic-weighted usage website featuring multiple strains, tissues, and stages. BMC Bioinformatics 2024; 25:294. [PMID: 39242990 PMCID: PMC11380194 DOI: 10.1186/s12859-024-05906-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Accepted: 08/20/2024] [Indexed: 09/09/2024] Open
Abstract
Mouse (Mus musculus) models have been heavily utilized in developmental biology research to understand mammalian embryonic development, as mice share many genetic, physiological, and developmental characteristics with humans. New explorations into the integration of temporal (stage-specific) and transcriptional (tissue-specific) data have expanded our knowledge of mouse embryo tissue-specific gene functions. To better understand the substantial impact of synonymous mutational variations in the cell-state-specific transcriptome on a tissue's codon and codon pair usage landscape, we have established a novel resource-Mouse Embryo Codon and Codon Pair Usage Tables (Mouse Embryo CoCoPUTs). This webpage not only offers codon and codon pair usage, but also GC, dinucleotide, and junction dinucleotide usage, encompassing four strains, 15 murine embryonic tissue groups, 18 Theiler stages, and 26 embryonic days. Here, we leverage Mouse Embryo CoCoPUTs and employ the use of heatmaps to depict usage changes over time and a comparison to human usage for each strain and embryonic time point, highlighting unique differences and similarities. The usage similarities found between mouse and human central nervous system data highlight the translation for projects leveraging mouse models. Data for this analysis can be directly retrieved from Mouse Embryo CoCoPUTs. This cutting-edge resource plays a crucial role in deciphering the complex interplay between usage patterns and embryonic development, offering valuable insights into variation across diverse tissues, strains, and stages. Its applications extend across multiple domains, with notable advantages for biotherapeutic development, where optimizing codon usage can enhance protein expression; one can compare strains, tissues, and mouse embryonic stages in one query. Additionally, Mouse Embryo CoCoPUTs holds great potential in the field of tissue-specific genetic engineering, providing insights for tailoring gene expression to specific tissues for targeted interventions. Furthermore, this resource may enhance our understanding of the nuanced connections between usage biases and tissue-specific gene function, contributing to the development of more accurate predictive models for genetic disorders.
Collapse
Affiliation(s)
- Sarah E Fumagalli
- Hemostasis Branch 1, Division of Hemostasis, Office of Plasma Protein Therapeutics CMC, Office of Therapeutic Products, Center for Biologics Evaluation and Research (CBER), US Food and Drug Administration (FDA), Silver Spring, MD, USA
| | - Sean Smith
- High-performance Integrated Virtual Environment (HIVE), Office of Biostatistics and Pharmacovigilance (OBPV), Center for Biologics Evaluation and Research (CBER), US Food and Drug Administration (FDA), Silver Spring, MD, USA
| | - Tigran Ghazanchyan
- High-performance Integrated Virtual Environment (HIVE), Office of Biostatistics and Pharmacovigilance (OBPV), Center for Biologics Evaluation and Research (CBER), US Food and Drug Administration (FDA), Silver Spring, MD, USA
| | - Douglas Meyer
- Hemostasis Branch 1, Division of Hemostasis, Office of Plasma Protein Therapeutics CMC, Office of Therapeutic Products, Center for Biologics Evaluation and Research (CBER), US Food and Drug Administration (FDA), Silver Spring, MD, USA
| | - Rahul Paul
- High-performance Integrated Virtual Environment (HIVE), Office of Biostatistics and Pharmacovigilance (OBPV), Center for Biologics Evaluation and Research (CBER), US Food and Drug Administration (FDA), Silver Spring, MD, USA
| | - Collin Campbell
- High-performance Integrated Virtual Environment (HIVE), Office of Biostatistics and Pharmacovigilance (OBPV), Center for Biologics Evaluation and Research (CBER), US Food and Drug Administration (FDA), Silver Spring, MD, USA
| | - Luis Santana-Quintero
- High-performance Integrated Virtual Environment (HIVE), Office of Biostatistics and Pharmacovigilance (OBPV), Center for Biologics Evaluation and Research (CBER), US Food and Drug Administration (FDA), Silver Spring, MD, USA
| | - Anton Golikov
- High-performance Integrated Virtual Environment (HIVE), Office of Biostatistics and Pharmacovigilance (OBPV), Center for Biologics Evaluation and Research (CBER), US Food and Drug Administration (FDA), Silver Spring, MD, USA
| | - Juan Ibla
- Department of Anesthesiology, Critical Care and Pain Medicine, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Haim Bar
- Department of Statistics, University of Connecticut, Storrs, CT, USA
| | - Anton A Komar
- Department of Biological, Geological and Environmental Sciences, Center for Gene Regulation in Health and Disease, Cleveland State University, Cleveland, OH, USA
| | - Ryan C Hunt
- Hemostasis Branch 1, Division of Hemostasis, Office of Plasma Protein Therapeutics CMC, Office of Therapeutic Products, Center for Biologics Evaluation and Research (CBER), US Food and Drug Administration (FDA), Silver Spring, MD, USA
| | - Brian Lin
- Hemostasis Branch 1, Division of Hemostasis, Office of Plasma Protein Therapeutics CMC, Office of Therapeutic Products, Center for Biologics Evaluation and Research (CBER), US Food and Drug Administration (FDA), Silver Spring, MD, USA
| | | | - Chava Kimchi-Sarfaty
- Hemostasis Branch 1, Division of Hemostasis, Office of Plasma Protein Therapeutics CMC, Office of Therapeutic Products, Center for Biologics Evaluation and Research (CBER), US Food and Drug Administration (FDA), Silver Spring, MD, USA.
| |
Collapse
|
3
|
Bhattacharya A, Dasgupta AK. Multifaceted perspectives of detecting and targeting solid tumors. INTERNATIONAL REVIEW OF CELL AND MOLECULAR BIOLOGY 2024; 389:1-66. [PMID: 39396844 DOI: 10.1016/bs.ircmb.2024.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/15/2024]
Abstract
Solid tumors are the most prevalent form of cancer. Considerable technological and medical advancements had been achieved for the diagnosis of the disease. However, detection of the disease in an early stage is of utmost importance, still far from reality. On the contrary, the treatment and therapeutic area to combat solid tumors are still in its infancy. Conventional treatments like chemotherapy and radiation therapy pose challenges due to their indiscriminate impact on healthy and cancerous cells. Contextually, efficient drug targeting is a pivotal approach in solid tumor treatment. This involves the precise delivery of drugs to cancer cells while minimizing harm to healthy cells. Targeted drugs exhibit superior efficacy in eradicating cancer cells while impeding tumor growth and mitigate side effects by optimizing absorption which further diminishes the risk of resistance. Furthermore, tailoring targeted therapies to a patient's tumor-specific molecular profile augments treatment efficacy and reduces the likelihood of relapse. This chapter discuss about the distinctive characteristics of solid tumors, the possibility of early detection of the disease and potential therapeutic angle beyond the conventional approaches. Additionally, the chapter delves into a hitherto unknown attribute of magnetic field effect to target cancer cells which exploit the relatively less susceptibility of normal cells compared to cancer cells to magnetic fields, suggesting a future potential of magnetic nanoparticles for selective cancer cell destruction. Lastly, bioinformatics tools and other unconventional methodologies such as AI-assisted codon bias analysis have a crucial role in comprehending tumor biology, aiding in the identification of futuristic targeted therapies.
Collapse
Affiliation(s)
- Abhishek Bhattacharya
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH, United States
| | - Anjan Kr Dasgupta
- Department of Biochemistry, University of Calcutta, Kolkata, West Bengal, India.
| |
Collapse
|
4
|
Lin BC, Katneni U, Jankowska KI, Meyer D, Kimchi-Sarfaty C. In silico methods for predicting functional synonymous variants. Genome Biol 2023; 24:126. [PMID: 37217943 PMCID: PMC10204308 DOI: 10.1186/s13059-023-02966-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 05/10/2023] [Indexed: 05/24/2023] Open
Abstract
Single nucleotide variants (SNVs) contribute to human genomic diversity. Synonymous SNVs are previously considered to be "silent," but mounting evidence has revealed that these variants can cause RNA and protein changes and are implicated in over 85 human diseases and cancers. Recent improvements in computational platforms have led to the development of numerous machine-learning tools, which can be used to advance synonymous SNV research. In this review, we discuss tools that should be used to investigate synonymous variants. We provide supportive examples from seminal studies that demonstrate how these tools have driven new discoveries of functional synonymous SNVs.
Collapse
Affiliation(s)
- Brian C Lin
- Hemostasis Branch 1, Division of Hemostasis, Office of Plasma Protein Therapeutics CMC, Office of Therapeutic Products, Center for Biologics Evaluation and Research, US FDA, Silver Spring, MD, USA
| | - Upendra Katneni
- Hemostasis Branch 1, Division of Hemostasis, Office of Plasma Protein Therapeutics CMC, Office of Therapeutic Products, Center for Biologics Evaluation and Research, US FDA, Silver Spring, MD, USA
| | - Katarzyna I Jankowska
- Hemostasis Branch 1, Division of Hemostasis, Office of Plasma Protein Therapeutics CMC, Office of Therapeutic Products, Center for Biologics Evaluation and Research, US FDA, Silver Spring, MD, USA
| | - Douglas Meyer
- Hemostasis Branch 1, Division of Hemostasis, Office of Plasma Protein Therapeutics CMC, Office of Therapeutic Products, Center for Biologics Evaluation and Research, US FDA, Silver Spring, MD, USA
| | - Chava Kimchi-Sarfaty
- Hemostasis Branch 1, Division of Hemostasis, Office of Plasma Protein Therapeutics CMC, Office of Therapeutic Products, Center for Biologics Evaluation and Research, US FDA, Silver Spring, MD, USA.
| |
Collapse
|
5
|
Implementing computational methods in tandem with synonymous gene recoding for therapeutic development. Trends Pharmacol Sci 2023; 44:73-84. [PMID: 36307252 DOI: 10.1016/j.tips.2022.09.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 09/26/2022] [Accepted: 09/27/2022] [Indexed: 12/24/2022]
Abstract
Synonymous gene recoding, the substitution of synonymous variants into the genetic sequence, has been used to overcome many production limitations in therapeutic development. However, the safety and efficacy of recoded therapeutics can be difficult to evaluate because synonymous codon substitutions can result in subtle, yet impactful changes in protein features and require sensitive methods for detection. Given that computational approaches have made significant leaps in recent years, we propose that machine-learning (ML) tools may be leveraged to assess gene-recoded therapeutics and foresee an opportunity to adapt codon contexts to enhance some powerful existing tools. Here, we examine how synonymous gene recoding has been used to address challenges in therapeutic development, explain the biological mechanisms underlying its effects, and explore the application of computational platforms to improve the surveillance of functional variants in therapeutic design.
Collapse
|
6
|
Ran X, Xiao J, Cheng F, Wang T, Teng H, Sun Z. Pan-cancer analyses of synonymous mutations based on tissue-specific codon optimality. Comput Struct Biotechnol J 2022; 20:3567-3580. [PMID: 35860410 PMCID: PMC9287186 DOI: 10.1016/j.csbj.2022.07.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 06/22/2022] [Accepted: 07/03/2022] [Indexed: 11/24/2022] Open
Abstract
Developed tissue-specific codon optimality in 29 human tissues. Applied these to analyze synonymous mutations in ∼10,000 tumor and normal samples. Synonymous mutations frequently increase optimal codons in most cancer types. Synonymous mutations frequently increase optimal codons cell cycle-related genes. Frequency of optimal codon gain relates to proliferation, DDR deficiency, and survival.
Codon optimality has been demonstrated to be an important determinant of mRNA stability and expression levels in multiple model organisms and human cell lines. However, tissue-specific codon optimality has not been developed to investigate how codon optimality is usually perturbed by somatic synonymous mutations in human cancers. Here, we determined tissue-specific codon optimality in 29 human tissues based on mRNA expression data from the Genotype-Tissue Expression project. We found that optimal codons were associated with differentiation, whereas non-optimal codons were correlated with proliferation. Furthermore, codons biased toward differentiation displayed greater tissue specificity in codon optimality, and the tissue specificity of codon optimality was primarily present in amino acids with high degeneracy of the genetic code. By applying tissue-specific codon optimality to somatic synonymous mutations in 8532 tumor samples across 24 cancer types and to those in 416 normal cells across six human tissues, we found that synonymous mutations frequently increased optimal codons in tumor cells and cancer-related genes (e.g., genes involved in cell cycle). Furthermore, an elevated frequency of optimal codon gain was found to promote tumor cell proliferation in three cancer types characterized by DNA damage repair deficiency and could act as a prognostic biomarker for patients with triple-negative breast cancer. In summary, this study profiled tissue-specific codon optimality in human tissues, revealed alterations in codon optimality caused by synonymous mutations in human cancers, and highlighted the non-negligible role of optimal codon gain in tumorigenesis and therapeutics.
Collapse
Affiliation(s)
- Xia Ran
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing 100101, China.,CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jinyuan Xiao
- Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou 325000, China
| | - Fang Cheng
- Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou 325000, China
| | - Tao Wang
- Center for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Kaifu District, Changsha, Hunan 410078, China
| | - Huajing Teng
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Department of Radiation Oncology, Peking University Cancer Hospital & Institute, Beijing, China
| | - Zhongsheng Sun
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing 100101, China.,CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing 100049, China.,Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou 325000, China
| |
Collapse
|
7
|
Kaissarian NM, Meyer D, Kimchi-Sarfaty C. Synonymous Variants: Necessary Nuance in our Understanding of Cancer Drivers and Treatment Outcomes. J Natl Cancer Inst 2022; 114:1072-1094. [PMID: 35477782 PMCID: PMC9360466 DOI: 10.1093/jnci/djac090] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 03/24/2022] [Accepted: 04/18/2022] [Indexed: 11/13/2022] Open
Abstract
Once called "silent mutations" and assumed to have no effect on protein structure and function, synonymous variants are now recognized to be drivers for some cancers. There have been significant advances in our understanding of the numerous mechanisms by which synonymous single nucleotide variants (sSNVs) can affect protein structure and function by affecting pre-mRNA splicing, mRNA expression, stability, folding, miRNA binding, translation kinetics, and co-translational folding. This review highlights the need for considering sSNVs in cancer biology to gain a better understanding of the genetic determinants of human cancers and to improve their diagnosis and treatment. We surveyed the literature for reports of sSNVs in cancer and found numerous studies on the consequences of sSNVs on gene function with supporting in vitro evidence. We also found reports of sSNVs that have statistically significant associations with specific cancer types but for which in vitro studies are lacking to support the reported associations. Additionally, we found reports of germline and somatic sSNVs that were observed in numerous clinical studies and for which in silico analysis predicts possible effects on gene function. We provide a review of these investigations and discuss necessary future studies to elucidate the mechanisms by which sSNVs disrupt protein function and are play a role in tumorigeneses, cancer progression, and treatment efficacy. As splicing dysregulation is one of the most well recognized mechanisms by which sSNVs impact protein function, we also include our own in silico analysis for predicting which sSNVs may disrupt pre-mRNA splicing.
Collapse
Affiliation(s)
- Nayiri M Kaissarian
- Hemostasis Branch, Division of Plasma Protein Therapeutics, Office of Tissues and Advanced Therapies, Center for Biologics Evaluation & Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Douglas Meyer
- Hemostasis Branch, Division of Plasma Protein Therapeutics, Office of Tissues and Advanced Therapies, Center for Biologics Evaluation & Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Chava Kimchi-Sarfaty
- Hemostasis Branch, Division of Plasma Protein Therapeutics, Office of Tissues and Advanced Therapies, Center for Biologics Evaluation & Research, US Food and Drug Administration, Silver Spring, MD, USA
| |
Collapse
|