1
|
Berezin CT, Aguilera LU, Billerbeck S, Bourne PE, Densmore D, Freemont P, Gorochowski TE, Hernandez SI, Hillson NJ, King CR, Köpke M, Ma S, Miller KM, Moon TS, Moore JH, Munsky B, Myers CJ, Nicholas DA, Peccoud SJ, Zhou W, Peccoud J. Ten simple rules for managing laboratory information. PLoS Comput Biol 2023; 19:e1011652. [PMID: 38060459 PMCID: PMC10703290 DOI: 10.1371/journal.pcbi.1011652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2023] Open
Abstract
Information is the cornerstone of research, from experimental (meta)data and computational processes to complex inventories of reagents and equipment. These 10 simple rules discuss best practices for leveraging laboratory information management systems to transform this large information load into useful scientific findings.
Collapse
Affiliation(s)
- Casey-Tyler Berezin
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, United States of America
| | - Luis U. Aguilera
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, United States of America
| | - Sonja Billerbeck
- Molecular Microbiology Unit, Faculty of Science and Engineering, University of Groningen, Groningen, the Netherlands
| | - Philip E. Bourne
- School of Data Science, University of Virginia, Charlottesville, Virginia, United States of America
- Department of Biomedical Engineering, University of Virginia, Charlottesville, Virginia, United States of America
| | - Douglas Densmore
- College of Engineering, Boston University, Boston, Massachusetts, United States of America
| | - Paul Freemont
- Department of Infectious Disease, Imperial College, London, United Kingdom
| | - Thomas E. Gorochowski
- School of Biological Sciences, University of Bristol, Bristol, United Kingdom
- BrisEngBio, University of Bristol, Bristol, United Kingdom
| | - Sarah I. Hernandez
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, United States of America
| | - Nathan J. Hillson
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
- US Department of Energy Agile BioFoundry, Emeryville, California, United States of America
- US Department of Energy Joint BioEnergy Institute, Emeryville, California, United States of America
| | - Connor R. King
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, United States of America
| | - Michael Köpke
- LanzaTech, Skokie, Illinois, United States of America
| | - Shuyi Ma
- Center for Global Infectious Disease Research, Seattle Children’s Hospital, University of Washington Medicine, Seattle, Washington, United States of America
| | - Katie M. Miller
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, United States of America
| | - Tae Seok Moon
- Department of Energy, Environmental & Chemical Engineering, Washington University in St. Louis, St. Louis, Missouri, United States of America
| | - Jason H. Moore
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, California, United States of America
| | - Brian Munsky
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, United States of America
| | - Chris J. Myers
- Department of Electrical, Computer & Energy Engineering, University of Colorado Boulder, Boulder, Colorado, United States of America
| | - Dequina A. Nicholas
- Department of Molecular Biology & Biochemistry, University of California Irvine, Irvine, California, United States of America
| | - Samuel J. Peccoud
- Department of Electrical and Computer Engineering, Colorado State University, Fort Collins, Colorado, United States of America
| | - Wen Zhou
- Department of Statistics, Colorado State University, Fort Collins, Colorado, United States of America
| | - Jean Peccoud
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, United States of America
| |
Collapse
|
2
|
Identifying metabolic features and engineering targets for productivity improvement in CHO cells by integrated transcriptomics and genome-scale metabolic model. Biochem Eng J 2020. [DOI: 10.1016/j.bej.2020.107624] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
3
|
A dual-parameter identification approach for data-based predictive modeling of hybrid gene regulatory network-growth kinetics in Pseudomonas putida mt-2. Bioprocess Biosyst Eng 2020; 43:1671-1688. [PMID: 32377941 DOI: 10.1007/s00449-020-02360-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2019] [Accepted: 04/21/2020] [Indexed: 10/24/2022]
Abstract
Data integration to model-based description of biological systems incorporating gene dynamics improves the performance of microbial systems. Bioprocess performance, typically predicted using empirical Monod-type models, is essential for a sustainable bioeconomy. To replace empirical models, we updated a hybrid gene regulatory network-growth kinetic model, predicting aromatic pollutants degradation and biomass growth in Pseudomonas putida mt-2. We modeled a complex biological system including extensive information to understand the role of the regulatory elements in toluene biodegradation and biomass growth. The updated model exhibited extra complications such as the existence of oscillations and discontinuities. As parameter estimation of complex biological models remains a key challenge, we used the updated model to present a dual-parameter identification approach (the 'dual approach') combining two independent methodologies. Approach I handled the complexity by incorporation of demonstrated biological knowledge in the model-development process and combination of global sensitivity analysis and optimisation. Approach II complemented Approach I handling multimodality, ill-conditioning and overfitting through regularisation estimation, global optimisation, and identifiability analysis. To systematically quantify the biological system, we used a vast amount of high-quality time-course data. The dual approach resulted in an accurately calibrated kinetic model (NRMSE: 0.17055) efficiently handling the additional model complexity. We tested model validation using three independent experimental data sets, achieving greater predictive power (NRMSE: 0.18776) than the individual approaches (NRMSE I: 0.25322, II: 0.25227) and increasing model robustness. These results demonstrated data-driven predictive modeling potentially leading to bioprocess' model-based control and optimisation.
Collapse
|
4
|
Richelle A, David B, Demaegd D, Dewerchin M, Kinet R, Morreale A, Portela R, Zune Q, von Stosch M. Towards a widespread adoption of metabolic modeling tools in biopharmaceutical industry: a process systems biology engineering perspective. NPJ Syst Biol Appl 2020; 6:6. [PMID: 32170148 PMCID: PMC7070029 DOI: 10.1038/s41540-020-0127-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Accepted: 02/12/2020] [Indexed: 01/09/2023] Open
Abstract
In biotechnology, the emergence of high-throughput technologies challenges the interpretation of large datasets. One way to identify meaningful outcomes impacting process and product attributes from large datasets is using systems biology tools such as metabolic models. However, these tools are still not fully exploited for this purpose in industrial context due to gaps in our knowledge and technical limitations. In this paper, key aspects restraining the routine implementation of these tools are highlighted in three research fields: monitoring, network science and hybrid modeling. Advances in these fields could expand the current state of systems biology applications in biopharmaceutical industry to address existing challenges in bioprocess development and improvement.
Collapse
|
5
|
Integration of Time-Series Transcriptomic Data with Genome-Scale CHO Metabolic Models for mAb Engineering. Processes (Basel) 2020. [DOI: 10.3390/pr8030331] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Chinese hamster ovary (CHO) cells are the most commonly used cell lines in biopharmaceutical manufacturing. Genome-scale metabolic models have become a valuable tool to study cellular metabolism. Despite the presence of reference global genome-scale CHO model, context-specific metabolic models may still be required for specific cell lines (for example, CHO-K1, CHO-S, and CHO-DG44), and for specific process conditions. Many integration algorithms have been available to reconstruct specific genome-scale models. These methods are mainly based on integrating omics data (i.e., transcriptomics, proteomics, and metabolomics) into reference genome-scale models. In the present study, we aimed to investigate the impact of time points of transcriptomics integration on the genome-scale CHO model by assessing the prediction of growth rates with each reconstructed model. We also evaluated the feasibility of applying extracted models to different cell lines (generated from the same parental cell line). Our findings illustrate that gene expression at various stages of culture slightly impacts the reconstructed models. However, the prediction capability is robust enough on cell growth prediction not only across different growth phases but also in expansion to other cell lines.
Collapse
|
6
|
Digital Twins and Their Role in Model-Assisted Design of Experiments. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2020; 177:29-61. [PMID: 32797268 DOI: 10.1007/10_2020_136] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Rising demands for biopharmaceuticals and the need to reduce manufacturing costs increase the pressure to develop productive and efficient bioprocesses. Among others, a major hurdle during process development and optimization studies is the huge experimental effort in conventional design of experiments (DoE) methods. As being an explorative approach, DoE requires extensive expert knowledge about the investigated factors and their boundary values and often leads to multiple rounds of time-consuming and costly experiments. The combination of DoE with a virtual representation of the bioprocess, called digital twin, in model-assisted DoE (mDoE) can be used as an alternative to decrease the number of experiments significantly. mDoE enables a knowledge-driven bioprocess development including the definition of a mathematical process model in the early development stages. In this chapter, digital twins and their role in mDoE are discussed. First, statistical DoE methods are introduced as the basis of mDoE. Second, the combination of a mathematical process model and DoE into mDoE is examined. This includes mathematical model structures and a selection scheme for the choice of DoE designs. Finally, the application of mDoE is discussed in a case study for the medium optimization in an antibody-producing Chinese hamster ovary cell culture process.
Collapse
|
7
|
|
8
|
Rejc Ž, Magdevska L, Tršelič T, Osolin T, Vodopivec R, Mraz J, Pavliha E, Zimic N, Cvitanović T, Rozman D, Moškon M, Mraz M. Computational modelling of genome-scale metabolic networks and its application to CHO cell cultures. Comput Biol Med 2017; 88:150-160. [PMID: 28732234 DOI: 10.1016/j.compbiomed.2017.07.005] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Revised: 07/04/2017] [Accepted: 07/05/2017] [Indexed: 01/30/2023]
Abstract
Genome-scale metabolic models (GEMs) have become increasingly important in recent years. Currently, GEMs are the most accurate in silico representation of the genotype-phenotype link. They allow us to study complex networks from the systems perspective. Their application may drastically reduce the amount of experimental and clinical work, improve diagnostic tools and increase our understanding of complex biological phenomena. GEMs have also demonstrated high potential for the optimisation of bio-based production of recombinant proteins. Herein, we review the basic concepts, methods, resources and software tools used for the reconstruction and application of GEMs. We overview the evolution of the modelling efforts devoted to the metabolism of Chinese Hamster Ovary (CHO) cells. We present a case study on CHO cell metabolism under different amino acid depletions. This leads us to the identification of the most influential as well as essential amino acids in selected CHO cell lines.
Collapse
Affiliation(s)
- Živa Rejc
- Faculty of Chemistry and Chemical Technology, University of Ljubljana, Ljubljana, Slovenia
| | - Lidija Magdevska
- Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
| | - Tilen Tršelič
- Faculty of Chemistry and Chemical Technology, University of Ljubljana, Ljubljana, Slovenia
| | - Timotej Osolin
- Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
| | - Rok Vodopivec
- Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
| | - Jakob Mraz
- Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Eva Pavliha
- Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Nikolaj Zimic
- Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
| | - Tanja Cvitanović
- Centre for Functional Genomics and Bio-Chips, Institute of Biochemistry, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia
| | - Damjana Rozman
- Centre for Functional Genomics and Bio-Chips, Institute of Biochemistry, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia
| | - Miha Moškon
- Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia.
| | - Miha Mraz
- Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
| |
Collapse
|
9
|
Chen C, Le H, Goudar CT. Evaluation of two public genome references for chinese hamster ovary cells in the context of rna-seq based gene expression analysis. Biotechnol Bioeng 2017; 114:1603-1613. [PMID: 28295162 DOI: 10.1002/bit.26290] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2016] [Revised: 02/21/2017] [Accepted: 03/10/2017] [Indexed: 11/08/2022]
Abstract
RNA-Seq is a powerful transcriptomics tool for mammalian cell culture process development. Successful RNA-Seq data analysis requires a high quality reference for read mapping and gene expression quantification. Currently, there are two public genome references for Chinese hamster ovary (CHO) cells, the predominant mammalian cell line in the biopharmaceutical industry. In this study, we compared these two references by analyzing 60 RNA-Seq samples from a variety of CHO cell culture conditions. Among the 20,891 common genes in both references, we observed that 31.5% have more than 7.1% quantification differences, implying gene definition differences in the two references. We propose a framework to quantify this difference using two metrics, Consistency and Stringency, which account for the average quantification difference between the two references over all samples, and the sample-specific effect on the quantification result, respectively. These two metrics can be used to identify potential genes for future gene model improvement and to understand the reliability of differentially expressed genes identified by RNA-Seq data analysis. Before a more comprehensive genome reference for CHO cells emerges, the strategy proposed in this study can enable more robust transcriptome analysis from CHO cell RNA-Seq data. Biotechnol. Bioeng. 2017;114: 1603-1613. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Chun Chen
- Drug Substance Technologies, Process Development, Amgen Inc., 1 Amgen Center Drive, Thousand Oaks, California, 91320
| | - Huong Le
- Drug Substance Technologies, Process Development, Amgen Inc., 1 Amgen Center Drive, Thousand Oaks, California, 91320
| | - Chetan T Goudar
- Drug Substance Technologies, Process Development, Amgen Inc., 1 Amgen Center Drive, Thousand Oaks, California, 91320
| |
Collapse
|
10
|
Farzan P, Ierapetritou MG. Integrated modeling to capture the interaction of physiology and fluid dynamics in biopharmaceutical bioreactors. Comput Chem Eng 2017. [DOI: 10.1016/j.compchemeng.2016.11.037] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
11
|
Identifying model error in metabolic flux analysis - a generalized least squares approach. BMC SYSTEMS BIOLOGY 2016; 10:91. [PMID: 27619919 PMCID: PMC5020535 DOI: 10.1186/s12918-016-0335-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2016] [Accepted: 08/30/2016] [Indexed: 01/22/2023]
Abstract
BACKGROUND The estimation of intracellular flux through traditional metabolic flux analysis (MFA) using an overdetermined system of equations is a well established practice in metabolic engineering. Despite the continued evolution of the methodology since its introduction, there has been little focus on validation and identification of poor model fit outside of identifying "gross measurement error". The growing complexity of metabolic models, which are increasingly generated from genome-level data, has necessitated robust validation that can directly assess model fit. RESULTS In this work, MFA calculation is framed as a generalized least squares (GLS) problem, highlighting the applicability of the common t-test for model validation. To differentiate between measurement and model error, we simulate ideal flux profiles directly from the model, perturb them with estimated measurement error, and compare their validation to real data. Application of this strategy to an established Chinese Hamster Ovary (CHO) cell model shows how fluxes validated by traditional means may be largely non-significant due to a lack of model fit. With further simulation, we explore how t-test significance relates to calculation error and show that fluxes found to be non-significant have 2-4 fold larger error (if measurement uncertainty is in the 5-10 % range). CONCLUSIONS The proposed validation method goes beyond traditional detection of "gross measurement error" to identify lack of fit between model and data. Although the focus of this work is on t-test validation and traditional MFA, the presented framework is readily applicable to other regression analysis methods and MFA formulations.
Collapse
|
12
|
Sharfstein ST. Omics insights into production-scale bioreactors. Biotechnol J 2016; 11:1124-5. [DOI: 10.1002/biot.201600338] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Revised: 06/01/2016] [Accepted: 06/15/2016] [Indexed: 12/27/2022]
Affiliation(s)
- Susan T. Sharfstein
- Colleges of Nanoscale Science and Engineering; SUNY Polytechnic Institute; 257 Fuller Road Albany NY 12203 USA
| |
Collapse
|