1
|
Fadhil SH, Saheb EJ. Relationship between the serum level, polymorphism and gene expression of IL-33 in samples of recurrent miscarriage Iraqi women infected with toxoplasmosis. Exp Parasitol 2024; 263-264:108799. [PMID: 39025462 DOI: 10.1016/j.exppara.2024.108799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Revised: 05/20/2024] [Accepted: 07/11/2024] [Indexed: 07/20/2024]
Abstract
One of the many warm-blooded hosts that toxoplasmosis-causing intracellular protozoan parasite Toxoplasma gondii can infect is humans. Cytokines are crucial to stimulate an effective immune response against T. gondii. Interleukin-33 (IL-33) is a unique anti-inflammatory cytokine that suppresses the immune response. The levels of cytokine gene expression are regulated by genetics, and the genetic polymorphisms of these cytokines play a functional role in this process. Single nucleotide polymorphisms (SNPs) are prognostic indicators of illnesses. This study aimed to determine whether toxoplasmosis interacts with serum levels of IL-33 and its SNP in miscarriage women as well as whether serum levels and IL-33 gene expression are related in toxoplasmosis-positive miscarriage women. Two hundred blood samples from patients and controls were collected from AL-Alawiya Maternity Teaching Hospital and AL-Yarmouk Teaching Hospital in Baghdad, Iraq from 2021 to 2022 in order to evaluate the serum level of IL-33 using ELISA test. For the SNP of IL-33, the allelic high-resolution approach was utilized, and real time-PCR was performed to assess gene expression. The results showed that compared to healthy and pregnant women, recurrent miscarriage with toxoplasmosis and recurrent miscarriage women had lower IL-33 concentrations. Additionally, there were significant differences among healthy women, pregnant women, and women with repeated miscarriage who experienced toxoplasmosis. Furthermore, no differences between patients and controls were revealed by gene expression data. The results revealed that recurrent miscarriage, pregnancy, and healthy women all had a slightly higher amount of the IL-33 gene fold. Additionally, the SNP of IL-33 data demonstrated that there was no significant genetic relationship between patients and controls. Recurrent miscarriage women with toxoplasmosis have showed significant differences from pregnant women in the genotypes GG and AA as well as the alleles A and G. There were notable variations between recurrent miscarriage with and without toxoplasmosis in terms of the genotypes AA and AC. The genotypes GG, AA, and allele A in recurrent miscarriage women with toxoplasmosis and recurrent miscarriage women is a protective factor. Taking together, there was a statistically significant negative correlation between toxoplasmosis and IL-33 gene expression, which calls for more quantitative investigation in order to fully comprehend the interaction of mRNA and protein.
Collapse
Affiliation(s)
- Sabreen Hadi Fadhil
- Department of Biology, Collage of Science, Baghdad University, Baghdad, Iraq.
| | - Entsar Jabbar Saheb
- Department of Biology, Collage of Science, Baghdad University, Baghdad, Iraq
| |
Collapse
|
2
|
Ferreira MADM, Silveira WBD, Nikoloski Z. Protein constraints in genome-scale metabolic models: Data integration, parameter estimation, and prediction of metabolic phenotypes. Biotechnol Bioeng 2024; 121:915-930. [PMID: 38178617 DOI: 10.1002/bit.28650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 10/24/2023] [Accepted: 12/18/2023] [Indexed: 01/06/2024]
Abstract
Genome-scale metabolic models provide a valuable resource to study metabolism and cell physiology. These models are employed with approaches from the constraint-based modeling framework to predict metabolic and physiological phenotypes. The prediction performance of genome-scale metabolic models can be improved by including protein constraints. The resulting protein-constrained models consider data on turnover numbers (kcat ) and facilitate the integration of protein abundances. In this systematic review, we present and discuss the current state-of-the-art regarding the estimation of kinetic parameters used in protein-constrained models. We also highlight how data-driven and constraint-based approaches can aid the estimation of turnover numbers and their usage in improving predictions of cellular phenotypes. Finally, we identify standing challenges in protein-constrained metabolic models and provide a perspective regarding future approaches to improve the predictive performance.
Collapse
Affiliation(s)
| | | | - Zoran Nikoloski
- Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
- Systems Biology and Mathematical Modeling, Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany
| |
Collapse
|
3
|
Monteiro GA, Duarte SOD. The Effect of Recombinant Protein Production in Lactococcus lactis Transcriptome and Proteome. Microorganisms 2022; 10:microorganisms10020267. [PMID: 35208722 PMCID: PMC8877491 DOI: 10.3390/microorganisms10020267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 01/18/2022] [Accepted: 01/20/2022] [Indexed: 11/18/2022] Open
Abstract
Lactococcus lactis is a food-grade, and generally recognized as safe, bacterium, which making it ideal for producing plasmid DNA (pDNA) or recombinant proteins for industrial or pharmaceutical applications. The present paper reviews the major findings from L. lactis transcriptome and proteome studies, with an overexpression of native or recombinant proteins. These studies should provide important insights on how to engineer the plasmid vectors and/or the strains in order to achieve high pDNA or recombinant proteins yields, with high quality standards. L. lactis harboring high copy numbers of plasmids for DNA vaccines production showed altered proteome profiles, when compared with a smaller copy number plasmid. For live mucosal vaccination applications, the cell-wall anchored antigens had shown more promising results, when compared with intracellular or secreted antigens. However, previous transcriptome and proteome studies demonstrated that engineering L. lactis to express membrane proteins, mainly with a eukaryotic background, increases the overall cellular burden. Genome engineering strategies could be used to knockout or overexpress the pinpointed genes, so as to increase the profitability of the process. Studies about the effect of protein overexpression on Escherichia coli and Bacillus subtillis transcriptome and proteome are also included.
Collapse
Affiliation(s)
- Gabriel A. Monteiro
- iBB—Institute for Bioengineering and Biosciences, Department of Bioengineering, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, 1049-001 Lisboa, Portugal;
| | - Sofia O. D. Duarte
- iBB—Institute for Bioengineering and Biosciences, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, 1049-001 Lisboa, Portugal
- Correspondence:
| |
Collapse
|
4
|
Wang M, Gong C, Amakye W, Ren J. Exploring the Mechanisms of Anti-Aβ42 Aggregation Activity of Walnut-derived Peptides using Transcriptomics and Proteomics in vitro. EFOOD 2022. [DOI: 10.53365/efood.k/144885] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
Inhibiting β-amyloid (Aβ) aggregation is of significance in finding potential candidates for Alzheimer’s disease (AD) treatment. Accumulating evidence suggests that nutrition is important for improving cognition and reducing AD risk. Walnut has been widely used as a functional food for brain health; however the underlying mechanisms remain unknown. Here, we investigated the molecular level alteration in Arctic mutant Aβ42 induced aggregation cell model by RNA-seq and iTRAQ approaches after walnut-derived peptides Pro-Pro-Lys-Asn-Trp (PW5) and Trp-Pro-Pro-Lys-Asn (WN5) interventions. PW5 or WN5 could significantly decrease abnormal Aβ42 aggregates. However, resultant alterations in transcriptome (substantially unchanged) were inconsistent with proteomic data (marked change). Proteomic analysis revealed 184 and 194 differentially expressed proteins unique to PW5 and WN5 treatment, respectively, for inhibiting Aβ42 protein production or increasing protein degradation via the mismatch repair pathways. Our study provides new insights into the effectiveness of food-derived peptides for anti-Aβ42 aggregation in AD.
Collapse
|
5
|
Zhu X, Wang J, Sun B, Ren C, Yang T, Ding J. An efficient ensemble method for missing value imputation in microarray gene expression data. BMC Bioinformatics 2021; 22:188. [PMID: 33849444 PMCID: PMC8045198 DOI: 10.1186/s12859-021-04109-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Accepted: 03/29/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The genomics data analysis has been widely used to study disease genes and drug targets. However, the existence of missing values in genomics datasets poses a significant problem, which severely hinders the use of genomics data. Current imputation methods based on a single learner often explores less known genomic data information for imputation and thus causes the imputation performance loss. RESULTS In this study, multiple single imputation methods are combined into an imputation method by ensemble learning. In the ensemble method, the bootstrap sampling is applied for predictions of missing values by each component method, and these predictions are weighted and summed to produce the final prediction. The optimal weights are learned from known gene data in the sense of minimizing a cost function about the imputation error. And the expression of the optimal weights is derived in closed form. Additionally, the performance of the ensemble method is analytically investigated, in terms of the sum of squared regression errors. The proposed method is simulated on several typical genomic datasets and compared with the state-of-the-art imputation methods at different noise levels, sample sizes and data missing rates. Experimental results show that the proposed method achieves the improved imputation performance in terms of the imputation accuracy, robustness and generalization. CONCLUSION The ensemble method possesses the superior imputation performance since it can make use of known data information more efficiently for missing data imputation by integrating diverse imputation methods and learning the integration weights in a data-driven way.
Collapse
Affiliation(s)
- Xinshan Zhu
- School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072, China.,State Key Laboratory of Digital Publishing Technology, Beijing, 100871, China
| | - Jiayu Wang
- School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072, China
| | - Biao Sun
- School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072, China.
| | - Chao Ren
- School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072, China
| | - Ting Yang
- School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072, China
| | - Jie Ding
- China Institute of FTZ Supply Chain, Shanghai Maritime University, Shanghai, 201306, China
| |
Collapse
|
6
|
Bai B, van der Horst N, Cordewener JH, America AHP, Nijveen H, Bentsink L. Delayed Protein Changes During Seed Germination. FRONTIERS IN PLANT SCIENCE 2021; 12:735719. [PMID: 34603360 PMCID: PMC8480309 DOI: 10.3389/fpls.2021.735719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2021] [Accepted: 08/05/2021] [Indexed: 05/12/2023]
Abstract
Over the past decade, ample transcriptome data have been generated at different stages during seed germination; however, far less is known about protein synthesis during this important physiological process. Generally, the correlation between transcript levels and protein abundance is low, which strongly limits the use of transcriptome data to accurately estimate protein expression. Polysomal profiling has emerged as a tool to identify mRNAs that are actively translated. The association of the mRNA to the polysome, also referred to as translatome, provides a proxy for mRNA translation. In this study, the correlation between the changes in total mRNA, polysome-associated mRNA, and protein levels across seed germination was investigated. The direct correlation between polysomal mRNA and protein abundance at a single time-point during seed germination is low. However, once the polysomal mRNA of a time-point is compared to the proteome of the next time-point, the correlation is much higher. 35% of the investigated proteome has delayed changes at the protein level. Genes have been classified based on their delayed protein changes, and specific motifs in these genes have been identified. Moreover, mRNA and protein stability and mRNA length have been found as important predictors for changes in protein abundance. In conclusion, polysome association and/or dissociation predicts future changes in protein abundance in germinating seeds.
Collapse
Affiliation(s)
- Bing Bai
- Wageningen Seed Science Centre, Laboratory of Plant Physiology, Wageningen University, Wageningen, Netherlands
- *Correspondence: Bing Bai,
| | | | - Jan H. Cordewener
- BU Bioscience, Wageningen Plant Research, Wageningen, Netherlands
- Centre for BioSystems Genomics, Wageningen, Netherlands
- Netherlands Proteomics Centre, Utrecht, Netherlands
| | - Antoine H. P. America
- BU Bioscience, Wageningen Plant Research, Wageningen, Netherlands
- Centre for BioSystems Genomics, Wageningen, Netherlands
- Netherlands Proteomics Centre, Utrecht, Netherlands
| | - Harm Nijveen
- Bioinformatics Group, Wageningen University, Wageningen, Netherlands
| | - Leónie Bentsink
- Wageningen Seed Science Centre, Laboratory of Plant Physiology, Wageningen University, Wageningen, Netherlands
- Leónie Bentsink,
| |
Collapse
|
7
|
Jiang D, Armour CR, Hu C, Mei M, Tian C, Sharpton TJ, Jiang Y. Microbiome Multi-Omics Network Analysis: Statistical Considerations, Limitations, and Opportunities. Front Genet 2019; 10:995. [PMID: 31781153 PMCID: PMC6857202 DOI: 10.3389/fgene.2019.00995] [Citation(s) in RCA: 86] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Accepted: 09/18/2019] [Indexed: 12/21/2022] Open
Abstract
The advent of large-scale microbiome studies affords newfound analytical opportunities to understand how these communities of microbes operate and relate to their environment. However, the analytical methodology needed to model microbiome data and integrate them with other data constructs remains nascent. This emergent analytical toolset frequently ports over techniques developed in other multi-omics investigations, especially the growing array of statistical and computational techniques for integrating and representing data through networks. While network analysis has emerged as a powerful approach to modeling microbiome data, oftentimes by integrating these data with other types of omics data to discern their functional linkages, it is not always evident if the statistical details of the approach being applied are consistent with the assumptions of microbiome data or how they impact data interpretation. In this review, we overview some of the most important network methods for integrative analysis, with an emphasis on methods that have been applied or have great potential to be applied to the analysis of multi-omics integration of microbiome data. We compare advantages and disadvantages of various statistical tools, assess their applicability to microbiome data, and discuss their biological interpretability. We also highlight on-going statistical challenges and opportunities for integrative network analysis of microbiome data.
Collapse
Affiliation(s)
- Duo Jiang
- Department of Statistics, Oregon State University, Corvallis, OR, United States
| | - Courtney R Armour
- Department of Microbiology, Oregon State University, Corvallis, OR, United States
| | - Chenxiao Hu
- Department of Statistics, Oregon State University, Corvallis, OR, United States
| | - Meng Mei
- Department of Statistics, Oregon State University, Corvallis, OR, United States
| | - Chuan Tian
- Department of Statistics, Oregon State University, Corvallis, OR, United States
| | - Thomas J Sharpton
- Department of Statistics, Oregon State University, Corvallis, OR, United States
- Department of Microbiology, Oregon State University, Corvallis, OR, United States
| | - Yuan Jiang
- Department of Statistics, Oregon State University, Corvallis, OR, United States
| |
Collapse
|
8
|
Li Y, Fan TWM, Lane AN, Kang WY, Arnold SM, Stromberg AJ, Wang C, Chen L. SDA: a semi-parametric differential abundance analysis method for metabolomics and proteomics data. BMC Bioinformatics 2019; 20:501. [PMID: 31623550 PMCID: PMC6798423 DOI: 10.1186/s12859-019-3067-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Accepted: 09/03/2019] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Identifying differentially abundant features between different experimental groups is a common goal for many metabolomics and proteomics studies. However, analyzing data from mass spectrometry (MS) is difficult because the data may not be normally distributed and there is often a large fraction of zero values. Although several statistical methods have been proposed, they either require the data normality assumption or are inefficient. RESULTS We propose a new semi-parametric differential abundance analysis (SDA) method for metabolomics and proteomics data from MS. The method considers a two-part model, a logistic regression for the zero proportion and a semi-parametric log-linear model for the possibly non-normally distributed non-zero values, to characterize data from each feature. A kernel-smoothed likelihood method is developed to estimate model coefficients and a likelihood ratio test is constructed for differential abundant analysis. The method has been implemented into an R package, SDAMS, which is available at https://www.bioconductor.org/packages/release/bioc/html/SDAMS.html . CONCLUSION By introducing the two-part semi-parametric model, SDA is able to handle both non-normally distributed data and large fraction of zero values in a MS dataset. It also allows for adjustment of covariates. Simulations and real data analyses demonstrate that SDA outperforms existing methods.
Collapse
Affiliation(s)
- Yuntong Li
- Department of Statistics, University of Kentucky, Lexington, 40536, USA
| | - Teresa W M Fan
- Markey Cancer Center, University of Kentucky, Lexington, 40536, USA
- Center for Environmental and Systems Biochemistry, University of Kentucky, Lexington, 40536, USA
- Department of Toxicology and Cancer Biology, University of Kentucky, Lexington, 40536, USA
| | - Andrew N Lane
- Markey Cancer Center, University of Kentucky, Lexington, 40536, USA
- Center for Environmental and Systems Biochemistry, University of Kentucky, Lexington, 40536, USA
- Department of Toxicology and Cancer Biology, University of Kentucky, Lexington, 40536, USA
| | - Woo-Young Kang
- Markey Cancer Center, University of Kentucky, Lexington, 40536, USA
- Center for Environmental and Systems Biochemistry, University of Kentucky, Lexington, 40536, USA
- Department of Toxicology and Cancer Biology, University of Kentucky, Lexington, 40536, USA
| | - Susanne M Arnold
- Markey Cancer Center, University of Kentucky, Lexington, 40536, USA
- Department of Medicine, University of Kentucky, Lexington, 40536, USA
| | | | - Chi Wang
- Markey Cancer Center, University of Kentucky, Lexington, 40536, USA.
- Department of Biostatistics, University of Kentucky, Lexington, 40536, USA.
| | - Li Chen
- Markey Cancer Center, University of Kentucky, Lexington, 40536, USA.
- Department of Biostatistics, University of Kentucky, Lexington, 40536, USA.
| |
Collapse
|
9
|
Du Y, Clair GC, Al Alam D, Danopoulos S, Schnell D, Kitzmiller JA, Misra RS, Bhattacharya S, Warburton D, Mariani TJ, Pryhuber GS, Whitsett JA, Ansong C, Xu Y. Integration of transcriptomic and proteomic data identifies biological functions in cell populations from human infant lung. Am J Physiol Lung Cell Mol Physiol 2019; 317:L347-L360. [PMID: 31268347 DOI: 10.1152/ajplung.00475.2018] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Systems biology uses computational approaches to integrate diverse data types to understand cell and organ behavior. Data derived from complementary technologies, for example transcriptomic and proteomic analyses, are providing new insights into development and disease. We compared mRNA and protein profiles from purified endothelial, epithelial, immune, and mesenchymal cells from normal human infant lung tissue. Signatures for each cell type were identified and compared at both mRNA and protein levels. Cell-specific biological processes and pathways were predicted by analysis of concordant and discordant RNA-protein pairs. Cell clustering and gene set enrichment comparisons identified shared versus unique processes associated with transcriptomic and/or proteomic data. Clear cell-cell correlations between mRNA and protein data were obtained from each cell type. Approximately 40% of RNA-protein pairs were coherently expressed. While the correlation between RNA and their protein products was relatively low (Spearman rank coefficient rs ~0.4), cell-specific signature genes involved in functional processes characteristic of each cell type were more highly correlated with their protein products. Consistency of cell-specific RNA-protein signatures indicated an essential framework for the function of each cell type. Visualization and reutilization of the protein and RNA profiles are supported by a new web application, "LungProteomics," which is freely accessible to the public.
Collapse
Affiliation(s)
- Yina Du
- The Perinatal Institute and Section of Neonatology, Perinatal and Pulmonary Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio
| | - Geremy C Clair
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington
| | - Denise Al Alam
- Developmental Biology and Regenerative Medicine Program, Department of Pediatric Surgery, The Saban Research Institute, Children's Hospital Los Angeles, Los Angeles, California.,Keck School of Medicine, University of Southern California, Los Angeles, California
| | - Soula Danopoulos
- Developmental Biology and Regenerative Medicine Program, Department of Pediatric Surgery, The Saban Research Institute, Children's Hospital Los Angeles, Los Angeles, California.,Keck School of Medicine, University of Southern California, Los Angeles, California
| | - Daniel Schnell
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio.,Heart Institute and Center for Translational Fibrosis Research, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio
| | - Joseph A Kitzmiller
- The Perinatal Institute and Section of Neonatology, Perinatal and Pulmonary Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio
| | - Ravi S Misra
- Department of Pediatrics, University of Rochester Medical Center, Rochester, New York
| | - Soumyaroop Bhattacharya
- Department of Pediatrics, University of Rochester Medical Center, Rochester, New York.,Division of Neonatology and Program in Pediatric Molecular and Personalized Medicine, University of Rochester Medical Center, Rochester, New York
| | - David Warburton
- Developmental Biology and Regenerative Medicine Program, Department of Pediatric Surgery, The Saban Research Institute, Children's Hospital Los Angeles, Los Angeles, California.,Keck School of Medicine, University of Southern California, Los Angeles, California
| | - Thomas J Mariani
- Department of Pediatrics, University of Rochester Medical Center, Rochester, New York.,Division of Neonatology and Program in Pediatric Molecular and Personalized Medicine, University of Rochester Medical Center, Rochester, New York
| | - Gloria S Pryhuber
- Department of Pediatrics, University of Rochester Medical Center, Rochester, New York
| | - Jeffrey A Whitsett
- The Perinatal Institute and Section of Neonatology, Perinatal and Pulmonary Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio
| | - Charles Ansong
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington
| | - Yan Xu
- The Perinatal Institute and Section of Neonatology, Perinatal and Pulmonary Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio.,Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio
| |
Collapse
|
10
|
Lin D, Zhang J, Li J, Xu C, Deng HW, Wang YP. An integrative imputation method based on multi-omics datasets. BMC Bioinformatics 2016; 17:247. [PMID: 27329642 PMCID: PMC4915152 DOI: 10.1186/s12859-016-1122-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2015] [Accepted: 06/05/2016] [Indexed: 12/26/2022] Open
Abstract
Background Integrative analysis of multi-omics data is becoming increasingly important to unravel functional mechanisms of complex diseases. However, the currently available multi-omics datasets inevitably suffer from missing values due to technical limitations and various constrains in experiments. These missing values severely hinder integrative analysis of multi-omics data. Current imputation methods mainly focus on using single omics data while ignoring biological interconnections and information imbedded in multi-omics data sets. Results In this study, a novel multi-omics imputation method was proposed to integrate multiple correlated omics datasets for improving the imputation accuracy. Our method was designed to: 1) combine the estimates of missing value from individual omics data itself as well as from other omics, and 2) simultaneously impute multiple missing omics datasets by an iterative algorithm. We compared our method with five imputation methods using single omics data at different noise levels, sample sizes and data missing rates. The results demonstrated the advantage and efficiency of our method, consistently in terms of the imputation error and the recovery of mRNA-miRNA network structure. Conclusions We concluded that our proposed imputation method can utilize more biological information to minimize the imputation error and thus can improve the performance of downstream analysis such as genetic regulatory network construction. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1122-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Dongdong Lin
- Department of Biomedical Engineering, Tulane University, New Orleans, LA, 70118, USA.,Center for Bioinformatics and Genomics, Tulane University, New Orleans, LA, 70112, USA
| | - Jigang Zhang
- Center for Bioinformatics and Genomics, Tulane University, New Orleans, LA, 70112, USA.,Department of Biostatistics and Bioinformatics, Tulane University, New Orleans, LA, 70112, USA
| | - Jingyao Li
- Department of Biomedical Engineering, Tulane University, New Orleans, LA, 70118, USA.,Center for Bioinformatics and Genomics, Tulane University, New Orleans, LA, 70112, USA
| | - Chao Xu
- Center for Bioinformatics and Genomics, Tulane University, New Orleans, LA, 70112, USA.,Department of Biostatistics and Bioinformatics, Tulane University, New Orleans, LA, 70112, USA
| | - Hong-Wen Deng
- Center for Bioinformatics and Genomics, Tulane University, New Orleans, LA, 70112, USA.,Department of Biostatistics and Bioinformatics, Tulane University, New Orleans, LA, 70112, USA
| | - Yu-Ping Wang
- Department of Biomedical Engineering, Tulane University, New Orleans, LA, 70118, USA. .,Center for Bioinformatics and Genomics, Tulane University, New Orleans, LA, 70112, USA. .,Department of Biostatistics and Bioinformatics, Tulane University, New Orleans, LA, 70112, USA.
| |
Collapse
|
11
|
Bundy JL, Inouye BD, Mercer RS, Nowakowski RS. Fractionation-dependent improvements in proteome resolution in the mouse hippocampus by IEF LC-MS/MS. Electrophoresis 2016; 37:2054-62. [DOI: 10.1002/elps.201600076] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2015] [Revised: 04/03/2016] [Accepted: 04/20/2016] [Indexed: 01/19/2023]
Affiliation(s)
- Joseph L. Bundy
- Department of Biomedical Sciences, College of Medicine; Florida State University; Tallahassee FL USA
| | - Brian D. Inouye
- Department of Biological Science; Florida State University; Tallahassee FL USA
| | - Roger S. Mercer
- Translational Science Laboratory; College of Medicine Florida State University; Tallahassee FL USA
| | - Richard S. Nowakowski
- Department of Biomedical Sciences, College of Medicine; Florida State University; Tallahassee FL USA
| |
Collapse
|
12
|
Qi F, Zhao X, Kitahara Y, Li T, Ou X, Du W, Liu D, Huang J. Integrative transcriptomic and proteomic analysis of the mutant lignocellulosic hydrolyzate-tolerant Rhodosporidium toruloides. Eng Life Sci 2016; 17:249-261. [PMID: 32624772 DOI: 10.1002/elsc.201500143] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2015] [Revised: 11/15/2015] [Accepted: 01/14/2016] [Indexed: 12/15/2022] Open
Abstract
The oleaginous yeast Rhodosporidium toruloides has been considered as an economical lipid producer because it transforms carbohydrates from lignocellulosic hydrolyzate into triglycerides; however, R. toruloides cannot survive in hydrolyzate due to the inhibitors co-produced by hydrolysis. We have previously reported a plasma mutagenesis-generated mutant strain M18 that had strong tolerance for the stress environments of hydrolyzate. Here, we applied transcriptomic and proteomic approaches to analyze the global metabolic responses to the stress in hydrolyzate of R. toruloides and elucidate the tolerant mechanism of the mutant strain. The results showed that 57% genes matched and correlated well with their corresponding proteins. Five hundred and seven genes and 366 proteins had their transcription and expression levels changed, respectively, and 39 key genes with significantly changed transcription and expression levels (≥5-fold changes) were identified. The results demonstrated that four cellular processes and their key genes are likely related to the mechanism of tolerance of M18 strain. Enhanced expression of the key genes in R. toruloides could improve the cellular stress tolerance to lignocellulosic hydrolyzate, while the altered expression of most key genes is probably not caused by mutagenesis, but induced by stressful environments of the hydrolyzate.
Collapse
Affiliation(s)
- Feng Qi
- College of Life Sciences Fujian Normal University Fuzhou, Fujian China.,Institute of Applied Chemistry Department of Chemical Engineering Tsinghua University Beijing China
| | - Xuebing Zhao
- Institute of Applied Chemistry Department of Chemical Engineering Tsinghua University Beijing China
| | - Yuki Kitahara
- Department of Bioengineering Tokyo Institute of Technology Yokohama, Kanagawa Japan
| | - Tian Li
- Institute of Applied Chemistry Department of Chemical Engineering Tsinghua University Beijing China
| | - Xianjin Ou
- Institute of Applied Chemistry Department of Chemical Engineering Tsinghua University Beijing China
| | - Wei Du
- Institute of Applied Chemistry Department of Chemical Engineering Tsinghua University Beijing China
| | - Dehua Liu
- Institute of Applied Chemistry Department of Chemical Engineering Tsinghua University Beijing China
| | - Jianzhong Huang
- College of Life Sciences Fujian Normal University Fuzhou, Fujian China
| |
Collapse
|
13
|
Lazar C, Gatto L, Ferro M, Bruley C, Burger T. Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies. J Proteome Res 2016; 15:1116-25. [DOI: 10.1021/acs.jproteome.5b00981] [Citation(s) in RCA: 232] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Affiliation(s)
- Cosmin Lazar
- Univ. Grenoble Alpes, iRTSV-BGE, F-38000 Grenoble, France
- CEA, iRTSV-BGE, F-38000 Grenoble, France
- INSERM, BGE, F-38000 Grenoble, France
| | - Laurent Gatto
- Computational Proteomics Unit, Cambridge CB2 1GA, United Kingdom
- Cambridge Center for Proteomics, Cambridge CB2 1GA, United Kingdom
| | - Myriam Ferro
- Univ. Grenoble Alpes, iRTSV-BGE, F-38000 Grenoble, France
- CEA, iRTSV-BGE, F-38000 Grenoble, France
- INSERM, BGE, F-38000 Grenoble, France
| | - Christophe Bruley
- Univ. Grenoble Alpes, iRTSV-BGE, F-38000 Grenoble, France
- CEA, iRTSV-BGE, F-38000 Grenoble, France
- INSERM, BGE, F-38000 Grenoble, France
| | - Thomas Burger
- Univ. Grenoble Alpes, iRTSV-BGE, F-38000 Grenoble, France
- CNRS, iRTSV-BGE, F-38000 Grenoble, France
- CEA, iRTSV-BGE, F-38000 Grenoble, France
- INSERM, BGE, F-38000 Grenoble, France
| |
Collapse
|
14
|
Wang J, Wu G, Chen L, Zhang W. Integrated Analysis of Transcriptomic and Proteomic Datasets Reveals Information on Protein Expressivity and Factors Affecting Translational Efficiency. Methods Mol Biol 2016; 1375:123-136. [PMID: 25762301 DOI: 10.1007/7651_2015_242] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Integrated analysis of large-scale transcriptomic and proteomic data can provide important insights into the metabolic mechanisms underlying complex biological systems. In this chapter, we present methods to address two aspects of issues related to integrated transcriptomic and proteomic analysis. First, due to the fact that proteomic datasets are often incomplete, and integrated analysis of partial proteomic data may introduce significant bias. To address these issues, we describe a zero-inflated Poisson (ZIP)-based model to uncover the complicated relationships between protein abundances and mRNA expression levels, and then apply them to predict protein abundance for the proteins not experimentally detected. The ZIP model takes into consideration the undetected proteins by assuming that there is a probability mass at zero representing expressed proteins that were undetected owing to technical limitations. The model validity is demonstrated using biological information of operons, regulons, and pathways. Second, weak correlation between transcriptomic and proteomic datasets is often due to biological factors affecting translational processes. To quantify the effects of these factors, we describe a multiple regression-based statistical framework to quantitatively examine the effects of various translational efficiency-related sequence features on mRNA-protein correlation. Using the datasets from sulfate-reducing bacteria Desulfovibrio vulgaris, the analysis shows that translation-related sequence features can contribute up to 15.2-26.2% of the total variation of the correlation between transcriptomic and proteomic datasets, and also reveals the relative importance of various features in translation process.
Collapse
Affiliation(s)
- Jiangxin Wang
- Laboratory of Synthetic Microbiology, School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, People's Republic of China
- Key Laboratory of Systems Bioengineering, Ministry of Education of China, Tianjin, 300072, People's Republic of China
- Collaborative Innovation Center of Chemical Science and Engineering, Tianjin, People's Republic of China
| | - Gang Wu
- University of Maryland at Baltimore Country, Baltimore County, MD, USA
| | - Lei Chen
- Laboratory of Synthetic Microbiology, School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, People's Republic of China
- Key Laboratory of Systems Bioengineering, Ministry of Education of China, Tianjin, 300072, People's Republic of China
- Collaborative Innovation Center of Chemical Science and Engineering, Tianjin, People's Republic of China
| | - Weiwen Zhang
- Laboratory of Synthetic Microbiology, School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, People's Republic of China.
- Key Laboratory of Systems Bioengineering, Ministry of Education of China, Tianjin, 300072, People's Republic of China.
- Collaborative Innovation Center of Chemical Science and Engineering, Tianjin, People's Republic of China.
| |
Collapse
|
15
|
Gao L, Pei G, Chen L, Zhang W. A global network-based protocol for functional inference of hypothetical proteins in Synechocystis sp. PCC 6803. J Microbiol Methods 2015; 116:44-52. [DOI: 10.1016/j.mimet.2015.06.013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2015] [Revised: 06/24/2015] [Accepted: 06/25/2015] [Indexed: 01/15/2023]
|
16
|
A Post-Genomic View of the Ecophysiology, Catabolism and Biotechnological Relevance of Sulphate-Reducing Prokaryotes. Adv Microb Physiol 2015. [PMID: 26210106 DOI: 10.1016/bs.ampbs.2015.05.002] [Citation(s) in RCA: 174] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Dissimilatory sulphate reduction is the unifying and defining trait of sulphate-reducing prokaryotes (SRP). In their predominant habitats, sulphate-rich marine sediments, SRP have long been recognized to be major players in the carbon and sulphur cycles. Other, more recently appreciated, ecophysiological roles include activity in the deep biosphere, symbiotic relations, syntrophic associations, human microbiome/health and long-distance electron transfer. SRP include a high diversity of organisms, with large nutritional versatility and broad metabolic capacities, including anaerobic degradation of aromatic compounds and hydrocarbons. Elucidation of novel catabolic capacities as well as progress in the understanding of metabolic and regulatory networks, energy metabolism, evolutionary processes and adaptation to changing environmental conditions has greatly benefited from genomics, functional OMICS approaches and advances in genetic accessibility and biochemical studies. Important biotechnological roles of SRP range from (i) wastewater and off gas treatment, (ii) bioremediation of metals and hydrocarbons and (iii) bioelectrochemistry, to undesired impacts such as (iv) souring in oil reservoirs and other environments, and (v) corrosion of iron and concrete. Here we review recent advances in our understanding of SRPs focusing mainly on works published after 2000. The wealth of publications in this period, covering many diverse areas, is a testimony to the large environmental, biogeochemical and technological relevance of these organisms and how much the field has progressed in these years, although many important questions and applications remain to be explored.
Collapse
|
17
|
Guindani M, Sepúlveda N, Paulino CD, Müller P. A Bayesian Semi-parametric Approach for the Differential Analysis of Sequence Counts Data. J R Stat Soc Ser C Appl Stat 2014; 63:385-404. [PMID: 24833809 PMCID: PMC4017673 DOI: 10.1111/rssc.12041] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Data obtained using modern sequencing technologies are often summarized by recording the frequencies of observed sequences. Examples include the analysis of T cell counts in immunological research and studies of gene expression based on counts of RNA fragments. In both cases the items being counted are sequences, of proteins and base pairs, respectively. The resulting sequence-abundance distribution is usually characterized by overdispersion. We propose a Bayesian semi-parametric approach to implement inference for such data. Besides modeling the overdispersion, the approach takes also into account two related sources of bias that are usually associated with sequence counts data: some sequence types may not be recorded during the experiment and the total count may differ from one experiment to another. We illustrate our methodology with two data sets, one regarding the analysis of CD4+ T cell counts in healthy and diabetic mice and another data set concerning the comparison of mRNA fragments recorded in a Serial Analysis of Gene Expression (SAGE) experiment with gastrointestinal tissue of healthy and cancer patients.
Collapse
Affiliation(s)
- Michele Guindani
- Department of Biostatistics, U.T. M.D. Anderson Cancer Center, Houston, TX, USA
| | - Nuno Sepúlveda
- London School of Hygiene and Tropical Medicine, United Kingdom and Centre of Statistics and Applications of University of Lisbon, Portugal
| | - Carlos Daniel Paulino
- Departamento de Matemática, Instituto Superior Técnico, Portugal and Centre of Statistics and Applications of University of Lisbon, Portugal Portugal
| | - Peter Müller
- Department of Mathematics, University of Texas at Austin, Austin, TX, USA
| |
Collapse
|
18
|
Tomescu OA, Mattanovich D, Thallinger GG. Integrative omics analysis. A study based on Plasmodium falciparum mRNA and protein data. BMC SYSTEMS BIOLOGY 2014; 8 Suppl 2:S4. [PMID: 25033389 PMCID: PMC4101701 DOI: 10.1186/1752-0509-8-s2-s4] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Background Technological improvements have shifted the focus from data generation to data analysis. The availability of large amounts of data from transcriptomics, protemics and metabolomics experiments raise new questions concerning suitable integrative analysis methods. We compare three integrative analysis techniques (co-inertia analysis, generalized singular value decomposition and integrative biclustering) by applying them to gene and protein abundance data from the six life cycle stages of Plasmodium falciparum. Co-inertia analysis is an analysis method used to visualize and explore gene and protein data. The generalized singular value decomposition has shown its potential in the analysis of two transcriptome data sets. Integrative Biclustering applies biclustering to gene and protein data. Results Using CIA, we visualize the six life cycle stages of Plasmodium falciparum, as well as GO terms in a 2D plane and interpret the spatial configuration. With GSVD, we decompose the transcriptomic and proteomic data sets into matrices with biologically meaningful interpretations and explore the processes captured by the data sets. IBC identifies groups of genes, proteins, GO Terms and life cycle stages of Plasmodium falciparum. We show method-specific results as well as a network view of the life cycle stages based on the results common to all three methods. Additionally, by combining the results of the three methods, we create a three-fold validated network of life cycle stage specific GO terms: Sporozoites are associated with transcription and transport; merozoites with entry into host cell as well as biosynthetic and metabolic processes; rings with oxidation-reduction processes; trophozoites with glycolysis and energy production; schizonts with antigenic variation and immune response; gametocyctes with DNA packaging and mitochondrial transport. Furthermore, the network connectivity underlines the separation of the intraerythrocytic cycle from the gametocyte and sporozoite stages. Conclusion Using integrative analysis techniques, we can integrate knowledge from different levels and obtain a wider view of the system under study. The overlap between method-specific and common results is considerable, even if the basic mathematical assumptions are very different. The three-fold validated network of life cycle stage characteristics of Plasmodium falciparum could identify a large amount of the known associations from literature in only one study.
Collapse
|
19
|
Proteomic and transcriptomic analyses of "Candidatus Pelagibacter ubique" describe the first PII-independent response to nitrogen limitation in a free-living Alphaproteobacterium. mBio 2013; 4:e00133-12. [PMID: 24281717 PMCID: PMC3870248 DOI: 10.1128/mbio.00133-12] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
UNLABELLED Nitrogen is one of the major nutrients limiting microbial productivity in the ocean, and as a result, most marine microorganisms have evolved systems for responding to nitrogen stress. The highly abundant alphaproteobacterium "Candidatus Pelagibacter ubique," a cultured member of the order Pelagibacterales (SAR11), lacks the canonical GlnB, GlnD, GlnK, and NtrB/NtrC genes for regulating nitrogen assimilation, raising questions about how these organisms respond to nitrogen limitation. A survey of 266 Alphaproteobacteria genomes found these five regulatory genes nearly universally conserved, absent only in intracellular parasites and members of the order Pelagibacterales, including "Ca. Pelagibacter ubique." Global differences in mRNA and protein expression between nitrogen-limited and nitrogen-replete cultures were measured to identify nitrogen stress responses in "Ca. Pelagibacter ubique" strain HTCC1062. Transporters for ammonium (AmtB), taurine (TauA), amino acids (YhdW), and opines (OccT) were all elevated in nitrogen-limited cells, indicating that they devote increased resources to the assimilation of nitrogenous organic compounds. Enzymes for assimilating amine into glutamine (GlnA), glutamate (GltBD), and glycine (AspC) were similarly upregulated. Differential regulation of the transcriptional regulator NtrX in the two-component signaling system NtrY/NtrX was also observed, implicating it in control of the nitrogen starvation response. Comparisons of the transcriptome and proteome supported previous observations of uncoupling between transcription and translation in nutrient-deprived "Ca. Pelagibacter ubique" cells. Overall, these data reveal a streamlined, PII-independent response to nitrogen stress in "Ca. Pelagibacter ubique," and likely other Pelagibacterales, and show that they respond to nitrogen stress by allocating more resources to the assimilation of nitrogen-rich organic compounds. IMPORTANCE Pelagibacterales are extraordinarily abundant and play a pivotal role in marine geochemical cycles, as one of the major recyclers of labile dissolved organic matter. They are also models for understanding how streamlining selection can reshape chemoheterotroph metabolism. Streamlining and its broad importance to environmental microbiology are emerging slowly from studies that reveal the complete genomes of uncultured organisms. Here, we report another remarkable example of streamlined metabolism in Pelagibacterales, this time in systems that control nitrogen assimilation. Pelagibacterales are major contributors to metatranscriptomes and metaproteomes from ocean systems, where patterns of gene expression are used to gain insight into ocean conditions and geochemical cycles. The data presented here supply background that is essential to interpreting data from field studies.
Collapse
|
20
|
Haider S, Pal R. Integrated analysis of transcriptomic and proteomic data. Curr Genomics 2013; 14:91-110. [PMID: 24082820 PMCID: PMC3637682 DOI: 10.2174/1389202911314020003] [Citation(s) in RCA: 258] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2012] [Revised: 01/09/2013] [Accepted: 01/22/2013] [Indexed: 12/14/2022] Open
Abstract
Until recently, understanding the regulatory behavior of cells has been pursued through independent analysis of the transcriptome or the proteome. Based on the central dogma, it was generally assumed that there exist a direct correspondence between mRNA transcripts and generated protein expressions. However, recent studies have shown that the correlation between mRNA and Protein expressions can be low due to various factors such as different half lives and post transcription machinery. Thus, a joint analysis of the transcriptomic and proteomic data can provide useful insights that may not be deciphered from individual analysis of mRNA or protein expressions. This article reviews the existing major approaches for joint analysis of transcriptomic and proteomic data. We categorize the different approaches into eight main categories based on the initial algorithm and final analysis goal. We further present analogies with other domains and discuss the existing research problems in this area.
Collapse
Affiliation(s)
| | - Ranadip Pal
- Department of Electrical and Computer Engineering, Texas Tech University, Lubbock, TX, 79409, USA
| |
Collapse
|
21
|
A practical data processing workflow for multi-OMICS projects. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2013; 1844:52-62. [PMID: 23501674 DOI: 10.1016/j.bbapap.2013.02.029] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2012] [Revised: 02/15/2013] [Accepted: 02/20/2013] [Indexed: 12/11/2022]
Abstract
Multi-OMICS approaches aim on the integration of quantitative data obtained for different biological molecules in order to understand their interrelation and the functioning of larger systems. This paper deals with several data integration and data processing issues that frequently occur within this context. To this end, the data processing workflow within the PROFILE project is presented, a multi-OMICS project that aims on identification of novel biomarkers and the development of new therapeutic targets for seven important liver diseases. Furthermore, a software called CrossPlatformCommander is sketched, which facilitates several steps of the proposed workflow in a semi-automatic manner. Application of the software is presented for the detection of novel biomarkers, their ranking and annotation with existing knowledge using the example of corresponding Transcriptomics and Proteomics data sets obtained from patients suffering from hepatocellular carcinoma. Additionally, a linear regression analysis of Transcriptomics vs. Proteomics data is presented and its performance assessed. It was shown, that for capturing profound relations between Transcriptomics and Proteomics data, a simple linear regression analysis is not sufficient and implementation and evaluation of alternative statistical approaches are needed. Additionally, the integration of multivariate variable selection and classification approaches is intended for further development of the software. Although this paper focuses only on the combination of data obtained from quantitative Proteomics and Transcriptomics experiments, several approaches and data integration steps are also applicable for other OMICS technologies. Keeping specific restrictions in mind the suggested workflow (or at least parts of it) may be used as a template for similar projects that make use of different high throughput techniques. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan.
Collapse
|
22
|
Wang J, Chen L, Huang S, Liu J, Ren X, Tian X, Qiao J, Zhang W. RNA-seq based identification and mutant validation of gene targets related to ethanol resistance in cyanobacterial Synechocystis sp. PCC 6803. BIOTECHNOLOGY FOR BIOFUELS 2012; 5:89. [PMID: 23259593 PMCID: PMC3564720 DOI: 10.1186/1754-6834-5-89] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2012] [Accepted: 12/04/2012] [Indexed: 05/03/2023]
Abstract
BACKGROUND Fermentation production of biofuel ethanol consumes agricultural crops, which will compete directly with the food supply. As an alternative, photosynthetic cyanobacteria have been proposed as microbial factories to produce ethanol directly from solar energy and CO2. However, the ethanol productivity from photoautotrophic cyanobacteria is still very low, mostly due to the low tolerance of cyanobacterial systems to ethanol stress. RESULTS To build a foundation necessary to engineer robust ethanol-producing cyanobacterial hosts, in this study we applied a quantitative transcriptomics approach with a next-generation sequencing technology, combined with quantitative reverse-transcript PCR (RT-PCR) analysis, to reveal the global metabolic responses to ethanol in model cyanobacterial Synechocystis sp. PCC 6803. The results showed that ethanol exposure induced genes involved in common stress responses, transporting and cell envelope modification. In addition, the cells can also utilize enhanced polyhydroxyalkanoates (PHA) accumulation and glyoxalase detoxication pathway as means against ethanol stress. The up-regulation of photosynthesis by ethanol was also further confirmed at transcriptional level. Finally, we used gene knockout strains to validate the potential target genes related to ethanol tolerance. CONCLUSION RNA-Seq based global transcriptomic analysis provided a comprehensive view of cellular response to ethanol exposure. The analysis provided a list of gene targets for engineering ethanol tolerance in cyanobacterium Synechocystis.
Collapse
Affiliation(s)
- Jiangxin Wang
- School of Chemical Engineering & Technology, Tianjin University, Tianjin, 300072, People's Republic of China
- Key Laboratory of Systems Bioengineering, Ministry of Education, Tianjin, 300072, People's Republic of China
| | - Lei Chen
- School of Chemical Engineering & Technology, Tianjin University, Tianjin, 300072, People's Republic of China
- Key Laboratory of Systems Bioengineering, Ministry of Education, Tianjin, 300072, People's Republic of China
| | - Siqiang Huang
- School of Chemical Engineering & Technology, Tianjin University, Tianjin, 300072, People's Republic of China
- Key Laboratory of Systems Bioengineering, Ministry of Education, Tianjin, 300072, People's Republic of China
| | - Jie Liu
- School of Chemical Engineering & Technology, Tianjin University, Tianjin, 300072, People's Republic of China
- Key Laboratory of Systems Bioengineering, Ministry of Education, Tianjin, 300072, People's Republic of China
| | - Xiaoyue Ren
- School of Chemical Engineering & Technology, Tianjin University, Tianjin, 300072, People's Republic of China
- Key Laboratory of Systems Bioengineering, Ministry of Education, Tianjin, 300072, People's Republic of China
| | - Xiaoxu Tian
- School of Chemical Engineering & Technology, Tianjin University, Tianjin, 300072, People's Republic of China
- Key Laboratory of Systems Bioengineering, Ministry of Education, Tianjin, 300072, People's Republic of China
| | - Jianjun Qiao
- School of Chemical Engineering & Technology, Tianjin University, Tianjin, 300072, People's Republic of China
- Key Laboratory of Systems Bioengineering, Ministry of Education, Tianjin, 300072, People's Republic of China
| | - Weiwen Zhang
- School of Chemical Engineering & Technology, Tianjin University, Tianjin, 300072, People's Republic of China
- Key Laboratory of Systems Bioengineering, Ministry of Education, Tianjin, 300072, People's Republic of China
| |
Collapse
|
23
|
Prediction and Characterization of Missing Proteomic Data in Desulfovibrio vulgaris. Comp Funct Genomics 2011; 2011:780973. [PMID: 21687592 PMCID: PMC3114432 DOI: 10.1155/2011/780973] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2010] [Revised: 12/17/2010] [Accepted: 03/01/2011] [Indexed: 11/17/2022] Open
Abstract
Proteomic datasets are often incomplete due to identification range and sensitivity issues. It becomes important to develop methodologies to estimate missing proteomic data, allowing better interpretation of proteomic datasets and metabolic mechanisms underlying complex biological systems. In this study, we applied an artificial neural network to approximate the relationships between cognate transcriptomic and proteomic datasets of Desulfovibrio vulgaris, and to predict protein abundance for the proteins not experimentally detected, based on several relevant predictors, such as mRNA abundance, cellular role and triple codon counts. The results showed that the coefficients of determination for the trained neural network models ranged from 0.47 to 0.68, providing better modeling than several previous regression models. The validity of the trained neural network model was evaluated using biological information (i.e. operons). To seek understanding of mechanisms causing missing proteomic data, we used a multivariate logistic regression analysis and the result suggested that some key factors, such as protein instability index, aliphatic index, mRNA abundance, effective number of codons (N(c)) and codon adaptation index (CAI) values may be ascribed to whether a given expressed protein can be detected. In addition, we demonstrated that biological interpretation can be improved by use of imputed proteomic datasets.
Collapse
|
24
|
Torres-García W, Brown SD, Johnson RH, Zhang W, Runger GC, Meldrum DR. Integrative analysis of transcriptomic and proteomic data of Shewanella oneidensis: missing value imputation using temporal datasets. MOLECULAR BIOSYSTEMS 2011; 7:1093-104. [PMID: 21212895 DOI: 10.1039/c0mb00260g] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Despite significant improvements in recent years, proteomic datasets currently available still suffer from large number of missing values. Integrative analyses based upon incomplete proteomic and transcriptomic datasets could seriously bias the biological interpretation. In this study, we applied a non-linear data-driven stochastic gradient boosted trees (GBT) model to impute missing proteomic values using a temporal transcriptomic and proteomic dataset of Shewanella oneidensis. In this dataset, genes' expression was measured after the cells were exposed to 1 mM potassium chromate for 5, 30, 60, and 90 min, while protein abundance was measured for 45 and 90 min. With the ultimate objective to impute protein values for experimentally undetected samples at 45 and 90 min, we applied a serial set of algorithms to capture relationships between temporal gene and protein expression. This work follows four main steps: (1) a quality control step for gene expression reliability, (2) mRNA imputation, (3) protein prediction, and (4) validation. Initially, an S control chart approach is performed on gene expression replicates to remove unwanted variability. Then, we focused on the missing measurements of gene expression through a nonlinear Smoothing Splines Curve Fitting. This method identifies temporal relationships among transcriptomic data at different time points and enables imputation of mRNA abundance at 45 min. After mRNA imputation was validated by biological constrains (i.e. operons), we used a data-driven GBT model to impute protein abundance for the proteins experimentally undetected in the 45 and 90 min samples, based on relevant predictors such as temporal mRNA gene expression data and cellular functional roles. The imputed protein values were validated using biological constraints such as operon and pathway information through a permutation test to investigate whether dispersion measures are indeed smaller for known biological groups than for any set of random genes. Finally, we demonstrated that such missing value imputation improved characterization of the temporal response of S. oneidensis to chromate.
Collapse
Affiliation(s)
- Wandaliz Torres-García
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ 85287-5906, USA.
| | | | | | | | | | | |
Collapse
|
25
|
Abstract
Multiple Omics datasets (for example, high throughput mRNA and protein measurements for the same set of genes) are beginning to appear more widely within the fields of bioinformatics and computational biology. There are many tools available for the analysis of single datasets but two (or more) sets of coupled observations present more of a challenge. I describe some of the methods available - from classical statistical techniques to more recent advances from the fields of Machine Learning and Pattern Recognition for linking Omics data levels with particular focus on transcriptomics and proteomics profiles.
Collapse
Affiliation(s)
- Simon Rogers
- Inference Research Group, Department of Computing Science, University of Glasgow, Glasgow, UK.
| |
Collapse
|
26
|
Transcriptome and proteome exploration to model translation efficiency and protein stability in Lactococcus lactis. PLoS Comput Biol 2009; 5:e1000606. [PMID: 20019804 PMCID: PMC2787624 DOI: 10.1371/journal.pcbi.1000606] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2009] [Accepted: 11/12/2009] [Indexed: 11/19/2022] Open
Abstract
This genome-scale study analysed the various parameters influencing protein levels in cells. To achieve this goal, the model bacterium Lactococcus lactis was grown at steady state in continuous cultures at different growth rates, and proteomic and transcriptomic data were thoroughly compared. Ratios of mRNA to protein were highly variable among proteins but also, for a given gene, between the different growth conditions. The modeling of cellular processes combined with a data fitting modeling approach allowed both translation efficiencies and degradation rates to be estimated for each protein in each growth condition. Estimated translational efficiencies and degradation rates strongly differed between proteins and were tested for their biological significance through statistical correlations with relevant parameters such as codon or amino acid bias. These efficiencies and degradation rates were not constant in all growth conditions and were inversely proportional to the growth rate, indicating a more efficient translation at low growth rate but an antagonistic higher rate of protein degradation. Estimated protein median half-lives ranged from 23 to 224 min, underlying the importance of protein degradation notably at low growth rates. The regulation of intracellular protein level was analysed through regulatory coefficient calculations, revealing a complex control depending on protein and growth conditions. The modeling approach enabled translational efficiencies and protein degradation rates to be estimated, two biological parameters extremely difficult to determine experimentally and generally lacking in bacteria. This method is generic and can now be extended to other environments and/or other micro-organisms. This work is in the field of systems biology. Via an in-depth comparison of proteomic and transcriptomic data in various culture conditions, our objective was to better understand the regulation of protein levels. We have demonstrated that bacteria exert a tight control on intracellular protein levels, through a multi-level regulation involving translation but also dilution due to growth and protein degradation. We have estimated translational efficiencies and protein degradation rates by modeling. These two biological parameters are extremely difficult to measure experimentally and have not been previously determined in bacteria. We have found that they are growth rate dependent, indicating a fine control of translation and degradation processes. We have worked with the small genome bacterium Lactococcus lactis on a limited number of mRNA-protein couples but keeping in mind that this approach could be extended to other micro-organisms and biological phenomena. We have exhibited that mathematical modeling associated to experimental steady-states cultures is a powerful tool to understand microbial physiology.
Collapse
|
27
|
Zhang W, Li F, Nie L. Integrating multiple 'omics' analysis for microbial biology: application and methodologies. MICROBIOLOGY-SGM 2009; 156:287-301. [PMID: 19910409 DOI: 10.1099/mic.0.034793-0] [Citation(s) in RCA: 281] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Recent advances in various 'omics' technologies enable quantitative monitoring of the abundance of various biological molecules in a high-throughput manner, and thus allow determination of their variation between different biological states on a genomic scale. Several popular 'omics' platforms that have been used in microbial systems biology include transcriptomics, which measures mRNA transcript levels; proteomics, which quantifies protein abundance; metabolomics, which determines abundance of small cellular metabolites; interactomics, which resolves the whole set of molecular interactions in cells; and fluxomics, which establishes dynamic changes of molecules within a cell over time. However, no single 'omics' analysis can fully unravel the complexities of fundamental microbial biology. Therefore, integration of multiple layers of information, the multi-'omics' approach, is required to acquire a precise picture of living micro-organisms. In spite of this being a challenging task, some attempts have been made recently to integrate heterogeneous 'omics' datasets in various microbial systems and the results have demonstrated that the multi-'omics' approach is a powerful tool for understanding the functional principles and dynamics of total cellular systems. This article reviews some basic concepts of various experimental 'omics' approaches, recent application of the integrated 'omics' for exploring metabolic and regulatory mechanisms in microbes, and advances in computational and statistical methodologies associated with integrated 'omics' analyses. Online databases and bioinformatic infrastructure available for integrated 'omics' analyses are also briefly discussed.
Collapse
Affiliation(s)
- Weiwen Zhang
- Center for Ecogenomics, Biodesign Institute, Arizona State University, Tempe, AZ 85287-6501, USA
| | - Feng Li
- Division of Biometrics II, Office of Biometrics/OTS/CDER/FDA, Silver Spring, MD 20993-0002, USA
| | - Lei Nie
- Division of Biometrics IV, Office of Biometrics/OTS/CDER/FDA, Silver Spring, MD 20993-0002, USA
| |
Collapse
|
28
|
Sun A, Zhang J, Wang C, Yang D, Wei H, Zhu Y, Jiang Y, He F. Modified Spectral Count Index (mSCI) for Estimation of Protein Abundance by Protein Relative Identification Possibility (RIPpro): A New Proteomic Technological Parameter. J Proteome Res 2009; 8:4934-42. [DOI: 10.1021/pr900252n] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Aihua Sun
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100730, P. R. China, State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine, Beijing 102206, P. R. China, and Institutes of Biomedical Sciences and Department of Chemistry, Fudan University, Shanghai 200032, P. R. China
| | - Jiyang Zhang
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100730, P. R. China, State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine, Beijing 102206, P. R. China, and Institutes of Biomedical Sciences and Department of Chemistry, Fudan University, Shanghai 200032, P. R. China
| | - Chunping Wang
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100730, P. R. China, State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine, Beijing 102206, P. R. China, and Institutes of Biomedical Sciences and Department of Chemistry, Fudan University, Shanghai 200032, P. R. China
| | - Dong Yang
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100730, P. R. China, State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine, Beijing 102206, P. R. China, and Institutes of Biomedical Sciences and Department of Chemistry, Fudan University, Shanghai 200032, P. R. China
| | - Handong Wei
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100730, P. R. China, State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine, Beijing 102206, P. R. China, and Institutes of Biomedical Sciences and Department of Chemistry, Fudan University, Shanghai 200032, P. R. China
| | - Yunping Zhu
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100730, P. R. China, State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine, Beijing 102206, P. R. China, and Institutes of Biomedical Sciences and Department of Chemistry, Fudan University, Shanghai 200032, P. R. China
| | - Ying Jiang
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100730, P. R. China, State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine, Beijing 102206, P. R. China, and Institutes of Biomedical Sciences and Department of Chemistry, Fudan University, Shanghai 200032, P. R. China
| | - Fuchu He
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100730, P. R. China, State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine, Beijing 102206, P. R. China, and Institutes of Biomedical Sciences and Department of Chemistry, Fudan University, Shanghai 200032, P. R. China
| |
Collapse
|
29
|
de Sousa Abreu R, Penalva LO, Marcotte EM, Vogel C. Global signatures of protein and mRNA expression levels. MOLECULAR BIOSYSTEMS 2009; 5:1512-26. [PMID: 20023718 DOI: 10.1039/b908315d] [Citation(s) in RCA: 578] [Impact Index Per Article: 38.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Cellular states are determined by differential expression of the cell's proteins. The relationship between protein and mRNA expression levels informs about the combined outcomes of translation and protein degradation which are, in addition to transcription and mRNA stability, essential contributors to gene expression regulation. This review summarizes the state of knowledge about large-scale measurements of absolute protein and mRNA expression levels, and the degree of correlation between the two parameters. We summarize the information that can be derived from comparison of protein and mRNA expression levels and discuss how corresponding sequence characteristics suggest modes of regulation.
Collapse
Affiliation(s)
- Raquel de Sousa Abreu
- Children's Cancer Research Institute, University of Texas Health Science Center at San Antonio, TX, USA
| | | | | | | |
Collapse
|
30
|
Tan CS, Salim A, Ploner A, Lehtiö J, Chia KS, Pawitan Y. Correlating gene and protein expression data using Correlated Factor Analysis. BMC Bioinformatics 2009; 10:272. [PMID: 19723309 PMCID: PMC2744708 DOI: 10.1186/1471-2105-10-272] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2008] [Accepted: 09/01/2009] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND Joint analysis of transcriptomic and proteomic data taken from the same samples has the potential to elucidate complex biological mechanisms. Most current methods that integrate these datasets allow for the computation of the correlation between a gene and protein but only after a one-to-one matching of genes and proteins is done. However, genes and proteins are connected via biological pathways and their relationship is not necessarily one-to-one. In this paper, we investigate the use of Correlated Factor Analysis (CFA) for modeling the correlation of genome-scale gene and protein data. Unlike existing approaches, CFA considers all possible gene-protein pairs and utilizes all gene and protein information in its modeling framework. The Generalized Singular Value Decomposition (gSVD) is another method which takes into account all available transcriptomic and proteomic data. Comparison is made between CFA and gSVD. RESULTS Our simulation study indicates that the CFA estimates can consistently capture the dominant patterns of correlation between two sets of measurements; in contrast, the gSVD estimates cannot do that. Applied to real cancer data, the list of co-regulated genes and proteins identified by CFA has biologically meaningful interpretation, where both the gene and protein expressions are pointing to the same processes. Among the GO terms for which the genes and proteins are most correlated, we observed blood vessel morphogenesis and development. CONCLUSION We demonstrate that CFA is a useful tool for gene-protein data integration and modeling, where the main question is in finding which patterns of gene expression are most correlated with protein expression.
Collapse
Affiliation(s)
- Chuen Seng Tan
- Lewis-Sigler Institute, Princeton University, New Jersey, USA.
| | | | | | | | | | | |
Collapse
|
31
|
Torres-García W, Zhang W, Runger GC, Johnson RH, Meldrum DR. Integrative analysis of transcriptomic and proteomic data of Desulfovibrio vulgaris: a non-linear model to predict abundance of undetected proteins. ACTA ACUST UNITED AC 2009; 25:1905-14. [PMID: 19447782 DOI: 10.1093/bioinformatics/btp325] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION Gene expression profiling technologies can generally produce mRNA abundance data for all genes in a genome. A dearth of proteomic data persists because identification range and sensitivity of proteomic measurements lag behind those of transcriptomic measurements. Using partial proteomic data, it is likely that integrative transcriptomic and proteomic analysis may introduce significant bias. Developing methodologies to accurately estimate missing proteomic data will allow better integration of transcriptomic and proteomic datasets and provide deeper insight into metabolic mechanisms underlying complex biological systems. RESULTS In this study, we present a non-linear data-driven model to predict abundance for undetected proteins using two independent datasets of cognate transcriptomic and proteomic data collected from Desulfovibrio vulgaris. We use stochastic gradient boosted trees (GBT) to uncover possible non-linear relationships between transcriptomic and proteomic data, and to predict protein abundance for the proteins not experimentally detected based on relevant predictors such as mRNA abundance, cellular role, molecular weight, sequence length, protein length, guanine-cytosine (GC) content and triple codon counts. Initially, we constructed a GBT model using all possible variables to assess their relative importance and characterize the behavior of the predictive model. A strong plateau effect in the regions of high mRNA values and sparse data occurred in this model. Hence, we removed genes in those areas based on thresholds estimated from the partial dependency plots where this behavior was captured. At this stage, only the strongest predictors of protein abundance were retained to reduce the complexity of the GBT model. After removing genes in the plateau region, mRNA abundance, main cellular functional categories and few triple codon counts emerged as the top-ranked predictors of protein abundance. We then created a new tuned GBT model using the five most significant predictors. The construction of our non-linear model consists of a set of serial regression trees models with implicit strength in variable selection. The model provides variable relative importance measures using as a criterion mean square error. The results showed that coefficients of determination for our nonlinear models ranged from 0.393 to 0.582 in both datasets, providing better results than linear regression used in the past. We evaluated the validity of this non-linear model using biological information of operons, regulons and pathways, and the results demonstrated that the coefficients of variation of estimated protein abundance values within operons, regulons or pathways are indeed smaller than those for random groups of proteins. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wandaliz Torres-García
- Department of Industrial, Systems and Operations Engineering, Tempe, AZ 85287-5906, USA.
| | | | | | | | | |
Collapse
|
32
|
Baíllo A, Berrendero J, Cárcamo J. Tests for zero-inflation and overdispersion: A new approach based on the stochastic convex order. Comput Stat Data Anal 2009. [DOI: 10.1016/j.csda.2008.12.012] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
33
|
Nie L, Wu G, Zhang W. Statistical Application and Challenges in Global Gel-Free Proteomic Analysis by Mass Spectrometry. Crit Rev Biotechnol 2008; 28:297-307. [DOI: 10.1080/07388550802543158] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
34
|
Mishra Y, Chaurasia N, Rai LC. Heat pretreatment alleviates UV-B toxicity in the cyanobacterium Anabaena doliolum: A proteomic analysis of cross tolerance. Photochem Photobiol 2008; 85:824-33. [PMID: 19076303 DOI: 10.1111/j.1751-1097.2008.00469.x] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
This study offers proteomic elucidation of heat pretreatment-induced alleviation of UV-B toxicity in Anabaena doliolum. Heat-pretreated cells exposed to UV-B showed improved activity of PSI, PSII, whole chain, (14)C fixation, ATP and NADPH contents compared to UV-B alone. Proteomic analysis using two-dimensional gel electrophoresis (2-DE), MALDI-TOF MS/MS and reverse transcription polymerase chain reaction (RT-PCR) of UV-B and heat pretreatment followed by UV-B-treated cells exhibited significant and reproducible alterations in nine proteins homologous to phycocyanin-alpha-chain (PC-alpha-chain), phycoerythrocyanin-alpha-chain (PEC-alpha-chain), hypothetical protein alr0882, phycobilisome core component (PBS-CC), iron superoxide dismutase (Fe-SOD), fructose-1,6-bisphosphate aldolase (FBA), nucleoside diphosphate kinase (NDPK), phosphoribulokinase (PRK) and ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCo) large chain. Except the PEC-alpha-chain, hypothetical protein alr0882 and PBS-CC, all other proteins showed upregulation at low doses of UV-B (U2) and significant downregulation at higher doses of UV-B (U5). The disruption of redox status, signaling, pentose phosphate pathway and Calvin cycle appears to be due to the downregulation of Fe-SOD, NDPK, FBA, PRK and RuBisCo thereby leading to the death of Anabaena. In contrast to this, the upregulation of all the above proteins in heat-pretreated cells, harboring different heat shock proteins (HSPs) like 60, 26 and 16.6, followed by UV-B treatment than only the UV-B-treated ones suggests a protective role of HSPs in mitigating UV-B toxicity.
Collapse
Affiliation(s)
- Yogesh Mishra
- Center of Advanced Study in Botany, Banaras Hindu University, Varanasi, India
| | | | | |
Collapse
|
35
|
Bhargava P, Kumar A, Mishra Y, Rai LC. Copper pretreatment augments ultraviolet B toxicity in the cyanobacterium Anabaena doliolum: a proteomic analysis of cell death. FUNCTIONAL PLANT BIOLOGY : FPB 2008; 35:360-372. [PMID: 32688793 DOI: 10.1071/fp07267] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2007] [Accepted: 05/22/2008] [Indexed: 06/11/2023]
Abstract
This study provides first-hand proteomic characterisation of Cu-pretreatment-induced augmentation of ultraviolet B toxicity in the cyanobacterium Anabaena doliolum Bharadwaja. Of the three treatments (i.e. Cu, UV-B and Cu + UV-B) tested, the UV-B treatment of Cu-pretreated Anabaena produced a greater inhibition of oxygen evolution, 14C fixation, ATP and NADPH contents than UV-B alone. Proteomic analysis using two-dimensional gel electrophoresis (2DE), MALDI-TOF MS/MS and reverse transcription polymerase chain reaction (RT-PCR) of Cu, UV-B, and Cu + UV-B treated Anabaena exhibited significant and reproducible alterations in 12 proteins. Of these, manganese superoxide dismutase (Mn-SOD), iron superoxide dismutase (Fe-SOD) and peroxiredoxin (PER) are antioxidative enzymes; ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCo), phosphoribulokinase (PRK), flavodoxin (Flv), plastocyanin (PLC), phosphoglycerate kinase (PGK), phycocyanin (PC) and phycoerythrocyanin α-chain (PC α-chain) are linked with photosynthesis and respiration; and DnaK and nucleoside diphosphate kinase (NDPK) are associated with cellular processes and light signalling, respectively. However, when subjected to a high dose of UV-B, Cu-pretreated Anabaena depicted a severe down-regulation of DnaK, NDPK and Flv, probably because of inevitable oxidative stress. Thus, the augmentation of UV-B toxicity by Cu can be attributed to the down-regulation of DnaK, NDPK and Flv.
Collapse
Affiliation(s)
- Poonam Bhargava
- Molecular Biology Section, Laboratory of Algal Biology, Center of Advanced Study in Botany, Banaras Hindu University, Varanasi-221005, India
| | - Arvind Kumar
- Molecular Biology Section, Laboratory of Algal Biology, Center of Advanced Study in Botany, Banaras Hindu University, Varanasi-221005, India
| | - Yogesh Mishra
- Molecular Biology Section, Laboratory of Algal Biology, Center of Advanced Study in Botany, Banaras Hindu University, Varanasi-221005, India
| | - Lal Chand Rai
- Molecular Biology Section, Laboratory of Algal Biology, Center of Advanced Study in Botany, Banaras Hindu University, Varanasi-221005, India
| |
Collapse
|
36
|
Guo Y, Xiao P, Lei S, Deng F, Xiao GG, Liu Y, Chen X, Li L, Wu S, Chen Y, Jiang H, Tan L, Xie J, Zhu X, Liang S, Deng H. How is mRNA expression predictive for protein expression? A correlation study on human circulating monocytes. Acta Biochim Biophys Sin (Shanghai) 2008; 40:426-36. [PMID: 18465028 DOI: 10.1111/j.1745-7270.2008.00418.x] [Citation(s) in RCA: 311] [Impact Index Per Article: 19.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
A key assumption in studying mRNA expression is that it is informative in the prediction of protein expression. However, only limited studies have explored the mRNA-protein expression correlation in yeast or human tissues and the results have been relatively inconsistent. We carried out correlation analyses on mRNA-protein expressions in freshly isolated human circulating monocytes from 30 unrelated women. The expressed proteins for 71 genes were quantified and identified by 2-D electrophoresis coupled with mass spectrometry. The corresponding mRNA expressions were quantified by Affymetrix gene chips. Significant correlation (r=0.235, P<0.0001) was observed for the whole dataset including all studied genes and all samples. The correlations varied in different biological categories of gene ontology. For example, the highest correlation was achieved for genes of the extracellular region in terms of cellular component (r=0.643, P<0.0001) and the lowest correlation was obtained for genes of regulation (r=0.099, P=0.213) in terms of biological process. In the genome, half of the samples showed significant positive correlation for the 71 genes and significant correlation was found between the average mRNA and the average protein expression levels in all samples (r=0.296, P<0.01). However, at the study group level, only five studied genes had significant positive correlation across all the samples. Our results showed an overall positive correlation between mRNA and protein expression levels. However, the moderate and varied correlations suggest that mRNA expression might be sometimes useful, but certainly far from perfect, in predicting protein expression levels.
Collapse
Affiliation(s)
- Yanfang Guo
- Laboratory of Molecular and Statistical Genetics, College of Life Sciences, Hunan Normal University, Changsha 410081, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Rohmer L, Guina T, Chen J, Gallis B, Taylor GK, Shaffer SA, Miller SI, Brittnacher MJ, Goodlett DR. Determination and Comparison of the Francisella tularensis subsp.novicida U112 Proteome to Other Bacterial Proteomes. J Proteome Res 2008; 7:2016-24. [DOI: 10.1021/pr700760z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Laurence Rohmer
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| | - Tina Guina
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| | - Jinzhi Chen
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| | - Byron Gallis
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| | - Greg K. Taylor
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| | - Scott A. Shaffer
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| | - Samuel I. Miller
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| | - Mitchell J. Brittnacher
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| | - David R. Goodlett
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| |
Collapse
|
38
|
Nie L, Wu G, Culley DE, Scholten JCM, Zhang W. Integrative analysis of transcriptomic and proteomic data: challenges, solutions and applications. Crit Rev Biotechnol 2007; 27:63-75. [PMID: 17578703 DOI: 10.1080/07388550701334212] [Citation(s) in RCA: 170] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Recent advances in high-throughput technologies enable quantitative monitoring of the abundance of various biological molecules and allow determination of their variation between biological states on a genomic scale. Two popular platforms are DNA microarrays that measure messenger RNA transcript levels, and gel-free proteomic analyses that quantify protein abundance. Obviously, no single approach can fully unravel the complexities of fundamental biology and it is equally clear that integrative analysis of multiple levels of gene expression would be valuable in this endeavor. However, most integrative transcriptomic and proteomic studies have thus far either failed to find a correlation or only observed a weak correlation. In addition to various biological factors, it is suggested that the poor correlation could be quite possibly due to the inadequacy of available statistical tools to compensate for biases in the data collection methodologies. To address this issue, attempts have recently been made to systematically investigate the correlation patterns between transcriptomic and proteomic datasets, and to develop sophisticated statistical tools to improve the chances of capturing a relationship. The goal of these efforts is to enhance understanding of the relationship between transcriptomes and proteomes so that integrative analyses may be utilized to reveal new biological insights that are not accessible through one-dimensional datasets. In this review, we outline some of the challenges associated with integrative analyses and present some preliminary statistical solutions. In addition, some new applications of integrated transcriptomic and proteomic analysis to the investigation of post-transcriptional regulation are also discussed.
Collapse
Affiliation(s)
- Lei Nie
- Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown University. Washington, DC, USA
| | | | | | | | | |
Collapse
|
39
|
Comulada WS, Weiss RE, Cumberland W, Rotheram-Borus MJ. Reductions in drug use among young people living with HIV. THE AMERICAN JOURNAL OF DRUG AND ALCOHOL ABUSE 2007; 33:493-501. [PMID: 17613977 PMCID: PMC2819808 DOI: 10.1080/00952990701301921] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
ZIP models were used to detect reductions in drug abuse among young people living with HIV (YPLH) over 15 months when most young people abstain from use. YPLH (n = 171) aged 16 to 29 years were randomly assigned to an 18 session intervention or a delayed-intervention condition. The ZIP models showed significant reductions in abuse of multiple substances over time in the non-delayed intervention. Previous analyses did not find significant reductions. Intervention efficacy often cannot be detected if there are highly skewed distributions of outcomes, such as drug abuse. ZIP modeling offers an opportunity to more reliably detect behavioral changes.
Collapse
Affiliation(s)
- W Scott Comulada
- Semel Institute for Neuroscience and Human Behavior and the AIDS Institute, University of California, Los Angeles, California, USA.
| | | | | | | |
Collapse
|
40
|
Fagan A, Culhane AC, Higgins DG. A multivariate analysis approach to the integration of proteomic and gene expression data. Proteomics 2007; 7:2162-71. [PMID: 17549791 DOI: 10.1002/pmic.200600898] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
In order to understand even the simplest cellular processes, we need to integrate proteomic, gene expression and other biomolecular data. To date, most computational approaches aimed at integrating proteomics and gene expression data use direct gene/protein correlation measures. However, due to post-transcriptional and translational regulations, the correspondence between the expression of a gene and its protein is complicated. We apply a multivariate statistical method, co-inertia analysis (CIA), to visualise gene and proteomic expression data stemming from the same biological samples. Principal components analysis or correspondence analysis can be used for data exploration on single datasets. CIA is then used to explore the relationships between two or more datasets. We further explore the data by projecting gene ontology (GO) information onto these plots to describe the cellular processes in action. We apply these techniques to gene expression and protein abundance data from studies of the human malarial parasite life cycle and the NCI-60 cancer cell lines. In each case, we visualise gene expression, protein abundance and GO classes in the same low dimensional projections and identify GO classes that are likely to be of biological importance.
Collapse
Affiliation(s)
- Ailís Fagan
- Conway Institute for Biomolecular and Biomedical Research, University College Dublin, Belfield, Dublin, Ireland.
| | | | | |
Collapse
|
41
|
Nie L, Wu G, Zhang W. Correlation of mRNA expression and protein abundance affected by multiple sequence features related to translational efficiency in Desulfovibrio vulgaris: a quantitative analysis. Genetics 2006; 174:2229-43. [PMID: 17028312 PMCID: PMC1698625 DOI: 10.1534/genetics.106.065862] [Citation(s) in RCA: 163] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The modest correlation between mRNA expression and protein abundance in large-scale data sets is explained in part by experimental challenges, such as technological limitations, and in part by fundamental biological factors in the transcription and translation processes. Among various factors affecting the mRNA-protein correlation, the roles of biological factors related to translation are poorly understood. In this study, using experimental mRNA expression and protein abundance data collected from Desulfovibrio vulgaris by DNA microarray and liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) proteomic analysis, we quantitatively examined the effects of several translational-efficiency-related sequence features on mRNA-protein correlation. Three classes of sequence features were investigated according to different translational stages: (i) initiation, Shine-Dalgarno sequences, start codon identity, and start codon context; (ii) elongation, codon usage and amino acid usage; and (iii) termination, stop codon identity and stop codon context. Surprisingly, although it is widely accepted that translation initiation is the rate-limiting step for translation, our results showed that the mRNA-protein correlation was affected the most by the features at elongation stages, i.e., codon usage and amino acid composition (5.3-15.7% and 5.8-11.9% of the total variation of mRNA-protein correlation, respectively), followed by stop codon context and the Shine-Dalgarno sequence (3.7-5.1% and 1.9-3.8%, respectively). Taken together, all sequence features contributed to 15.2-26.2% of the total variation of mRNA-protein correlation. This study provides the first comprehensive quantitative analysis of the mRNA-protein correlation in bacterial D. vulgaris and adds new insights into the relative importance of various sequence features in prokaryotic protein translation.
Collapse
Affiliation(s)
- Lei Nie
- Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University, Washington, DC 20057, USA
| | | | | |
Collapse
|