51
|
Garcia B, Walter ND, Dolganov G, Coram M, Davis JL, Schoolnik GK, Strong M. A minimum variance method for genome-wide data-driven normalization of quantitative real-time polymerase chain reaction expression data. Anal Biochem 2014; 458:11-3. [PMID: 24780223 DOI: 10.1016/j.ab.2014.04.021] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2014] [Accepted: 04/18/2014] [Indexed: 12/16/2022]
Abstract
Advances in multiplex qRT-PCR have enabled increasingly accurate and robust quantification of RNA, even at lower concentrations, facilitating RNA expression profiling in clinical and environmental samples. Here we describe a data-driven qRT-PCR normalization method, the minimum variance method, and evaluate it on clinically derived Mycobacterium tuberculosis samples with variable transcript detection percentages. For moderate to significant amounts of nondetection (∼50%), our minimum variance method consistently produces the lowest false discovery rates compared to commonly used data-driven normalization methods.
Collapse
Affiliation(s)
- Benjamin Garcia
- Integrated Center for Genes, Environment, and Health, National Jewish Health, Denver, CO 80206, USA; Computational Bioscience Program, University of Colorado Denver, Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Nicholas D Walter
- Division of Pulmonary Sciences and Critical Care Medicine, University of Colorado Denver, Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Gregory Dolganov
- Department of Microbiology and Immunology, Stanford University, Stanford, CA 94304, USA
| | - Marc Coram
- Department of Health Research and Policy, Stanford University, Stanford, CA 94305, USA
| | - J Lucian Davis
- Division of Pulmonary & Critical Care Medicine, San Francisco General Hospital, University of California, San Francisco, San Francisco, CA, USA
| | - Gary K Schoolnik
- Department of Microbiology and Immunology, Stanford University, Stanford, CA 94304, USA
| | - Michael Strong
- Integrated Center for Genes, Environment, and Health, National Jewish Health, Denver, CO 80206, USA; Computational Bioscience Program, University of Colorado Denver, Anschutz Medical Campus, Aurora, CO 80045, USA.
| |
Collapse
|
52
|
Abstract
MOTIVATION Quantitative real-time PCR (qPCR) is one of the most widely used methods to measure gene expression. Despite extensive research in qPCR laboratory protocols, normalization and statistical analysis, little attention has been given to qPCR non-detects-those reactions failing to produce a minimum amount of signal. RESULTS We show that the common methods of handling qPCR non-detects lead to biased inference. Furthermore, we show that non-detects do not represent data missing completely at random and likely represent missing data occurring not at random. We propose a model of the missing data mechanism and develop a method to directly model non-detects as missing data. Finally, we show that our approach results in a sizeable reduction in bias when estimating both absolute and differential gene expression. AVAILABILITY AND IMPLEMENTATION The proposed algorithm is implemented in the R package, nondetects. This package also contains the raw data for the three example datasets used in this manuscript. The package is freely available at http://mnmccall.com/software and as part of the Bioconductor project.
Collapse
Affiliation(s)
- Matthew N McCall
- Department of Biostatistics and Computational Biology, Department of Biomedical Genetics and James P Wilmot Cancer Center, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Helene R McMurray
- Department of Biostatistics and Computational Biology, Department of Biomedical Genetics and James P Wilmot Cancer Center, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Hartmut Land
- Department of Biostatistics and Computational Biology, Department of Biomedical Genetics and James P Wilmot Cancer Center, University of Rochester Medical Center, Rochester, NY 14642, USADepartment of Biostatistics and Computational Biology, Department of Biomedical Genetics and James P Wilmot Cancer Center, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Anthony Almudevar
- Department of Biostatistics and Computational Biology, Department of Biomedical Genetics and James P Wilmot Cancer Center, University of Rochester Medical Center, Rochester, NY 14642, USA
| |
Collapse
|
53
|
Nuutila K, Katayama S, Vuola J, Kankuri E. Human Wound-Healing Research: Issues and Perspectives for Studies Using Wide-Scale Analytic Platforms. Adv Wound Care (New Rochelle) 2014; 3:264-271. [PMID: 24761360 DOI: 10.1089/wound.2013.0502] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2013] [Accepted: 11/26/2013] [Indexed: 11/13/2022] Open
Abstract
Significance: Revealing the basic mechanisms in the healing process and then regulating these processes for faster healing or to avoid negative outcomes such as infection or scarring are fundamental to wound research. The normal healing process is basically known, but to thoroughly understand the very complex aspects involved, it is necessary to characterize the course of events at a higher resolution with the latest molecular techniques and methodologies. Recent Advances: Various animal models are used in wound-healing research. Rodent and pig models are the ones most often used, probably because of pre-existing sophisticated research methodologies and as the proper care and ethical use of these species are highly developed and organized to serve science throughout the world. Critical Issues: Since several animal models are used, their anatomical and physiological differences varyingly affect the translation of results on healing mechanisms. Hence, to avoid species-specific misinformation, more ways to study wound healing directly in humans are needed. Future Directions: Fortunately, novel techniques have enabled high-end molecular-level research even from small samples of tissue. Since these methods require only a small amount of patient skin, they make it possible to study wound healing directly in humans.
Collapse
Affiliation(s)
- Kristo Nuutila
- Institute of Biomedicine, Pharmacology, Biomedicum, University of Helsinki, Helsinki, Finland
| | - Shintaro Katayama
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
- Science for Life Laboratory, Solna, Sweden
| | - Jyrki Vuola
- Helsinki Burn Center, Töölö Hospital, Helsinki University Central Hospital, Helsinki, Finland
| | - Esko Kankuri
- Institute of Biomedicine, Pharmacology, Biomedicum, University of Helsinki, Helsinki, Finland
| |
Collapse
|
54
|
Systemic toll-like receptor and interleukin-18 pathway activation in patients with acute ST elevation myocardial infarction. J Mol Cell Cardiol 2014; 67:94-102. [PMID: 24389343 DOI: 10.1016/j.yjmcc.2013.12.021] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/09/2013] [Revised: 12/09/2013] [Accepted: 12/25/2013] [Indexed: 12/21/2022]
Abstract
Acute myocardial infarction (AMI) is accompanied by increased expression of Toll like receptors (TLR)-2 and TLR4 on circulating monocytes. In animal models, blocking TLR2/4 signaling reduces inflammatory cell influx and infarct size. The clinical consequences of TLR activation during AMI in humans are unknown, including its role in long-term cardiac functional outcome Therefore, we analyzed gene expression in whole blood samples from 28 patients with an acute ST elevation myocardial infarction (STEMI), enrolled in the EXenatide trial for AMI patients (EXAMI), both at admission and after 4-month follow-up, by whole genome expression profiling and real-time PCR. Cardiac function was determined by cardiac magnetic resonance (CMR) imaging at baseline and after 4-month follow-up. TLR pathway activation was shown by increased expression of TLR4 and its downstream genes, including IL-18R1, IL-18R2, IL-8, MMP9, HIF1A, and NFKBIA. In contrast, expression of the classical TLR-induced genes, TNF, was reduced. Bioinformatics analysis and in vitro experiments explained this noncanonical TLR response by identification of a pivotal role for HIF-1α. The extent of TLR activation and IL-18R1/2 expression in circulating cells preceded massive troponin-T release and correlated with the CMR-measured ischemic area (R=0.48, p=0.01). In conclusion, we identified a novel HIF-1-dependent noncanonical TLR activation pathway in circulating leukocytes leading to enhanced IL-18R expression which correlated with the magnitude of the ischemic area. This knowledge may contribute to our mechanistic understanding of the involvement of the innate immune system during STEMI and may yield diagnostic and prognostic value for patients with myocardial infarction.
Collapse
|
55
|
Abstract
Deep sequencing of transcriptomes has become an indispensable tool for biology, enabling expression levels for thousands of genes to be compared across multiple samples. Since transcript counts scale with sequencing depth, counts from different samples must be normalized to a common scale prior to comparison. We analyzed fifteen existing and novel algorithms for normalizing transcript counts, and evaluated the effectiveness of the resulting normalizations. For this purpose we defined two novel and mutually independent metrics: (1) the number of “uniform” genes (genes whose normalized expression levels have a sufficiently low coefficient of variation), and (2) low Spearman correlation between normalized expression profiles of gene pairs. We also define four novel algorithms, one of which explicitly maximizes the number of uniform genes, and compared the performance of all fifteen algorithms. The two most commonly used methods (scaling to a fixed total value, or equalizing the expression of certain ‘housekeeping’ genes) yielded particularly poor results, surpassed even by normalization based on randomly selected gene sets. Conversely, seven of the algorithms approached what appears to be optimal normalization. Three of these algorithms rely on the identification of “ubiquitous” genes: genes expressed in all the samples studied, but never at very high or very low levels. We demonstrate that these include a “core” of genes expressed in many tissues in a mutually consistent pattern, which is suitable for use as an internal normalization guide. The new methods yield robustly normalized expression values, which is a prerequisite for the identification of differentially expressed and tissue-specific genes as potential biomarkers.
Collapse
|
56
|
Vischi Winck F, Arvidsson S, Riaño-Pachón DM, Hempel S, Koseska A, Nikoloski Z, Urbina Gomez DA, Rupprecht J, Mueller-Roeber B. Genome-wide identification of regulatory elements and reconstruction of gene regulatory networks of the green alga Chlamydomonas reinhardtii under carbon deprivation. PLoS One 2013; 8:e79909. [PMID: 24224019 PMCID: PMC3816576 DOI: 10.1371/journal.pone.0079909] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2012] [Accepted: 10/01/2013] [Indexed: 11/18/2022] Open
Abstract
The unicellular green alga Chlamydomonas reinhardtii is a long-established model organism for studies on photosynthesis and carbon metabolism-related physiology. Under conditions of air-level carbon dioxide concentration [CO2], a carbon concentrating mechanism (CCM) is induced to facilitate cellular carbon uptake. CCM increases the availability of carbon dioxide at the site of cellular carbon fixation. To improve our understanding of the transcriptional control of the CCM, we employed FAIRE-seq (formaldehyde-assisted Isolation of Regulatory Elements, followed by deep sequencing) to determine nucleosome-depleted chromatin regions of algal cells subjected to carbon deprivation. Our FAIRE data recapitulated the positions of known regulatory elements in the promoter of the periplasmic carbonic anhydrase (Cah1) gene, which is upregulated during CCM induction, and revealed new candidate regulatory elements at a genome-wide scale. In addition, time series expression patterns of 130 transcription factor (TF) and transcription regulator (TR) genes were obtained for cells cultured under photoautotrophic condition and subjected to a shift from high to low [CO2]. Groups of co-expressed genes were identified and a putative directed gene-regulatory network underlying the CCM was reconstructed from the gene expression data using the recently developed IOTA (inner composition alignment) method. Among the candidate regulatory genes, two members of the MYB-related TF family, Lcr1 (Low-CO 2 response regulator 1) and Lcr2 (Low-CO2 response regulator 2), may play an important role in down-regulating the expression of a particular set of TF and TR genes in response to low [CO2]. The results obtained provide new insights into the transcriptional control of the CCM and revealed more than 60 new candidate regulatory genes. Deep sequencing of nucleosome-depleted genomic regions indicated the presence of new, previously unknown regulatory elements in the C. reinhardtii genome. Our work can serve as a basis for future functional studies of transcriptional regulator genes and genomic regulatory elements in Chlamydomonas.
Collapse
Affiliation(s)
- Flavia Vischi Winck
- GoFORSYS Research Unit for Systems Biology, Institute of Biochemistry and Biology, University of Potsdam, Potsdam-Golm, Germany
- GoFORSYS Research Unit for Systems Biology, Max-Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany
| | - Samuel Arvidsson
- GoFORSYS Research Unit for Systems Biology, Max-Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany
| | - Diego Mauricio Riaño-Pachón
- Group of Computational and Evolutionary Biology, Biological Sciences Department, Universidad de los Andes, Bogotá, Colombia
| | - Sabrina Hempel
- University of Potsdam, Institute of Physics, Potsdam-Golm, Germany
- Potsdam Institute for Climate Impact Research (PIK), Potsdam, Germany
- Department of Physics, Humboldt University of Berlin, Berlin, Germany
| | - Aneta Koseska
- University of Potsdam, Institute of Physics, Potsdam-Golm, Germany
| | - Zoran Nikoloski
- GoFORSYS Research Unit for Systems Biology, Institute of Biochemistry and Biology, University of Potsdam, Potsdam-Golm, Germany
- Systems Biology and Mathematical Modeling Group, Max-Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany
| | - David Alejandro Urbina Gomez
- Group of Computational and Evolutionary Biology, Biological Sciences Department, Universidad de los Andes, Bogotá, Colombia
| | - Jens Rupprecht
- GoFORSYS Research Unit for Systems Biology, Max-Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany
| | - Bernd Mueller-Roeber
- GoFORSYS Research Unit for Systems Biology, Institute of Biochemistry and Biology, University of Potsdam, Potsdam-Golm, Germany
- GoFORSYS Research Unit for Systems Biology, Max-Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany
- * E-mail:
| |
Collapse
|
57
|
Pizzamiglio S, Bottelli S, Ciniselli CM, Zanutto S, Bertan C, Gariboldi M, Pierotti MA, Verderio P. A normalization strategy for the analysis of plasma microRNA qPCR data in colorectal cancer. Int J Cancer 2013; 134:2016-8. [PMID: 24150995 DOI: 10.1002/ijc.28530] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2013] [Revised: 08/30/2013] [Accepted: 09/30/2013] [Indexed: 11/07/2022]
Affiliation(s)
- Sara Pizzamiglio
- Unit of Medical Statistics Biometry and Bioinformatics, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy
| | | | | | | | | | | | | | | |
Collapse
|
58
|
Bowler RP, Bahr TM, Hughes G, Lutz S, Kim YI, Coldren CD, Reisdorph N, Kechris KJ. Integrative omics approach identifies interleukin-16 as a biomarker of emphysema. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2013; 17:619-26. [PMID: 24138069 DOI: 10.1089/omi.2013.0038] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Interleukin-16 (IL-16) is a multifunctional cytokine that has been associated with autoimmune and allergic diseases. To investigate comprehensively whether IL-16 is also associated with chronic obstructive pulmonary disease (COPD) and emphysema, we performed an integrated analysis of multiple "omics" data. Over 500 subjects participating in the COPDGene® study donated blood and were clinically characterized and genetically profiled. IL-16 mRNA levels were measured in peripheral blood mononuclear cells (PBMC), and protein levels were measured in fresh frozen plasma. A multivariate analysis found plasma IL-16 positively associated with age and body mass index, and negatively associated with current smoking and emphysema in the upper lobes. PBMC IL-16 expression was positively associated with gender and a composite score for airflow obstruction, emphysema, and gas trapping. Whole-genome expression quantitative trait locus (eQTL) analysis identified a novel IL-16 missense SNP (rs11556218) associated with lower IL-16 in plasma. In summary, an integrated "omics" analysis in a very large cohort identified an association between decreased IL-16 and emphysema and discovered a novel IL-16 cis-eQTL. Thus IL-16 plasma levels and IL-16 genotyping may be useful in a personalized medicine approach for lung disease.
Collapse
Affiliation(s)
- Russell P Bowler
- 1 Department of Medicine, National Jewish Health , Denver, Colorado
| | | | | | | | | | | | | | | |
Collapse
|
59
|
Carlsson J, Helenius G, Karlsson MG, Andrén O, Klinga-Levan K, Olsson B. Differences in microRNA expression during tumor development in the transition and peripheral zones of the prostate. BMC Cancer 2013; 13:362. [PMID: 23890084 PMCID: PMC3733730 DOI: 10.1186/1471-2407-13-362] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2012] [Accepted: 07/09/2013] [Indexed: 01/07/2023] Open
Abstract
Background The prostate is divided into three glandular zones, the peripheral zone (PZ), the transition zone (TZ), and the central zone. Most prostate tumors arise in the peripheral zone (70-75%) and in the transition zone (20-25%) while only 10% arise in the central zone. The aim of this study was to investigate if differences in miRNA expression could be a possible explanation for the difference in propensity of tumors in the zones of the prostate. Methods Patients with prostate cancer were included in the study if they had a tumor with Gleason grade 3 in the PZ, the TZ, or both (n=16). Normal prostate tissue was collected from men undergoing cystoprostatectomy (n=20). The expression of 667 unique miRNAs was investigated using TaqMan low density arrays for miRNAs. Student’s t-test was used in order to identify differentially expressed miRNAs, followed by hierarchical clustering and principal component analysis (PCA) to study the separation of the tissues. The ADtree algorithm was used to identify markers for classification of tissues and a cross-validation procedure was used to test the generality of the identified miRNA-based classifiers. Results The t-tests revealed that the major differences in miRNA expression are found between normal and malignant tissues. Hierarchical clustering and PCA based on differentially expressed miRNAs between normal and malignant tissues showed perfect separation between samples, while the corresponding analyses based on differentially expressed miRNAs between the two zones showed several misplaced samples. A classification and cross-validation procedure confirmed these results and several potential miRNA markers were identified. Conclusions The results of this study indicate that the major differences in the transcription program are those arising during tumor development, rather than during normal tissue development. In addition, tumors arising in the TZ have more unique differentially expressed miRNAs compared to the PZ. The results also indicate that separate miRNA expression signatures for diagnosis might be needed for tumors arising in the different zones. MicroRNA signatures that are specific for PZ and TZ tumors could also lead to more accurate prognoses, since tumors arising in the PZ tend to be more aggressive than tumors arising in the TZ.
Collapse
Affiliation(s)
- Jessica Carlsson
- Department of Urology, Örebro University Hospital, Örebro, Sweden.
| | | | | | | | | | | |
Collapse
|
60
|
Ejigu BA, Valkenborg D, Baggerman G, Vanaerschot M, Witters E, Dujardin JC, Burzykowski T, Berg M. Evaluation of normalization methods to pave the way towards large-scale LC-MS-based metabolomics profiling experiments. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2013; 17:473-85. [PMID: 23808607 DOI: 10.1089/omi.2013.0010] [Citation(s) in RCA: 75] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Combining liquid chromatography-mass spectrometry (LC-MS)-based metabolomics experiments that were collected over a long period of time remains problematic due to systematic variability between LC-MS measurements. Until now, most normalization methods for LC-MS data are model-driven, based on internal standards or intermediate quality control runs, where an external model is extrapolated to the dataset of interest. In the first part of this article, we evaluate several existing data-driven normalization approaches on LC-MS metabolomics experiments, which do not require the use of internal standards. According to variability measures, each normalization method performs relatively well, showing that the use of any normalization method will greatly improve data-analysis originating from multiple experimental runs. In the second part, we apply cyclic-Loess normalization to a Leishmania sample. This normalization method allows the removal of systematic variability between two measurement blocks over time and maintains the differential metabolites. In conclusion, normalization allows for pooling datasets from different measurement blocks over time and increases the statistical power of the analysis, hence paving the way to increase the scale of LC-MS metabolomics experiments. From our investigation, we recommend data-driven normalization methods over model-driven normalization methods, if only a few internal standards were used. Moreover, data-driven normalization methods are the best option to normalize datasets from untargeted LC-MS experiments.
Collapse
|
61
|
Prognostic microRNAs in cancer tissue from patients operated for pancreatic cancer--five microRNAs in a prognostic index. World J Surg 2013; 36:2699-707. [PMID: 22851141 DOI: 10.1007/s00268-012-1705-y] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
BACKGROUND The aim of the present study was to identify a panel of microRNAs (miRNAs) that can predict overall survival (OS) in non micro-dissected cancer tissues from patients operated for pancreatic cancer (PC). METHODS MiRNAs were purified from formalin-fixed paraffin embedded (FFPE) cancer tissue from 225 patients operated for PC. Only a few of those patients received adjuvant chemotherapy. Expressions of miRNAs were determined with the TaqMan MicroRNA Array v2.0. Two statistical methods, univariate selection and the Lasso (Least Absolute Shrinkage and Selection Operator) method, were applied in conjunction with the Cox proportional hazard model to relate miRNAs to OS. RESULTS High expression of miR-212 and miR-675 and low expression of miR-148a, miR-187, and let-7g predicted short OS independent of age, gender, calendar year of operation, KRAS mutation status, tumor stage, American Society of Anesthesiologists (ASA) score, localization (not miR-148a), and differentiation of tumor. A prognostic index (PI) based on these five miRNAs was calculated for each patient. The median survival was 1.09 years (Confidence Interval [CI] 0.98-1.43) for PI > median PI compared to 2.23 years (CI 1.84-4.36) for PI < median. MiR-212, miR-675, miR-187, miR-205, miR-944, miR-431, miR-194, miR-148a, and miR-769-5p showed the strongest prediction ability by the Lasso method. Thus miR-212, miR-675, miR-187, and miR-148a were predictors for OS in both statistical methods. CONCLUSIONS The combination of five miRNAs expression in non micro-dissected FFPE PC tissue can identify patients with short OS after radical surgery. The results are independent of chemotherapy treatment. Patients with a prognostic index > median had a very short median OS of only 1 year.
Collapse
|
62
|
Normalization of miRNA qPCR high-throughput data: a comparison of methods. Biotechnol Lett 2013; 35:843-51. [DOI: 10.1007/s10529-013-1150-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2012] [Accepted: 01/23/2013] [Indexed: 10/27/2022]
|
63
|
Dai H, Charnigo R, Vyhlidal CA, Jones BL, Bhandary M. Mixed modeling and sample size calculations for identifying housekeeping genes. Stat Med 2013; 32:3115-25. [PMID: 23444319 DOI: 10.1002/sim.5768] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2012] [Accepted: 01/29/2013] [Indexed: 11/05/2022]
Abstract
Normalization of gene expression data using internal control genes that have biologically stable expression levels is an important process for analyzing reverse transcription polymerase chain reaction data. We propose a three-way linear mixed-effects model to select optimal housekeeping genes. The mixed-effects model can accommodate multiple continuous and/or categorical variables with sample random effects, gene fixed effects, systematic effects, and gene by systematic effect interactions. We propose using the intraclass correlation coefficient among gene expression levels as the stability measure to select housekeeping genes that have low within-sample variation. Global hypothesis testing is proposed to ensure that selected housekeeping genes are free of systematic effects or gene by systematic effect interactions. A gene combination with the highest lower bound of 95% confidence interval for intraclass correlation coefficient and no significant systematic effects is selected for normalization. Sample size calculation based on the estimation accuracy of the stability measure is offered to help practitioners design experiments to identify housekeeping genes. We compare our methods with geNorm and NormFinder by using three case studies. A free software package written in SAS (Cary, NC, U.S.A.) is available at http://d.web.umkc.edu/daih under software tab.
Collapse
Affiliation(s)
- Hongying Dai
- Research Development and Clinical Investigation, Children's Mercy Hospital, 2401 Gillham Road, Kansas City, MO 64108, USA.
| | | | | | | | | |
Collapse
|
64
|
Abstract
Background MicroRNAs (miRNAs) are short non-coding RNA molecules that regulate mRNA transcript levels and translation. Deregulation of microRNAs is indicated in a number of diseases and microRNAs are seen as a promising target for biomarker identification and drug development. miRNA expression is commonly measured by microarray or real-time polymerase chain reaction (RT-PCR). The findings of RT-PCR data are highly dependent on the normalization techniques used during preprocessing of the Cycle Threshold readings from RT-PCR. Some of the commonly used endogenous controls themselves have been discovered to be differentially expressed in various conditions such as cancer, making them inappropriate internal controls. Methods We demonstrate that RT-PCR data contains a systematic bias resulting in large variations in the Cycle Threshold (CT) values of the low-abundant miRNA samples. We propose a new data normalization method that considers all available microRNAs as endogenous controls. A weighted normalization approach is utilized to allow contribution from all microRNAs, weighted by their empirical stability. Results The systematic bias in RT-PCR data is illustrated on a microRNA dataset obtained from primary cutaneous melanocytic neoplasms. We show that through a single control parameter, this method is able to emulate other commonly used normalization methods and thus provides a more general approach. We explore the consistency of RT-PCR expression data with microarray expression by utilizing a dataset where both RT-PCR and microarray profiling data is available for the same miRNA samples. Conclusions A weighted normalization method allows the contribution of all of the miRNAs, whether they are highly abundant or have low expression levels. Our findings further suggest that the normalization of a particular miRNA should rely on only miRNAs that have comparable expression levels.
Collapse
Affiliation(s)
- Rehman Qureshi
- Center for integrated Bioinformatics, School of Biomedical Engineering, Science and Health System, Drexel University, 3120 Market Street, Philadelphia, PA 19104, USA
| | | |
Collapse
|
65
|
Salzman J, Klass DM, Brown PO. Improved discovery of molecular interactions in genome-scale data with adaptive model-based normalization. PLoS One 2013; 8:e53930. [PMID: 23349766 PMCID: PMC3551948 DOI: 10.1371/journal.pone.0053930] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2012] [Accepted: 12/07/2012] [Indexed: 11/18/2022] Open
Abstract
Background High throughput molecular-interaction studies using immunoprecipitations (IP) or affinity purifications are powerful and widely used in biology research. One of many important applications of this method is to identify the set of RNAs that interact with a particular RNA-binding protein (RBP). Here, the unique statistical challenge presented is to delineate a specific set of RNAs that are enriched in one sample relative to another, typically a specific IP compared to a non-specific control to model background. The choice of normalization procedure critically impacts the number of RNAs that will be identified as interacting with an RBP at a given significance threshold – yet existing normalization methods make assumptions that are often fundamentally inaccurate when applied to IP enrichment data. Methods In this paper, we present a new normalization methodology that is specifically designed for identifying enriched RNA or DNA sequences in an IP. The normalization (called adaptive or AD normalization) uses a basic model of the IP experiment and is not a variant of mean, quantile, or other methodology previously proposed. The approach is evaluated statistically and tested with simulated and empirical data. Results and Conclusions The adaptive (AD) normalization method results in a greatly increased range in the number of enriched RNAs identified, fewer false positives, and overall better concordance with independent biological evidence, for the RBPs we analyzed, compared to median normalization. The approach is also applicable to the study of pairwise RNA, DNA and protein interactions such as the analysis of transcription factors via chromatin immunoprecipitation (ChIP) or any other experiments where samples from two conditions, one of which contains an enriched subset of the other, are studied.
Collapse
Affiliation(s)
- Julia Salzman
- Department of Biochemistry, Stanford University School of Medicine, Stanford, California, United States of America
- Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, California, United States of America
- Department of Statistics, Stanford University, Stanford, California, United States of America
- * E-mail: (JS); (POB)
| | - Daniel M. Klass
- Department of Biochemistry, Stanford University School of Medicine, Stanford, California, United States of America
- Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, California, United States of America
| | - Patrick O. Brown
- Department of Biochemistry, Stanford University School of Medicine, Stanford, California, United States of America
- Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, California, United States of America
- * E-mail: (JS); (POB)
| |
Collapse
|
66
|
MicroRNA expression profiles associated with pancreatic adenocarcinoma and ampullary adenocarcinoma. Mod Pathol 2012; 25:1609-22. [PMID: 22878649 DOI: 10.1038/modpathol.2012.122] [Citation(s) in RCA: 123] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
MicroRNAs have potential as diagnostic cancer biomarkers. The aim of this study was (1) to define microRNA expression patterns in formalin-fixed parafin-embedded tissue from pancreatic ductal adenocarcinoma, ampullary adenocarcinoma, normal pancreas and chronic pancreatitis without using micro-dissection and (2) to discover new diagnostic microRNAs and combinations of microRNAs in cancer tissue. The expression of 664 microRNAs in tissue from 170 pancreatic adenocarcinomas and 107 ampullary adenocarcinomas were analyzed using a commercial microRNA assay. Results were compared with chronic pancreatitis, normal pancreas and duodenal adenocarcinoma. In all, 43 microRNAs had higher and 41 microRNAs reduced expression in pancreatic cancer compared with normal pancreas. In all, 32 microRNAs were differently expressed in pancreatic adenocarcinoma compared with chronic pancreatitis (17 higher; 15 reduced). Several of these microRNAs have not before been related to diagnosis of pancreatic cancer (eg, miR-492, miR-614, miR-622). MiR-614, miR-492, miR-622, miR-135b and miR-196 were most differently expressed. MicroRNA profiles of pancreatic and ampullary adenocarcinomas were correlated (0.990). MicroRNA expression profiles for pancreatic cancer described in the literature were consistent with our findings, and the microRNA profile for pancreatic adenocarcinoma (miR-196b-miR-217) was validated. We identified a more significant expression profile, the difference between miR-411 and miR-198 (P=2.06 × 10(-54)) and a diagnostic LASSO classifier using 19 microRNAs (sensitivity 98.5%; positive predictive value 97.8%; accuracy 97.0%). We also identified microRNA profiles to subclassify ampullary adenocarcinomas into pancreatobiliary or intestinal type. In conclusion, we found that combinations of two microRNAs could roughly separate neoplastic from non-neoplastic samples. A diagnostic 19 microRNA classifier was constructed which without micro-dissection could discriminate pancreatic and ampullary adenocarcinomas from chronic pancreatitis and normal pancreas with high sensitivity and accuracy. Ongoing prospective studies will evaluate if these microRNA profiles are useful on fine-needle biopsies for early diagnosis of pancreatic cancer.
Collapse
|
67
|
Determinants of human adipose tissue gene expression: impact of diet, sex, metabolic status, and cis genetic regulation. PLoS Genet 2012; 8:e1002959. [PMID: 23028366 PMCID: PMC3459935 DOI: 10.1371/journal.pgen.1002959] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2012] [Accepted: 07/25/2012] [Indexed: 12/17/2022] Open
Abstract
Weight control diets favorably affect parameters of the metabolic syndrome and delay the onset of diabetic complications. The adaptations occurring in adipose tissue (AT) are likely to have a profound impact on the whole body response as AT is a key target of dietary intervention. Identification of environmental and individual factors controlling AT adaptation is therefore essential. Here, expression of 271 transcripts, selected for regulation according to obesity and weight changes, was determined in 515 individuals before, after 8-week low-calorie diet-induced weight loss, and after 26-week ad libitum weight maintenance diets. For 175 genes, opposite regulation was observed during calorie restriction and weight maintenance phases, independently of variations in body weight. Metabolism and immunity genes showed inverse profiles. During the dietary intervention, network-based analyses revealed strong interconnection between expression of genes involved in de novo lipogenesis and components of the metabolic syndrome. Sex had a marked influence on AT expression of 88 transcripts, which persisted during the entire dietary intervention and after control for fat mass. In women, the influence of body mass index on expression of a subset of genes persisted during the dietary intervention. Twenty-two genes revealed a metabolic syndrome signature common to men and women. Genetic control of AT gene expression by cis signals was observed for 46 genes. Dietary intervention, sex, and cis genetic variants independently controlled AT gene expression. These analyses help understanding the relative importance of environmental and individual factors that control the expression of human AT genes and therefore may foster strategies aimed at improving AT function in metabolic diseases. In obesity, an excess of adipose tissue is associated with dyslipidemia and diabetic complications. Gene expression is under the control of various genetic and environmental factors. As a central organ for the control of metabolic disturbances in conditions of both weight gain and loss, a comprehensive understanding of the control of adipose tissue gene expression is of paramount interest. We analyzed adipose tissue gene expression in obese individuals from the DiOGenes protocol, one of the largest dietary interventions worldwide. We found evidence for composite control of adipose tissue gene expression by nutrition, metabolic syndrome, body mass index, sex, and genotype with two main novel features. First, we observed a preeminent effect of sex on adipose tissue gene expression, which was independent of nutritional status, fat mass, and sex chromosomes. Second, the control of gene expression by cis genetic factors was unaffected by sex and nutritional status. Altogether, the effects of the investigated factors were most often independent of each other. Comprehension of the relative importance of environmental and individual factors that control the expression of human adipose tissue genes may help deciphering strategies aimed at controlling adipose tissue function during metabolic disorders.
Collapse
|
68
|
Abstract
Data normalization is a crucial preliminary step in analyzing genomic datasets. The goal of normalization is to remove global variation to make readings across different experiments comparable. In addition, most genomic loci have non-uniform sensitivity to any given assay because of variation in local sequence properties. In microarray experiments, this non-uniform sensitivity is due to different DNA hybridization and cross-hybridization efficiencies, known as the probe effect. In this paper we introduce a new scheme, called Group Normalization (GN), to remove both global and local biases in one integrated step, whereby we determine the normalized probe signal by finding a set of reference probes with similar responses. Compared to conventional normalization methods such as Quantile normalization and physically motivated probe effect models, our proposed method is general in the sense that it does not require the assumption that the underlying signal distribution be identical for the treatment and control, and is flexible enough to correct for nonlinear and higher order probe effects. The Group Normalization algorithm is computationally efficient and easy to implement. We also describe a variant of the Group Normalization algorithm, called Cross Normalization, which efficiently amplifies biologically relevant differences between any two genomic datasets.
Collapse
|
69
|
Freeman K, Staehle MM, Vadigepalli R, Gonye GE, Ogunnaike BA, Hoek JB, Schwaber JS. Coordinated dynamic gene expression changes in the central nucleus of the amygdala during alcohol withdrawal. Alcohol Clin Exp Res 2012; 37 Suppl 1:E88-100. [PMID: 22827539 DOI: 10.1111/j.1530-0277.2012.01910.x] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2012] [Accepted: 06/06/2012] [Indexed: 12/22/2022]
Abstract
BACKGROUND Chronic alcohol use causes widespread changes in the cellular biology of the amygdala's central nucleus (CeA), a GABAergic center that integrates autonomic physiology with the emotional aspects of motivation and learning. While alcohol-induced neurochemical changes play a role in dependence and drinking behavior, little is known about the CeA's dynamic changes during withdrawal, a period of emotional and physiologic disturbance. METHODS We used a qRT-PCR platform to measure 139 transcripts in 92 rat CeA samples from control (N = 33), chronically alcohol exposed (N = 26), and withdrawn rats (t = 4, 8, 18, 32, and 48 hours; N = 5, 10, 7, 6, 5). This focused transcript set allowed us to identify significant dynamic expression patterns during the first 48 hours of withdrawal and propose potential regulatory mechanisms. RESULTS Chronic alcohol exposure causes a limited number of small magnitude expression changes. In contrast, withdrawal results in a greater number of large changes within 4 hours of removal of the alcohol diet. Sixty-five of the 139 measured transcripts (47%) showed differential regulation during withdrawal. Over the 48-hour period, dynamic changes in the expression of γ-aminobutyric acid type A (GABA(A) ), ionotropic glutamate and neuropeptide system-related G-protein-coupled receptor subunits, and the Ras/Raf signaling pathway were seen as well as downstream transcription factors (TFs) and epigenetic regulators. Four temporally correlated gene clusters were identified with shared functional roles including NMDA receptors, MAPKKK and chemokine signaling cascades, and mediators of long-term potentiation, among others. Cluster promoter regions shared overrepresented binding sites for multiple TFs including Cebp, Usf-1, Smad3, Ap-2, and c-Ets, suggesting a potential regulatory role. CONCLUSIONS During alcohol withdrawal, the CeA experiences rapid changes in mRNA expression of these functionally related transcripts that were not predicted by measurement during chronic exposure. This study provides new insight into dynamic expression changes during alcohol withdrawal and suggests novel regulatory relationships that potentially impact the aspects of emotional modulation.
Collapse
Affiliation(s)
- Kate Freeman
- Department of Pathology, Anatomy and Cell Biology (KF, MMS, RV, GEG, JBH, JSS), Daniel Baugh Institute for Functional Genomics and Computational Biology, Thomas Jefferson University, Philadelphia, Pennsylvania; Department of Chemical Engineering (MMS), Rowan University, Glassboro, New Jersey; Department of Chemical Engineering (MMS, BAO), University of Delaware, Newark, Delaware
| | - Mary M Staehle
- Department of Pathology, Anatomy and Cell Biology (KF, MMS, RV, GEG, JBH, JSS), Daniel Baugh Institute for Functional Genomics and Computational Biology, Thomas Jefferson University, Philadelphia, Pennsylvania; Department of Chemical Engineering (MMS), Rowan University, Glassboro, New Jersey; Department of Chemical Engineering (MMS, BAO), University of Delaware, Newark, Delaware
| | - Rajanikanth Vadigepalli
- Department of Pathology, Anatomy and Cell Biology (KF, MMS, RV, GEG, JBH, JSS), Daniel Baugh Institute for Functional Genomics and Computational Biology, Thomas Jefferson University, Philadelphia, Pennsylvania; Department of Chemical Engineering (MMS), Rowan University, Glassboro, New Jersey; Department of Chemical Engineering (MMS, BAO), University of Delaware, Newark, Delaware
| | - Gregory E Gonye
- Department of Pathology, Anatomy and Cell Biology (KF, MMS, RV, GEG, JBH, JSS), Daniel Baugh Institute for Functional Genomics and Computational Biology, Thomas Jefferson University, Philadelphia, Pennsylvania; Department of Chemical Engineering (MMS), Rowan University, Glassboro, New Jersey; Department of Chemical Engineering (MMS, BAO), University of Delaware, Newark, Delaware
| | - Babatunde A Ogunnaike
- Department of Pathology, Anatomy and Cell Biology (KF, MMS, RV, GEG, JBH, JSS), Daniel Baugh Institute for Functional Genomics and Computational Biology, Thomas Jefferson University, Philadelphia, Pennsylvania; Department of Chemical Engineering (MMS), Rowan University, Glassboro, New Jersey; Department of Chemical Engineering (MMS, BAO), University of Delaware, Newark, Delaware
| | - Jan B Hoek
- Department of Pathology, Anatomy and Cell Biology (KF, MMS, RV, GEG, JBH, JSS), Daniel Baugh Institute for Functional Genomics and Computational Biology, Thomas Jefferson University, Philadelphia, Pennsylvania; Department of Chemical Engineering (MMS), Rowan University, Glassboro, New Jersey; Department of Chemical Engineering (MMS, BAO), University of Delaware, Newark, Delaware
| | - James S Schwaber
- Department of Pathology, Anatomy and Cell Biology (KF, MMS, RV, GEG, JBH, JSS), Daniel Baugh Institute for Functional Genomics and Computational Biology, Thomas Jefferson University, Philadelphia, Pennsylvania; Department of Chemical Engineering (MMS), Rowan University, Glassboro, New Jersey; Department of Chemical Engineering (MMS, BAO), University of Delaware, Newark, Delaware
| |
Collapse
|
70
|
Perkins JR, Dawes JM, McMahon SB, Bennett DLH, Orengo C, Kohl M. ReadqPCR and NormqPCR: R packages for the reading, quality checking and normalisation of RT-qPCR quantification cycle (Cq) data. BMC Genomics 2012; 13:296. [PMID: 22748112 PMCID: PMC3443438 DOI: 10.1186/1471-2164-13-296] [Citation(s) in RCA: 154] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2012] [Accepted: 05/30/2012] [Indexed: 11/10/2022] Open
Abstract
Background Measuring gene transcription using real-time reverse transcription polymerase chain reaction (RT-qPCR) technology is a mainstay of molecular biology. Technologies now exist to measure the abundance of many transcripts in parallel. The selection of the optimal reference gene for the normalisation of this data is a recurring problem, and several algorithms have been developed in order to solve it. So far nothing in R exists to unite these methods, together with other functions to read in and normalise the data using the chosen reference gene(s). Results We have developed two R/Bioconductor packages, ReadqPCR and NormqPCR, intended for a user with some experience with high-throughput data analysis using R, who wishes to use R to analyse RT-qPCR data. We illustrate their potential use in a workflow analysing a generic RT-qPCR experiment, and apply this to a real dataset. Packages are available from http://www.bioconductor.org/packages/release/bioc/html/ReadqPCR.htmland http://www.bioconductor.org/packages/release/bioc/html/NormqPCR.html Conclusions These packages increase the repetoire of RT-qPCR analysis tools available to the R user and allow them to (amongst other things) read their data into R, hold it in an ExpressionSet compatible R object, choose appropriate reference genes, normalise the data and look for differential expression between samples.
Collapse
Affiliation(s)
- James R Perkins
- Institute of Structural and Molecular Biology, University College of London, London, UK.
| | | | | | | | | | | |
Collapse
|
71
|
Deo A, Carlsson J, Lindlöf A. How to choose a normalization strategy for miRNA quantitative real-time (qPCR) arrays. J Bioinform Comput Biol 2012; 9:795-812. [PMID: 22084014 DOI: 10.1142/s0219720011005793] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2011] [Revised: 10/05/2011] [Accepted: 10/05/2011] [Indexed: 01/21/2023]
Abstract
Low-density arrays for quantitative real-time PCR (qPCR) are increasingly being used as an experimental technique for miRNA expression profiling. As with gene expression profiling using microarrays, data from such experiments needs effective analysis methods to produce reliable and high-quality results. In the pre-processing of the data, one crucial analysis step is normalization, which aims to reduce measurement errors and technical variability among arrays that might have arisen during the execution of the experiments. However, there are currently a number of different approaches to choose among and an unsuitable applied method may induce misleading effects, which could affect the subsequent analysis steps and thereby any conclusions drawn from the results. The choice of normalization method is hence an important issue to consider. In this study we present the comparison of a number of data-driven normalization methods for TaqMan low-density arrays for qPCR and different descriptive statistical techniques that can facilitate the choice of normalization method. The performance of the normalization methods was assessed and compared against each other as well as against standard normalization using endogenous controls. The results clearly show that the data-driven methods reduce variation and represent robust alternatives to using endogenous controls.
Collapse
Affiliation(s)
- Ameya Deo
- Systems Biology Research Centre, University of Skövde, Box 408 Skövde, 541 28, Sweden.
| | | | | |
Collapse
|
72
|
Diehl P, Fricke A, Sander L, Stamm J, Bassler N, Htun N, Ziemann M, Helbing T, El-Osta A, Jowett JBM, Peter K. Microparticles: major transport vehicles for distinct microRNAs in circulation. Cardiovasc Res 2012; 93:633-44. [PMID: 22258631 PMCID: PMC3291092 DOI: 10.1093/cvr/cvs007] [Citation(s) in RCA: 383] [Impact Index Per Article: 31.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Aims Circulating microRNAs (miRNAs) have attracted major interest as biomarkers for cardiovascular diseases. Since RNases are abundant in circulating blood, there needs to be a mechanism protecting miRNAs from degradation. We hypothesized that microparticles (MP) represent protective transport vehicles for miRNAs and that these are specifically packaged by their maternal cells. Methods and results Conventional plasma preparations, such as the ones used for biomarker detection, are shown to contain substantial numbers of platelet-, leucocyte-, and endothelial cell-derived MP. To analyse the widest spectrum of miRNAs, Next Generation Sequencing was used to assess miRNA profiles of MP and their corresponding stimulated and non-stimulated cells of origin. THP-1 (monocytic origin) and human umbilical vein endothelial cell (HUVEC) MP were used for representing circulating MP at a high purity. miRNA profiles of MP differed significantly from those of stimulated and non-stimulated maternal THP-1 cells and HUVECs, respectively. Quantitative reverse transcription–polymerase chain reaction of miRNAs which have been associated with cardiovascular diseases also demonstrated significant differences in miRNA profiles between platelets and their MP. Notably, the main fraction of miRNA in plasma was localized in MP. Furthermore, miRNA profiles of MP differed significantly between patients with stable and unstable coronary artery disease. Conclusion Circulating MP represent transport vehicles for large numbers of specific miRNAs, which have been associated with cardiovascular diseases. miRNA profiles of MP are significantly different from their maternal cells, indicating an active mechanism of selective ‘packaging’ from cells into MP. These findings describe an interesting mechanism for transferring gene-regulatory function from MP-releasing cells to target cells via MP circulating in blood.
Collapse
Affiliation(s)
- Philipp Diehl
- Atherothrombosis and Vascular Biology, BakerIDI Heart and Diabetes Institute, 75 Commercial Road, Melbourne, VIC 3004, Australia
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
73
|
Bell SM, Burgoon LD, Last RL. MIPHENO: data normalization for high throughput metabolite analysis. BMC Bioinformatics 2012; 13:10. [PMID: 22244038 PMCID: PMC3278354 DOI: 10.1186/1471-2105-13-10] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2011] [Accepted: 01/13/2012] [Indexed: 01/27/2023] Open
Abstract
Background High throughput methodologies such as microarrays, mass spectrometry and plate-based small molecule screens are increasingly used to facilitate discoveries from gene function to drug candidate identification. These large-scale experiments are typically carried out over the course of months and years, often without the controls needed to compare directly across the dataset. Few methods are available to facilitate comparisons of high throughput metabolic data generated in batches where explicit in-group controls for normalization are lacking. Results Here we describe MIPHENO (Mutant Identification by Probabilistic High throughput-Enabled Normalization), an approach for post-hoc normalization of quantitative first-pass screening data in the absence of explicit in-group controls. This approach includes a quality control step and facilitates cross-experiment comparisons that decrease the false non-discovery rates, while maintaining the high accuracy needed to limit false positives in first-pass screening. Results from simulation show an improvement in both accuracy and false non-discovery rate over a range of population parameters (p < 2.2 × 10-16) and a modest but significant (p < 2.2 × 10-16) improvement in area under the receiver operator characteristic curve of 0.955 for MIPHENO vs 0.923 for a group-based statistic (z-score). Analysis of the high throughput phenotypic data from the Arabidopsis Chloroplast 2010 Project (http://www.plastid.msu.edu/) showed ~ 4-fold increase in the ability to detect previously described or expected phenotypes over the group based statistic. Conclusions Results demonstrate MIPHENO offers substantial benefit in improving the ability to detect putative mutant phenotypes from post-hoc analysis of large data sets. Additionally, it facilitates data interpretation and permits cross-dataset comparison where group-based controls are missing. MIPHENO is applicable to a wide range of high throughput screenings and the code is freely available as Additional file 1 as well as through an R package in CRAN.
Collapse
Affiliation(s)
- Shannon M Bell
- Quantitative Biology Program, Michigan State University, East Lansing, MI, USA
| | | | | |
Collapse
|
74
|
Bell SM, Burgoon LD, Last RL. MIPHENO: data normalization for high throughput metabolite analysis. BMC Bioinformatics 2012. [PMID: 22244038 DOI: 10.1186/1471-2105- 13-10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/09/2023] Open
Abstract
BACKGROUND High throughput methodologies such as microarrays, mass spectrometry and plate-based small molecule screens are increasingly used to facilitate discoveries from gene function to drug candidate identification. These large-scale experiments are typically carried out over the course of months and years, often without the controls needed to compare directly across the dataset. Few methods are available to facilitate comparisons of high throughput metabolic data generated in batches where explicit in-group controls for normalization are lacking. RESULTS Here we describe MIPHENO (Mutant Identification by Probabilistic High throughput-Enabled Normalization), an approach for post-hoc normalization of quantitative first-pass screening data in the absence of explicit in-group controls. This approach includes a quality control step and facilitates cross-experiment comparisons that decrease the false non-discovery rates, while maintaining the high accuracy needed to limit false positives in first-pass screening. Results from simulation show an improvement in both accuracy and false non-discovery rate over a range of population parameters (p < 2.2 × 10(-16)) and a modest but significant (p < 2.2 × 10(-16)) improvement in area under the receiver operator characteristic curve of 0.955 for MIPHENO vs 0.923 for a group-based statistic (z-score). Analysis of the high throughput phenotypic data from the Arabidopsis Chloroplast 2010 Project (http://www.plastid.msu.edu/) showed ~ 4-fold increase in the ability to detect previously described or expected phenotypes over the group based statistic. CONCLUSIONS Results demonstrate MIPHENO offers substantial benefit in improving the ability to detect putative mutant phenotypes from post-hoc analysis of large data sets. Additionally, it facilitates data interpretation and permits cross-dataset comparison where group-based controls are missing. MIPHENO is applicable to a wide range of high throughput screenings and the code is freely available as Additional file 1 as well as through an R package in CRAN.
Collapse
Affiliation(s)
- Shannon M Bell
- Quantitative Biology Program, Michigan State University, East Lansing, MI, USA
| | | | | |
Collapse
|
75
|
Heckmann LH, Sørensen PB, Krogh PH, Sørensen JG. NORMA-Gene: a simple and robust method for qPCR normalization based on target gene data. BMC Bioinformatics 2011; 12:250. [PMID: 21693017 PMCID: PMC3223928 DOI: 10.1186/1471-2105-12-250] [Citation(s) in RCA: 107] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2011] [Accepted: 06/21/2011] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Normalization of target gene expression, measured by real-time quantitative PCR (qPCR), is a requirement for reducing experimental bias and thereby improving data quality. The currently used normalization approach is based on using one or more reference genes. Yet, this approach extends the experimental work load and suffers from assumptions that may be difficult to meet and to validate. RESULTS We developed a data driven normalization algorithm (NORMA-Gene). An analysis of the performance of NORMA-Gene compared to reference gene normalization on artificially generated data-sets showed that the NORMA-Gene normalization yielded more precise results under a large range of parameters tested. Furthermore, when tested on three very different real qPCR data-sets NORMA-Gene was shown to be best at reducing variance due to experimental bias in all three data-sets compared to normalization based on the use of reference gene(s). CONCLUSIONS Here we present the NORMA-Gene algorithm that is applicable to all biological and biomedical qPCR studies, especially those that are based on a limited number of assayed genes. The method is based on a data-driven normalization and is useful for as little as five target genes comprising the data-set. NORMA-Gene does not require the identification and validation of reference genes allowing researchers to focus their efforts on studying target genes of biological relevance.
Collapse
Affiliation(s)
- Lars-Henrik Heckmann
- National Environmental Research Institute, Aarhus University, Department of Terrestrial Ecology, Vejlsøvej 25, DK-8600 Silkeborg, Denmark
| | | | | | | |
Collapse
|
76
|
A miRNA expression signature that separates between normal and malignant prostate tissues. Cancer Cell Int 2011; 11:14. [PMID: 21619623 PMCID: PMC3123620 DOI: 10.1186/1475-2867-11-14] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2011] [Accepted: 05/27/2011] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND MicroRNAs (miRNAs) constitute a class of small non-coding RNAs that post-transcriptionally regulate genes involved in several key biological processes and thus are involved in various diseases, including cancer. In this study we aimed to identify a miRNA expression signature that could be used to separate between normal and malignant prostate tissues. RESULTS Nine miRNAs were found to be differentially expressed (p <0.00001). With the exception of two samples, this expression signature could be used to separate between the normal and malignant tissues. A cross-validation procedure confirmed the generality of this expression signature. We also identified 16 miRNAs that possibly could be used as a complement to current methods for grading of prostate tumor tissues. CONCLUSIONS We found an expression signature based on nine differentially expressed miRNAs that with high accuracy (85%) could classify the normal and malignant prostate tissues in patients from the Swedish Watchful Waiting cohort. The results show that there are significant differences in miRNA expression between normal and malignant prostate tissue, indicating that these small RNA molecules might be important in the biogenesis of prostate cancer and potentially useful for clinical diagnosis of the disease.
Collapse
|
77
|
RefGenes: identification of reliable and condition specific reference genes for RT-qPCR data normalization. BMC Genomics 2011; 12:156. [PMID: 21418615 PMCID: PMC3072958 DOI: 10.1186/1471-2164-12-156] [Citation(s) in RCA: 215] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2010] [Accepted: 03/21/2011] [Indexed: 12/31/2022] Open
Abstract
Background RT-qPCR is a sensitive and increasingly used method for gene expression quantification. To normalize RT-qPCR measurements between samples, most laboratories use endogenous reference genes as internal controls. There is increasing evidence, however, that the expression of commonly used reference genes can vary significantly in certain contexts. Results Using the Genevestigator database of normalized and well-annotated microarray experiments, we describe the expression stability characteristics of the transciptomes of several organisms. The results show that a) no genes are universally stable, b) most commonly used reference genes yield very high transcript abundances as compared to the entire transcriptome, and c) for each biological context a subset of stable genes exists that has smaller variance than commonly used reference genes or genes that were selected for their stability across all conditions. Conclusion We therefore propose the normalization of RT-qPCR data using reference genes that are specifically chosen for the conditions under study. RefGenes is a community tool developed for that purpose. Validation RT-qPCR experiments across several organisms showed that the candidates proposed by RefGenes generally outperformed commonly used reference genes. RefGenes is available within Genevestigator at http://www.genevestigator.com.
Collapse
|
78
|
TRAM (Transcriptome Mapper): database-driven creation and analysis of transcriptome maps from multiple sources. BMC Genomics 2011; 12:121. [PMID: 21333005 PMCID: PMC3052188 DOI: 10.1186/1471-2164-12-121] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2010] [Accepted: 02/18/2011] [Indexed: 11/10/2022] Open
Abstract
Background Several tools have been developed to perform global gene expression profile data analysis, to search for specific chromosomal regions whose features meet defined criteria as well as to study neighbouring gene expression. However, most of these tools are tailored for a specific use in a particular context (e.g. they are species-specific, or limited to a particular data format) and they typically accept only gene lists as input. Results TRAM (Transcriptome Mapper) is a new general tool that allows the simple generation and analysis of quantitative transcriptome maps, starting from any source listing gene expression values for a given gene set (e.g. expression microarrays), implemented as a relational database. It includes a parser able to assign univocal and updated gene symbols to gene identifiers from different data sources. Moreover, TRAM is able to perform intra-sample and inter-sample data normalization, including an original variant of quantile normalization (scaled quantile), useful to normalize data from platforms with highly different numbers of investigated genes. When in 'Map' mode, the software generates a quantitative representation of the transcriptome of a sample (or of a pool of samples) and identifies if segments of defined lengths are over/under-expressed compared to the desired threshold. When in 'Cluster' mode, the software searches for a set of over/under-expressed consecutive genes. Statistical significance for all results is calculated with respect to genes localized on the same chromosome or to all genome genes. Transcriptome maps, showing differential expression between two sample groups, relative to two different biological conditions, may be easily generated. We present the results of a biological model test, based on a meta-analysis comparison between a sample pool of human CD34+ hematopoietic progenitor cells and a sample pool of megakaryocytic cells. Biologically relevant chromosomal segments and gene clusters with differential expression during the differentiation toward megakaryocyte were identified. Conclusions TRAM is designed to create, and statistically analyze, quantitative transcriptome maps, based on gene expression data from multiple sources. The release includes FileMaker Pro database management runtime application and it is freely available at http://apollo11.isto.unibo.it/software/, along with preconfigured implementations for mapping of human, mouse and zebrafish transcriptomes.
Collapse
|
79
|
Ban J, Jug G, Mestdagh P, Schwentner R, Kauer M, Aryee DNT, Schaefer KL, Nakatani F, Scotlandi K, Reiter M, Strunk D, Speleman F, Vandesompele J, Kovar H. Hsa-mir-145 is the top EWS-FLI1-repressed microRNA involved in a positive feedback loop in Ewing's sarcoma. Oncogene 2011; 30:2173-80. [PMID: 21217773 DOI: 10.1038/onc.2010.581] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
EWS-FLI1 is a chromosome translocation-derived chimeric transcription factor that has a central and rate-limiting role in the pathogenesis of Ewing's sarcoma. Although the EWS-FLI1 transcriptomic signature has been extensively characterized on the mRNA level, information on its impact on non-coding RNA expression is lacking. We have performed a genome-wide analysis of microRNAs affected by RNAi-mediated silencing of EWS-FLI1 in Ewing's sarcoma cell lines, and differentially expressed between primary Ewing's sarcoma and mesenchymal progenitor cells. Here, we report on the identification of hsa-mir-145 as the top EWS-FLI1-repressed microRNA. Upon knockdown of EWS-FLI1, hsa-mir-145 expression dramatically increases in all Ewing's sarcoma cell lines tested. Vice versa, ectopic expression of the microRNA in Ewing's sarcoma cell lines strongly reduced EWS-FLI1 protein, whereas transfection of an anti-mir to hsa-mir-145 increased the EWS-FLI1 levels. Reporter gene assays revealed that this modulation of EWS-FLI1 protein was mediated by the microRNA targeting the FLI1 3'-untranslated region. Mutual regulations of EWS-FLI1 and hsa-mir-145 were mirrored by an inverse correlation between their expression levels in four of the Ewing's sarcoma cell lines tested. Consistent with the role of EWS-FLI1 in Ewing's sarcoma growth regulation, forced hsa-mir-145 expression halted Ewing's sarcoma cell line growth. These results identify feedback regulation between EWS-FLI1 and hsa-mir-145 as an important component of the EWS-FLI1-mediated Ewing's sarcomagenesis that may open a new avenue to future microRNA-mediated therapy of this devastating malignant disease.
Collapse
Affiliation(s)
- J Ban
- Children's Cancer Research Institute, St Anna Kinderkrebsforschung, Vienna, Austria
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
80
|
Kawaji H, Severin J, Lizio M, Forrest ARR, van Nimwegen E, Rehli M, Schroder K, Irvine K, Suzuki H, Carninci P, Hayashizaki Y, Daub CO. Update of the FANTOM web resource: from mammalian transcriptional landscape to its dynamic regulation. Nucleic Acids Res 2010; 39:D856-60. [PMID: 21075797 PMCID: PMC3013704 DOI: 10.1093/nar/gkq1112] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
The international Functional Annotation Of the Mammalian Genomes 4 (FANTOM4) research collaboration set out to better understand the transcriptional network that regulates macrophage differentiation and to uncover novel components of the transcriptome employing a series of high-throughput experiments. The primary and unique technique is cap analysis of gene expression (CAGE), sequencing mRNA 5′-ends with a second-generation sequencer to quantify promoter activities even in the absence of gene annotation. Additional genome-wide experiments complement the setup including short RNA sequencing, microarray gene expression profiling on large-scale perturbation experiments and ChIP–chip for epigenetic marks and transcription factors. All the experiments are performed in a differentiation time course of the THP-1 human leukemic cell line. Furthermore, we performed a large-scale mammalian two-hybrid (M2H) assay between transcription factors and monitored their expression profile across human and mouse tissues with qRT-PCR to address combinatorial effects of regulation by transcription factors. These interdependent data have been analyzed individually and in combination with each other and are published in related but distinct papers. We provide all data together with systematic annotation in an integrated view as resource for the scientific community (http://fantom.gsc.riken.jp/4/). Additionally, we assembled a rich set of derived analysis results including published predicted and validated regulatory interactions. Here we introduce the resource and its update after the initial release.
Collapse
Affiliation(s)
- Hideya Kawaji
- RIKEN Omics Science Center, RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama City, Kanagawa 230-0045, Japan.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
81
|
Maqbool SB, Mehrotra S, Kolpakas A, Durden C, Zhang B, Zhong H, Calvi BR. Dampened activity of E2F1-DP and Myb-MuvB transcription factors in Drosophila endocycling cells. J Cell Sci 2010; 123:4095-106. [PMID: 21045111 DOI: 10.1242/jcs.064519] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
The endocycle is a variant cell cycle comprised of alternating gap (G) and DNA synthesis (S) phases (endoreplication) without mitosis (M), which results in DNA polyploidy and large cell size. Endocycles occur widely in nature, but much remains to be learned about the regulation of this modified cell cycle. Here, we compared gene expression profiles of mitotic cycling larval brain and disc cells with the endocycling cells of fat body and salivary gland of the Drosophila larva. The results indicated that many genes that are positively regulated by the heterodimeric E2F1-DP or Myb-MuvB complex transcription factors are expressed at lower levels in endocycling cells. Many of these target genes have functions in M phase, suggesting that dampened E2F1 and Myb activity promote endocycles. Many other E2F1 target genes that are required for DNA replication were also repressed in endocycling cells, an unexpected result given that these cells must duplicate up to thousands of genome copies during each S phase. For some EF2-regulated genes, the lower level of mRNA in endocycling cells resulted in lower protein concentration, whereas for other genes it did not, suggesting a contribution of post-transcriptional regulation. Both knockdown and overexpression of E2F1-DP and Myb-MuvB impaired endocycles, indicating that transcriptional activation and repression must be balanced. Our data suggest that dampened transcriptional activation by E2F1-DP and Myb-MuvB is important to repress mitosis and coordinate the endocycle transcriptional and protein stability oscillators.
Collapse
|
82
|
Vinogradov AE. Human transcriptome nexuses: basic-eukaryotic and metazoan. Genomics 2010; 95:345-54. [PMID: 20298777 DOI: 10.1016/j.ygeno.2010.03.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2009] [Revised: 03/01/2010] [Accepted: 03/08/2010] [Indexed: 01/10/2023]
Abstract
Using a new approach, I analysed human transcriptome coexpression network and revealed two large-scale nexuses. Besides gene coexpression, each nexus is characterized by a combination of gene evolutionary origin, function and among-tissues expression breadth. The first nexus contains mostly genes of pre-metazoan origin, which are widely expressed and have cell-centred functions. The second nexus is enriched in genes of metazoan origin, which are expressed more narrowly and have organism-centred functions. The revealed nexuses are supported by asymmetry in distribution of transcription factor targets between them. Within the metazoan nexus, there is a subnexus that is more pronounced in the nervous tissues and is enriched in gene regulatory complexity. It mostly contains genes related to nervous system, cell communication and multicellular organism processes and development. The revealed nexuses indicate a dichotomy in the transcriptional regulation and can provide a framework for further functional genomics studies.
Collapse
|
83
|
Fujita A, Patriota AG, Sato JR, Miyano S. The impact of measurement errors in the identification of regulatory networks. BMC Bioinformatics 2009; 10:412. [PMID: 20003382 PMCID: PMC2811120 DOI: 10.1186/1471-2105-10-412] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2009] [Accepted: 12/13/2009] [Indexed: 11/21/2022] Open
Abstract
BACKGROUND There are several studies in the literature depicting measurement error in gene expression data and also, several others about regulatory network models. However, only a little fraction describes a combination of measurement error in mathematical regulatory networks and shows how to identify these networks under different rates of noise. RESULTS This article investigates the effects of measurement error on the estimation of the parameters in regulatory networks. Simulation studies indicate that, in both time series (dependent) and non-time series (independent) data, the measurement error strongly affects the estimated parameters of the regulatory network models, biasing them as predicted by the theory. Moreover, when testing the parameters of the regulatory network models, p-values computed by ignoring the measurement error are not reliable, since the rate of false positives are not controlled under the null hypothesis. In order to overcome these problems, we present an improved version of the Ordinary Least Square estimator in independent (regression models) and dependent (autoregressive models) data when the variables are subject to noises. Moreover, measurement error estimation procedures for microarrays are also described. Simulation results also show that both corrected methods perform better than the standard ones (i.e., ignoring measurement error). The proposed methodologies are illustrated using microarray data from lung cancer patients and mouse liver time series data. CONCLUSIONS Measurement error dangerously affects the identification of regulatory network models, thus, they must be reduced or taken into account in order to avoid erroneous conclusions. This could be one of the reasons for high biological false positive rates identified in actual regulatory network models.
Collapse
Affiliation(s)
- André Fujita
- Computational Science Research Program, RIKEN, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan
| | - Alexandre G Patriota
- Institute of Mathematics and Statistics, University of São Paulo, Rua do Matão, 1010 - São Paulo, 05508-090, Brazil
| | - João R Sato
- Center of Mathematics, Computation and Cognition, Universidade Federal do ABC, Rua Santa Adélia, 166 - Santo André, 09210-170, Brazil
| | - Satoru Miyano
- Computational Science Research Program, RIKEN, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan
| |
Collapse
|
84
|
Schmeier S, MacPherson CR, Essack M, Kaur M, Schaefer U, Suzuki H, Hayashizaki Y, Bajic VB. Deciphering the transcriptional circuitry of microRNA genes expressed during human monocytic differentiation. BMC Genomics 2009; 10:595. [PMID: 20003307 PMCID: PMC2797535 DOI: 10.1186/1471-2164-10-595] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2008] [Accepted: 12/10/2009] [Indexed: 12/19/2022] Open
Abstract
Background Macrophages are immune cells involved in various biological processes including host defence, homeostasis, differentiation, and organogenesis. Disruption of macrophage biology has been linked to increased pathogen infection, inflammation and malignant diseases. Differential gene expression observed in monocytic differentiation is primarily regulated by interacting transcription factors (TFs). Current research suggests that microRNAs (miRNAs) degrade and repress translation of mRNA, but also may target genes involved in differentiation. We focus on getting insights into the transcriptional circuitry regulating miRNA genes expressed during monocytic differentiation. Results We computationally analysed the transcriptional circuitry of miRNA genes during monocytic differentiation using in vitro time-course expression data for TFs and miRNAs. A set of TF→miRNA associations was derived from predicted TF binding sites in promoter regions of miRNA genes. Time-lagged expression correlation analysis was utilised to evaluate the TF→miRNA associations. Our analysis identified 12 TFs that potentially play a central role in regulating miRNAs throughout the differentiation process. Six of these 12 TFs (ATF2, E2F3, HOXA4, NFE2L1, SP3, and YY1) have not previously been described to be important for monocytic differentiation. The remaining six TFs are CEBPB, CREB1, ELK1, NFE2L2, RUNX1, and USF2. For several miRNAs (miR-21, miR-155, miR-424, and miR-17-92), we show how their inferred transcriptional regulation impacts monocytic differentiation. Conclusions The study demonstrates that miRNAs and their transcriptional regulatory control are integral molecular mechanisms during differentiation. Furthermore, it is the first study to decipher on a large-scale, how miRNAs are controlled by TFs during human monocytic differentiation. Subsequently, we have identified 12 candidate key controllers of miRNAs during this differentiation process.
Collapse
Affiliation(s)
- Sebastian Schmeier
- South African National Bioinformatics Institute, University of the Western Cape, Modderdam Road, Bellville, South Africa.
| | | | | | | | | | | | | | | |
Collapse
|
85
|
Kawaji H, Severin J, Lizio M, Waterhouse A, Katayama S, Irvine KM, Hume DA, Forrest ARR, Suzuki H, Carninci P, Hayashizaki Y, Daub CO. The FANTOM web resource: from mammalian transcriptional landscape to its dynamic regulation. Genome Biol 2009; 10:R40. [PMID: 19374775 PMCID: PMC2688931 DOI: 10.1186/gb-2009-10-4-r40] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2008] [Revised: 03/09/2009] [Accepted: 04/19/2009] [Indexed: 12/02/2022] Open
Abstract
The genome-scale data collected by the FANTOM4 collaborative research project are presented as an integrated web resource. In FANTOM4, an international collaborative research project, we collected a wide range of genome-scale data, including 24 million mRNA 5'-reads (CAGE tags) and microarray expression profiles along a differentiation time course of the human THP-1 cell line and under 52 systematic siRNA perturbations. In addition, data regarding chromatin status derived from ChIP-chip to elucidate the transcriptional regulatory interactions are included. Here we present these data to the research community as an integrated web resource.
Collapse
Affiliation(s)
- Hideya Kawaji
- RIKEN Omics Science Center, RIKEN Yokohama Institute, 1-7-22 Suehiro-cho Tsurumi-ku Yokohama, Kanagawa, 230-0045 Japan.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|