1
|
Cai C, Chen DZ, Tu HX, Chen WK, Ge LC, Fu TT, Tao Y, Ye SS, Li J, Lin Z, Wang XD, Xu LM, Chen YP. MicroRNA-29c Acting on FOS Plays a Significant Role in Nonalcoholic Steatohepatitis Through the Interleukin-17 Signaling Pathway. Front Physiol 2021; 12:597449. [PMID: 33927635 PMCID: PMC8078210 DOI: 10.3389/fphys.2021.597449] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Accepted: 03/04/2021] [Indexed: 12/13/2022] Open
Abstract
Nonalcoholic fatty liver disease is the most common hepatic disease in western countries and is even more ubiquitous in Asian countries. Our study determined that TH17/Treg cells were imbalanced in animal models. Based on our interest in the mechanism underlying TH17/Treg cell imbalance in nonalcoholic fatty liver mice, we conducted a joint bioinformatics analysis to further investigate this process. Common gene sequencing analysis was based on one trial from one sequencing platform, where gene expression analysis and enrichment analysis were the only analyses performed. We compared different sequencing results from different trials performed using different sequencing platforms, and we utilized the intersection of these analytical results to perform joint analysis. We used a bioinformatics analysis method to perform enrichment analysis and map interaction network analysis and predict potential microRNA sites. Animal experiments were also designed to validate the results of the data analysis based on quantitative polymerase chain reaction (qPCR) and western blotting. Our results revealed 8 coexisting differentially expressed genes (DEGs) and 7 hinge genes. The identified DEGs may influence nonalcoholic steatosis hepatitis through the interleukin-17 pathway. We found that microRNA-29c interacts with FOS and IGFBP1. Polymerase chain reaction analyses revealed both FOS and microRNA-29c expression in NASH mice, and western blot analyses indicated the same trend with regard to FOS protein levels. Based on these results, we suggest that microRNA-29c acts on FOS via the interleukin-17 signaling pathway to regulate TH17/Treg cells in NASH patients.
Collapse
Affiliation(s)
- Chao Cai
- Department of Infectious Disease, The First Affiliated Hospital of Wenzhou Medical University, Zhejiang Provincial Key Laboratory for Accurate Diagnosis and Treatment of Chronic Liver Disease, Hepatology Institute of Wenzhou Medical University, Wenzhou, China
| | - Da-Zhi Chen
- Department of Gastroenterology, The First Hospital of Peking University, Beijing, China
| | - Han-Xiao Tu
- Department of Infectious Disease, The First Affiliated Hospital of Wenzhou Medical University, Zhejiang Provincial Key Laboratory for Accurate Diagnosis and Treatment of Chronic Liver Disease, Hepatology Institute of Wenzhou Medical University, Wenzhou, China
| | - Wen-Kai Chen
- Department of Infectious Disease, The First Affiliated Hospital of Wenzhou Medical University, Zhejiang Provincial Key Laboratory for Accurate Diagnosis and Treatment of Chronic Liver Disease, Hepatology Institute of Wenzhou Medical University, Wenzhou, China
| | - Li-Chao Ge
- Department of Infectious Disease, The First Affiliated Hospital of Wenzhou Medical University, Zhejiang Provincial Key Laboratory for Accurate Diagnosis and Treatment of Chronic Liver Disease, Hepatology Institute of Wenzhou Medical University, Wenzhou, China
| | - Tian-Tian Fu
- School of Wenzhou Medical University, Wenzhou, China
| | - Ying Tao
- Department of Infectious Disease, The First Affiliated Hospital of Wenzhou Medical University, Zhejiang Provincial Key Laboratory for Accurate Diagnosis and Treatment of Chronic Liver Disease, Hepatology Institute of Wenzhou Medical University, Wenzhou, China
| | - Sha-Sha Ye
- Department of Infectious Disease, The First Affiliated Hospital of Wenzhou Medical University, Zhejiang Provincial Key Laboratory for Accurate Diagnosis and Treatment of Chronic Liver Disease, Hepatology Institute of Wenzhou Medical University, Wenzhou, China
| | - Ji Li
- Department of Infectious Disease, The First Affiliated Hospital of Wenzhou Medical University, Zhejiang Provincial Key Laboratory for Accurate Diagnosis and Treatment of Chronic Liver Disease, Hepatology Institute of Wenzhou Medical University, Wenzhou, China
| | - Zhuo Lin
- Department of Infectious Disease, The First Affiliated Hospital of Wenzhou Medical University, Zhejiang Provincial Key Laboratory for Accurate Diagnosis and Treatment of Chronic Liver Disease, Hepatology Institute of Wenzhou Medical University, Wenzhou, China
| | - Xiao-Dong Wang
- Department of Infectious Disease, The First Affiliated Hospital of Wenzhou Medical University, Zhejiang Provincial Key Laboratory for Accurate Diagnosis and Treatment of Chronic Liver Disease, Hepatology Institute of Wenzhou Medical University, Wenzhou, China
| | - Lan-Man Xu
- Department of Infectious Disease, The First Affiliated Hospital of Wenzhou Medical University, Zhejiang Provincial Key Laboratory for Accurate Diagnosis and Treatment of Chronic Liver Disease, Hepatology Institute of Wenzhou Medical University, Wenzhou, China.,Department of Infectious Diseases and Liver Diseases, Ningbo Medical Centre Lihuili Hospital, Affiliated Lihuili Hospital of Ningbo University, Ningbo Institute of Innovation for Combined Medicine and Engineering, Ningbo, China
| | - Yong-Ping Chen
- Department of Infectious Disease, The First Affiliated Hospital of Wenzhou Medical University, Zhejiang Provincial Key Laboratory for Accurate Diagnosis and Treatment of Chronic Liver Disease, Hepatology Institute of Wenzhou Medical University, Wenzhou, China
| |
Collapse
|
2
|
Technique of Gene Expression Profiles Extraction Based on the Complex Use of Clustering and Classification Methods. Diagnostics (Basel) 2020; 10:diagnostics10080584. [PMID: 32806785 PMCID: PMC7460566 DOI: 10.3390/diagnostics10080584] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2020] [Revised: 08/10/2020] [Accepted: 08/11/2020] [Indexed: 11/16/2022] Open
Abstract
In this paper, we present the results of the research concerning extraction of informative gene expression profiles from high-dimensional array of gene expressions considering the state of patients' health using clustering method, ML-based binary classifiers and fuzzy inference system. Applying of the proposed stepwise procedure can allow us to extract the most informative genes taking into account both the subtypes of disease or state of the patient's health for further reconstruction of gene regulatory networks based on the allocated genes and following simulation of the reconstructed models. We used the publicly available gene expressions data as the experimental ones which were obtained using DNA microarray experiments and contained two types of patients' gene expression profiles-the patients with lung cancer tumor and healthy patients. The stepwise procedure of the data processing assumes the following steps-in the beginning, we reduce the number of genes by removing non-informative genes in terms of statistical criteria and Shannon entropy; then, we perform the stepwise hierarchical clustering of gene expression profiles at hierarchical levels from 1 to 10 using the SOTA (Self-Organizing Tree Algorithm) clustering algorithm with correlation distance metric. The quality of the obtained clustering was evaluated using the complex clustering quality criterion which is considered both the gene expression profiles distribution relative to center of the clusters where these gene expression profiles are allocated and the centers of the clusters distribution. The result of this stage execution was a selection of the optimal cluster at each of the hierarchical levels which corresponded to the minimum value of the quality criterion. At the next step, we have implemented a classification procedure of the examined objects using four well known binary classifiers-logistic regression, support-vector machine, decision trees and random forest classifier. The effectiveness of the appropriate technique was evaluated based on the use of ROC (Receiver Operating Characteristic) analysis using criteria, included as the components, the errors of both the first and the second kinds. The final decision concerning the extraction of the most informative subset of gene expression profiles was taken based on the use of the fuzzy inference system, the inputs of which are the results of the appropriate single classifiers operation and the output is the final solution concerning state of the patient's health. To our mind, the implementation of the proposed stepwise procedure of the informative gene expression profiles extraction create the conditions for the increasing effectiveness of the further procedure of gene regulatory networks reconstruction and the following simulation of the reconstructed models considering the subtypes of the disease and/or state of the patient's health.
Collapse
|
3
|
Galichon P, Xu-Dubois YC, Buob D, Tinel C, Anglicheau D, Benbouzid S, Dahan K, Ouali N, Hertig A, Brocheriou I, Rondeau E. Urinary transcriptomics reveals patterns associated with subclinical injury of the renal allograft. Biomark Med 2018; 12:427-438. [PMID: 29697267 DOI: 10.2217/bmm-2017-0330] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
AIM Subclinical pathological features in renal allograft biopsies predict poor outcomes, and noninvasive biomarkers are wanted. RNA quantification in urine predicts overt rejection. We hypothesized that a whole transcriptome analysis would be informative, even for discrete injury. PATIENTS & METHODS We performed an mRNA microarray with an optimized normalization method on 26 urinary cell pellets to study renal partial epithelial to mesenchymal transition (pEMT) in stable kidney allografts. RESULTS & CONCLUSION Unbiased pathway analysis revealed immune response as the main underlying biological process. In a subgroup of pristine biopsies, isolated pEMT was associated with reduced metabolic functions. Thus, pEMT translates into specific urinary mRNA patterns, in other words, increased inflammation and decreased metabolic functions. Deposited in Gene Expression Omnibus (GSE89652).
Collapse
Affiliation(s)
- Pierre Galichon
- Sorbonne Universités, UPMC Univ Paris 06, UMR_S1155, Paris, France.,Institut National de la Santé et de la Recherche Médicale, UMR_S1155, Paris, France.,Urgences Néphrologiques et Transplantation Rénale, Hôpital Tenon, APHP, Paris, France
| | - Yi-Chun Xu-Dubois
- Institut National de la Santé et de la Recherche Médicale, UMR_S1155, Paris, France.,Service de Santé Publique, Hôpital Tenon, APHP, Paris, France
| | - David Buob
- Sorbonne Universités, UPMC Univ Paris 06, UMR_S1155, Paris, France.,Institut National de la Santé et de la Recherche Médicale, UMR_S1155, Paris, France.,Service d'Anatomie Pathologique, Hôpital Tenon, APHP, Paris, France
| | - Claire Tinel
- Service de Néphrologie et Transplantation Adulte, Hôpital Necker, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Dany Anglicheau
- Service de Néphrologie et Transplantation Adulte, Hôpital Necker, Assistance Publique-Hôpitaux de Paris, Paris, France.,Université Paris Descartes, Sorbonne Paris Cité, Paris, France.,RTRS « Centaure », Labex « Transplantex », Paris, France
| | | | - Karine Dahan
- Néphrologie et Dialyses, Hôpital Tenon, APHP, Paris, France
| | - Nacera Ouali
- Urgences Néphrologiques et Transplantation Rénale, Hôpital Tenon, APHP, Paris, France
| | - Alexandre Hertig
- Sorbonne Universités, UPMC Univ Paris 06, UMR_S1155, Paris, France.,Institut National de la Santé et de la Recherche Médicale, UMR_S1155, Paris, France.,Urgences Néphrologiques et Transplantation Rénale, Hôpital Tenon, APHP, Paris, France
| | - Isabelle Brocheriou
- Sorbonne Universités, UPMC Univ Paris 06, UMR_S1155, Paris, France.,Institut National de la Santé et de la Recherche Médicale, UMR_S1155, Paris, France.,Service d'Anatomie Pathologique, Hôpital de la Pitié-Salpêtrière, APHP, Paris, France
| | - Eric Rondeau
- Sorbonne Universités, UPMC Univ Paris 06, UMR_S1155, Paris, France.,Institut National de la Santé et de la Recherche Médicale, UMR_S1155, Paris, France.,Urgences Néphrologiques et Transplantation Rénale, Hôpital Tenon, APHP, Paris, France
| |
Collapse
|
4
|
Binder H, Kurz T, Teschner S, Kreutz C, Geyer M, Donauer J, Kraemer-Guth A, Timmer J, Schumacher M, Walz G. Dealing with prognostic signature instability: a strategy illustrated for cardiovascular events in patients with end-stage renal disease. BMC Med Genomics 2016; 9:43. [PMID: 27439789 PMCID: PMC4955222 DOI: 10.1186/s12920-016-0210-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2015] [Accepted: 07/14/2016] [Indexed: 11/13/2022] Open
Abstract
Background Identification of prognostic gene expression markers from clinical cohorts might help to better understand disease etiology. A set of potentially important markers can be automatically selected when linking gene expression covariates to a clinical endpoint by multivariable regression models and regularized parameter estimation. However, this is hampered by instability due to selection from many measurements. Stability can be assessed by resampling techniques, which might guide modeling decisions, such as choice of the model class or the specific endpoint definition. Methods We specifically propose a strategy for judging the impact of different endpoint definitions, endpoint updates, different approaches for marker selection, and exclusion of outliers. This strategy is illustrated for a study with end-stage renal disease patients, who experience a yearly mortality of more than 20 %, with almost 50 % sudden cardiac death or myocardial infarction. The underlying etiology is poorly understood, and we specifically point out how our strategy can help to identify novel prognostic markers and targets for therapeutic interventions. Results For markers such as the potentially prognostic platelet glycoprotein IIb, the endpoint definition, in combination with the signature building approach is seen to have the largest impact. Removal of outliers, as identified by the proposed strategy, is also seen to considerably improve stability. Conclusions As the proposed strategy allowed us to precisely quantify the impact of modeling choices on the stability of marker identification, we suggest routine use also in other applications to prevent analysis-specific results, which are unstable, i.e. not reproducible.
Collapse
Affiliation(s)
- Harald Binder
- Institute of Medical Biostatistics, Epidemiology and Informatics, University Medical Center Mainz, Obere Zahlbacher Str. 69, Mainz, 55131, Germany.
| | - Thorsten Kurz
- Core Facility Genomics, Centre for Systems Biology, University Freiburg, Freiburg, Germany
| | - Sven Teschner
- Renal Division, Department of Medicine, University Hospital Freiburg, Freiburg, Germany
| | - Clemens Kreutz
- Institute of Physics, University Freiburg, Freiburg, Germany
| | - Marcel Geyer
- Renal Division, Department of Medicine, University Hospital Freiburg, Freiburg, Germany
| | | | | | - Jens Timmer
- Institute of Physics, University Freiburg, Freiburg, Germany.,BIOSS Center for Biological Signalling Studies, University Freiburg, Germany, Freiburg, Germany
| | - Martin Schumacher
- Institute of Medical Biometry and Medical Informatics, University Medical Center Freiburg, Freiburg, Germany
| | - Gerd Walz
- Renal Division, Department of Medicine, University Hospital Freiburg, Freiburg, Germany.,BIOSS Center for Biological Signalling Studies, University Freiburg, Germany, Freiburg, Germany
| |
Collapse
|
5
|
Robinson JF, Piersma AH. Toxicogenomic approaches in developmental toxicology testing. Methods Mol Biol 2013; 947:451-73. [PMID: 23138921 DOI: 10.1007/978-1-62703-131-8_31] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
The emergence of toxicogenomic applications provides new tools to characterize, classify, and potentially predict teratogens. However, due to the vast number of experimental and statistical procedural steps, toxicogenomic studies are challenging. Here, we guide researchers through the basic framework of conducting toxicogenomic investigations in the field of developmental toxicology, providing examples of biological and technical factors that may influence response and interpretation. Furthermore, we review current, diverse applications of toxicogenomic-based approaches in teratology testing, including exposure-response characterization (dose and duration), chemical classification studies, and cross-model comparisons study designs. This review is intended to guide scientists through the challenging and complex structure of conducting toxicogenomic analyses, while considering the many applications of using toxicogenomics in study designs and the future of these types of "omics" approaches in developmental toxicology.
Collapse
Affiliation(s)
- Joshua F Robinson
- Laboratory for Health Protection Research-National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands.
| | | |
Collapse
|
6
|
Robinson JF, Pennings JLA, Piersma AH. A review of toxicogenomic approaches in developmental toxicology. Methods Mol Biol 2012; 889:347-371. [PMID: 22669676 DOI: 10.1007/978-1-61779-867-2_22] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Over the past decade, the use of gene expression profiling (i.e., toxicogenomics or transcriptomics) has been established as the vanguard "omics" technology to investigate exposure-induced molecular changes that underlie the development of disease. As this technology quickly advances, researchers are striving to keep pace in grasping the complexity of toxicogenomic response while at the same time determine its applicability for the field of developmental toxicology. Initial studies suggest toxicogenomics to be a promising tool for multiple types of study designs, including exposure-response investigations (dose and duration), chemical classification, and model comparisons. In this review, we examine the use of toxicogenomics in developmental toxicology, discussing biological and technical factors that influence response and interpretation. Additionally, we provide a framework to guide toxicogenomic investigations in the field of developmental toxicology.
Collapse
Affiliation(s)
- Joshua F Robinson
- National Institute for Public Health and the Environment-RIVM, Bilthoven, The Netherlands
| | | | | |
Collapse
|
7
|
Saei AA, Omidi Y. A glance at DNA microarray technology and applications. BIOIMPACTS : BI 2011; 1:75-86. [PMID: 23678411 DOI: 10.5681/bi.2011.011] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 06/27/2011] [Revised: 07/13/2011] [Accepted: 07/20/2011] [Indexed: 01/06/2023]
Abstract
INTRODUCTION Because of huge impacts of "OMICS" technologies in life sciences, many researchers aim to implement such high throughput approach to address cellular and/or molecular functions in response to any influential intervention in genomics, proteomics, or metabolomics levels. However, in many cases, use of such technologies often encounters some cybernetic difficulties in terms of knowledge extraction from a bunch of data using related softwares. In fact, there is little guidance upon data mining for novices. The main goal of this article is to provide a brief review on different steps of microarray data handling and mining for novices and at last to introduce different PC and/or web-based softwares that can be used in preprocessing and/or data mining of microarray data. METHODS To pursue such aim, recently published papers and microarray softwares were reviewed. RESULTS It was found that defining the true place of the genes in cell networks is the main phase in our understanding of programming and functioning of living cells. This can be obtained with global/selected gene expression profiling. CONCLUSION Studying the regulation patterns of genes in groups, using clustering and classification methods helps us understand different pathways in the cell, their functions, regulations and the way one component in the system affects the other one. These networks can act as starting points for data mining and hypothesis generation, helping us reverse engineer.
Collapse
Affiliation(s)
- Amir Ata Saei
- Research Center for Pharmaceutical Nanotechnology, Faculty of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran
| | | |
Collapse
|
8
|
Takahashi M, Negishi T, Imamura M, Sawano E, Kuroda Y, Yoshikawa Y, Tashiro T. Alterations in gene expression of glutamate receptors and exocytosis-related factors by a hydroxylated-polychlorinated biphenyl in the developing rat brain. Toxicology 2009; 257:17-24. [DOI: 10.1016/j.tox.2008.12.003] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2008] [Revised: 12/01/2008] [Accepted: 12/02/2008] [Indexed: 01/06/2023]
|
9
|
Takahashi M, Negishi T, Tashiro T. Identification of genes mediating thyroid hormone action in the developing mouse cerebellum. J Neurochem 2007; 104:640-52. [PMID: 18005342 DOI: 10.1111/j.1471-4159.2007.05049.x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Despite the indispensable role thyroid hormone (TH) plays in brain development, only a small number of genes have been identified to be directly regulated by TH and its precise mechanism of action remains largely unknown, partly because most of the previous studies have been carried out at postnatal day 15 or later. In the present study, we screened for TH-responsive genes in the developing mouse cerebellum at postnatal day 4 when morphological alterations because of TH status are not apparent. Among the new candidate genes selected by comparing gene expression profiles of experimentally hypothyroid, hypothyroid with postnatal thyroxine replacement, and control animals using oligoDNA microarrays, six genes were confirmed by real-time PCR to be positively (orc1l, galr3, sort1, nlgn3, cdk5r2, and zfp367) regulated by TH. Among these, sort1, cdk5r2, and zfp367 were up-regulated already at 1 h after a single injection of thyroxine to the hypothyroid or control animal, suggesting them to be possible primary targets of the hormone. Cell proliferation and apoptosis examined by BrdU incorporation and terminal deoxynucleotide transferase-mediated dUTP nick-end labeling assay revealed that hypothyroidism by itself did not enhance apoptosis at this stage, but rather increased cell survival, possibly through regulation of these newly identified genes.
Collapse
Affiliation(s)
- Masaki Takahashi
- Laboratory of Molecular Neurobiology, Department of Chemistry and Biological Science, School of Science and Technology, Aoyama Gakuin University, Sagamihara, Kanagawa, Japan
| | | | | |
Collapse
|
10
|
Wei H, Cai Y, Chu J, Li C, Li L. Temporal gene expression profile in hippocampus of mice treated with D-galactose. Cell Mol Neurobiol 2007; 28:781-94. [PMID: 17710534 DOI: 10.1007/s10571-007-9177-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2007] [Accepted: 07/28/2007] [Indexed: 10/22/2022]
Abstract
(1) Rodent chronically treated with D-galactose (D-gal) is increasingly used in pharmacological studies on aging; however, its mechanism remains unclear. The present study investigated the alterations of gene expression in the hippocampus of D-gal-treated mice. (2) C57 mice were subcutaneously injected with D-gal for 2, 4, and 8 weeks or vehicle, and then were subjected to behavioral tests. Gene expression profiles in hippocampus of each group were finally examined with cDNA microarray. (3) Both 4- and 8-week D-gal treatment led to a decrease of discrimination index in object recognition test, and 8-week D-gal-treated mice showed significant spatial learning & memory impairment in Morris water maze test. In comparison with the vehicle control group, the 2-, 4-, and 8-week D-gal treatment repressed the expression of 10, 13, and 30 genes by 2-fold or more, respectively. The early pattern was mainly characterized by down regulation of genes involved in ion transport. The delayed pattern included genes involved in protein biosynthesis, transport and signal transduction, which were highly related to synaptic functions. (4) These findings will contribute to the understanding of the mechanism of learning and memory impairment in mice treated with D-galactose.
Collapse
Affiliation(s)
- Haifeng Wei
- Department of Pharmacology, Key Laboratory for Neurodegenerative Disease of Ministry of Education, Xuan-wu Hospital of Capital Medical University, 45 Chang-chun Street, Xuan-wu District, Beijing, China
| | | | | | | | | |
Collapse
|
11
|
Tsai CA, Hsueh HM, Chen JJ. A Generalized Additive Model For Microarray Gene Expression Data Analysis. J Biopharm Stat 2007; 14:553-73. [PMID: 15468752 DOI: 10.1081/bip-200025648] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Microarray technology allows the measurement of expression levels of a large number of genes simultaneously. There are inherent biases in microarray data generated from an experiment. Various statistical methods have been proposed for data normalization and data analysis. This paper proposes a generalized additive model for the analysis of gene expression data. This model consists of two sub-models: a non-linear model and a linear model. We propose a two-step normalization algorithm to fit the two sub-models sequentially. The first step involves a non-parametric regression using lowess fits to adjust for non-linear systematic biases. The second step uses a linear ANOVA model to estimate the remaining effects including the interaction effect of genes and treatments, the effect of interest in a study. The proposed model is a generalization of the ANOVA model for microarray data analysis. We show correspondences between the lowess fit and the ANOVA model methods. The normalization procedure does not assume the majority of genes do not change their expression levels, and neither does it assume two channel intensities from the same spot are independent. The procedure can be applied to either one channel or two channel data from the experiments with multiple treatments or multiple nuisance factors. Two toxicogenomic experiment data sets and a simulated data set are used to contrast the proposed method with the commonly known lowess fit and ANOVA methods.
Collapse
Affiliation(s)
- Chen-An Tsai
- Division of Biometry and Risk Assessment, National Center for Toxicological Research, Food and Drug Administration, Jefferson, Arkansas 72079, USA
| | | | | |
Collapse
|
12
|
Ju Z, Wells MC, Walter RB. DNA microarray technology in toxicogenomics of aquatic models: methods and applications. Comp Biochem Physiol C Toxicol Pharmacol 2007; 145:5-14. [PMID: 16828578 DOI: 10.1016/j.cbpc.2006.04.017] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/17/2005] [Revised: 04/10/2006] [Accepted: 04/21/2006] [Indexed: 10/24/2022]
Abstract
Toxicogenomics represents the merging of toxicology with genomics and bioinformatics to investigate biological functions of genome in response to environmental contaminants. Aquatic species have traditionally been used as models in toxicology to characterize the actions of environmental stresses. Recent completion of the DNA sequencing for several fish species has spurred the development of DNA microarrays allowing investigators access to toxicogenomic approaches. However, since microarray technology is thus far limited to only a few aquatic species and derivation of biological meaning from microarray data is highly dependent on statistical arguments, the full potential of microarray in aquatic species research has yet to be realized. Herein we review some of the issues related to construction, probe design, statistical and bioinformatical data analyses, and current applications of DNA microarrays. As a model a recently developed medaka (Oryzias latipes) oligonucleotide microarray was described to highlight some of the issues related to array technology and its application in aquatic species exposed to hypoxia. Although there are known non-biological variations present in microarray data, it remains unquestionable that array technology will have a great impact on aquatic toxicology. Microarray applications in aquatic toxicogenomics will range from the discovery of diagnostic biomarkers, to establishment of stress-specific signatures and molecular pathways hallmarking the adaptation to new environmental conditions.
Collapse
Affiliation(s)
- Zhenlin Ju
- Molecular Biosciences Research Group, Department of Chemistry and Biochemistry, 419 Centennial Hall, Texas State University, San Marcos, TX 78666, USA
| | | | | |
Collapse
|
13
|
Whiteford CC, Bilke S, Greer BT, Chen Q, Braunschweig TA, Cenacchi N, Wei JS, Smith MA, Houghton P, Morton C, Reynolds CP, Lock R, Gorlick R, Khanna C, Thiele CJ, Takikita M, Catchpoole D, Hewitt SM, Khan J. Credentialing preclinical pediatric xenograft models using gene expression and tissue microarray analysis. Cancer Res 2007; 67:32-40. [PMID: 17210681 DOI: 10.1158/0008-5472.can-06-0610] [Citation(s) in RCA: 86] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Human tumor xenografts have been used extensively for rapid screening of the efficacy of anticancer drugs for the past 35 years. The selection of appropriate xenograft models for drug testing has been largely empirical and has not incorporated a similarity to the tumor type of origin at the molecular level. This study is the first comprehensive analysis of the transcriptome of a large set of pediatric xenografts, which are currently used for preclinical drug testing. Suitable models representing the tumor type of origin were identified. It was found that the characteristic expression patterns of the primary tumors were maintained in the corresponding xenografts for the majority of samples. Because a prerequisite for developing rationally designed drugs is that the target is expressed at the protein level, we developed tissue arrays from these xenografts and corroborated that high mRNA levels yielded high protein levels for two tested genes. The web database and availability of tissue arrays will allow for the rapid confirmation of the expression of potential targets at both the mRNA and the protein level for molecularly targeted agents. The database will facilitate the identification of tumor markers predictive of response to tested agents as well as the discovery of new molecular targets.
Collapse
Affiliation(s)
- Craig C Whiteford
- Oncogenomics Section, Comparative Oncology Program, and Cell and Molecular Biology Section, Pediatric Oncology Branch
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Chen J, Sarkar SK. A Bayesian determination of threshold for identifying differentially expressed genes in microarray experiments. Stat Med 2007; 25:3174-89. [PMID: 16345048 DOI: 10.1002/sim.2422] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The original definitions of false discovery rate (FDR) and false non-discovery rate (FNR) can be understood as the frequentist risks of false rejections and false non-rejections, respectively, conditional on the unknown parameter, while the Bayesian posterior FDR and posterior FNR are conditioned on the data. From a Bayesian point of view, it seems natural to take into account the uncertainties in both the parameter and the data. In this spirit, we propose averaging out the frequentist risks of false rejections and false non-rejections with respect to some prior distribution of the parameters to obtain the average FDR (AFDR) and average FNR (AFNR), respectively. A linear combination of the AFDR and AFNR, called the average Bayes error rate (ABER), is considered as an overall risk. Some useful formulas for the AFDR, AFNR and ABER are developed for normal samples with hierarchical mixture priors. The idea of finding threshold values by minimizing the ABER or controlling the AFDR is illustrated using a gene expression data set. Simulation studies show that the proposed approaches are more powerful and robust than the widely used FDR method.
Collapse
Affiliation(s)
- Jie Chen
- Merck Research Laboratories, P.O. Box 4, BL3-2, West Point, PA 19486, USA.
| | | |
Collapse
|
15
|
Lee HS, Park MH, Yang SJ, Park KC, Kim NS, Kim YS, Kim DI, Yoo HS, Choi EJ, Yeom YI. Novel candidate targets of Wnt/beta-catenin signaling in hepatoma cells. Life Sci 2006; 80:690-8. [PMID: 17157329 DOI: 10.1016/j.lfs.2006.10.024] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2006] [Revised: 09/21/2006] [Accepted: 10/26/2006] [Indexed: 02/06/2023]
Abstract
The activity of beta-catenin/TCF, the key component of Wnt signaling pathway, is frequently deregulated in HCC, resulting in the activation of genes whose dysregulation has significant consequences on tumor development. Therefore, identifying the target genes of Wnt signaling is important for understanding beta-catenin-mediated carcinogenesis. We analyzed the transcriptome profile of human hepatoma cell lines using cDNA microarrays representing 15,127 unique, liver-enriched gene loci to identify the target genes of beta-catenin-mediated transcription (p<0.005). This analysis yielded 130 potential Wnt-associated classifier genes, and we found 33 of them contain consensus TCF-binding sites in presumptive transcriptional regulatory sequences. These genes were, then, tested for their Wnt-dependence of expression in experimental models of Wnt activation. Genes such as RPL29, NEDD4L, FUT8, LYZ, STMN2, STARD7 and KIAA0998 were proven to be up-regulated upon Wnt/beta-catenin activation. Gene ontology analysis of the 33 candidate genes indicated the presence of functional categories relevant to Wnt pathway such as cell growth, proliferation, adhesion and signal transduction. In conclusion, we identified a number of candidate Wnt/beta-catenin target genes that can be useful for studying the role of altered Wnt signaling in liver cancer development, and showed that some of them might be direct targets of Wnt signaling in hepatoma cells.
Collapse
Affiliation(s)
- Heun-Sik Lee
- Functional Genomics Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon 305-600, South Korea
| | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Yauk CL, Williams A, Boucher S, Berndt LM, Zhou G, Zheng JL, Rowan-Carroll A, Dong H, Lambert IB, Douglas GR, Parfett CL. Novel design and controls for focused DNA microarrays: applications in quality assurance/control and normalization for the Health Canada ToxArray. BMC Genomics 2006; 7:266. [PMID: 17052352 PMCID: PMC1635050 DOI: 10.1186/1471-2164-7-266] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2005] [Accepted: 10/19/2006] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Microarray normalizations typically apply methods that assume absence of global transcript shifts, or absence of changes in internal control features such as housekeeping genes. These normalization approaches are not appropriate for focused arrays with small sets of genes where a large portion may be expected to change. Furthermore, many microarrays lack control features that can be used for quality assurance (QA). Here, we describe a novel external control series integrated with a design feature that addresses the above issues. RESULTS An EC dilution series that involves spike-in of a single concentration of the A. thaliana chlorophyll synthase gene to hybridize against spotted dilutions (0.000015 to 100 microM) of a single complimentary oligonucleotide representing the gene was developed. The EC series is printed in duplicate within each subgrid of the microarray and covers the full range of signal intensities from background to saturation. The design and placement of the series allows for QA examination of frequently encountered problems in hybridization (e.g., uneven hybridizations) and printing (e.g., cross-spot contamination). Additionally, we demonstrate that the series can be integrated with a LOWESS normalization to improve the detection of differential gene expression (improved sensitivity and predictivity) over LOWESS normalization on its own. CONCLUSION The quality of microarray experiments and the normalization methods used affect the ability to measure accurate changes in gene expression. Novel methods are required for normalization of small focused microarrays, and for incorporating measures of performance and quality. We demonstrate that dilution of oligonucleotides on the microarray itself provides an innovative approach allowing the full dynamic range of the scanner to be covered with a single gene spike-in. The dilution series can be used in a composite normalization to improve detection of differential gene expression and to provide quality control measures.
Collapse
Affiliation(s)
- Carole L Yauk
- Mutagenesis Section, Environmental and Occupational Toxicology Division, Safe Environments Program, Health Canada, Ottawa, Ontario, K1A 0L2, Canada
| | - Andrew Williams
- Biostatistics and Epidemiology Division, Safe Environments Program, Health Canada, Ottawa, ON, K1A 0K9, Canada
| | - Sherri Boucher
- Mutagenesis Section, Environmental and Occupational Toxicology Division, Safe Environments Program, Health Canada, Ottawa, Ontario, K1A 0L2, Canada
| | - Lynn M Berndt
- Mutagenesis Section, Environmental and Occupational Toxicology Division, Safe Environments Program, Health Canada, Ottawa, Ontario, K1A 0L2, Canada
| | - Gu Zhou
- Mutagenesis Section, Environmental and Occupational Toxicology Division, Safe Environments Program, Health Canada, Ottawa, Ontario, K1A 0L2, Canada
| | - Jenny L Zheng
- Mutagenesis Section, Environmental and Occupational Toxicology Division, Safe Environments Program, Health Canada, Ottawa, Ontario, K1A 0L2, Canada
| | - Andrea Rowan-Carroll
- Mutagenesis Section, Environmental and Occupational Toxicology Division, Safe Environments Program, Health Canada, Ottawa, Ontario, K1A 0L2, Canada
| | - Hongyan Dong
- Mutagenesis Section, Environmental and Occupational Toxicology Division, Safe Environments Program, Health Canada, Ottawa, Ontario, K1A 0L2, Canada
| | - Iain B Lambert
- Department of Biology, Carleton University, 1125 Colonel By Drive, Ottawa, ON, Canada, K1S 5B6
| | - George R Douglas
- Mutagenesis Section, Environmental and Occupational Toxicology Division, Safe Environments Program, Health Canada, Ottawa, Ontario, K1A 0L2, Canada
| | - Craig L Parfett
- Mutagenesis Section, Environmental and Occupational Toxicology Division, Safe Environments Program, Health Canada, Ottawa, Ontario, K1A 0L2, Canada
| |
Collapse
|
17
|
Tsujimura K, Asamoto M, Suzuki S, Hokaiwado N, Ogawa K, Shirai T. Prediction of carcinogenic potential by a toxicogenomic approach using rat hepatoma cells. Cancer Sci 2006; 97:1002-10. [PMID: 16918996 PMCID: PMC11159364 DOI: 10.1111/j.1349-7006.2006.00280.x] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
The long-term rodent bioassay is the standard method to predict the carcinogenic hazard of chemicals for humans. However, this assay is costly, and the results take at least two years to produce. In the present study, we conducted gene expression profiling of cultured cells exposed to carcinogenic chemicals with the aim of providing a basis for rapid and reliable prediction of carcinogenicity using microarray technology. We selected 39 chemicals, including 17 rat hepatocarcinogens and eight compounds demonstrating carcinogenicity in organs other than the liver. The remaining 14 were non-carcinogens. When rat hepatoma cells (MH1C1) were treated with the chemicals for 3 days at a non-toxic dose, analysis of gene expression changes with our in-house microarray allowed a set of genes to be identified differentiating hepatocarcinogens from non-carcinogens, and all carcinogens from non-carcinogens, by statistical methods. Moreover, optimization of the two gene sets for classification with an SVM and LOO-CV resulted in selection of 39 genes. The highest predictivity was achieved with 207 genes for differentiation between non-hepatocarcinogens and non-carcinogens. The overlap between the two selected gene sets encompassed 26 genes. This gene set contained significant genes for prediction of carcinogenicity, with a concordance of 84.6% by LOO-CV SVM. Using nine external samples, correct prediction of carcinogenicity by SVM was 88.9%. These results indicate that short-term bioassay systems for carcinogenicity using gene expression profiling in hepatoma cells have great promise.
Collapse
Affiliation(s)
- Kazunari Tsujimura
- Department of Experimental Pathology and Tumor Biology, Nagoya City University, Graduate School of Medical Sciences, 1 Kawasumi, Nagoya, Japan
| | | | | | | | | | | |
Collapse
|
18
|
Abstract
Microarrays have numerous applications in the clinical setting, and these uses are not confined to the study of common human diseases. Indeed, the high-throughput technology affects clinical diagnostics in a variety of contexts, and this is reflected in the increasing use of microarray-based tools in the development of diagnostic and prognostic tests and in the identification of novel therapeutic targets. While much of the value of microarray-based experimentation has been derived from the study of human disease, there is equivalent potential for its role in veterinary medicine. Even though the resources devoted to the study of animal molecular diagnostics may be less than those available for human research, there is nonetheless a growing appreciation of the value of genome-wide information as it applies to animal disease. Therefore, this review focuses on the basics of microarray experimentation, and how this technology lends itself to a variety of diagnostic approaches in veterinary medicine.
Collapse
Affiliation(s)
- Harriet E Feilotter
- Department of Pathology and Molecular Medicine, Queen's University, Kingston, Ontario K7L 3N6, Canada.
| |
Collapse
|
19
|
Zhang JJ, Yi T, Zhao LP. Evaluation of nine strategies for analyzing a cDNA toxicology microarray data set. J Biopharm Stat 2005; 15:403-18. [PMID: 15920888 DOI: 10.1081/bip-200056518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Microarray technology with two-color-based cDNA is commonly used for drug development, as well as for a much broader range of biomedical research. Among all the applications, two-group design is probably most commonly used for comparing, e.g., normal and abnormal tissue samples, tissues treated and untreated, or individuals responded and not responded to a drug. Despite the apparent simplicity, there are numerous methods for analyzing such data in a statistically rigorous manner. Here, we discuss nine different analytical strategies, each of which is derived under a set of "reasonable" assumptions. Some of them resemble methods developed for different contexts. In the absence of the truth, investigators should consider underlying assumptions before taking one or more of these strategies for analyzing data from a particular experiment. The issue here is what are the similarities and differences between these analytical strategies. We present these strategies in the context of an actual microarray experiment performed at the U.S. Food and Drug Administration.
Collapse
Affiliation(s)
- Juan Joanne Zhang
- Office of Biostatistics, Center for Drug Evaluation and Research, US Food and Drug Administration, Rockville, Maryland 20857, USA.
| | | | | |
Collapse
|
20
|
Abstract
This paper contains a description of several common normalization methods used in microarray analysis, and compares the effect of these methods on microarray data. The importance of background subtraction is also addressed. The research focuses on three parts. The first uses three statistical methods: t-test, Wilcoxon signed rank test, and sign test to measure the difference between background subtracted data and nonbackground subtracted data. The second part of the study uses the same three statistical methods to compare whether data normalized with different normalization methods yield similar results. The third part of the study focuses on whether these differently normalized data will influence the result of gene selection (dimension reduction). The comparisons are done for several data sets to help identify similarity patterns. The conclusion of this study is that background subtraction can make a difference, especially for some data sets with poorer quality data. The choice of normalization method, for the most part, makes little difference in the sense that the methods produce similarly normalized data. But, based on the third part of analysis, we found that when gene selection is performed on these differently normalized data, somewhat different gene sets are obtained. Thus, the choice of normalization method will likely have some effect on the final analysis.
Collapse
Affiliation(s)
- Yuanyuan Ding
- University of Mississippi, Computer and Information Science Department, University, Mississippi 38677, USA
| | | |
Collapse
|
21
|
Bayesian Clustering of Prostate Cancer Patients by Using a Latent Class Poisson Model. KOREAN JOURNAL OF APPLIED STATISTICS 2005. [DOI: 10.5351/kjas.2005.18.1.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
22
|
Delongchamp RR, Bowyer JF, Chen JJ, Kodell RL. Multiple-testing strategy for analyzing cDNA array data on gene expression. Biometrics 2005; 60:774-82. [PMID: 15339301 DOI: 10.1111/j.0006-341x.2004.00228.x] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
An objective of many functional genomics studies is to estimate treatment-induced changes in gene expression. cDNA arrays interrogate each tissue sample for the levels of mRNA for hundreds to tens of thousands of genes, and the use of this technology leads to a multitude of treatment contrasts. By-gene hypotheses tests evaluate the evidence supporting no effect, but selecting a significance level requires dealing with the multitude of comparisons. The p-values from these tests order the genes such that a p-value cutoff divides the genes into two sets. Ideally one set would contain the affected genes and the other would contain the unaffected genes. However, the set of genes selected as affected will have false positives, i.e., genes that are not affected by treatment. Likewise, the other set of genes, selected as unaffected, will contain false negatives, i.e., genes that are affected. A plot of the observed p-values (1 - p) versus their expectation under a uniform [0, 1] distribution allows one to estimate the number of true null hypotheses. With this estimate, the false positive rates and false negative rates associated with any p-value cutoff can be estimated. When computed for a range of cutoffs, these rates summarize the ability of the study to resolve effects. In our work, we are more interested in selecting most of the affected genes rather than protecting against a few false positives. An optimum cutoff, i.e., the best set given the data, depends upon the relative cost of falsely classifying a gene as affected versus the cost of falsely classifying a gene as unaffected. We select the cutoff by a decision-theoretic method analogous to methods developed for receiver operating characteristic curves. In addition, we estimate the false discovery rate and the false nondiscovery rate associated with any cutoff value. Two functional genomics studies that were designed to assess a treatment effect are used to illustrate how the methods allowed the investigators to determine a cutoff to suit their research goals.
Collapse
Affiliation(s)
- Robert R Delongchamp
- Division of Biometry and Risk Assessment, National Center for Toxicological Research, Jefferson, Arkansas 72079, USA.
| | | | | | | |
Collapse
|
23
|
Print-tip Normalization for DNA Microarray Data. KOREAN JOURNAL OF APPLIED STATISTICS 2005. [DOI: 10.5351/kjas.2005.18.1.115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
24
|
Parrish RS, Spencer HJ. Effect of normalization on significance testing for oligonucleotide microarrays. J Biopharm Stat 2005; 14:575-89. [PMID: 15468753 DOI: 10.1081/bip-200025650] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
MOTIVATION Normalization techniques are used to reduce variation among gene expression measurements in oligonucleotide microarrays in an effort to improve the quality of the data and the power of significance tests for detecting differential expression. Of several such proposed methods, two that have commonly been employed include median-interquartile range normalization and quantile normalization. The median-IQR method applied directly to fold-changes for paired data also was considered. Two methods for calculating gene expression values include the MAS 5.0 algorithm [Affymetrix. (2002). Statistical Algorithms Description Document. Santa Clara, CA: Affymetrix, Inc. http://www.affymetrix.com/support/technical/whitepapers/sadd-whitepaper.pdf] and the RMA method [Irizarry, R. A., Bolstad, B. M., Collin, F., Cope, L. M., Hobbs, B., Speed, T. P. (2003a). Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res. 31(4,e15); Irizarry, R. A., Hobbs, B., Collin, F., Beazer-Barclay, Y. D., Antonellis, K. J., Scherf, U., Speed, T. P. (2003b). Exploration, normalization, and summaries of high density oligonucleotide array probe-level data. Biostatistics 4(2):249-264; Irizarry, R. A., Gautier, L., Cope, L. (2003c). An R package for analysis of Affymetrix oligonucleotide arrays. In: Parmigiani, R. I. G., Garrett, E. S., Ziegler, S., eds. The Analysis of Gene Expression Data: Methods and Software. Berlin: Springer, pp. 102-119]. RESULTS In considering these methods applied to a prostate cancer data set derived from paired samples on normal and tumor tissue, it is shown that normalization methods may lead to substantial inflation of the number of genes identified by paired-t significance tests even after adjustment for multiple testing. This is shown to be due primarily to an unintended effect that normalization has on the experimental error variance. The impact appears to be greater in the RMA method compared to the MAS 5.0 algorithm and for quantile normalization compared to median-IQR normalization.
Collapse
Affiliation(s)
- Rudolph S Parrish
- Department of Bioinformatics and Biostatistics, School of Public Health and Information Sciences, University of Louisville, Louisville, KY 42092, USA.
| | | |
Collapse
|
25
|
Yoon D, Yi SG, Kim JH, Park T. Two-stage normalization using background intensities in cDNA microarray data. BMC Bioinformatics 2004; 5:97. [PMID: 15268767 PMCID: PMC509428 DOI: 10.1186/1471-2105-5-97] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2004] [Accepted: 07/21/2004] [Indexed: 11/10/2022] Open
Abstract
Background In the microarray experiment, many undesirable systematic variations are commonly observed. Normalization is the process of removing such variation that affects the measured gene expression levels. Normalization plays an important role in the earlier stage of microarray data analysis. The subsequent analysis results are highly dependent on normalization. One major source of variation is the background intensities. Recently, some methods have been employed for correcting the background intensities. However, all these methods focus on defining signal intensities appropriately from foreground and background intensities in the image analysis. Although a number of normalization methods have been proposed, no systematic methods have been proposed using the background intensities in the normalization process. Results In this paper, we propose a two-stage method adjusting for the effect of background intensities in the normalization process. The first stage fits a regression model to adjust for the effect of background intensities and the second stage applies the usual normalization method such as a nonlinear LOWESS method to the background-adjusted intensities. In order to carry out the two-stage normalization method, we consider nine different background measures and investigate their performances in normalization. The performance of two-stage normalization is compared to those of global median normalization as well as intensity dependent nonlinear LOWESS normalization. We use the variability among the replicated slides to compare performance of normalization methods. Conclusions For the selected background measures, the proposed two-stage normalization method performs better than global or intensity dependent nonlinear LOWESS normalization method. Especially, when there is a strong relationship between the background intensity and the signal intensity, the proposed method performs much better. Regardless of background correction methods used in the image analysis, the proposed two-stage normalization method can be applicable as long as both signal intensity and background intensity are available.
Collapse
Affiliation(s)
- Dankyu Yoon
- Program in Bioinformatics, Seoul National University, San56-l, Shin Lim-Dong, Kwan Ak-Ku, Seoul 151-747, Republic of Korea
| | - Sung-Gon Yi
- Department of Statistics, College of Natural Science, Seoul National University, San56-l, Shin Lim-Dong, Kwan Ak-Ku, Seoul 151-747, Republic of Korea
| | - Ju-Han Kim
- SNUBI: Seoul National University Biomedical Informatics, Seoul National University School of Medicine, 28 Yongon-dong Chongno-gu, Seoul 110-799, Republic of Korea
| | - Taesung Park
- Department of Statistics, College of Natural Science, Seoul National University, San56-l, Shin Lim-Dong, Kwan Ak-Ku, Seoul 151-747, Republic of Korea
| |
Collapse
|
26
|
|
27
|
Park T, Yi SG, Kang SH, Lee S, Lee YS, Simon R. Evaluation of normalization methods for microarray data. BMC Bioinformatics 2003; 4:33. [PMID: 12950995 PMCID: PMC200968 DOI: 10.1186/1471-2105-4-33] [Citation(s) in RCA: 88] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2003] [Accepted: 09/02/2003] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Microarray technology allows the monitoring of expression levels for thousands of genes simultaneously. This novel technique helps us to understand gene regulation as well as gene by gene interactions more systematically. In the microarray experiment, however, many undesirable systematic variations are observed. Even in replicated experiment, some variations are commonly observed. Normalization is the process of removing some sources of variation which affect the measured gene expression levels. Although a number of normalization methods have been proposed, it has been difficult to decide which methods perform best. Normalization plays an important role in the earlier stage of microarray data analysis. The subsequent analysis results are highly dependent on normalization. RESULTS In this paper, we use the variability among the replicated slides to compare performance of normalization methods. We also compare normalization methods with regard to bias and mean square error using simulated data. CONCLUSIONS Our results show that intensity-dependent normalization often performs better than global normalization methods, and that linear and nonlinear normalization methods perform similarly. These conclusions are based on analysis of 36 cDNA microarrays of 3,840 genes obtained in an experiment to search for changes in gene expression profiles during neuronal differentiation of cortical stem cells. Simulation studies confirm our findings.
Collapse
Affiliation(s)
- Taesung Park
- Department of Statistics, Seoul National University, Seoul, Korea
| | - Sung-Gon Yi
- Department of Statistics, Seoul National University, Seoul, Korea
| | - Sung-Hyun Kang
- Department of Statistics, Seoul National University, Seoul, Korea
| | - SeungYeoun Lee
- Department of Applied Mathematics, Sejong University, Seoul, Korea
| | - Yong-Sung Lee
- Department of Biochemistry, Hanyang University College of Medicine, Seoul, Korea
| | - Richard Simon
- Biometric Research Branch, Division of Cancer Treatment & Diagnosis National Cancer Institute, Bethesda MD, USA
| |
Collapse
|
28
|
Tsai CA, Chen YJ, Chen JJ. Testing for differentially expressed genes with microarray data. Nucleic Acids Res 2003; 31:e52. [PMID: 12711697 PMCID: PMC154240 DOI: 10.1093/nar/gng052] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
This paper compares the type I error and power of the one- and two-sample t-tests, and the one- and two-sample permutation tests for detecting differences in gene expression between two microarray samples with replicates using Monte Carlo simulations. When data are generated from a normal distribution, type I errors and powers of the one-sample parametric t-test and one-sample permutation test are very close, as are the two-sample t-test and two-sample permutation test, provided that the number of replicates is adequate. When data are generated from a t-distribution, the permutation tests outperform the corresponding parametric tests if the number of replicates is at least five. For data from a two-color dye swap experiment, the one-sample test appears to perform better than the two-sample test since expression measurements for control and treatment samples from the same spot are correlated. For data from independent samples, such as the one-channel array or two-channel array experiment using reference design, the two-sample t-tests appear more powerful than the one-sample t-tests.
Collapse
Affiliation(s)
- Chen-An Tsai
- Division of Biometry and Risk Assessment, National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR 72079, USA
| | | | | |
Collapse
|