1
|
Patel CJ, Kerr J, Thomas DC, Mukherjee B, Ritz B, Chatterjee N, Jankowska M, Madan J, Karagas MR, McAllister KA, Mechanic LE, Fallin MD, Ladd-Acosta C, Blair IA, Teitelbaum SL, Amos CI. Opportunities and Challenges for Environmental Exposure Assessment in Population-Based Studies. Cancer Epidemiol Biomarkers Prev 2017; 26:1370-1380. [PMID: 28710076 PMCID: PMC5581729 DOI: 10.1158/1055-9965.epi-17-0459] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Revised: 06/14/2017] [Accepted: 06/22/2017] [Indexed: 12/15/2022] Open
Abstract
A growing number and increasing diversity of factors are available for epidemiological studies. These measures provide new avenues for discovery and prevention, yet they also raise many challenges for adoption in epidemiological investigations. Here, we evaluate 1) designs to investigate diseases that consider heterogeneous and multidimensional indicators of exposure and behavior, 2) the implementation of numerous methods to capture indicators of exposure, and 3) the analytical methods required for discovery and validation. We find that case-control studies have provided insights into genetic susceptibility but are insufficient for characterizing complex effects of environmental factors on disease development. Prospective and two-phase designs are required but must balance extended data collection with follow-up of study participants. We discuss innovations in assessments including the microbiome; mass spectrometry and metabolomics; behavioral assessment; dietary, physical activity, and occupational exposure assessment; air pollution monitoring; and global positioning and individual sensors. We claim the the availability of extensive correlated data raises new challenges in disentangling specific exposures that influence cancer risk from among extensive and often correlated exposures. In conclusion, new high-dimensional exposure assessments offer many new opportunities for environmental assessment in cancer development. Cancer Epidemiol Biomarkers Prev; 26(9); 1370-80. ©2017 AACR.
Collapse
Affiliation(s)
- Chirag J Patel
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts.
| | - Jacqueline Kerr
- Department of Family Medicine and Public Health, University of California San Diego, La Jolla, California
| | - Duncan C Thomas
- Department of Preventive Medicine, University of Southern California, Los Angeles, California
| | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan
| | - Beate Ritz
- Department of Epidemiology, Fielding School of Public Health, University of California Los Angeles, Los Angeles, California
| | - Nilanjan Chatterjee
- Department of Biostatistics and Department of Oncology, Johns Hopkins University, Baltimore, Maryland
| | - Marta Jankowska
- Department of Family Medicine and Public Health, University of California San Diego, La Jolla, California
| | - Juliette Madan
- Division of Neonatology, Department of Pediatrics, Dartmouth-Hitchcock Medical Center, Lebanon, New Hampshire
| | - Margaret R Karagas
- Department of Epidemiology, Geisel School of Medicine, Dartmouth College, Lebanon, New Hampshire
| | - Kimberly A McAllister
- Susceptibility and Population Health Branch, National Institute of Environmental Health Sciences, NIH, Research Triangle Park, North Carolina
| | - Leah E Mechanic
- Epidemiology and Genomics Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, NIH, Bethesda, Maryland
| | - M Daniele Fallin
- Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland
| | | | - Ian A Blair
- Center of Excellence in Environmental Toxicology and Penn SRP Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Susan L Teitelbaum
- Department of Preventive Medicine, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Christopher I Amos
- Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth College, Lebanon, New Hampshire.
| |
Collapse
|
2
|
Huque MH, Carroll RJ, Diao N, Christiani DC, Ryan LM. Exposure Enriched Case-Control (EECC) Design for the Assessment of Gene-Environment Interaction. Genet Epidemiol 2016; 40:570-578. [PMID: 27313007 DOI: 10.1002/gepi.21986] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2015] [Revised: 05/07/2016] [Accepted: 05/08/2016] [Indexed: 11/10/2022]
Abstract
Genetic susceptibility and environmental exposure both play an important role in the aetiology of many diseases. Case-control studies are often the first choice to explore the joint influence of genetic and environmental factors on the risk of developing a rare disease. In practice, however, such studies may have limited power, especially when susceptibility genes are rare and exposure distributions are highly skewed. We propose a variant of the classical case-control study, the exposure enriched case-control (EECC) design, where not only cases, but also high (or low) exposed individuals are oversampled, depending on the skewness of the exposure distribution. Of course, a traditional logistic regression model is no longer valid and results in biased parameter estimation. We show that addition of a simple covariate to the regression model removes this bias and yields reliable estimates of main and interaction effects of interest. We also discuss optimal design, showing that judicious oversampling of high/low exposed individuals can boost study power considerably. We illustrate our results using data from a study involving arsenic exposure and detoxification genes in Bangladesh.
Collapse
Affiliation(s)
- Md Hamidul Huque
- School of Mathematical and Physical Sciences, University of Technology Sydney, New South Wales, Australia.
| | - Raymond J Carroll
- School of Mathematical and Physical Sciences, University of Technology Sydney, New South Wales, Australia.,Department of Statistics, Texas A&M University, College Station, Texas, United States of American
| | - Nancy Diao
- Department of Environmental Health, Harvard School of Public Health, Boston, Massachusetts, United States of American
| | - David C Christiani
- Department of Environmental Health, Harvard School of Public Health, Boston, Massachusetts, United States of American
| | - Louise M Ryan
- School of Mathematical and Physical Sciences, University of Technology Sydney, New South Wales, Australia.,Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of American
| |
Collapse
|
3
|
Rivera CL, Lumley T. Using the entire history in the analysis of nested case cohort samples. Stat Med 2016; 35:3213-28. [DOI: 10.1002/sim.6917] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2015] [Revised: 01/20/2016] [Accepted: 01/31/2016] [Indexed: 11/10/2022]
Affiliation(s)
- C. L. Rivera
- Department of Biostatistics; Harvard School of Public Health; 677 Huntington Avenue, Kresge 803B Boston MA 02115 U.S.A
| | - T. Lumley
- Department of Biostatistics; Harvard School of Public Health; 677 Huntington Avenue, Kresge 803B Boston MA 02115 U.S.A
| |
Collapse
|
4
|
Keogh RH, White IR. Using full-cohort data in nested case-control and case-cohort studies by multiple imputation. Stat Med 2013; 32:4021-43. [PMID: 23613433 DOI: 10.1002/sim.5818] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2012] [Accepted: 03/17/2013] [Indexed: 11/08/2022]
Abstract
In many large prospective cohorts, expensive exposure measurements cannot be obtained for all individuals. Exposure-disease association studies are therefore often based on nested case-control or case-cohort studies in which complete information is obtained only for sampled individuals. However, in the full cohort, there may be a large amount of information on cheaply available covariates and possibly a surrogate of the main exposure(s), which typically goes unused. We view the nested case-control or case-cohort study plus the remainder of the cohort as a full-cohort study with missing data. Hence, we propose using multiple imputation (MI) to utilise information in the full cohort when data from the sub-studies are analysed. We use the fully observed data to fit the imputation models. We consider using approximate imputation models and also using rejection sampling to draw imputed values from the true distribution of the missing values given the observed data. Simulation studies show that using MI to utilise full-cohort information in the analysis of nested case-control and case-cohort studies can result in important gains in efficiency, particularly when a surrogate of the main exposure is available in the full cohort. In simulations, this method outperforms counter-matching in nested case-control studies and a weighted analysis for case-cohort studies, both of which use some full-cohort information. Approximate imputation models perform well except when there are interactions or non-linear terms in the outcome model, where imputation using rejection sampling works well.
Collapse
Affiliation(s)
- Ruth H Keogh
- MRC Biostatistics Unit, Cambridge, U.K.; Department of Medical Statistics, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, U.K.
| | | |
Collapse
|
5
|
Abstract
BACKGROUND High-throughput laboratory technologies coupled with sophisticated bioinformatics algorithms have tremendous potential for discovering novel biomarkers, or profiles of biomarkers, that could serve as predictors of disease risk, response to treatment or prognosis. We discuss methodological issues in wedding high-throughput approaches for biomarker discovery with the case-control study designs typically used in biomarker discovery studies, especially focusing on nested case-control designs. METHODS We review principles for nested case-control study design in relation to biomarker discovery studies and describe how the efficiency of biomarker discovery can be effected by study design choices. We develop a simulated prostate cancer cohort data set and a series of biomarker discovery case-control studies nested within the cohort to illustrate how study design choices can influence biomarker discovery process. RESULT Common elements of nested case-control design, incidence density sampling and matching of controls to cases are not typically factored correctly into biomarker discovery analyses, inducing bias in the discovery process. We illustrate how incidence density sampling and matching of controls to cases reduce the apparent specificity of truly valid biomarkers 'discovered' in a nested case-control study. We also propose and demonstrate a new case-control matching protocol, we call 'antimatching', that improves the efficiency of biomarker discovery studies. CONCLUSIONS For a valid, but as yet undiscovered, biomarker(s) disjunctions between correctly designed epidemiologic studies and the practice of biomarker discovery reduce the likelihood that true biomarker(s) will be discovered and increases the false-positive discovery rate.
Collapse
Affiliation(s)
- Andrew Rundle
- Department of Epidemiology, Mailman School of Public Health, and Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY 10032, USA.
| | | | | |
Collapse
|
6
|
Aschard H, Lutz S, Maus B, Duell EJ, Fingerlin TE, Chatterjee N, Kraft P, Van Steen K. Challenges and opportunities in genome-wide environmental interaction (GWEI) studies. Hum Genet 2012; 131:1591-613. [PMID: 22760307 DOI: 10.1007/s00439-012-1192-0] [Citation(s) in RCA: 110] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2012] [Accepted: 06/11/2012] [Indexed: 02/03/2023]
Abstract
The interest in performing gene-environment interaction studies has seen a significant increase with the increase of advanced molecular genetics techniques. Practically, it became possible to investigate the role of environmental factors in disease risk and hence to investigate their role as genetic effect modifiers. The understanding that genetics is important in the uptake and metabolism of toxic substances is an example of how genetic profiles can modify important environmental risk factors to disease. Several rationales exist to set up gene-environment interaction studies and the technical challenges related to these studies-when the number of environmental or genetic risk factors is relatively small-has been described before. In the post-genomic era, it is now possible to study thousands of genes and their interaction with the environment. This brings along a whole range of new challenges and opportunities. Despite a continuing effort in developing efficient methods and optimal bioinformatics infrastructures to deal with the available wealth of data, the challenge remains how to best present and analyze genome-wide environmental interaction (GWEI) studies involving multiple genetic and environmental factors. Since GWEIs are performed at the intersection of statistical genetics, bioinformatics and epidemiology, usually similar problems need to be dealt with as for genome-wide association gene-gene interaction studies. However, additional complexities need to be considered which are typical for large-scale epidemiological studies, but are also related to "joining" two heterogeneous types of data in explaining complex disease trait variation or for prediction purposes.
Collapse
Affiliation(s)
- Hugues Aschard
- Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA.
| | | | | | | | | | | | | | | |
Collapse
|
7
|
Mechanic LE, Chen HS, Amos CI, Chatterjee N, Cox NJ, Divi RL, Fan R, Harris EL, Jacobs K, Kraft P, Leal SM, McAllister K, Moore JH, Paltoo DN, Province MA, Ramos EM, Ritchie MD, Roeder K, Schaid DJ, Stephens M, Thomas DC, Weinberg CR, Witte JS, Zhang S, Zöllner S, Feuer EJ, Gillanders EM. Next generation analytic tools for large scale genetic epidemiology studies of complex diseases. Genet Epidemiol 2011; 36:22-35. [PMID: 22147673 DOI: 10.1002/gepi.20652] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Over the past several years, genome-wide association studies (GWAS) have succeeded in identifying hundreds of genetic markers associated with common diseases. However, most of these markers confer relatively small increments of risk and explain only a small proportion of familial clustering. To identify obstacles to future progress in genetic epidemiology research and provide recommendations to NIH for overcoming these barriers, the National Cancer Institute sponsored a workshop entitled "Next Generation Analytic Tools for Large-Scale Genetic Epidemiology Studies of Complex Diseases" on September 15-16, 2010. The goal of the workshop was to facilitate discussions on (1) statistical strategies and methods to efficiently identify genetic and environmental factors contributing to the risk of complex disease; and (2) how to develop, apply, and evaluate these strategies for the design, analysis, and interpretation of large-scale complex disease association studies in order to guide NIH in setting the future agenda in this area of research. The workshop was organized as a series of short presentations covering scientific (gene-gene and gene-environment interaction, complex phenotypes, and rare variants and next generation sequencing) and methodological (simulation modeling and computational resources and data management) topic areas. Specific needs to advance the field were identified during each session and are summarized.
Collapse
Affiliation(s)
- Leah E Mechanic
- Epidemiology and Genetics Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, NIH, Bethesda, Maryland 20892, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Thomas D. Methods for investigating gene-environment interactions in candidate pathway and genome-wide association studies. Annu Rev Public Health 2010; 31:21-36. [PMID: 20070199 DOI: 10.1146/annurev.publhealth.012809.103619] [Citation(s) in RCA: 121] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Despite the considerable enthusiasm about the yield of novel and replicated discoveries of genetic associations from the new generation of genome-wide association studies (GWAS), the proportion of the heritability of most complex diseases that have been studied to date remains small. Some of this "dark matter" could be due to gene-environment (G x E) interactions or more complex pathways involving multiple genes and exposures. We review the basic epidemiologic study design and statistical analysis approaches to studying G x E interactions individually and then consider more comprehensive approaches to studying entire pathways or GWAS data. In addition to the usual issues in genetic association studies, particular care is needed in exposure assessment, and very large sample sizes are required. Although hypothesis-driven, pathway-based and agnostic GWA study approaches are generally viewed as opposite poles, we suggest that the two can be usefully married using hierarchical modeling strategies that exploit external pathway knowledge in mining genome-wide data.
Collapse
Affiliation(s)
- Duncan Thomas
- Department of Preventive Medicine, University of Southern California, Los Angeles, California, 90089-9011, USA.
| |
Collapse
|
9
|
Abstract
Despite the yield of recent genome-wide association (GWA) studies, the identified variants explain only a small proportion of the heritability of most complex diseases. This unexplained heritability could be partly due to gene--environment (G×E) interactions or more complex pathways involving multiple genes and exposures. This Review provides a tutorial on the available epidemiological designs and statistical analysis approaches for studying specific G×E interactions and choosing the most appropriate methods. I discuss the approaches that are being developed for studying entire pathways and available techniques for mining interactions in GWA data. I also explore methods for marrying hypothesis-driven pathway-based approaches with 'agnostic' GWA studies.
Collapse
Affiliation(s)
- Duncan Thomas
- Medicine, University of Southern California, 1540 Alcazar Street, CHP‑220, Los Angeles, California 90089‑9011, USA.
| |
Collapse
|
10
|
Kazma R, Bonaïti-Pellié C, Norris JM, Génin E. On the use of sibling recurrence risks to select environmental factors liable to interact with genetic risk factors. Eur J Hum Genet 2010; 18:88-94. [PMID: 19584901 DOI: 10.1038/ejhg.2009.119] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Gene-environment interactions are likely to be involved in the susceptibility to multifactorial diseases but are difficult to detect. Available methods usually concentrate on some particular genetic and environmental factors. In this paper, we propose a new method to determine whether a given exposure is susceptible to interact with unknown genetic factors. Rather than focusing on a specific genetic factor, the degree of familial aggregation is used as a surrogate for genetic factors. A test comparing the recurrence risks in sibs according to the exposure of indexes is proposed and its power is studied for varying values of model parameters. The Exposed versus Unexposed Recurrence Analysis (EURECA) is valuable for common diseases with moderate familial aggregation, only when the role of exposure has been clearly outlined. Interestingly, accounting for a sibling correlation for the exposure increases the power of EURECA. An application on a sample ascertained through one index affected with type 2 diabetes is presented where gene-environment interactions involving obesity and physical inactivity are investigated. Association of obesity with type 2 diabetes is clearly evidenced and a potential interaction involving this factor is suggested in Hispanics (P=0.045), whereas a clear gene-environment interaction is evidenced involving physical inactivity only in non-Hispanic whites (P=0.028). The proposed method might be of particular interest before genetic studies to help determine the environmental risk factors that will need to be accounted for to increase the power to detect genetic risk factors and to select the most appropriate samples to genotype.
Collapse
Affiliation(s)
- Rémi Kazma
- University of Paris-Sud, Le Kremlin Bicêtre, France.
| | | | | | | |
Collapse
|
11
|
Abstract
In this chapter, we discuss statistical methods for various study designs that are commonly used in epidemiological research and particularly in cancer epidemiological research. After a brief review of basic concepts in epidemiological studies, statistical methods for case-control studies and cohort studies are discussed. Statistical methods for nested case-control and case-cohort studies, which have been increasingly used in cancer epidemiology, also are discussed. This chapter is designed for cancer epidemiologists who understand basic statistical methods for commonly used epidemiological study designs and are able to initiate power and sample size calculations. Therefore, this chapter emphasizes newly developed statistical methods for epidemiological studies as well as study planning.
Collapse
Affiliation(s)
- Xiaonan Xue
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
| | | |
Collapse
|
12
|
Wild P, Andrieu N, Goldstein AM, Schill W. Flexible Two-Phase studies for rare exposures: Feasibility, planning and efficiency issues of a new variant. EPIDEMIOLOGIC PERSPECTIVES & INNOVATIONS : EP+I 2008; 5:4. [PMID: 18828892 PMCID: PMC2602593 DOI: 10.1186/1742-5573-5-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2008] [Accepted: 10/01/2008] [Indexed: 11/30/2022]
Abstract
The two-phase design consists of an initial (Phase One) study with known disease status and inexpensive covariate information. Within this initial study one selects a subsample on which to collect detailed covariate data. Two-phase studies have been shown to be efficient compared to standard case-control designs. However, potential problems arise if one cannot assure minimum sample sizes in the rarest categories or if recontact of subjects is difficult. In the case of a rare exposure with an inexpensive proxy, the authors propose the flexible two-phase design for which there is a single time of contact, at which a decision about full covariate ascertainment is made based on the proxy. Subjects are screened until the desired numbers of cases and controls have been selected for full data collection. Strategies for optimizing the cost/efficiency of this design and corresponding software are presented. The design is applied to two examples from occupational and genetic epidemiology. By ensuring minimum numbers for the rarest disease-covariate combination(s), we obtain considerable efficiency gains over standard two-phase studies with an improved practical feasibility. The flexible two-phase design may be the design of choice in the case of well targeted studies of the effect of rare exposures with an inexpensive proxy.
Collapse
Affiliation(s)
- Pascal Wild
- INRS, French National Institute for Research and Safety, Department of Epidemiology, France.
| | | | | | | |
Collapse
|
13
|
Estimating interaction between genetic and environmental risk factors: efficiency of sampling designs within a cohort. Epidemiology 2008; 19:83-93. [PMID: 18091418 DOI: 10.1097/ede.0b013e31815c4d0e] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Large prospective cohorts originally assembled to study environmental risk factors are increasingly exploited to study gene-environment interactions. Given the cost of genetic studies in large samples, being able to select a subsample for genotyping that contains most of the information from the cohort would lead to substantial savings. We consider nested case-control and case-cohort sampling designs with and without stratification and compare their efficiency relative to the entire cohort for estimating the effects of genetic and environmental risk factors and their interactions. Asymptotic calculations show that the relative efficiency of the case-cohort and nested case-control designs implementing the same sampling stratification are similar over a range of scenarios for the relationships among genes, environmental exposures, and disease status. Sampling equal numbers of exposed and unexposed subjects improves efficiency when the exposure is rare. The case-cohort designs had a slight advantage in simulations of sampling designs within the Framingham Offspring Study, using the interaction between apolipoprotein E and smoking on the risk of coronary heart disease as an example. It was possible to estimate the interaction effect with precision close to that of the full cohort when using case-cohort or nested case-control samples containing fewer than half the subjects of the cohort.
Collapse
|
14
|
Bermejo JL, Hemminki K. Gene-environment studies: any advantage over environmental studies? Carcinogenesis 2007; 28:1526-32. [PMID: 17389613 DOI: 10.1093/carcin/bgm068] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Gene-environment studies have been motivated by the likely existence of prevalent low-risk genes that interact with common environmental exposures. The present study assessed the statistical advantage of the simultaneous consideration of genes and environment to investigate the effect of environmental risk factors on disease. In particular, we contemplated the possibility that several genes modulate the environmental effect. Environmental exposures, genotypes and phenotypes were simulated according to a wide range of parameter settings. Different models of gene-gene-environment interaction were considered. For each parameter combination, we estimated the probability of detecting the main environmental effect, the power to identify the gene-environment interaction and the frequency of environmentally affected individuals at which environmental and gene-environment studies show the same statistical power. The proportion of cases in the population attributable to the modeled risk factors was also calculated. Our data indicate that environmental exposures with weak effects may account for a significant proportion of the population prevalence of the disease. A general result was that, if the environmental effect was restricted to rare genotypes, the power to detect the gene-environment interaction was higher than the power to identify the main environmental effect. In other words, when few individuals contribute to the overall environmental effect, individual contributions are large and result in easily identifiable gene-environment interactions. Moreover, when multiple genes interacted with the environment, the statistical benefit of gene-environment studies was limited to those studies that included major contributors to the gene-environment interaction. The advantage of gene-environment over plain environmental studies also depends on the inheritance mode of the involved genes, on the study design and, to some extend, on the disease prevalence.
Collapse
Affiliation(s)
- Justo Lorenzo Bermejo
- Division of Molecular Genetic Epidemiology, German Cancer Research Center, Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany.
| | | |
Collapse
|
15
|
GOLDSTEIN LARRY, LANGHOLZ BRYAN. Cohort Sampling Schemes for the Mantel?Haenszel Estimator. Scand Stat Theory Appl 2007. [DOI: 10.1111/j.1467-9469.2006.00542.x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
16
|
LANGHOLZ BRYAN. Use of Cohort Information in the Design and Analysis of Case-Control Studies. Scand Stat Theory Appl 2007. [DOI: 10.1111/j.1467-9469.2006.00548.x] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
17
|
Ahmed FE. Gene-gene, gene-environment & multiple interactions in colorectal cancer. JOURNAL OF ENVIRONMENTAL SCIENCE AND HEALTH. PART C, ENVIRONMENTAL CARCINOGENESIS & ECOTOXICOLOGY REVIEWS 2006; 24:1-101. [PMID: 16690537 DOI: 10.1080/10590500600614295] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
This review comprehensively evaluates the influence of gene-gene, gene-environment and multiple interactions on the risk of colorectal cancer (CRC). Methods of studying these interactions and their limitations have been discussed herein. There is a need to develop biomarkers of exposure and of risk that are sensitive, specific, present in the pathway of the disease, and that have been clinically tested for routine use. The influence of inherited variation (polymorphism) in several genes has been discussed in this review; however, due to study limitations and confounders, it is difficult to conclude which ones are associated with the highest risk (either individually or in combination with environmental factors) to CRC. The majority of the sporadic cancer is believed to be due to modification of mutation risk by other genetic and/or environmental factors. Micronutrient deficiency may explain the association between low consumption of fruit/vegetables and CRC in human studies. Mitochondrial modulation by dietary factors influences the balance between cell renewal and death critical in colon mucosal homeostasis. Both genetic and epigenetic interactions are intricately dependent on each other, and collectively influence the process of colorectal tumorigenesis. The genetic and environmental interactions present a good prospect and a challenge for prevention strategies for CRC because they support the view that this highly prevalent cancer is preventable.
Collapse
Affiliation(s)
- Farid E Ahmed
- Department of Radiation Oncology, Leo W. Jenkins Cancer Center, The Brody School of Medicine, East Carolina University, Greenville, North, Carolina 27858, USA.
| |
Collapse
|
18
|
Andrieu N, Goldstein AM. The case-combined-control design was efficient in detecting gene-environment interactions. J Clin Epidemiol 2004; 57:662-71. [PMID: 15358394 DOI: 10.1016/j.jclinepi.2003.11.014] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/11/2003] [Indexed: 11/21/2022]
Abstract
OBJECTIVE The interest in studying gene-environment (GxE) interaction is increasing for complex diseases. A design combining both related and unrelated controls (e.g., population-based and siblings) is proposed to increase the power to detect GxE interaction. STUDY DESIGN AND SETTING We used simulations to assess the efficiency of the case-combined-control design relative to a classical case-control study under a variety of assumptions. RESULTS The case-combined-control design appears more efficient and feasible than a classical case-control study for detecting interaction involving rare exposures and/or genetic factors. The number of available sibling controls per case and the frequencies of the risk factors are the most important parameters for determining relative efficiency. Relative efficiencies decrease as the frequency of the gene (G) increases. A positive correlation in exposure (E) between siblings decreases relative efficiency. CONCLUSIONS Although the case-combined-control design may not be efficient for common genes with moderate effects, it appears to be a useful alternative in certain situations where classical approaches remain unrealistic.
Collapse
Affiliation(s)
- N Andrieu
- Inserm EMI00-06, Tour Evry 2, 523 Place des Terrasses de l'Agora, 91034 Evry Cedex, France.
| | | |
Collapse
|
19
|
Bernstein JL, Langholz B, Haile RW, Bernstein L, Thomas DC, Stovall M, Malone KE, Lynch CF, Olsen JH, Anton-Culver H, Shore RE, Boice JD, Berkowitz GS, Gatti RA, Teitelbaum SL, Smith SA, Rosenstein BS, Børresen-Dale AL, Concannon P, Thompson WD. Study design: evaluating gene-environment interactions in the etiology of breast cancer - the WECARE study. Breast Cancer Res 2004; 6:R199-214. [PMID: 15084244 PMCID: PMC400669 DOI: 10.1186/bcr771] [Citation(s) in RCA: 92] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2003] [Revised: 01/15/2004] [Accepted: 01/30/2004] [Indexed: 11/10/2022] Open
Abstract
INTRODUCTION Deficiencies in cellular responses to DNA damage can predispose to cancer. Ionizing radiation can cause cluster damage and double-strand breaks (DSBs) that pose problems for cellular repair processes. Three genes (ATM, BRCA1, and BRCA2) encode products that are essential for the normal cellular response to DSBs, but predispose to breast cancer when mutated. DESIGN To examine the joint roles of radiation exposure and genetic susceptibility in the etiology of breast cancer, we designed a case-control study nested within five population-based cancer registries. We hypothesized that a woman carrying a mutant allele in one of these genes is more susceptible to radiation-induced breast cancer than is a non-carrier. In our study, 700 women with asynchronous bilateral breast cancer were individually matched to 1400 controls with unilateral breast cancer on date and age at diagnosis of the first breast cancer, race, and registry region, and counter-matched on radiation therapy. Each triplet comprised two women who received radiation therapy and one woman who did not. Radiation absorbed dose to the contralateral breast after initial treatment was estimated with a comprehensive dose reconstruction approach that included experimental measurements in anthropomorphic and water phantoms applying patient treatment parameters. Blood samples were collected from all participants for genetic analyses. CONCLUSIONS Our study design improves the potential for detecting gene-environment interactions for diseases when both gene mutations and the environmental exposures of interest are rare in the general population. This is particularly applicable to the study of bilateral breast cancer because both radiation dose and genetic susceptibility have important etiologic roles, possibly by interactive mechanisms. By using counter-matching, we optimized the informativeness of the collected dosimetry data by increasing the variability of radiation dose within the case-control sets and enhanced our ability to detect radiation-genotype interactions.
Collapse
MESH Headings
- Adult
- Alleles
- Ataxia Telangiectasia Mutated Proteins
- Breast Neoplasms/epidemiology
- Breast Neoplasms/etiology
- Breast Neoplasms/genetics
- Breast Neoplasms/radiotherapy
- Case-Control Studies
- Cell Cycle Proteins
- Cocarcinogenesis
- DNA-Binding Proteins
- Female
- Genes, BRCA1
- Genes, BRCA2
- Genes, Tumor Suppressor
- Genetic Predisposition to Disease
- Genotype
- Humans
- Likelihood Functions
- Middle Aged
- Neoplasms, Radiation-Induced/epidemiology
- Neoplasms, Radiation-Induced/etiology
- Neoplasms, Radiation-Induced/genetics
- Neoplasms, Second Primary/epidemiology
- Neoplasms, Second Primary/etiology
- Neoplasms, Second Primary/genetics
- Phantoms, Imaging
- Protein Serine-Threonine Kinases/genetics
- Radiotherapy/adverse effects
- Radiotherapy Dosage
- Registries/statistics & numerical data
- Research Design
- Single-Blind Method
- Tumor Suppressor Proteins
Collapse
Affiliation(s)
- Jonine L Bernstein
- Department of Community and Preventive Medicine, Mount Sinai School of Medicine, New York, NY, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Zondervan KT, Cardon LR, Kennedy SH. What makes a good case-control study? Design issues for complex traits such as endometriosis. Hum Reprod 2002; 17:1415-23. [PMID: 12042253 DOI: 10.1093/humrep/17.6.1415] [Citation(s) in RCA: 201] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The combined investigation of environmental and genetic risk-factors in complex traits will refocus attention on the case-control study. Endometriosis is an example of a complex trait for which most case-control studies have not followed the basic criteria of epidemiological study design. Appropriate control selection has been a particular problem. This article reviews the principles underlying the design of case-control studies, and their application to the study of endometriosis. Only if it is designed well is the case-control study a suitable alternative to the prospective cohort study. Use of newly diagnosed over prevalent cases is preferable, as the latter may alter risk estimates and complicate the interpretation of findings. Controls should be selected from the source population from which cases arose. Potential confounding should be addressed both in studies of environmental and genetic factors. For endometriosis, a possible design would be to: (i) use newly diagnosed cases with 'endometriotic' disease; (ii) collect information predating symptom onset; and (iii) use at least one population-based female control group matched on unadjustable confounders and screened for pelvic symptoms. In conclusion, future studies of complex traits such as endometriosis will have to incorporate both environmental and genetic factors. Only adequately designed studies will allow reliable results to be obtained and any true aetiologic heterogeneity expected to underlie a complex trait to be detected.
Collapse
Affiliation(s)
- Krina T Zondervan
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 6BN, UK.
| | | | | |
Collapse
|
21
|
Gauderman WJ. Sample size requirements for matched case-control studies of gene-environment interaction. Stat Med 2002; 21:35-50. [PMID: 11782049 DOI: 10.1002/sim.973] [Citation(s) in RCA: 496] [Impact Index Per Article: 22.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Consideration of gene-environment (GxE) interaction is becoming increasingly important in the design of new epidemiologic studies. We present a method for computing required sample size or power to detect GxE interaction in the context of three specific designs: the standard matched case-control; the case-sibling, and the case-parent designs. The method is based on computation of the expected value of the likelihood ratio test statistic, assuming that the data will be analysed using conditional logistic regression. Comparisons of required sample sizes indicate that the family-based designs (case-sibling and case-parent) generally require fewer matched sets than the case-control design to achieve the same power for detecting a GxE interaction. The case-sibling design is most efficient when studying a dominant gene, while the case-parent design is preferred for a recessive gene. Methods are also presented for computing sample size when matched sets are obtained from a stratified population, for example, when the population consists of multiple ethnic groups. A software program that implements the method is freely available, and may be downloaded from the website http://hydra.usc.edu/gxe.
Collapse
Affiliation(s)
- W James Gauderman
- Department of Preventive Medicine, University of Southern California, CA, USA.
| |
Collapse
|