1
|
Pereira DA, Luizon MR, Palei AC, Tanus-Santos JE, Cavalli RC, Sandrim VC. Functional polymorphisms of NOS3 and GUCY1A3 affect both nitric oxide formation and association with hypertensive disorders of pregnancy. Front Genet 2024; 15:1293082. [PMID: 38469120 PMCID: PMC10925623 DOI: 10.3389/fgene.2024.1293082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 02/12/2024] [Indexed: 03/13/2024] Open
Abstract
Impaired nitric oxide (NO) formation may be associated with endothelial dysfunction and increased cardiovascular disease risk in preeclampsia (PE). Functional single-nucleotide polymorphisms (SNPs) of nitric oxide synthase 3 (NOS3) (rs3918226) and guanylate cyclase 1, soluble, alpha 3 (GUCY1A3) (rs7692387) increase susceptibility to the adverse consequences due to inadequate generation of NO by the endothelium. However, no previous study has examined whether these SNPs affect NO formation in healthy pregnancy and in gestational hypertension (GH) and PE. Here, we compared the alleles and genotypes of NOS3 (rs3918226) and GUCY1A3 (rs7692387) SNPs in normotensive pregnant women (NP, n = 153), in GH (n = 96) and PE (n = 163), and examined whether these SNPs affect plasma nitrite concentrations (a marker of NO formation) in these groups. We further examined whether the interaction among SNP genotypes is associated with GH and PE. Genotypes were determined using TaqMan allele discrimination assays, and plasma nitrite concentrations were determined by an ozone-based chemiluminescence assay. Multifactor dimensionality reduction was used to examine the interactions among SNP genotypes. Regarding NOS3 rs3918226, the CT genotype (p = 0.046) and T allele (p = 0.020) were more frequent in NP than in GH, and GH patients carrying the CT+TT genotypes showed lower nitrite concentrations than NP carrying the CT+TT genotypes (p < 0.05). Regarding GUCY1A3 rs7692387, the GA genotype (p = 0.013) and A allele (p = 0.016) were more frequent in PE than in NP, and NP women carrying the GG genotype showed higher nitrite concentrations than GH or PE patients carrying the GG genotype (p < 0.05). However, we found no significant interactions among genotypes for these functional SNPs to be associated with GH or PE. Our novel findings suggest that NOS3 rs3918226 and GUCY1A3 rs7692387 may affect NO formation and association with hypertensive disorders of pregnancy.
Collapse
Affiliation(s)
- Daniela A. Pereira
- Department of Genetics, Ecology and Evolution, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Marcelo R. Luizon
- Department of Genetics, Ecology and Evolution, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
- Department of Biophysics and Pharmacology, Institute of Biosciences, Universidade Estadual Paulista (UNESP), Botucatu, Brazil
| | - Ana C. Palei
- Department of Surgery, University of Mississippi Medical Center, Jackson, MS, United States
| | - José E. Tanus-Santos
- Department of Pharmacology, Ribeirao Preto Medical School, University of Sao Paulo, Ribeirão Preto, Brazil
| | - Ricardo C. Cavalli
- Department of Gynecology and Obstetrics, Ribeirao Preto Medical School, University of Sao Paulo, Ribeirão Preto, Brazil
| | - Valeria C. Sandrim
- Department of Biophysics and Pharmacology, Institute of Biosciences, Universidade Estadual Paulista (UNESP), Botucatu, Brazil
| |
Collapse
|
2
|
Prieto-Fernández A, Sánchez-Barroso G, González-Domínguez J, García-Sanz-Calcedo J. Interaction between maintenance variables of medical ultrasound scanners through multifactor dimensionality reduction. Expert Rev Med Devices 2023; 20:851-864. [PMID: 37522639 DOI: 10.1080/17434440.2023.2243208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 06/14/2023] [Accepted: 06/22/2023] [Indexed: 08/01/2023]
Abstract
BACKGROUND Proper maintenance of electro-medical devices is crucial for the quality of care to patients and the economic performance of healthcare organizations. This research aims to identify the interaction between Ultrasound scanners (US) maintenance variables as a function of maintenance indicators: US in service or decommissioned, excessive number of failures, and failure rate. Knowing those interactions, specific maintenance measures will be developed to improve the reliability of the US. RESEARCH DESIGN AND METHODS Multifactor Dimensionality Reduction (MDR) method was eployed to analyze data from 222 US and their four-year maintenance history. Models were developed based on the variables with the greatest influence on maintenance indicators, where US were classified according to the associated risk. RESULTS US with more than one major failure or at least one major component replacement had up to 496.4% more failures than the average. Failure rate increased by up to 188.7% over the average for those US with more than three moderate failures, three replacements, or both. CONCLUSIONS This study identifies and quantifies the causes of risk to establish a specific maintenance plan for US. It helps to better understand the degradation of US to optimize their operation and maintenance.
Collapse
Affiliation(s)
| | - Gonzalo Sánchez-Barroso
- Engineering Projects Area, School of Industrial Engineering, University of Extremadura, Badajoz, Spain
| | - Jaime González-Domínguez
- Engineering Projects Area, School of Industrial Engineering, University of Extremadura, Badajoz, Spain
| | - Justo García-Sanz-Calcedo
- Engineering Projects Area, School of Industrial Engineering, University of Extremadura, Badajoz, Spain
| |
Collapse
|
3
|
Ott J, Park T. Overview of frequent pattern mining. Genomics Inform 2022; 20:e39. [PMID: 36617647 PMCID: PMC9847378 DOI: 10.5808/gi.22074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 12/22/2022] [Indexed: 12/31/2022] Open
Abstract
Various methods of frequent pattern mining have been applied to genetic problems, specifically, to the combined association of two genotypes (a genotype pattern, or diplotype) at different DNA variants with disease. These methods have the ability to come up with a selection of genotype patterns that are more common in affected than unaffected individuals, and the assessment of statistical significance for these selected patterns poses some unique problems, which are briefly outlined here.
Collapse
Affiliation(s)
- Jurg Ott
- Laboratory of Statistical Genetics, Rockefeller University, New York, NY 10065, USA,Corresponding author E-mail:
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul 08826, Korea
| |
Collapse
|
4
|
Park M, Jeong HB, Lee JH, Park T. Spatial rank-based multifactor dimensionality reduction to detect gene-gene interactions for multivariate phenotypes. BMC Bioinformatics 2021; 22:480. [PMID: 34607566 PMCID: PMC8489107 DOI: 10.1186/s12859-021-04395-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Accepted: 09/17/2021] [Indexed: 01/11/2023] Open
Abstract
Background Identifying interaction effects between genes is one of the main tasks of genome-wide association studies aiming to shed light on the biological mechanisms underlying complex diseases. Multifactor dimensionality reduction (MDR) is a popular approach for detecting gene–gene interactions that has been extended in various forms to handle binary and continuous phenotypes. However, only few multivariate MDR methods are available for multiple related phenotypes. Current approaches use Hotelling’s T2 statistic to evaluate interaction models, but it is well known that Hotelling’s T2 statistic is highly sensitive to heavily skewed distributions and outliers. Results We propose a robust approach based on nonparametric statistics such as spatial signs and ranks. The new multivariate rank-based MDR (MR-MDR) is mainly suitable for analyzing multiple continuous phenotypes and is less sensitive to skewed distributions and outliers. MR-MDR utilizes fuzzy k-means clustering and classifies multi-locus genotypes into two groups. Then, MR-MDR calculates a spatial rank-sum statistic as an evaluation measure and selects the best interaction model with the largest statistic. Our novel idea lies in adopting nonparametric statistics as an evaluation measure for robust inference. We adopt tenfold cross-validation to avoid overfitting. Intensive simulation studies were conducted to compare the performance of MR-MDR with current methods. Application of MR-MDR to a real dataset from a Korean genome-wide association study demonstrated that it successfully identified genetic interactions associated with four phenotypes related to kidney function. The R code for conducting MR-MDR is available at https://github.com/statpark/MR-MDR. Conclusions Intensive simulation studies comparing MR-MDR with several current methods showed that the performance of MR-MDR was outstanding for skewed distributions. Additionally, for symmetric distributions, MR-MDR showed comparable power. Therefore, we conclude that MR-MDR is a useful multivariate non-parametric approach that can be used regardless of the phenotype distribution, the correlations between phenotypes, and sample size.
Collapse
Affiliation(s)
- Mira Park
- Department of Preventive Medicine, Eulji University, Daejeon, 34824, Republic of Korea
| | - Hoe-Bin Jeong
- Department of Statistics, Korea University, Seoul, 02841, Republic of Korea
| | - Jong-Hyun Lee
- Department of Statistics, Korea University, Seoul, 02841, Republic of Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul, 08826, Republic of Korea.
| |
Collapse
|
5
|
Gao H, Yang C, Fan J, Lan L, Pang D. Hereditary and breastfeeding factors are positively associated with the aetiology of mammary gland hyperplasia: a case-control study. Int Health 2021; 13:240-247. [PMID: 32556322 PMCID: PMC8079319 DOI: 10.1093/inthealth/ihaa028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Revised: 04/10/2020] [Accepted: 05/18/2020] [Indexed: 11/30/2022] Open
Abstract
Background Hyperplasia of mammary gland (HMG) has become a common disorder in women. A family history of breast cancer and female reproductive factors may work together to increase the risk of HMG. However, this specific relationship has not been fully characterized. Methods A total of 1881 newly diagnosed HMG cases and 1900 controls were recruited from 2012 to 2017. Demographic characteristics including female reproductive factors and a family history of breast cancer were collected. A multi-analytic strategy combining unconditional logistic regression, multifactor dimensionality reduction (MDR) and crossover approaches were applied to systematically identify the interaction effect of family history of breast cancer and reproductive factors on HMG susceptibility. Results In MDR analysis, high-order interactions among higher-level education, shorter breastfeeding duration and family history of breast cancer were identified (odds ratio [OR] 7.07 [95% confidence interval {CI} 6.08 to 8.22]). Similarly, in crossover analysis, HMG risk increased significantly for those with higher-level education (OR 36.39 [95% CI 11.47 to 115.45]), shorter duration of breastfeeding (OR 27.70 [95% CI 3.73 to 205.70]) and a family history of breast cancer. Conclusion Higher-level education, shorter breastfeeding duration and a family history of breast cancer may synergistically increase the risk of HMG.
Collapse
Affiliation(s)
- Hanlu Gao
- Department of Preventive Health, The Affiliated Hospital of Medical School of Ningbo University, 247 Renmin Road, Ningbo, Zhejiang, P.R. China.,Division of Chronic and Non-communicable Diseases, Harbin Center for Diseases Control and Prevention, 30 Weixing Road, Harbin, Heilongjiang, P.R. China.,Department of Breast Surgery, Harbin Medical University Cancer Hospital, 150 Haping Road, Harbin, Heilongjiang, P.R. China
| | - Chao Yang
- Division of Chronic and Non-communicable Diseases, Harbin Center for Diseases Control and Prevention, 30 Weixing Road, Harbin, Heilongjiang, P.R. China
| | - Jinqing Fan
- Department of Dermatology, The Affiliated Hospital of Medical School of Ningbo University, 247 Renmin Road, Ningbo, Zhejiang, P.R. China
| | - Li Lan
- Division of Chronic and Non-communicable Diseases, Harbin Center for Diseases Control and Prevention, 30 Weixing Road, Harbin, Heilongjiang, P.R. China
| | - Da Pang
- Department of Breast Surgery, Harbin Medical University Cancer Hospital, 150 Haping Road, Harbin, Heilongjiang, P.R. China
| |
Collapse
|
6
|
Liu N, Chee ML, Koh ZX, Leow SL, Ho AFW, Guo D, Ong MEH. Utilizing machine learning dimensionality reduction for risk stratification of chest pain patients in the emergency department. BMC Med Res Methodol 2021; 21:74. [PMID: 33865317 PMCID: PMC8052947 DOI: 10.1186/s12874-021-01265-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Accepted: 04/05/2021] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Chest pain is among the most common presenting complaints in the emergency department (ED). Swift and accurate risk stratification of chest pain patients in the ED may improve patient outcomes and reduce unnecessary costs. Traditional logistic regression with stepwise variable selection has been used to build risk prediction models for ED chest pain patients. In this study, we aimed to investigate if machine learning dimensionality reduction methods can improve performance in deriving risk stratification models. METHODS A retrospective analysis was conducted on the data of patients > 20 years old who presented to the ED of Singapore General Hospital with chest pain between September 2010 and July 2015. Variables used included demographics, medical history, laboratory findings, heart rate variability (HRV), and heart rate n-variability (HRnV) parameters calculated from five to six-minute electrocardiograms (ECGs). The primary outcome was 30-day major adverse cardiac events (MACE), which included death, acute myocardial infarction, and revascularization within 30 days of ED presentation. We used eight machine learning dimensionality reduction methods and logistic regression to create different prediction models. We further excluded cardiac troponin from candidate variables and derived a separate set of models to evaluate the performance of models without using laboratory tests. Receiver operating characteristic (ROC) and calibration analysis was used to compare model performance. RESULTS Seven hundred ninety-five patients were included in the analysis, of which 247 (31%) met the primary outcome of 30-day MACE. Patients with MACE were older and more likely to be male. All eight dimensionality reduction methods achieved comparable performance with the traditional stepwise variable selection; The multidimensional scaling algorithm performed the best with an area under the curve of 0.901. All prediction models generated in this study outperformed several existing clinical scores in ROC analysis. CONCLUSIONS Dimensionality reduction models showed marginal value in improving the prediction of 30-day MACE for ED chest pain patients. Moreover, they are black box models, making them difficult to explain and interpret in clinical practice.
Collapse
Affiliation(s)
- Nan Liu
- Duke-NUS Medical School, National University of Singapore, Singapore, Singapore.
- Health Services Research Centre, Singapore Health Services, Singapore, Singapore.
- Institute of Data Science, National University of Singapore, Singapore, Singapore.
| | - Marcel Lucas Chee
- Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Australia
| | - Zhi Xiong Koh
- Department of Emergency Medicine, Singapore General Hospital, Singapore, Singapore
| | - Su Li Leow
- Duke-NUS Medical School, National University of Singapore, Singapore, Singapore
| | - Andrew Fu Wah Ho
- Duke-NUS Medical School, National University of Singapore, Singapore, Singapore
- Department of Emergency Medicine, Singapore General Hospital, Singapore, Singapore
| | - Dagang Guo
- SingHealth Duke-NUS Emergency Medicine Academic Clinical Programme, Singapore, Singapore
| | - Marcus Eng Hock Ong
- Duke-NUS Medical School, National University of Singapore, Singapore, Singapore
- Health Services Research Centre, Singapore Health Services, Singapore, Singapore
- Department of Emergency Medicine, Singapore General Hospital, Singapore, Singapore
| |
Collapse
|
7
|
Manavalan R, Priya S. Genetic interactions effects for cancer disease identification using computational models: a review. Med Biol Eng Comput 2021; 59:733-758. [PMID: 33839998 DOI: 10.1007/s11517-021-02343-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Accepted: 03/10/2021] [Indexed: 11/29/2022]
Abstract
Genome-wide association studies (GWAS) provide clear insight into understanding genetic variations and environmental influences responsible for various human diseases. Cancer identification through genetic interactions (epistasis) is one of the significant ongoing researches in GWAS. The growth of the cancer cell emerges from multi-locus as well as complex genetic interaction. It is impractical for the physician to detect cancer via manual examination of SNPs interaction. Due to its importance, several computational approaches have been modeled to infer epistasis effects. This article includes a comprehensive and multifaceted review of all relevant genetic studies published between 2001 and 2020. In this contemporary review, various computational methods are as follows: multifactor dimensionality reduction-based approaches, statistical strategies, machine learning, and optimization-based techniques are carefully reviewed and presented with their evaluation results. Moreover, these computational approaches' strengths and limitations are described. The issues behind the computational methods for identifying the cancer disease through genetic interactions and the various evaluation parameters used by researchers have been analyzed. This review is highly beneficial for researchers and medical professionals to learn techniques adapted to discover the epistasis and aids to design novel automatic epistasis detection systems with strong robustness and maximum efficiency to address the different research problems in finding practical solutions effectively.
Collapse
Affiliation(s)
- R Manavalan
- Department of Computer Science, Arignar Anna Government Arts College, Villupuram, Tamil Nadu, 605602, India.
| | - S Priya
- Computer Science, Arignar Anna Government Arts College, Villupuram, Tamil Nadu, India
| |
Collapse
|
8
|
Sandrim VC, Luizon MR, Pilan E, Caldeira-Dias M, Coeli-Lacchini FB, Kors G, Berndt I, Lacchini R, Cavalli RC. Interaction Between NOS3 and HMOX1 on Antihypertensive Drug Responsiveness in Preeclampsia. REVISTA BRASILEIRA DE GINECOLOGIA E OBSTETRÍCIA 2020; 42:460-467. [PMID: 32559798 PMCID: PMC10309231 DOI: 10.1055/s-0040-1712484] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
OBJECTIVE We examined the interaction of polymorphisms in the genes heme oxygenase-1 (HMOX1) and nitric oxide synthase (NOS3) in patients with preeclampsia (PE) as well as the responsiveness to methyldopa and to total antihypertensive therapy. METHODS The genes HMOX1 (rs2071746, A/T) and NOS3 (rs1799983, G/T) were genotyped using TaqMan allele discrimination assays (Applied Biosystems, Foster City, CA, USA ), and the levels of enzyme heme oxygenase-1 (HO-1) were measured using enzyme-linked immunosorbent assay (ELISA). RESULTS We found interactions between genotypes of the HMOX-1 and NOS3 genes and responsiveness to methyldopa and that PE genotyped as AT presents lower levels of protein HO-1 compared with AA. CONCLUSION We found interactions between the HMOX-1 and NOS3 genes and responsiveness to methyldopa and that the HMOX1 polymorphism affects the levels of enzyme HO-1 in responsiveness to methyldopa and to total antihypertensive therapy. These data suggest impact of the combination of these two polymorphisms on antihypertensive responsiveness in PE.
Collapse
Affiliation(s)
- Valeria Cristina Sandrim
- Department of Pharmacology, Instituto de Biociências de Botucatu da Universidade Estadual Paulista, Botucatu, SP, Brazil
| | - Marcelo Rizzatti Luizon
- Department of Genetics, Ecology and Evolution, Instituto de Ciências Biológicas da Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Eliane Pilan
- Department of Pharmacology, Instituto de Biociências de Botucatu da Universidade Estadual Paulista, Botucatu, SP, Brazil
| | - Mayara Caldeira-Dias
- Department of Pharmacology, Instituto de Biociências de Botucatu da Universidade Estadual Paulista, Botucatu, SP, Brazil
| | | | - Georgia Kors
- Department of Pharmacology, Instituto de Biociências de Botucatu da Universidade Estadual Paulista, Botucatu, SP, Brazil
| | - Iuly Berndt
- Department of Pharmacology, Instituto de Biociências de Botucatu da Universidade Estadual Paulista, Botucatu, SP, Brazil
| | - Riccardo Lacchini
- Department of Psychiatric Nursing and Human Sciences, Escola de Enfermagem de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto, SP, Brazil
| | - Ricardo Carvalho Cavalli
- Department of Gynecology and Obstetrics, Faculdade de Medicina de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto, SP, Brazil
| |
Collapse
|
9
|
Fernández-Torres J, Martínez-Nava GA, Zamudio-Cuevas Y, Lozada C, Garrido-Rodríguez D, Martínez-Flores K. Epistasis of polymorphisms related to the articular cartilage extracellular matrix in knee osteoarthritis: Analysis-based multifactor dimensionality reduction. Genet Mol Biol 2020; 43:e20180349. [PMID: 32240281 PMCID: PMC7197998 DOI: 10.1590/1678-4685-gmb-2018-0349] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Accepted: 06/26/2019] [Indexed: 12/23/2022] Open
Abstract
Osteoarthritis (OA) is a complex disease with a multifactorial etiology. The genetic component is one of the main associated factors, resulting from interactions between genes and environmental factors. The aim of this study was to identify gene-gene interactions (epistasis) of the articular cartilage extracellular matrix (ECM) in knee OA. Ninety-two knee OA patients and 147 healthy individuals were included. Participants were genotyped in order to evaluate nine variants of eight genes associated with ECM metabolism using the OpenArray technology. Epistasis was analyzed using the multifactor dimensionality reduction (MDR) method. The MDR analysis showed significant gene-gene interactions between MMP3 (rs679620) and COL3A1 (rs1800255), and between COL3A1 (rs1800255) and VEGFA (rs699947) polymorphisms, with information gain values of 3.21% and 2.34%, respectively. Furthermore, in our study we found interactions in high-risk genotypes of the HIF1AN, MMP3 and COL3A1 genes; the most representative were [AA+CC+GA], [AA+CT+GA] and [AA+CT+GG], respectively; and low-risk genotypes [AA+CC+GG], [GG+TT+GA] and [AA+TT+GA], respectively. Knowing the interactions of these polymorphisms involved in articular cartilage ECM metabolism could provide a new tool to identify individuals at high risk of developing knee OA.
Collapse
Affiliation(s)
- Javier Fernández-Torres
- Synovial Fluid Laboratory, Instituto Nacional de Rehabilitación "Luis Guillermo Ibarra Ibarra", Mexico City, Mexico
| | | | - Yessica Zamudio-Cuevas
- Synovial Fluid Laboratory, Instituto Nacional de Rehabilitación "Luis Guillermo Ibarra Ibarra", Mexico City, Mexico
| | - Carlos Lozada
- Rheumatology Service, Instituto Nacional de Rehabilitación "Luis Guillermo Ibarra Ibarra", Mexico City, Mexico
| | - Daniela Garrido-Rodríguez
- Center for Research in Infectious Diseases, National Institute of Respiratory Diseases, Mexico City, Mexico
| | - Karina Martínez-Flores
- Synovial Fluid Laboratory, Instituto Nacional de Rehabilitación "Luis Guillermo Ibarra Ibarra", Mexico City, Mexico
| |
Collapse
|
10
|
Xu Q, Guo L, Cheng J, Wang M, Geng Z, Zhu W, Zhang B, Liao W, Qiu S, Zhang H, Xu X, Yu Y, Gao B, Han T, Yao Z, Cui G, Liu F, Qin W, Zhang Q, Li MJ, Liang M, Chen F, Xian J, Li J, Zhang J, Zuo XN, Wang D, Shen W, Miao Y, Yuan F, Lui S, Zhang X, Xu K, Zhang LJ, Ye Z, Yu C. CHIMGEN: a Chinese imaging genetics cohort to enhance cross-ethnic and cross-geographic brain research. Mol Psychiatry 2020; 25:517-529. [PMID: 31827248 PMCID: PMC7042768 DOI: 10.1038/s41380-019-0627-6] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Revised: 11/21/2019] [Accepted: 11/27/2019] [Indexed: 02/05/2023]
Abstract
The Chinese Imaging Genetics (CHIMGEN) study establishes the largest Chinese neuroimaging genetics cohort and aims to identify genetic and environmental factors and their interactions that are associated with neuroimaging and behavioral phenotypes. This study prospectively collected genomic, neuroimaging, environmental, and behavioral data from more than 7000 healthy Chinese Han participants aged 18-30 years. As a pioneer of large-sample neuroimaging genetics cohorts of non-Caucasian populations, this cohort can provide new insights into ethnic differences in genetic-neuroimaging associations by being compared with Caucasian cohorts. In addition to micro-environmental measurements, this study also collects hundreds of quantitative macro-environmental measurements from remote sensing and national survey databases based on the locations of each participant from birth to present, which will facilitate discoveries of new environmental factors associated with neuroimaging phenotypes. With lifespan environmental measurements, this study can also provide insights on the macro-environmental exposures that affect the human brain as well as their timing and mechanisms of action.
Collapse
Affiliation(s)
- Qiang Xu
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, 300052, Tianjin, China
| | - Lining Guo
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, 300052, Tianjin, China
| | - Jingliang Cheng
- Department of Magnetic Resonance Imaging, The First Affiliated Hospital of Zhengzhou University, 450052, Zhengzhou, China
| | - Meiyun Wang
- Department of Radiology, Zhengzhou University People's Hospital and Henan Provincial People's Hospital, 450003, Zhengzhou, China
- Henan Key Laboratory for Medical Imaging of Neurological Diseases, 450003, Zhengzhou, China
| | - Zuojun Geng
- Department of Medical Imaging, The Second Hospital of Hebei Medical University, 050000, Shijiazhuang, China
| | - Wenzhen Zhu
- Department of Radiology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, 430030, Wuhan, China
| | - Bing Zhang
- Department of Radiology, Drum Tower Hospital, Medical School of Nanjing University, 210008, Nanjing, China
| | - Weihua Liao
- Department of Radiology, Xiangya Hospital, Central South University, 410008, Changsha, China
- National Clinical Research Center for Geriatric Disorder, 410008, Changsha, China
| | - Shijun Qiu
- Department of Medical Imaging, The First Affiliated Hospital of Guangzhou University of Chinese Medicine, 510405, Guangzhou, China
| | - Hui Zhang
- Department of Radiology, The First Hospital of Shanxi Medical University, 030001, Taiyuan, China
| | - Xiaojun Xu
- Department of Radiology, The Second Affiliated Hospital of Zhejiang University, School of Medicine, 310009, Hangzhou, China
| | - Yongqiang Yu
- Department of Radiology, The First Affiliated Hospital of Anhui Medical University, 230022, Hefei, China
| | - Bo Gao
- Department of Radiology, Yantai Yuhuangding Hospital, 264000, Yantai, China
| | - Tong Han
- Department of Radiology, Tianjin Huanhu Hospital, 300350, Tianjin, China
- Tianjin Key Laboratory of Cerebral Vascular and Neurodegenerative Diseases, 300350, Tianjin, China
| | - Zhenwei Yao
- Department of Radiology, Huashan Hosptial, Fudan University, 200040, Shanghai, China
| | - Guangbin Cui
- Functional and Molecular Imaging Key Lab of Shaanxi Province & Department of Radiology, Tangdu Hospital, The Military Medical University of PLA Airforce (Fourth Military Medical University), 710038, Xi'an, China
| | - Feng Liu
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, 300052, Tianjin, China
| | - Wen Qin
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, 300052, Tianjin, China
| | - Quan Zhang
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, 300052, Tianjin, China
| | - Mulin Jun Li
- Collaborative Innovation Center of Tianjin for Medical Epigenetics, Tianjin Key Laboratory of Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, 300070, Tianjin, China
| | - Meng Liang
- School of Medical Imaging, Tianjin Medical University, 300203, Tianjin, China
| | - Feng Chen
- Department of Radiology, Hainan General Hospital, 570311, Haikou, China
| | - Junfang Xian
- Department of Radiology, Beijing Tongren Hospital, Capital Medical University, 100730, Beijing, China
| | - Jiance Li
- Department of Radiology, The First Affiliated Hospital of Wenzhou Medical University, 325000, Wenzhou, China
| | - Jing Zhang
- Department of Magnetic Resonance, Lanzhou University Second Hospital, 730050, Lanzhou, China
| | - Xi-Nian Zuo
- Department of Psychology, University of Chinese Academy of Sciences (CAS), 100049, Beijing, China
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, 100101, Beijing, China
| | - Dawei Wang
- Department of Radiology, Qilu Hospital of Shandong University, 250012, Jinan, China
| | - Wen Shen
- Department of Radiology, Tianjin First Center Hospital, 300192, Tianjin, China
| | - Yanwei Miao
- Department of Radiology, The First Affiliated Hospital of Dalian Medical University, 116011, Dalian, China
| | - Fei Yuan
- Department of Radiology, Pingjin Hospital, Logistics University of Chinese People's Armed Police Forces, 300162, Tianjin, China
| | - Su Lui
- Department of Radiology, The Center for Medical Imaging, West China Hospital of Sichuan University, 610041, Chengdu, China
- Department of Radiology, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, 325000, Wenzhou, China
| | - Xiaochu Zhang
- CAS Key Laboratory of Brain Function and Disease, University of Science and Technology of China, 230026, Hefei, China
- School of Life Sciences, University of Science & Technology of China, 230026, Hefei, China
| | - Kai Xu
- Department of Radiology, The Affiliated Hospital of Xuzhou Medical University, 221006, Xuzhou, China
- School of Medical Imaging, Xuzhou Medical University, 221004, Xuzhou, China
| | - Long Jiang Zhang
- Department of Medical Imaging, Jinling Hospital, Medical School of Nanjing University, 210002, Nanjing, China
| | - Zhaoxiang Ye
- Department of Radiology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin's Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, 300060, Tianjin, China
| | - Chunshui Yu
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, 300052, Tianjin, China.
- CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, 200031, China.
| |
Collapse
|
11
|
Chattopadhyay A, Lu TP. Gene-gene interaction: the curse of dimensionality. ANNALS OF TRANSLATIONAL MEDICINE 2019; 7:813. [PMID: 32042829 DOI: 10.21037/atm.2019.12.87] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Identified genetic variants from genome wide association studies frequently show only modest effects on the disease risk, leading to the "missing heritability" problem. An avenue, to account for a part of this "missingness" is to evaluate gene-gene interactions (epistasis) thereby elucidating their effect on complex diseases. This can potentially help with identifying gene functions, pathways, and drug targets. However, the exhaustive evaluation of all possible genetic interactions among millions of single nucleotide polymorphisms (SNPs) raises several issues, otherwise known as the "curse of dimensionality". The dimensionality involved in the epistatic analysis of such exponentially growing SNPs diminishes the usefulness of traditional, parametric statistical methods. With the immense popularity of multifactor dimensionality reduction (MDR), a non-parametric method, proposed in 2001, that classifies multi-dimensional genotypes into one- dimensional binary approaches, led to the emergence of a fast-growing collection of methods that were based on the MDR approach. Moreover, machine-learning (ML) methods such as random forests and neural networks (NNs), deep-learning (DL) approaches, and hybrid approaches have also been applied profusely, in the recent years, to tackle this dimensionality issue associated with whole genome gene-gene interaction studies. However, exhaustive searching in MDR based approaches or variable selection in ML methods, still pose the risk of missing out on relevant SNPs. Furthermore, interpretability issues are a major hindrance for DL methods. To minimize this loss of information, Python based tools such as PySpark can potentially take advantage of distributed computing resources in the cloud, to bring back smaller subsets of data for further local analysis. Parallel computing can be a powerful resource that stands to fight this "curse". PySpark supports all standard Python libraries and C extensions thus making it convenient to write codes to deliver dramatic improvements in processing speed for extraordinarily large sets of data.
Collapse
Affiliation(s)
- Amrita Chattopadhyay
- Institute of Epidemiology and Preventive Medicine, Department of Public Health, National Taiwan University, Taipei
| | - Tzu-Pin Lu
- Institute of Epidemiology and Preventive Medicine, Department of Public Health, National Taiwan University, Taipei
| |
Collapse
|
12
|
Uppu S, Krishna A. A deep hybrid model to detect multi-locus interacting SNPs in the presence of noise. Int J Med Inform 2018; 119:134-151. [DOI: 10.1016/j.ijmedinf.2018.09.003] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 04/13/2018] [Accepted: 09/03/2018] [Indexed: 01/17/2023]
|
13
|
Zhang L, Kim I. Semiparametric Bayesian kernel survival model for evaluating pathway effects. Stat Methods Med Res 2018; 28:3301-3317. [PMID: 30289021 DOI: 10.1177/0962280218797360] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Massive amounts of high-dimensional data have been accumulated over the past two decades, which has cultured increasing interests in identifying gene pathways related to certain biological processes. In particular, since pathway-based analysis has the ability to detect subtle changes of differentially expressed genes that could be missed when using gene-based analysis, detecting the gene pathways that regulate certain diseases can provide new strategies for medical procedures and new targets for drug discovery. Limited work has been carried out, primarily in regression settings, to study the effects of pathways on survival outcomes. Motivated by a breast cancer gene-pathway data set, which exhibits the "small n, large p" characteristics, we propose a semiparametric Bayesian kernel survival model (s-BKSurv) to study the effects of both clinical covariates and gene expression levels within a pathway on survival time. We model the unknown high-dimensional functions of pathways via Gaussian kernel machine to consider the possibility that genes within the same pathway interact with each other. To address the multiple comparisons problem under a full Bayesian setting, we propose a similarity-dependent procedure based on Bayes factor to control the family-wise error rate. We demonstrate the outperformance of our approach under various simulation settings and pathways data.
Collapse
Affiliation(s)
- Lin Zhang
- Department of Statistics, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA
| | - Inyoung Kim
- Department of Statistics, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA
| |
Collapse
|
14
|
Uppu S, Krishna A, Gopalan RP. A Review on Methods for Detecting SNP Interactions in High-Dimensional Genomic Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:599-612. [PMID: 28060710 DOI: 10.1109/tcbb.2016.2635125] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
In this era of genome-wide association studies (GWAS), the quest for understanding the genetic architecture of complex diseases is rapidly increasing more than ever before. The development of high throughput genotyping and next generation sequencing technologies enables genetic epidemiological analysis of large scale data. These advances have led to the identification of a number of single nucleotide polymorphisms (SNPs) responsible for disease susceptibility. The interactions between SNPs associated with complex diseases are increasingly being explored in the current literature. These interaction studies are mathematically challenging and computationally complex. These challenges have been addressed by a number of data mining and machine learning approaches. This paper reviews the current methods and the related software packages to detect the SNP interactions that contribute to diseases. The issues that need to be considered when developing these models are addressed in this review. The paper also reviews the achievements in data simulation to evaluate the performance of these models. Further, it discusses the future of SNP interaction analysis.
Collapse
|
15
|
Cole BS, Hall MA, Urbanowicz RJ, Gilbert‐Diamond D, Moore JH. Analysis of Gene‐Gene Interactions. ACTA ACUST UNITED AC 2018; 95:1.14.1-1.14.10. [DOI: 10.1002/cphg.45] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Brian S. Cole
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania
| | - Molly A. Hall
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania
- The Center for Systems Genomics, The Pennsylvania State University, University Park Pennsylvania
| | - Ryan J. Urbanowicz
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania
| | - Diane Gilbert‐Diamond
- Institute for Quantitative Biomedical Sciences at Dartmouth Hanover New Hampshire
- Department of Epidemiology, Geisel School of Medicine at Dartmouth Hanover New Hampshire
| | - Jason H. Moore
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania
| |
Collapse
|
16
|
Yang CH, Lin YD, Chuang LY, Chen JB, Chang HW. Joint Analysis of SNP-SNP-Environment Interactions for Chronic Dialysis by an Improved Branch and Bound Algorithm. J Comput Biol 2017; 24:1212-1225. [PMID: 28876085 DOI: 10.1089/cmb.2017.0090] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
In previous studies, both single-nucleotide polymorphism (SNP)-SNP or gene-gene (G × G) interactions and SNP-environmental factor (G × E) interactions were reported to partially account for "missing" heritability. However, (G × G) × E interactions were less commonly addressed. The purpose of this study was to develop a novel strategy to evaluate possible (G × G) × E interactions in D-loop-based chronic dialysis association. Using values from our previously published data set (704 controls and 193 cases) of 77 D-loop SNPs and 7 environmental factors (coronary heart disease, hypertension, diabetes mellitus, triglyceride, cholesterol, blood thiol, and TBARS levels), we compared the performances of G, G × G, G × E, and (G × G) × E. We found that the interactions of four individual SNPs previously associated with a significantly high risk of chronic dialysis [odds ratio (OR) = 1.56-4.93] with environmental factors (G × E) increased the risk of chronic dialysis (maximum OR = 35.43). We then used an improved branch and bound algorithm to identify combinations of two to four SNPs that were most highly associated with chronic dialysis (OR = 9.27-34.39). When the interactions of the two- and three-SNP combinations with environmental factors were evaluated, we found that the (G × G) × E effects increased the risk of chronic dialysis (maximum OR = 8.32-57.54 and OR = 12.52-57.81, respectively; adjusted OR = 8.67-81.81 and OR = 12.29-81.95, respectively). Taken together, the (G × G) × E interactions identified chronic dialysis-associated SNPs that would not have been found using G × G or G × E interactions, suggesting that (G × G) × E interactions may be helpful to solve the problems of missing heritability in association studies.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- 1 Department of Electronic Engineering, National Kaohsiung University of Applied Sciences , Kaohsiung, Taiwan .,2 Graduate Institute of Clinical Medicine, Kaohsiung Medical University , Kaohsiung, Taiwan
| | - Yu-Da Lin
- 1 Department of Electronic Engineering, National Kaohsiung University of Applied Sciences , Kaohsiung, Taiwan
| | - Li-Yeh Chuang
- 3 Department of Chemical Engineering & Institute of Biotechnology and Chemical Engineering, I-Shou University , Kaohsiung, Taiwan
| | - Jin-Bor Chen
- 4 Division of Nephrology, Department of Internal Medicine, Mitochondrial Research Unit, Kaohsiung Chang Gung Memorial Hospital, Chang Gung University College of Medicine , Kaohsiung, Taiwan
| | - Hsueh-Wei Chang
- 5 Institute of Medical Science and Technology, National Sun Yat-Sen University , Kaohsiung, Taiwan .,6 Department of Medical Research, Kaohsiung Medical University Hospital , Kaohsiung, Taiwan .,7 Department of Biomedical Science and Environmental Biology, Kaohsiung Medical University , Kaohsiung, Taiwan
| |
Collapse
|
17
|
Abo Alchamlat S, Farnir F. KNN-MDR: a learning approach for improving interactions mapping performances in genome wide association studies. BMC Bioinformatics 2017; 18:184. [PMID: 28327091 PMCID: PMC5361736 DOI: 10.1186/s12859-017-1599-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2016] [Accepted: 03/11/2017] [Indexed: 12/30/2022] Open
Abstract
Background Finding epistatic interactions in large association studies like genome-wide association studies (GWAS) with the nowadays-available large volume of genomic data is a challenging and largely unsolved issue. Few previous studies could handle genome-wide data due to the intractable difficulties met in searching a combinatorial explosive search space and statistically evaluating epistatic interactions given a limited number of samples. Our work is a contribution to this field. We propose a novel approach combining K-Nearest Neighbors (KNN) and Multi Dimensional Reduction (MDR) methods for detecting gene-gene interactions as a possible alternative to existing algorithms, e especially in situations where the number of involved determinants is high. After describing the approach, a comparison of our method (KNN-MDR) to a set of the other most performing methods (i.e., MDR, BOOST, BHIT, MegaSNPHunter and AntEpiSeeker) is carried on to detect interactions using simulated data as well as real genome-wide data. Results Experimental results on both simulated data and real genome-wide data show that KNN-MDR has interesting properties in terms of accuracy and power, and that, in many cases, it significantly outperforms its recent competitors. Conclusions The presented methodology (KNN-MDR) is valuable in the context of loci and interactions mapping and can be seen as an interesting addition to the arsenal used in complex traits analyses. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1599-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sinan Abo Alchamlat
- Department of Biostatistics, Faculty of Veterinary Medicine, FARAH, University of Liège, Sart Tilman B43, 4000, Liege, Belgium
| | - Frédéric Farnir
- Department of Biostatistics, Faculty of Veterinary Medicine, FARAH, University of Liège, Sart Tilman B43, 4000, Liege, Belgium.
| |
Collapse
|
18
|
Abstract
BACKGROUND Detection of gene-gene interaction (GGI) is a key challenge towards solving the problem of missing heritability in genetics. The multifactor dimensionality reduction (MDR) method has been widely studied for detecting GGIs. MDR reduces the dimensionality of multi-factor by means of binary classification into high-risk (H) or low-risk (L) groups. Unfortunately, this simple binary classification does not reflect the uncertainty of H/L classification. Thus, we proposed Fuzzy MDR to overcome limitations of binary classification by introducing the degree of membership of two fuzzy sets H/L. While Fuzzy MDR demonstrated higher power than that of MDR, its performance is highly dependent on the several tuning parameters. In real applications, it is not easy to choose appropriate tuning parameter values. RESULT In this work, we propose an empirical fuzzy MDR (EF-MDR) which does not require specifying tuning parameters values. Here, we propose an empirical approach to estimating the membership degree that can be directly estimated from the data. In EF-MDR, the membership degree is estimated by the maximum likelihood estimator of the proportion of cases(controls) in each genotype combination. We also show that the balanced accuracy measure derived from this new membership function is a linear function of the standard chi-square statistics. This relationship allows us to perform the standard significance test using p-values in the MDR framework without permutation. Through two simulation studies, the power of the proposed EF-MDR is shown to be higher than those of MDR and Fuzzy MDR. We illustrate the proposed EF-MDR by analyzing Crohn's disease (CD) and bipolar disorder (BD) in the Wellcome Trust Case Control Consortium (WTCCC) dataset. CONCLUSION We propose an empirical Fuzzy MDR for detecting GGI using the maximum likelihood of the proportion of cases(controls) as the membership degree of the genotype combination. The program written in R for EF-MDR is available at http://statgen.snu.ac.kr/software/EF-MDR .
Collapse
Affiliation(s)
- Sangseob Leem
- Department of Statistics, Seoul National University, Seoul, 08826 South Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul, 08826 South Korea
| |
Collapse
|
19
|
|
20
|
Chen Q, Mao X, Zhang Z, Zhu R, Yin Z, Leng Y, Yu H, Jia H, Jiang S, Ni Z, Jiang H, Han X, Liu C, Hu Z, Wu X, Hu G, Xin D, Qi Z. SNP-SNP Interaction Analysis on Soybean Oil Content under Multi-Environments. PLoS One 2016; 11:e0163692. [PMID: 27668866 PMCID: PMC5036806 DOI: 10.1371/journal.pone.0163692] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2016] [Accepted: 09/13/2016] [Indexed: 11/22/2022] Open
Abstract
Soybean oil content is one of main quality traits. In this study, we used the multifactor dimensionality reduction (MDR) method and a soybean high-density genetic map including 5,308 markers to identify stable single nucleotide polymorphism (SNP)—SNP interactions controlling oil content in soybean across 23 environments. In total, 36,442,756 SNP-SNP interaction pairs were detected, 1865 of all interaction pairs associated with soybean oil content were identified under multiple environments by the Bonferroni correction with p <3.55×10−11. Two and 1863 SNP-SNP interaction pairs detected stable across 12 and 11 environments, respectively, which account around 50% of total environments. Epistasis values and contribution rates of stable interaction (the SNP interaction pairs were detected in more than 2 environments) pairs were detected by the two way ANOVA test, the available interaction pairs were ranged 0.01 to 0.89 and from 0.01 to 0.85, respectively. Some of one side of the interaction pairs were identified with previously research as a major QTL without epistasis effects. The results of this study provide insights into the genetic architecture of soybean oil content and can serve as a basis for marker-assisted selection breeding.
Collapse
Affiliation(s)
- Qingshan Chen
- College of Agriculture, Soybean biology Key Laboratory of the Ministry of Education, Northeast Agricultural University, Harbin, 150030, Heilongjiang, People’s Republic of China
| | - Xinrui Mao
- College of Agriculture, Soybean biology Key Laboratory of the Ministry of Education, Northeast Agricultural University, Harbin, 150030, Heilongjiang, People’s Republic of China
| | - Zhanguo Zhang
- College of Agriculture, Soybean biology Key Laboratory of the Ministry of Education, Northeast Agricultural University, Harbin, 150030, Heilongjiang, People’s Republic of China
| | - Rongsheng Zhu
- College of Agriculture, Soybean biology Key Laboratory of the Ministry of Education, Northeast Agricultural University, Harbin, 150030, Heilongjiang, People’s Republic of China
| | - Zhengong Yin
- College of Agriculture, Soybean biology Key Laboratory of the Ministry of Education, Northeast Agricultural University, Harbin, 150030, Heilongjiang, People’s Republic of China
- Crop Breeding Institute, Heilongjiang Academy of Agricultural Sciences, Harbin, 150086, Heilongjiang, People’s Republic of China
| | - Yue Leng
- College of Agriculture, Soybean biology Key Laboratory of the Ministry of Education, Northeast Agricultural University, Harbin, 150030, Heilongjiang, People’s Republic of China
| | - Hongxiao Yu
- College of Agriculture, Soybean biology Key Laboratory of the Ministry of Education, Northeast Agricultural University, Harbin, 150030, Heilongjiang, People’s Republic of China
| | - Huiying Jia
- College of Agriculture, Soybean biology Key Laboratory of the Ministry of Education, Northeast Agricultural University, Harbin, 150030, Heilongjiang, People’s Republic of China
| | - Shanshan Jiang
- College of Agriculture, Soybean biology Key Laboratory of the Ministry of Education, Northeast Agricultural University, Harbin, 150030, Heilongjiang, People’s Republic of China
| | - Zhongqiu Ni
- College of Agriculture, Soybean biology Key Laboratory of the Ministry of Education, Northeast Agricultural University, Harbin, 150030, Heilongjiang, People’s Republic of China
| | - Hongwei Jiang
- The Crop Research and Breeding Center of Land-Reclamation of Heilongjiang Province, Harbin, 150090, Heilongjiang, People’s Republic of China
| | - Xue Han
- The Crop Research and Breeding Center of Land-Reclamation of Heilongjiang Province, Harbin, 150090, Heilongjiang, People’s Republic of China
| | - Chunyan Liu
- The Crop Research and Breeding Center of Land-Reclamation of Heilongjiang Province, Harbin, 150090, Heilongjiang, People’s Republic of China
| | - Zhenbang Hu
- College of Agriculture, Soybean biology Key Laboratory of the Ministry of Education, Northeast Agricultural University, Harbin, 150030, Heilongjiang, People’s Republic of China
| | - Xiaoxia Wu
- College of Agriculture, Soybean biology Key Laboratory of the Ministry of Education, Northeast Agricultural University, Harbin, 150030, Heilongjiang, People’s Republic of China
| | - Guohua Hu
- The Crop Research and Breeding Center of Land-Reclamation of Heilongjiang Province, Harbin, 150090, Heilongjiang, People’s Republic of China
| | - Dawei Xin
- College of Agriculture, Soybean biology Key Laboratory of the Ministry of Education, Northeast Agricultural University, Harbin, 150030, Heilongjiang, People’s Republic of China
- * E-mail: (DX); (ZQ)
| | - Zhaoming Qi
- College of Agriculture, Soybean biology Key Laboratory of the Ministry of Education, Northeast Agricultural University, Harbin, 150030, Heilongjiang, People’s Republic of China
- * E-mail: (DX); (ZQ)
| |
Collapse
|
21
|
Gene–gene interactions in the NAMPT pathway, plasma visfatin/NAMPT levels, and antihypertensive therapy responsiveness in hypertensive disorders of pregnancy. THE PHARMACOGENOMICS JOURNAL 2016; 17:427-434. [DOI: 10.1038/tpj.2016.35] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Revised: 02/08/2016] [Accepted: 03/28/2016] [Indexed: 12/16/2022]
|
22
|
Software for detecting gene-gene interactions in genome wide association studies. BIOTECHNOL BIOPROC E 2015. [DOI: 10.1007/s12257-015-0064-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
23
|
Lavender N, Hein DW, Brock G, Kidd LCR. Evaluation of Oxidative Stress Response Related Genetic Variants, Pro-oxidants, Antioxidants and Prostate Cancer. AIMS MEDICAL SCIENCE 2015; 2:271-294. [PMID: 26636131 PMCID: PMC4664461 DOI: 10.3934/medsci.2015.4.271] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Background Oxidative stress and detoxification mechanisms have been commonly studied in Prostate Cancer (PCa) due to their function in the detoxification of potentially damaging reactive oxygen species (ROS) and carcinogens. However, findings have been either inconsistent or inconclusive. These mixed findings may, in part, relate to failure to consider interactions among oxidative stress response related genetic variants along with pro- and antioxidant factors. Methods We examined the effects of 33 genetic and 26 environmental oxidative stress and defense factors on PCa risk and disease aggressiveness among 2,286 men from the Cancer Genetic Markers of Susceptibility project (1,175 cases, 1,111 controls). Single and joint effects were analyzed using a comprehensive statistical approach involving logistic regression, multi-dimensionality reduction, and entropy graphs. Results Inheritance of one CYP2C8 rs7909236 T or two SOD2 rs2758331 A alleles was linked to a 1.3- and 1.4-fold increase in risk of developing PCa, respectively (p-value = 0.006–0.013). Carriers of CYP1B1 rs1800440GG, CYP2C8 rs1058932TC and, NAT2 (rs1208GG, rs1390358CC, rs7832071TT) genotypes were associated with a 1.3 to 2.2-fold increase in aggressive PCa [p-value = 0.04–0.001, FDR 0.088–0.939]. We observed a 23% reduction in aggressive disease linked to inheritance of one or more NAT2 rs4646247 A alleles (p = 0.04, FDR = 0.405). Only three NAT2 sequence variants remained significant after adjusting for multiple hypotheses testing, namely NAT2 rs1208, rs1390358, and rs7832071. Lastly, there were no significant gene-environment or gene-gene interactions associated with PCa outcomes. Conclusions Variations in genes involved in oxidative stress and defense pathways may modify PCa. Our findings do not firmly support the role of oxidative stress genetic variants combined with lifestyle/environmental factors as modifiers of PCa and disease progression. However, additional multi-center studies poised to pool genetic and environmental data are needed to make strong conclusions.
Collapse
Affiliation(s)
- Nicole Lavender
- Department of Pharmacology and Toxicology and James Graham Brown Cancer Center, University of Louisville, Louisville, KY
| | - David W Hein
- Department of Pharmacology and Toxicology and James Graham Brown Cancer Center, University of Louisville, Louisville, KY
| | - Guy Brock
- Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, KY
| | - La Creis R Kidd
- Department of Pharmacology and Toxicology and James Graham Brown Cancer Center, University of Louisville, Louisville, KY
| |
Collapse
|
24
|
Yang CH, Lin YD, Yang CS, Chuang LY. An efficiency analysis of high-order combinations of gene-gene interactions using multifactor-dimensionality reduction. BMC Genomics 2015; 16:489. [PMID: 26126977 PMCID: PMC4487567 DOI: 10.1186/s12864-015-1717-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2015] [Accepted: 06/24/2015] [Indexed: 12/21/2022] Open
Abstract
Background Multifactor dimensionality reduction (MDR) is widely used to analyze interactions of genes to determine the complex relationship between diseases and polymorphisms in humans. However, the astronomical number of high-order combinations makes MDR a highly time-consuming process which can be difficult to implement for multiple tests to identify more complex interactions between genes. This study proposes a new framework, named fast MDR (FMDR), which is a greedy search strategy based on the joint effect property. Results Six models with different minor allele frequencies (MAFs) and different sample sizes were used to generate the six simulation data sets. A real data set was obtained from the mitochondrial D-loop of chronic dialysis patients. Comparison of results from the simulation data and real data sets showed that FMDR identified significant gene–gene interaction with less computational complexity than the MDR in high-order interaction analysis. Conclusion FMDR improves the MDR difficulties associated with the computational loading of high-order SNPs and can be used to evaluate the relative effects of each individual SNP on disease susceptibility. FMDR is freely available at http://bioinfo.kmu.edu.tw/FMDR.rar. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1717-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan.
| | - Yu-Da Lin
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan.
| | - Cheng-San Yang
- Department of Plastic Surgery, Chia-Yi Christian Hospital, Chiayi, Taiwan.
| | - Li-Yeh Chuang
- Department of Chemical Engineering & Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung, Taiwan.
| |
Collapse
|
25
|
Gola D, Mahachie John JM, van Steen K, König IR. A roadmap to multifactor dimensionality reduction methods. Brief Bioinform 2015; 17:293-308. [PMID: 26108231 PMCID: PMC4793893 DOI: 10.1093/bib/bbv038] [Citation(s) in RCA: 56] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2015] [Indexed: 02/02/2023] Open
Abstract
Complex diseases are defined to be determined by multiple genetic and environmental factors alone as well as in interactions. To analyze interactions in genetic data, many statistical methods have been suggested, with most of them relying on statistical regression models. Given the known limitations of classical methods, approaches from the machine-learning community have also become attractive. From this latter family, a fast-growing collection of methods emerged that are based on the Multifactor Dimensionality Reduction (MDR) approach. Since its first introduction, MDR has enjoyed great popularity in applications and has been extended and modified multiple times. Based on a literature search, we here provide a systematic and comprehensive overview of these suggested methods. The methods are described in detail, and the availability of implementations is listed. Most recent approaches offer to deal with large-scale data sets and rare variants, which is why we expect these methods to even gain in popularity.
Collapse
|
26
|
MDR method for nonbinary response variable. J MULTIVARIATE ANAL 2015. [DOI: 10.1016/j.jmva.2014.11.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
27
|
Rodrigues P, de Marco G, Furriol J, Mansego ML, Pineda-Alonso M, Gonzalez-Neira A, Martin-Escudero JC, Benitez J, Lluch A, Chaves FJ, Eroles P. Oxidative stress in susceptibility to breast cancer: study in Spanish population. BMC Cancer 2014; 14:861. [PMID: 25416100 PMCID: PMC4251690 DOI: 10.1186/1471-2407-14-861] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2014] [Accepted: 11/14/2014] [Indexed: 11/23/2022] Open
Abstract
Background Alterations in the redox balance are involved in the origin, promotion and progression of cancer. Inter-individual differences in the oxidative stress regulation can explain a part of the variability in cancer susceptibility. The aim of this study was to evaluate if polymorphisms in genes codifying for the different systems involved in oxidative stress levels can have a role in susceptibility to breast cancer. Methods We have analyzed 76 single base polymorphisms located in 27 genes involved in oxidative stress regulation by SNPlex technology. First, we have tested all the selected SNPs in 493 breast cancer patients and 683 controls and we have replicated the significant results in a second independent set of samples (430 patients and 803 controls). Gene-gene interactions were performed by the multifactor dimensionality reduction approach. Results Six polymorphisms rs1052133 (OGG1), rs406113 and rs974334 (GPX6), rs2284659 (SOD3), rs4135225 (TXN) and rs207454 (XDH) were significant in the global analysis. The gene-gene interactions demonstrated a significant four-variant interaction among rs406113 (GPX6), rs974334 (GPX6), rs105213 (OGG1) and rs2284659 (SOD3) (p-value = 0.0008) with high-risk genotype combination showing increased risk for breast cancer (OR = 1.75 [95% CI; 1.26-2.44]). Conclusions The results of this study indicate that different genotypes in genes of the oxidant/antioxidant pathway could affect the susceptibility to breast cancer. Furthermore, our study highlighted the importance of the analysis of the epistatic interactions to define with more accuracy the influence of genetic variants in susceptibility to breast cancer.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Pilar Eroles
- INCLIVA Biomedical Research Institute, Valencia, Spain.
| |
Collapse
|
28
|
Big data analysis using modern statistical and machine learning methods in medicine. Int Neurourol J 2014; 18:50-7. [PMID: 24987556 PMCID: PMC4076480 DOI: 10.5213/inj.2014.18.2.50] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2014] [Accepted: 06/20/2014] [Indexed: 11/08/2022] Open
Abstract
In this article we introduce modern statistical machine learning and bioinformatics approaches that have been used in learning statistical relationships from big data in medicine and behavioral science that typically include clinical, genomic (and proteomic) and environmental variables. Every year, data collected from biomedical and behavioral science is getting larger and more complicated. Thus, in medicine, we also need to be aware of this trend and understand the statistical tools that are available to analyze these datasets. Many statistical analyses that are aimed to analyze such big datasets have been introduced recently. However, given many different types of clinical, genomic, and environmental data, it is rather uncommon to see statistical methods that combine knowledge resulting from those different data types. To this extent, we will introduce big data in terms of clinical data, single nucleotide polymorphism and gene expression studies and their interactions with environment. In this article, we will introduce the concept of well-known regression analyses such as linear and logistic regressions that has been widely used in clinical data analyses and modern statistical models such as Bayesian networks that has been introduced to analyze more complicated data. Also we will discuss how to represent the interaction among clinical, genomic, and environmental data in using modern statistical models. We conclude this article with a promising modern statistical method called Bayesian networks that is suitable in analyzing big data sets that consists with different type of large data from clinical, genomic, and environmental data. Such statistical model form big data will provide us with more comprehensive understanding of human physiology and disease.
Collapse
|
29
|
Luizon MR, Palei ACT, Sandrim VC, Amaral LM, Machado JSR, Lacchini R, Cavalli RC, Duarte G, Tanus-Santos JE. Tissue inhibitor of matrix metalloproteinase-1 polymorphism, plasma TIMP-1 levels, and antihypertensive therapy responsiveness in hypertensive disorders of pregnancy. THE PHARMACOGENOMICS JOURNAL 2014; 14:535-41. [PMID: 24913092 DOI: 10.1038/tpj.2014.26] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2014] [Revised: 03/29/2014] [Accepted: 04/29/2014] [Indexed: 01/01/2023]
Abstract
Tissue inhibitor of metalloproteinase (TIMP)-1 is a major endogenous inhibitor of matrix metalloproteinase (MMP)-9, which may affect the responsiveness to therapy in hypertensive disorders of pregnancy. We examined whether TIMP-1 polymorphism (g.-9830T>G, rs2070584) modifies plasma MMP-9 and TIMP-1 levels and the response to antihypertensive therapy in 596 pregnant: 206 patients with preeclampsia (PE), 183 patients with gestational hypertension (GH) and 207 healthy pregnant controls. We also studied the TIMP-3 polymorphism (g.-1296T>C, rs9619311). Plasma MMP-9 and TIMP-1 levels were measured by ELISA. GH patients with the GG genotype for the TIMP-1 polymorphism had lower MMP-9 levels and MMP-9/TIMP-1 ratios than those with the TT genotype. PE patients with the TG genotype had higher TIMP-1 levels. The G allele and the GG genotype were associated with PE and responsiveness to antihypertensive therapy in PE, but not in GH. Our results suggest that the TIMP-1 g.-9830T>G polymorphism not only promotes PE but also decreases the responses to antihypertensive therapy.
Collapse
Affiliation(s)
- M R Luizon
- Department of Pharmacology, Ribeirao Preto Medical School, University of Sao Paulo, Ribeirao Preto, Sao Paulo, Brazil
| | - A C T Palei
- Department of Physiology and Biophysics, University of Mississippi Medical Center, Jackson, MS, USA
| | - V C Sandrim
- Department of Pharmacology, Institute of Biosciences, UniversidadeEstadual Paulista (UNESP), Botucatu, Sao Paulo, Brazil
| | - L M Amaral
- Department of Pharmacology, Ribeirao Preto Medical School, University of Sao Paulo, Ribeirao Preto, Sao Paulo, Brazil
| | - J S R Machado
- Department of Gynecology and Obstetrics, Ribeirao Preto Medical School, University of Sao Paulo, Ribeirao Preto, Sao Paulo, Brazil
| | - R Lacchini
- Department of Pharmacology, Ribeirao Preto Medical School, University of Sao Paulo, Ribeirao Preto, Sao Paulo, Brazil
| | - R C Cavalli
- Department of Gynecology and Obstetrics, Ribeirao Preto Medical School, University of Sao Paulo, Ribeirao Preto, Sao Paulo, Brazil
| | - G Duarte
- Department of Gynecology and Obstetrics, Ribeirao Preto Medical School, University of Sao Paulo, Ribeirao Preto, Sao Paulo, Brazil
| | - J E Tanus-Santos
- Department of Pharmacology, Ribeirao Preto Medical School, University of Sao Paulo, Ribeirao Preto, Sao Paulo, Brazil
| |
Collapse
|
30
|
Liu J, Calhoun VD. A review of multivariate analyses in imaging genetics. Front Neuroinform 2014; 8:29. [PMID: 24723883 PMCID: PMC3972473 DOI: 10.3389/fninf.2014.00029] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2013] [Accepted: 03/04/2014] [Indexed: 12/13/2022] Open
Abstract
Recent advances in neuroimaging technology and molecular genetics provide the unique opportunity to investigate genetic influence on the variation of brain attributes. Since the year 2000, when the initial publication on brain imaging and genetics was released, imaging genetics has been a rapidly growing research approach with increasing publications every year. Several reviews have been offered to the research community focusing on various study designs. In addition to study design, analytic tools and their proper implementation are also critical to the success of a study. In this review, we survey recent publications using data from neuroimaging and genetics, focusing on methods capturing multivariate effects accommodating the large number of variables from both imaging data and genetic data. We group the analyses of genetic or genomic data into either a priori driven or data driven approach, including gene-set enrichment analysis, multifactor dimensionality reduction, principal component analysis, independent component analysis (ICA), and clustering. For the analyses of imaging data, ICA and extensions of ICA are the most widely used multivariate methods. Given detailed reviews of multivariate analyses of imaging data available elsewhere, we provide a brief summary here that includes a recently proposed method known as independent vector analysis. Finally, we review methods focused on bridging the imaging and genetic data by establishing multivariate and multiple genotype-phenotype-associations, including sparse partial least squares, sparse canonical correlation analysis, sparse reduced rank regression and parallel ICA. These methods are designed to extract latent variables from both genetic and imaging data, which become new genotypes and phenotypes, and the links between the new genotype-phenotype pairs are maximized using different cost functions. The relationship between these methods along with their assumptions, advantages, and limitations are discussed.
Collapse
Affiliation(s)
- Jingyu Liu
- The Mind Research Network and Lovelace Biomedical and Environmental Research InstituteAlbuquerque, NM, USA
- Department of Electrical and Computer Engineering, University of New MexicoAlbuquerque, NM, USA
| | - Vince D. Calhoun
- The Mind Research Network and Lovelace Biomedical and Environmental Research InstituteAlbuquerque, NM, USA
- Department of Electrical and Computer Engineering, University of New MexicoAlbuquerque, NM, USA
| |
Collapse
|
31
|
Aflakparast M, Salimi H, Gerami A, Dubé MP, Visweswaran S, Masoudi-Nejad A. Cuckoo search epistasis: a new method for exploring significant genetic interactions. Heredity (Edinb) 2014; 112:666-74. [PMID: 24549111 DOI: 10.1038/hdy.2014.4] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2013] [Revised: 12/09/2013] [Accepted: 12/18/2013] [Indexed: 11/09/2022] Open
Abstract
The advent of high-throughput sequencing technology has resulted in the ability to measure millions of single-nucleotide polymorphisms (SNPs) from thousands of individuals. Although these high-dimensional data have paved the way for better understanding of the genetic architecture of common diseases, they have also given rise to challenges in developing computational methods for learning epistatic relationships among genetic markers. We propose a new method, named cuckoo search epistasis (CSE) for identifying significant epistatic interactions in population-based association studies with a case-control design. This method combines a computationally efficient Bayesian scoring function with an evolutionary-based heuristic search algorithm, and can be efficiently applied to high-dimensional genome-wide SNP data. The experimental results from synthetic data sets show that CSE outperforms existing methods including multifactorial dimensionality reduction and Bayesian epistasis association mapping. In addition, on a real genome-wide data set related to Alzheimer's disease, CSE identified SNPs that are consistent with previously reported results, and show the utility of CSE for application to genome-wide data.
Collapse
Affiliation(s)
- M Aflakparast
- 1] Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran [2] Department of Mathematics, Faculty of Sciences, VU University, Amsterdam, The Netherlands
| | - H Salimi
- Department of Computer Science, University of Tehran, Tehran, Iran
| | - A Gerami
- Department of Statistics and Mathematics, Islamic Azad University, Qazvin Branch, Qazvin, Iran
| | - M-P Dubé
- Department of Medicine, Faculty of Medicine, University of Montreal, Montreal, Quebec, Canada
| | - S Visweswaran
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
| | - A Masoudi-Nejad
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| |
Collapse
|
32
|
Yang CH, Lin YD, Chuang LY, Chen JB, Chang HW. MDR-ER: balancing functions for adjusting the ratio in risk classes and classification errors for imbalanced cases and controls using multifactor-dimensionality reduction. PLoS One 2013; 8:e79387. [PMID: 24236125 PMCID: PMC3827354 DOI: 10.1371/journal.pone.0079387] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2013] [Accepted: 09/20/2013] [Indexed: 12/25/2022] Open
Abstract
Background Determining the complex relationship between diseases, polymorphisms in human genes and environmental factors is challenging. Multifactor dimensionality reduction (MDR) has proven capable of effectively detecting statistical patterns of epistasis. However, MDR has its weakness in accurately assigning multi-locus genotypes to either high-risk and low-risk groups, and does generally not provide accurate error rates when the case and control data sets are imbalanced. Consequently, results for classification error rates and odds ratios (OR) may provide surprising values in that the true positive (TP) value is often small. Methodology/Principal Findings To address this problem, we introduce a classifier function based on the ratio between the percentage of cases in case data and the percentage of controls in control data to improve MDR (MDR-ER) for multi-locus genotypes to be classified correctly into high-risk and low-risk groups. In this study, a real data set with different ratios of cases to controls (1∶4) was obtained from the mitochondrial D-loop of chronic dialysis patients in order to test MDR-ER. The TP and TN values were collected from all tests to analyze to what degree MDR-ER performed better than MDR. Conclusions/Significance Results showed that MDR-ER can be successfully used to detect the complex associations in imbalanced data sets.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan
| | - Yu-Da Lin
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan
| | - Li-Yeh Chuang
- Department of Chemical Engineering and Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung, Taiwan
- * E-mail: (L-YC); (H-WC)
| | - Jin-Bor Chen
- Division of Nephrology, Department of Internal Medicine, Mitochondrial Research Unit, Kaohsiung Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Kaohsiung, Taiwan
| | - Hsueh-Wei Chang
- Department of Biomedical Science and Environmental Biology, Kaohsiung Medical University, Taiwan
- Cancer Center, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung, Taiwan
- * E-mail: (L-YC); (H-WC)
| |
Collapse
|
33
|
Gui J, Moore JH, Williams SM, Andrews P, Hillege HL, van der Harst P, Navis G, Van Gilst WH, Asselbergs FW, Gilbert-Diamond D. A Simple and Computationally Efficient Approach to Multifactor Dimensionality Reduction Analysis of Gene-Gene Interactions for Quantitative Traits. PLoS One 2013; 8:e66545. [PMID: 23805232 PMCID: PMC3689797 DOI: 10.1371/journal.pone.0066545] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2012] [Accepted: 05/07/2013] [Indexed: 12/03/2022] Open
Abstract
We present an extension of the two-class multifactor dimensionality reduction (MDR) algorithm that enables detection and characterization of epistatic SNP-SNP interactions in the context of a quantitative trait. The proposed Quantitative MDR (QMDR) method handles continuous data by modifying MDR’s constructive induction algorithm to use a T-test. QMDR replaces the balanced accuracy metric with a T-test statistic as the score to determine the best interaction model. We used a simulation to identify the empirical distribution of QMDR’s testing score. We then applied QMDR to genetic data from the ongoing prospective Prevention of Renal and Vascular End-Stage Disease (PREVEND) study.
Collapse
Affiliation(s)
- Jiang Gui
- Institute for Quantitative Biomedical Sciences, Geisel School of Medicine, Lebanon, New Hampshire, United States of America
- Section of Biostatistics and Epidemiology, Departments of Community and Family Medicine, Geisel School of Medicine, Lebanon, New Hampshire, United States of America
| | - Jason H. Moore
- Institute for Quantitative Biomedical Sciences, Geisel School of Medicine, Lebanon, New Hampshire, United States of America
- Section of Biostatistics and Epidemiology, Departments of Community and Family Medicine, Geisel School of Medicine, Lebanon, New Hampshire, United States of America
- Department of Genetics, Geisel School of Medicine, Lebanon, New Hampshire, United States of America
- * E-mail:
| | - Scott M. Williams
- Institute for Quantitative Biomedical Sciences, Geisel School of Medicine, Lebanon, New Hampshire, United States of America
- Department of Genetics, Geisel School of Medicine, Lebanon, New Hampshire, United States of America
| | - Peter Andrews
- Institute for Quantitative Biomedical Sciences, Geisel School of Medicine, Lebanon, New Hampshire, United States of America
| | - Hans L. Hillege
- Department of Cardiology, University Medical Center Groningen, Groningen, The Netherlands
| | - Pim van der Harst
- Department of Cardiology, University Medical Center Groningen, Groningen, The Netherlands
| | - Gerjan Navis
- Department of Nephrology, University Medical Center Groningen, Groningen, The Netherlands
| | - Wiek H. Van Gilst
- Department of Cardiology, University Medical Center Groningen, Groningen, The Netherlands
| | - Folkert W. Asselbergs
- Department of Cardiology, Division of Heart and Lungs, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Diane Gilbert-Diamond
- Institute for Quantitative Biomedical Sciences, Geisel School of Medicine, Lebanon, New Hampshire, United States of America
- Section of Biostatistics and Epidemiology, Departments of Community and Family Medicine, Geisel School of Medicine, Lebanon, New Hampshire, United States of America
| |
Collapse
|
34
|
Rodrigues P, Furriol J, Tormo E, Ballester S, Lluch A, Eroles P. Epistatic interaction of Arg72Pro TP53 and −710 C/T VEGFR1 polymorphisms in breast cancer: predisposition and survival. Mol Cell Biochem 2013; 379:181-90. [DOI: 10.1007/s11010-013-1640-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2012] [Accepted: 03/28/2013] [Indexed: 01/30/2023]
|
35
|
Pathway-based genetic analysis of preterm birth. Genomics 2013; 101:163-70. [PMID: 23298525 DOI: 10.1016/j.ygeno.2012.12.005] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2012] [Revised: 12/17/2012] [Accepted: 12/25/2012] [Indexed: 01/06/2023]
Abstract
Preterm birth in the United States is now 12%. Multiple genes, gene networks, and variants have been associated with this disease. Using a custom database for preterm birth (dbPTB) with a refined set of genes extensively curated from literature and biological databases, we analyzed GWAS of preterm birth for complete genotype data on nearly 2000 preterm and term mothers. We used both the curated genes and a genome-wide approach to carry out a pathway-based analysis. There were 19 significant pathways, which withstood FDR correction for multiple testing that were identified using both the curated genes and the genome-wide approach. The analysis based on the curated genes was more significant than genome-wide in 15 out of 19 pathways. This approach demonstrates the use of a validated set of genes, in the analysis of otherwise unsuccessful GWAS data, to identify gene-gene interactions in a way that enhances statistical power and discovery.
Collapse
|
36
|
Abstract
Genome-wide association studies (GWASs) and other high-throughput initiatives have led to an information explosion in human genetics and genetic epidemiology. Conversion of this wealth of new information about genomic variation to knowledge about public health and human biology will depend critically on the complexity of the genotype to phenotype mapping relationship. We review here computational approaches to genetic analysis that embrace, rather than ignore, the complexity of human health. We focus on multifactor dimensionality reduction (MDR) as an approach for modeling one of these complexities: epistasis or gene-gene interaction.
Collapse
Affiliation(s)
- Qinxin Pan
- Computational Genetics Laboratory, Dartmouth Medical School, Dartmouth College, Lebanon, NH, USA
| | | | | |
Collapse
|
37
|
Papathomas M, Molitor J, Hoggart C, Hastie D, Richardson S. Exploring data from genetic association studies using Bayesian variable selection and the Dirichlet process: application to searching for gene × gene patterns. Genet Epidemiol 2012; 36:663-74. [PMID: 22851500 DOI: 10.1002/gepi.21661] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2012] [Revised: 05/16/2012] [Accepted: 06/08/2012] [Indexed: 11/09/2022]
Abstract
We construct data exploration tools for recognizing important covariate patterns associated with a phenotype, with particular focus on searching for association with gene-gene patterns. To this end, we propose a new variable selection procedure that employs latent selection weights and compare it to an alternative formulation. The selection procedures are implemented in tandem with a Dirichlet process mixture model for the flexible clustering of genetic and epidemiological profiles. We illustrate our approach with the aid of simulated data and the analysis of a real data set from a genome-wide association study.
Collapse
Affiliation(s)
- Michail Papathomas
- School of Mathematics and Statistics, University of St Andrews, Scotland, United Kingdom.
| | | | | | | | | |
Collapse
|
38
|
Silva PS, Fontana V, Luizon MR, Lacchini R, Silva WA, Biagi C, Tanus-Santos JE. eNOS and BDKRB2 genotypes affect the antihypertensive responses to enalapril. Eur J Clin Pharmacol 2012; 69:167-77. [DOI: 10.1007/s00228-012-1326-2] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2012] [Accepted: 05/24/2012] [Indexed: 12/13/2022]
|
39
|
Upstill-Goddard R, Eccles D, Fliege J, Collins A. Machine learning approaches for the discovery of gene-gene interactions in disease data. Brief Bioinform 2012; 14:251-60. [DOI: 10.1093/bib/bbs024] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
|
40
|
Epistasis among eNOS, MMP-9 and VEGF maternal genotypes in hypertensive disorders of pregnancy. Hypertens Res 2012; 35:917-21. [DOI: 10.1038/hr.2012.60] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
41
|
Uzun A, Sharma S, Padbury J. A bioinformatics approach to preterm birth. Am J Reprod Immunol 2012; 67:273-7. [PMID: 22385126 DOI: 10.1111/j.1600-0897.2012.01122.x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2011] [Accepted: 02/06/2012] [Indexed: 01/06/2023] Open
Abstract
A vast body of literature has suggested genetic programming of preterm birth. However, there is a complete lack of an organized analysis and stratification of genetic variants that may indeed be involved in the pathogenesis of preterm birth. We developed a novel bioinformatics approach to identify the nominal genetic variants associated with preterm birth. We used semantic data mining to extract all published articles related to preterm birth. Genes identified from public databases and archives of expression arrays were aggregated with genes curated from the literature. Pathway analysis was used to impute genes from pathways identified in the curations. The curated articles and collected genetic information are available in a web-based tool, the database for preterm birth (dbPTB) that forms a unique resource for investigators interested in preterm birth.
Collapse
Affiliation(s)
- Alper Uzun
- Department of Pediatrics, Women and Infants Hospital of Rhode Island, Providence, USA
| | | | | |
Collapse
|
42
|
Uzun A, Laliberte A, Parker J, Andrew C, Winterrowd E, Sharma S, Istrail S, Padbury JF. dbPTB: a database for preterm birth. Database (Oxford) 2012; 2012:bar069. [PMID: 22323062 PMCID: PMC3275764 DOI: 10.1093/database/bar069] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2011] [Revised: 12/28/2011] [Accepted: 12/29/2011] [Indexed: 02/07/2023]
Abstract
Genome-wide association studies (GWAS) query the entire genome in a hypothesis-free, unbiased manner. Since they have the potential for identifying novel genetic variants, they have become a very popular approach to the investigation of complex diseases. Nonetheless, since the success of the GWAS approach varies widely, the identification of genetic variants for complex diseases remains a difficult problem. We developed a novel bioinformatics approach to identify the nominal genetic variants associated with complex diseases. To test the feasibility of our approach, we developed a web-based aggregation tool to organize the genes, genetic variations and pathways involved in preterm birth. We used semantic data mining to extract all published articles related to preterm birth. All articles were reviewed by a team of curators. Genes identified from public databases and archives of expression arrays were aggregated with genes curated from the literature. Pathway analysis was used to impute genes from pathways identified in the curations. The curated articles and collected genetic information form a unique resource for investigators interested in preterm birth. The Database for Preterm Birth exemplifies an approach that is generalizable to other disorders for which there is evidence of significant genetic contributions.
Collapse
Affiliation(s)
- Alper Uzun
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA, Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA and Brown Alpert Medical School, Providence, RI 02912, USA
| | - Alyse Laliberte
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA, Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA and Brown Alpert Medical School, Providence, RI 02912, USA
| | - Jeremy Parker
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA, Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA and Brown Alpert Medical School, Providence, RI 02912, USA
| | - Caroline Andrew
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA, Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA and Brown Alpert Medical School, Providence, RI 02912, USA
| | - Emily Winterrowd
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA, Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA and Brown Alpert Medical School, Providence, RI 02912, USA
| | - Surendra Sharma
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA, Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA and Brown Alpert Medical School, Providence, RI 02912, USA
| | - Sorin Istrail
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA, Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA and Brown Alpert Medical School, Providence, RI 02912, USA
| | - James F. Padbury
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA, Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA and Brown Alpert Medical School, Providence, RI 02912, USA
| |
Collapse
|
43
|
Xie M, Li J, Jiang T. Detecting genome-wide epistases based on the clustering of relatively frequent items. ACTA ACUST UNITED AC 2011; 28:5-12. [PMID: 22053078 DOI: 10.1093/bioinformatics/btr603] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
MOTIVATION In genome-wide association studies (GWAS), up to millions of single nucleotide polymorphisms (SNPs) are genotyped for thousands of individuals. However, conventional single locus-based approaches are usually unable to detect gene-gene interactions underlying complex diseases. Due to the huge search space for complicated high order interactions, many existing multi-locus approaches are slow and may suffer from low detection power for GWAS. RESULTS In this article, we develop a simple, fast and effective algorithm to detect genome-wide multi-locus epistatic interactions based on the clustering of relatively frequent items. Extensive experiments on simulated data show that our algorithm is fast and more powerful in general than some recently proposed methods. On a real genome-wide case-control dataset for age-related macular degeneration (AMD), the algorithm has identified genotype combinations that are significantly enriched in the cases. AVAILABILITY http://www.cs.ucr.edu/~minzhux/EDCF.zip CONTACT minzhux@cs.ucr.edu; jingli@cwru.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Minzhu Xie
- Department of Computer Science and Engineering, University of California, Riverside, CA 92521, USA
| | | | | |
Collapse
|
44
|
Gilbert-Diamond D, Moore JH. Analysis of gene-gene interactions. CURRENT PROTOCOLS IN HUMAN GENETICS 2011; Chapter 1:Unit1.14. [PMID: 21735376 PMCID: PMC4086055 DOI: 10.1002/0471142905.hg0114s70] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The goal of this unit is to introduce gene-gene interactions (epistasis) as a significant complicating factor in the search for disease susceptibility genes. This unit begins with an overview of gene-gene interactions and why they are likely to be common. Then, it reviews several statistical and computational methods for detecting and characterizing genes with effects that are dependent on other genes. The focus of this unit is genetic association studies of discrete and quantitative traits because most of the methods for detecting gene-gene interactions have been developed specifically for these study designs.
Collapse
Affiliation(s)
- Diane Gilbert-Diamond
- Computational Genetics Laboratory, Departments of Genetics and Community and Family Medicine, Dartmouth Medical School, Lebanon, New Hampshire, USA
| | | |
Collapse
|
45
|
Winham SJ, Slater AJ, Motsinger-Reif AA. A comparison of internal validation techniques for multifactor dimensionality reduction. BMC Bioinformatics 2010; 11:394. [PMID: 20650002 PMCID: PMC2920275 DOI: 10.1186/1471-2105-11-394] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2009] [Accepted: 07/22/2010] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND It is hypothesized that common, complex diseases may be due to complex interactions between genetic and environmental factors, which are difficult to detect in high-dimensional data using traditional statistical approaches. Multifactor Dimensionality Reduction (MDR) is the most commonly used data-mining method to detect epistatic interactions. In all data-mining methods, it is important to consider internal validation procedures to obtain prediction estimates to prevent model over-fitting and reduce potential false positive findings. Currently, MDR utilizes cross-validation for internal validation. In this study, we incorporate the use of a three-way split (3WS) of the data in combination with a post-hoc pruning procedure as an alternative to cross-validation for internal model validation to reduce computation time without impairing performance. We compare the power to detect true disease causing loci using MDR with both 5- and 10-fold cross-validation to MDR with 3WS for a range of single-locus and epistatic disease models. Additionally, we analyze a dataset in HIV immunogenetics to demonstrate the results of the two strategies on real data. RESULTS MDR with 3WS is computationally approximately five times faster than 5-fold cross-validation. The power to find the exact true disease loci without detecting false positive loci is higher with 5-fold cross-validation than with 3WS before pruning. However, the power to find the true disease causing loci in addition to false positive loci is equivalent to the 3WS. With the incorporation of a pruning procedure after the 3WS, the power of the 3WS approach to detect only the exact disease loci is equivalent to that of MDR with cross-validation. In the real data application, the cross-validation and 3WS analyses indicate the same two-locus model. CONCLUSIONS Our results reveal that the performance of the two internal validation methods is equivalent with the use of pruning procedures. The specific pruning procedure should be chosen understanding the trade-off between identifying all relevant genetic effects but including false positives and missing important genetic factors. This implies 3WS may be a powerful and computationally efficient approach to screen for epistatic effects, and could be used to identify candidate interactions in large-scale genetic studies.
Collapse
Affiliation(s)
- Stacey J Winham
- Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA
| | - Andrew J Slater
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC 27695, USA
- Department of Genetics, North Carolina State University, Raleigh, NC 27695, USA
| | - Alison A Motsinger-Reif
- Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC 27695, USA
| |
Collapse
|