1
|
Lee E, Ibrahim JG, Zhu H. Bayesian bi-level variable selection for genome-wide survival study. Genomics Inform 2023; 21:e28. [PMID: 37813624 PMCID: PMC10584651 DOI: 10.5808/gi.23047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 06/26/2023] [Accepted: 06/27/2023] [Indexed: 10/11/2023] Open
Abstract
Mild cognitive impairment (MCI) is a clinical syndrome characterized by the onset and evolution of cognitive impairments, often considered a transitional stage to Alzheimer's disease (AD). The genetic traits of MCI patients who experience a rapid progression to AD can enhance early diagnosis capabilities and facilitate drug discovery for AD. While a genome-wide association study (GWAS) is a standard tool for identifying single nucleotide polymorphisms (SNPs) related to a disease, it fails to detect SNPs with small effect sizes due to stringent control for multiple testing. Additionally, the method does not consider the group structures of SNPs, such as genes or linkage disequilibrium blocks, which can provide valuable insights into the genetic architecture. To address the limitations, we propose a Bayesian bi-level variable selection method that detects SNPs associated with time of conversion from MCI to AD. Our approach integrates group inclusion indicators into an accelerated failure time model to identify important SNP groups. Additionally, we employ data augmentation techniques to impute censored time values using a predictive posterior. We adapt Dirichlet-Laplace shrinkage priors to incorporate the group structure for SNP-level variable selection. In the simulation study, our method outperformed other competing methods regarding variable selection. The analysis of Alzheimer's Disease Neuroimaging Initiative (ADNI) data revealed several genes directly or indirectly related to AD, whereas a classical GWAS did not identify any significant SNPs.
Collapse
Affiliation(s)
- Eunjee Lee
- Department of Information and Statistics, Chungnam National University, Daejeon 34134, Korea
| | - Joseph G. Ibrahim
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA
| | | |
Collapse
|
2
|
Zhou F, Lu X, Ren J, Fan K, Ma S, Wu C. Sparse group variable selection for gene-environment interactions in the longitudinal study. Genet Epidemiol 2022; 46:317-340. [PMID: 35766061 DOI: 10.1002/gepi.22461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 01/31/2022] [Accepted: 03/15/2022] [Indexed: 11/06/2022]
Abstract
Penalized variable selection for high-dimensional longitudinal data has received much attention as it can account for the correlation among repeated measurements while providing additional and essential information for improved identification and prediction performance. Despite the success, in longitudinal studies, the potential of penalization methods is far from fully understood for accommodating structured sparsity. In this article, we develop a sparse group penalization method to conduct the bi-level gene-environment (G × $\times $ E) interaction study under the repeatedly measured phenotype. Within the quadratic inference function framework, the proposed method can achieve simultaneous identification of main and interaction effects on both the group and individual levels. Simulation studies have shown that the proposed method outperforms major competitors. In the case study of asthma data from the Childhood Asthma Management Program, we conduct G × $\times $ E study by using high-dimensional single nucleotide polymorphism data as genetic factors and the longitudinal trait, forced expiratory volume in 1 s, as the phenotype. Our method leads to improved prediction and identification of main and interaction effects with important implications.
Collapse
Affiliation(s)
- Fei Zhou
- Department of Statistics, Kansas State University, Manhattan, Kansas, 66506, USA
| | - Xi Lu
- Department of Statistics, Kansas State University, Manhattan, Kansas, 66506, USA
| | - Jie Ren
- Department of Biostatistics and Health Data Sciences, Indiana University School of Medicine, Indianapolis, Indiana, 46202, USA
| | - Kun Fan
- Department of Statistics, Kansas State University, Manhattan, Kansas, 66506, USA
| | - Shuangge Ma
- Department of Biostatistics, Yale University, New Haven, Connecticut, 06520, USA
| | - Cen Wu
- Department of Statistics, Kansas State University, Manhattan, Kansas, 66506, USA
| |
Collapse
|
3
|
Hu X, Meng Z. Using potential variable to study gene-gene and gene-environment interaction effects with genetic model uncertainty. Ann Hum Genet 2022; 86:257-267. [PMID: 35582845 DOI: 10.1111/ahg.12470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 03/02/2022] [Accepted: 04/08/2022] [Indexed: 11/28/2022]
Abstract
One of the critical issues in genetic association studies is to evaluate the risk of a disease associated with gene-gene or gene-environment interactions. The commonly employed procedures are derived by assigning a particular set of scores to genotypes. However, the underlying genetic models of inheritance are rarely known in practice. Misspecifying a genetic model may result in power loss. By using some potential genetic variables to separate the genotype coding and genetic model parameter, we construct a model-embedded score test (MEST). Our test is free of assumption of gene-environment independence and allows for covariates in the model. An effective sequential optimization algorithm is developed. Extensive simulations show the proposed MEST is robust and powerful in most of scenarios. Finally, we apply the proposed method to rheumatoid arthritis data from the Genetic Analysis Workshop 16 to further investigate the potential interaction effects.
Collapse
Affiliation(s)
- Xiaonan Hu
- NCMIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | - Zhen Meng
- School of Statistics, Capital University of Economics and Business, Beijing, China
| |
Collapse
|
4
|
Akond Z, Ahsan MA, Alam M, Mollah MNH. Robustification of GWAS to explore effective SNPs addressing the challenges of hidden population stratification and polygenic effects. Sci Rep 2021; 11:13060. [PMID: 34158546 PMCID: PMC8219685 DOI: 10.1038/s41598-021-90774-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 05/12/2021] [Indexed: 11/24/2022] Open
Abstract
Genome-wide association studies (GWAS) play a vital role in identifying important genes those is associated with the phenotypic variations of living organisms. There are several statistical methods for GWAS including the linear mixed model (LMM) which is popular for addressing the challenges of hidden population stratification and polygenic effects. However, most of these methods including LMM are sensitive to phenotypic outliers that may lead the misleading results. To overcome this problem, in this paper, we proposed a way to robustify the LMM approach for reducing the influence of outlying observations using the β-divergence method. The performance of the proposed method was investigated using both synthetic and real data analysis. Simulation results showed that the proposed method performs better than both linear regression model (LRM) and LMM approaches in terms of powers and false discovery rates in presence of phenotypic outliers. On the other hand, the proposed method performed almost similar to LMM approach but much better than LRM approach in absence of outliers. In the case of real data analysis, our proposed method identified 11 SNPs that are significantly associated with the rice flowering time. Among the identified candidate SNPs, some were involved in seed development and flowering time pathways, and some were connected with flower and other developmental processes. These identified candidate SNPs could assist rice breeding programs effectively. Thus, our findings highlighted the importance of robust GWAS in identifying candidate genes.
Collapse
Affiliation(s)
- Zobaer Akond
- Bioinformatics Lab, Department of Statistics, University of Rajshahi, Rajshahi, 6205, Bangladesh
- Institute of Environmental Science, University of Rajshahi, Rajshahi, 6205, Bangladesh
- Agricultural Statistics and ICT Division, Bangladesh Agricultural Research Institute (BARI), Gazipur, 1701, Bangladesh
| | - Md Asif Ahsan
- Bioinformatics Lab, Department of Statistics, University of Rajshahi, Rajshahi, 6205, Bangladesh
| | - Munirul Alam
- Molecular Ecology and Metagenomic Laboratory, Infectious Diseases Division, International Centre for Diarrheal Disease Research (Icddr,b), Rajshahi, Bangladesh
| | - Md Nurul Haque Mollah
- Bioinformatics Lab, Department of Statistics, University of Rajshahi, Rajshahi, 6205, Bangladesh.
| |
Collapse
|
5
|
Liu Y, Sun W, Reiner AP, Kooperberg C, He Q. Statistical inference of genetic pathway analysis in high dimensions. Biometrika 2019; 106:651. [PMID: 31427824 DOI: 10.1093/biomet/asz033] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Indexed: 11/13/2022] Open
Abstract
Genetic pathway analysis has become an important tool for investigating the association between a group of genetic variants and traits. With dense genotyping and extensive imputation, the number of genetic variants in biological pathways has increased considerably and sometimes exceeds the sample size [Formula: see text]. Conducting genetic pathway analysis and statistical inference in such settings is challenging. We introduce an approach that can handle pathways whose dimension [Formula: see text] could be greater than [Formula: see text]. Our method can be used to detect pathways that have nonsparse weak signals, as well as pathways that have sparse but stronger signals. We establish the asymptotic distribution for the proposed statistic and conduct theoretical analysis on its power. Simulation studies show that our test has correct Type I error control and is more powerful than existing approaches. An application to a genome-wide association study of high-density lipoproteins demonstrates the proposed approach.
Collapse
Affiliation(s)
- Yang Liu
- Department of Mathematics and Statistics, Wright State University, 3640 Colonel Glenn Highway, Dayton, Ohio, U.S.A
| | - Wei Sun
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, Washington, U.S.A
| | - Alexander P Reiner
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, Washington, U.S.A
| | - Charles Kooperberg
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, Washington, U.S.A
| | - Qianchuan He
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, Washington, U.S.A
| |
Collapse
|
6
|
Chen G, Yuan A, Cai T, Li CM, Bentley AR, Zhou J, N Shriner D, A Adeyemo A, N Rotimi C. Measuring gene-gene interaction using Kullback-Leibler divergence. Ann Hum Genet 2019; 83:405-417. [PMID: 31206606 DOI: 10.1111/ahg.12324] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2018] [Revised: 03/30/2019] [Accepted: 04/12/2019] [Indexed: 12/29/2022]
Abstract
Genome-wide association studies (GWAS) are used to investigate genetic variants contributing to complex traits. Despite discovering many loci, a large proportion of "missing" heritability remains unexplained. Gene-gene interactions may help explain some of this gap. Traditionally, gene-gene interactions have been evaluated using parametric statistical methods such as linear and logistic regression, with multifactor dimensionality reduction (MDR) used to address sparseness of data in high dimensions. We propose a method for the analysis of gene-gene interactions across independent single-nucleotide polymorphisms (SNPs) in two genes. Typical methods for this problem use statistics based on an asymptotic chi-squared mixture distribution, which is not easy to use. Here, we propose a Kullback-Leibler-type statistic, which follows an asymptotic, positive, normal distribution under the null hypothesis of no relationship between SNPs in the two genes, and normally distributed under the alternative hypothesis. The performance of the proposed method is evaluated by simulation studies, which show promising results. The method is also used to analyze real data and identifies gene-gene interactions among RAB3A, MADD, and PTPRN on type 2 diabetes (T2D) status.
Collapse
Affiliation(s)
- Guanjie Chen
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland
| | - Ao Yuan
- Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University, Washington, DC
| | - Tao Cai
- Experimental Medicine Section, Laboratory of Sensory Biology, NIDCR, NIH, Bethesda, Maryland
| | - Chuan-Ming Li
- Division of Scientific Program, National Institute of Deafness and Other Communication Disorders, Rockville, Maryland, 20892
| | - Amy R Bentley
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland
| | - Jie Zhou
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland
| | - Daniel N Shriner
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland
| | - Adebowale A Adeyemo
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland
| | - Charles N Rotimi
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland
| |
Collapse
|
7
|
Ahsan A, Monir M, Meng X, Rahaman M, Chen H, Chen M. Identification of epistasis loci underlying rice flowering time by controlling population stratification and polygenic effect. DNA Res 2019; 26:119-130. [PMID: 30590457 PMCID: PMC6476725 DOI: 10.1093/dnares/dsy043] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Accepted: 11/21/2018] [Indexed: 01/28/2023] Open
Abstract
Flowering time is an important agronomic trait, attributed by multiple genes, gene-gene interactions and environmental factors. Population stratification and polygenic effects might confound genetic effects of the causal loci underlying this complex trait. We proposed a two-step approach for detecting epistasis interactions underlying rice flowering time by accounting population structure and polygenic effects. Simulation studies showed that the approach used in this study performs better than classical and PC-linear approaches in terms of powers and false discovery rates in the case of population stratification and polygenic effects. Whole genome epistasis analyses identified 589 putative genetic interactions for flowering time. Eighteen of these interactions are located within 10 kilobases of regions of known protein-protein interactions. Thirty-seven SNPs near to twenty-five genes involve in rice or/and Arabidopsis (orthologue) flowering pathway. Bioinformatics analysis showed that 66.55% pairwise genes of the identified interactions (392 out of the 589 interactions) have similarity in various genomic features. Moreover, significant numbers of detected epistatic genes have high expression in different floral tissues. Our findings highlight the importance of epistasis analysis by controlling population stratification and polygenic effect and provided novel insights into the genetic architecture of rice flowering which could assist breeding programmes.
Collapse
Affiliation(s)
- Asif Ahsan
- The State Key Laboratory of Plant Physiology and Biochemistry, Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, China
| | - Mamun Monir
- Institute of Bioinformatics, Zhejiang University, Hangzhou, China
| | - Xianwen Meng
- The State Key Laboratory of Plant Physiology and Biochemistry, Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, China
| | - Matiur Rahaman
- The State Key Laboratory of Plant Physiology and Biochemistry, Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, China
- Department of Statistics, Faculty of Science, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj, Bangladesh
| | - Hongjun Chen
- The State Key Laboratory of Plant Physiology and Biochemistry, Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, China
| | - Ming Chen
- The State Key Laboratory of Plant Physiology and Biochemistry, Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, China
- Institute of Bioinformatics, Zhejiang University, Hangzhou, China
| |
Collapse
|
8
|
Zheng C, Ferrari D, Zhang M, Baird P. Ranking the importance of genetic factors by variable‐selection confidence sets. J R Stat Soc Ser C Appl Stat 2019. [DOI: 10.1111/rssc.12337] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Affiliation(s)
| | - Davide Ferrari
- University of Bozen–Bolzano Bolzano Italy
- University of Melbourne Melbourne Australia
| | - Michael Zhang
- University of Melbourne Melbourne Australia
- Royal Victorian Eye and Ear Hospital Melbourne Australia
| | - Paul Baird
- University of Melbourne Melbourne Australia
- Royal Victorian Eye and Ear Hospital Melbourne Australia
| |
Collapse
|
9
|
Choi S, Lee S, Kim Y, Hwang H, Park T. HisCoM-GGI: Hierarchical structural component analysis of gene-gene interactions. J Bioinform Comput Biol 2018; 16:1840026. [PMID: 30567476 DOI: 10.1142/s0219720018400267] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Although genome-wide association studies (GWAS) have successfully identified thousands of single nucleotide polymorphisms (SNPs) associated with common diseases, these observations are limited for fully explaining "missing heritability". Determining gene-gene interactions (GGI) are one possible avenue for addressing the missing heritability problem. While many statistical approaches have been proposed to detect GGI, most of these focus primarily on SNP-to-SNP interactions. While there are many advantages of gene-based GGI analyses, such as reducing the burden of multiple-testing correction, and increasing power by aggregating multiple causal signals across SNPs in specific genes, only a few methods are available. In this study, we proposed a new statistical approach for gene-based GGI analysis, "Hierarchical structural CoMponent analysis of Gene-Gene Interactions" (HisCoM-GGI). HisCoM-GGI is based on generalized structured component analysis, and can consider hierarchical structural relationships between genes and SNPs. For a pair of genes, HisCoM-GGI first effectively summarizes all possible pairwise SNP-SNP interactions into a latent variable, from which it then performs GGI analysis. HisCoM-GGI can evaluate both gene-level and SNP-level interactions. Through simulation studies, HisCoM-GGI demonstrated higher statistical power than existing gene-based GGI methods, in analyzing a GWAS of a Korean population for identifying GGI associated with body mass index. Resultantly, HisCoM-GGI successfully identified 14 potential GGI, two of which, (NCOR2 <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mo>×</mml:mo></mml:math> SPOCK1) and (LINGO2 <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mo>×</mml:mo></mml:math> ZNF385D) were successfully replicated in independent datasets. We conclude that HisCoM-GGI method may be a valuable tool for genome to identify GGI in missing heritability, allowing us to better understand the biological genetic mechanisms of complex traits. We conclude that HisCoM-GGI method may be a valuable tool for genome to identify GGI in missing heritability, allowing us to better understand biological genetic mechanisms of complex traits. An implementation of HisCoM-GGI can be downloaded from the website ( http://statgen.snu.ac.kr/software/hiscom-ggi ).
Collapse
Affiliation(s)
- Sungkyoung Choi
- Department of Pharmacology, Yonsei University College of Medicine, 50-1 Yonsei-ro Seodaemun-gu, Seoul 03722, Republic of Korea
| | - Sungyoung Lee
- Center for Precision Medicine, Seoul National University Hospital, 71 Daehak-ro Jongno-gu, Seoul 03082, Republic of Korea
| | - Yongkang Kim
- Department of Statistics, Seoul National University, 1 Gwanak-ro Gwanak-gu, Seoul 08826, Republic of Korea.,Department of Psychology, McGill University, 2001 Avenue McGill College, Montreal, Quebec H3A 1G1, Canada
| | - Heungsun Hwang
- Department of Psychology, McGill University, 2001 Avenue McGill College, Montreal, Quebec H3A 1G1, Canada
| | - Taesung Park
- Department of Statistics, Seoul National University, 1 Gwanak-ro Gwanak-gu, Seoul 08826, Republic of Korea.,Interdisciplinary Program in Bioinformatics, Seoul National University, 1 Gwanak-ro Gwanak-gu, Seoul 08826, Republic of Korea
| |
Collapse
|
10
|
Wang JH, Chen YH. Overlapping group screening for detection of gene-gene interactions: application to gene expression profiles with survival trait. BMC Bioinformatics 2018; 19:335. [PMID: 30241463 PMCID: PMC6150983 DOI: 10.1186/s12859-018-2372-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Accepted: 09/12/2018] [Indexed: 01/29/2023] Open
Abstract
Background The development of a disease is a complex process that may result from joint effects of multiple genes. In this article, we propose the overlapping group screening (OGS) approach to determining active genes and gene-gene interactions incorporating prior pathway information. The OGS method is developed to overcome the challenges in genome-wide data analysis that the number of the genes and gene-gene interactions is far greater than the sample size, and the pathways generally overlap with one another. The OGS method is further proposed for patients’ survival prediction based on gene expression data. Results Simulation studies demonstrate that the performance of the OGS approach in identifying the true main and interaction effects is good and the survival prediction accuracy of OGS with the Lasso penalty is better than the ordinary Lasso method. In real data analysis, we identify several significant genes and/or epistasis interactions that are associated with clinical survival outcomes of diffuse large B-cell lymphoma (DLBCL) and non-small-cell lung cancer (NSCLC) by utilizing prior pathway information from the KEGG pathway and the GO biological process databases, respectively. Conclusions The OGS approach is useful for selecting important genes and epistasis interactions in the ultra-high dimensional feature space. The prediction ability of OGS with the Lasso penalty is better than existing methods. The OGS approach is generally applicable to various types of outcome data (quantitative, qualitative, censored event time data) and regression models (e.g. linear, logistic, and Cox’s regression models). Electronic supplementary material The online version of this article (10.1186/s12859-018-2372-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jie-Huei Wang
- Institute of Statistical Science, Academia Sinica, Nankang, Taipei, Taiwan
| | - Yi-Hau Chen
- Institute of Statistical Science, Academia Sinica, Nankang, Taipei, Taiwan.
| |
Collapse
|
11
|
Fang YH, Wang JH, Hsiung CA. TSGSIS: a high-dimensional grouped variable selection approach for detection of whole-genome SNP-SNP interactions. Bioinformatics 2018. [PMID: 28651334 DOI: 10.1093/bioinformatics/btx409] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Motivation Identification of single nucleotide polymorphism (SNP) interactions is an important and challenging topic in genome-wide association studies (GWAS). Many approaches have been applied to detecting whole-genome interactions. However, these approaches to interaction analysis tend to miss causal interaction effects when the individual marginal effects are uncorrelated to trait, while their interaction effects are highly associated with the trait. Results A grouped variable selection technique, called two-stage grouped sure independence screening (TS-GSIS), is developed to study interactions that may not have marginal effects. The proposed TS-GSIS is shown to be very helpful in identifying not only causal SNP effects that are uncorrelated to trait but also their corresponding SNP-SNP interaction effects. The benefit of TS-GSIS are gaining detection of interaction effects by taking the joint information among the SNPs and determining the size of candidate sets in the model. Simulation studies under various scenarios are performed to compare performance of TS-GSIS and current approaches. We also apply our approach to a real rheumatoid arthritis (RA) dataset. Both the simulation and real data studies show that the TS-GSIS performs very well in detecting SNP-SNP interactions. Availability and implementation R-package is delivered through CRAN and is available at: https://cran.r-project.org/web/packages/TSGSIS/index.html. Contact hsiung@nhri.org.tw. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yao-Hwei Fang
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan 35053, Taiwan
| | - Jie-Huei Wang
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan 35053, Taiwan
| | - Chao A Hsiung
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan 35053, Taiwan
| |
Collapse
|
12
|
Gosik K, Sun L, Chinchilli VM, Wu R. An Ultrahigh-Dimensional Mapping Model of High-order Epistatic Networks for Complex Traits. Curr Genomics 2018; 19:384-394. [PMID: 30065614 PMCID: PMC6030858 DOI: 10.2174/1389202919666171218162210] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2017] [Revised: 03/28/2017] [Accepted: 05/04/2017] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Genetic interactions involving more than two loci have been thought to affect quantitatively inherited traits and diseases more pervasively than previously appreciated. However, the detection of such high-order interactions to chart a complete portrait of genetic architecture has not been well explored. METHODS We present an ultrahigh-dimensional model to systematically characterize genetic main effects and interaction effects of various orders among all possible markers in a genetic mapping or association study. The model was built on the extension of a variable selection procedure, called iFORM, derived from forward selection. The model shows its unique power to estimate the magnitudes and signs of high-order epistatic effects, in addition to those of main effects and pairwise epistatic effects. RESULTS The statistical properties of the model were tested and validated through simulation studies. By analyzing a real data for shoot growth in a mapping population of woody plant, mei (Prunus mume), we demonstrated the usefulness and utility of the model in practical genetic studies. The model has identified important high-order interactions that contribute to shoot growth for mei. CONCLUSION The model provides a tool to precisely construct genotype-phenotype maps for quantitative traits by identifying any possible high-order epistasis which is often ignored in the current genetic literature.
Collapse
Affiliation(s)
- Kirk Gosik
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA17033, USA
| | - Lidan Sun
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA17033, USA
| | - Vernon M. Chinchilli
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA17033, USA
| | - Rongling Wu
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA17033, USA
| |
Collapse
|
13
|
Coker OO, Dai Z, Nie Y, Zhao G, Cao L, Nakatsu G, Wu WKK, Wong SH, Chen Z, Sung JJY, Yu J. Mucosal microbiome dysbiosis in gastric carcinogenesis. Gut 2018; 67:1024-1032. [PMID: 28765474 PMCID: PMC5969346 DOI: 10.1136/gutjnl-2017-314281] [Citation(s) in RCA: 396] [Impact Index Per Article: 66.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/05/2017] [Revised: 05/23/2017] [Accepted: 06/09/2017] [Indexed: 12/12/2022]
Abstract
OBJECTIVES We aimed to characterise the microbial changes associated with histological stages of gastric tumourigenesis. DESIGN We performed 16S rRNA gene analysis of gastric mucosal samples from 81 cases including superficial gastritis (SG), atrophic gastritis (AG), intestinal metaplasia (IM) and gastric cancer (GC) from Xi'an, China, to determine mucosal microbiome dysbiosis across stages of GC. We validated the results in mucosal samples of 126 cases from Inner Mongolia, China. RESULTS We observed significant mucosa microbial dysbiosis in IM and GC subjects, with significant enrichment of 21 and depletion of 10 bacterial taxa in GC compared with SG (q<0.05). Microbial network analysis showed increasing correlation strengths among them with disease progression (p<0.001). Five GC-enriched bacterial taxa whose species identifications correspond to Peptostreptococcus stomatis, Streptococcus anginosus, Parvimonas micra, Slackia exigua and Dialister pneumosintes had significant centralities in the GC ecological network (p<0.05) and classified GC from SG with an area under the receiver-operating curve (AUC) of 0.82. Moreover, stronger interactions among gastric microbes were observed in Helicobacter pylori-negative samples compared with H. pylori-positive samples in SG and IM. The fold changes of selected bacteria, and strengths of their interactions were successfully validated in the Inner Mongolian cohort, in which the five bacterial markers distinguished GC from SG with an AUC of 0.81. CONCLUSIONS In addition to microbial compositional changes, we identified differences in bacterial interactions across stages of gastric carcinogenesis. The significant enrichments and network centralities suggest potentially important roles of P. stomatis, D. pneumosintes, S. exigua, P. micra and S. anginosus in GC progression.
Collapse
Affiliation(s)
- Olabisi Oluwabukola Coker
- Department of Medicine and Therapeutics, Institute of Digestive Disease, State Key Laboratory of Digestive Disease, Li Ka Shing Institute of Health Sciences, CUHK Shenzhen Research Institute, The Chinese University of Hong Kong, Hong Kong, China
| | - Zhenwei Dai
- Department of Medicine and Therapeutics, Institute of Digestive Disease, State Key Laboratory of Digestive Disease, Li Ka Shing Institute of Health Sciences, CUHK Shenzhen Research Institute, The Chinese University of Hong Kong, Hong Kong, China
| | - Yongzhan Nie
- State Key Laboratory of Cancer Biology, Xijing Hospital, Fourth Military Medical University, Xian, China
| | - Guijun Zhao
- Department of Gastroenterology and Hepatology, Inner Mongolia People’s Hospital, Hohhot, China
| | - Lei Cao
- Department of Medicine and Therapeutics, Institute of Digestive Disease, State Key Laboratory of Digestive Disease, Li Ka Shing Institute of Health Sciences, CUHK Shenzhen Research Institute, The Chinese University of Hong Kong, Hong Kong, China
| | - Geicho Nakatsu
- Department of Medicine and Therapeutics, Institute of Digestive Disease, State Key Laboratory of Digestive Disease, Li Ka Shing Institute of Health Sciences, CUHK Shenzhen Research Institute, The Chinese University of Hong Kong, Hong Kong, China
| | - William KK Wu
- Department of Medicine and Therapeutics, Institute of Digestive Disease, State Key Laboratory of Digestive Disease, Li Ka Shing Institute of Health Sciences, CUHK Shenzhen Research Institute, The Chinese University of Hong Kong, Hong Kong, China
| | - Sunny Hei Wong
- Department of Medicine and Therapeutics, Institute of Digestive Disease, State Key Laboratory of Digestive Disease, Li Ka Shing Institute of Health Sciences, CUHK Shenzhen Research Institute, The Chinese University of Hong Kong, Hong Kong, China
| | - Zigui Chen
- Department of Microbiology, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China
| | - Joseph J Y Sung
- Department of Medicine and Therapeutics, Institute of Digestive Disease, State Key Laboratory of Digestive Disease, Li Ka Shing Institute of Health Sciences, CUHK Shenzhen Research Institute, The Chinese University of Hong Kong, Hong Kong, China
| | - Jun Yu
- Department of Medicine and Therapeutics, Institute of Digestive Disease, State Key Laboratory of Digestive Disease, Li Ka Shing Institute of Health Sciences, CUHK Shenzhen Research Institute, The Chinese University of Hong Kong, Hong Kong, China
| |
Collapse
|
14
|
Detection of Epistasis for Flowering Time Using Bayesian Multilocus Estimation in a Barley MAGIC Population. Genetics 2017; 208:525-536. [PMID: 29254994 PMCID: PMC5788519 DOI: 10.1534/genetics.117.300546] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2017] [Accepted: 12/12/2017] [Indexed: 12/16/2022] Open
Abstract
Gene-by-gene interactions, also known as epistasis, regulate many complex traits in different species. With the availability of low-cost genotyping it is now possible to study epistasis on a genome-wide scale. However, identifying genome-wide epistasis is a high-dimensional multiple regression problem and needs the application of dimensionality reduction techniques. Flowering Time (FT) in crops is a complex trait that is known to be influenced by many interacting genes and pathways in various crops. In this study, we successfully apply Sure Independence Screening (SIS) for dimensionality reduction to identify two-way and three-way epistasis for the FT trait in a Multiparent Advanced Generation Inter-Cross (MAGIC) barley population using the Bayesian multilocus model. The MAGIC barley population was generated from intercrossing among eight parental lines and thus, offered greater genetic diversity to detect higher-order epistatic interactions. Our results suggest that SIS is an efficient dimensionality reduction approach to detect high-order interactions in a Bayesian multilocus model. We also observe that many of our findings (genomic regions with main or higher-order epistatic effects) overlap with known candidate genes that have been already reported in barley and closely related species for the FT trait.
Collapse
|
15
|
Sun L, Wang J, Sang M, Jiang L, Zhao B, Cheng T, Zhang Q, Wu R. Landscaping Crossover Interference Across a Genome. TRENDS IN PLANT SCIENCE 2017; 22:894-907. [PMID: 28822625 DOI: 10.1016/j.tplants.2017.06.008] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Revised: 06/09/2017] [Accepted: 06/12/2017] [Indexed: 05/14/2023]
Abstract
The evolutionary success of eukaryotic organisms crucially depends on the capacity to produce genetic diversity through reciprocal exchanges of each chromosome pair, or crossovers (COs), during meiosis. It has been recognized that COs arise more evenly across a given chromosome than at random. This phenomenon, termed CO interference, occurs pervasively in eukaryotes and may confer a selective advantage. We describe here a multipoint linkage analysis procedure for segregating families to quantify the strength of CO interference over the genome, and extend this procedure to illustrate the landscape of CO interference in natural populations. We further discuss the crucial role of CO interference in amplifying and maintaining genetic diversity through sex-, stress-, and age-induced differentiation.
Collapse
Affiliation(s)
- Lidan Sun
- Beijing Key Laboratory of Ornamental Plants Germplasm Innovation and Molecular Breeding, National Engineering Research Center for Floriculture, Beijing Forestry University, Beijing 100083, China
| | - Jing Wang
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Mengmeng Sang
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Libo Jiang
- Beijing Key Laboratory of Ornamental Plants Germplasm Innovation and Molecular Breeding, National Engineering Research Center for Floriculture, Beijing Forestry University, Beijing 100083, China
| | - Bingyu Zhao
- Department of Horticulture, Virginia Tech, Blacksburg, VA 24061, USA
| | - Tangran Cheng
- Beijing Key Laboratory of Ornamental Plants Germplasm Innovation and Molecular Breeding, National Engineering Research Center for Floriculture, Beijing Forestry University, Beijing 100083, China
| | - Qixiang Zhang
- Beijing Key Laboratory of Ornamental Plants Germplasm Innovation and Molecular Breeding, National Engineering Research Center for Floriculture, Beijing Forestry University, Beijing 100083, China
| | - Rongling Wu
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China; Center for Statistical Genetics, Pennsylvania State University, Hershey, PA 17033, USA.
| |
Collapse
|
16
|
Kong Y, Li D, Fan Y, Lv J. Interaction pursuit in high-dimensional multi-response regression via distance correlation. Ann Stat 2017. [DOI: 10.1214/16-aos1474] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
17
|
Mieth B, Kloft M, Rodríguez JA, Sonnenburg S, Vobruba R, Morcillo-Suárez C, Farré X, Marigorta UM, Fehr E, Dickhaus T, Blanchard G, Schunk D, Navarro A, Müller KR. Combining Multiple Hypothesis Testing with Machine Learning Increases the Statistical Power of Genome-wide Association Studies. Sci Rep 2016; 6:36671. [PMID: 27892471 PMCID: PMC5125008 DOI: 10.1038/srep36671] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Accepted: 10/06/2016] [Indexed: 12/21/2022] Open
Abstract
The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008-2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0.
Collapse
Affiliation(s)
- Bettina Mieth
- Machine Learning Group, Technische Universität Berlin, Berlin, 10587, Germany
| | - Marius Kloft
- Department of Computer Science, Humboldt University of Berlin, Berlin, 10099, Germany
| | - Juan Antonio Rodríguez
- Institut de Biología Evolutiva (CSIC-UPF). Departament de Ciències Experimentals i de la Salut. Universitat Pompeu Fabra, Barcelona, 08003, Spain
| | | | - Robin Vobruba
- Machine Learning Group, Technische Universität Berlin, Berlin, 10587, Germany
| | - Carlos Morcillo-Suárez
- Institut de Biología Evolutiva (CSIC-UPF). Departament de Ciències Experimentals i de la Salut. Universitat Pompeu Fabra, Barcelona, 08003, Spain
| | - Xavier Farré
- Institut de Biología Evolutiva (CSIC-UPF). Departament de Ciències Experimentals i de la Salut. Universitat Pompeu Fabra, Barcelona, 08003, Spain
| | - Urko M. Marigorta
- School of Biology, Georgia Institute of Technology, Atlanta, 30332, GA, USA
| | - Ernst Fehr
- Department of Economics, Laboratory for Social and Neural Systems Research, University of Zurich, Zurich, 8006, Switzerland
| | - Thorsten Dickhaus
- Institute for Statistics (FB 3), University of Bremen, Bremen, 28359, Germany
| | - Gilles Blanchard
- Department of Mathematics, University of Potsdam, Potsdam, 14476, Germany
| | - Daniel Schunk
- Department of Economics, University of Mainz, Mainz, 55099, Germany
| | - Arcadi Navarro
- Institut de Biología Evolutiva (CSIC-UPF). Departament de Ciències Experimentals i de la Salut. Universitat Pompeu Fabra, Barcelona, 08003, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, 08010, Spain
- Center for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, 08003, Spain
| | - Klaus-Robert Müller
- Machine Learning Group, Technische Universität Berlin, Berlin, 10587, Germany
- Department of Brain and Cognitive Engineering, Korea University, Seoul, Republic of Korea
| |
Collapse
|
18
|
Liu J. Feature screening and variable selection for partially linear models with ultrahigh-dimensional longitudinal data. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2015.09.122] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
19
|
Wang N, Gosik K, Li R, Lindsay B, Wu R. A block mixture model to map eQTLs for gene clustering and networking. Sci Rep 2016; 6:21193. [PMID: 26892775 PMCID: PMC4759821 DOI: 10.1038/srep21193] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2015] [Accepted: 01/19/2016] [Indexed: 01/13/2023] Open
Abstract
To study how genes function in a cellular and physiological process, a general procedure is to classify gene expression profiles into categories based on their similarity and reconstruct a regulatory network for functional elements. However, this procedure has not been implemented with the genetic mechanisms that underlie the organization of gene clusters and networks, despite much effort made to map expression quantitative trait loci (eQTLs) that affect the expression of individual genes. Here we address this issue by developing a computational approach that integrates gene clustering and network reconstruction with genetic mapping into a unifying framework. The approach can not only identify specific eQTLs that control how genes are clustered and organized toward biological functions, but also enable the investigation of the biological mechanisms that individual eQTLs perturb in a signaling pathway. We applied the new approach to characterize the effects of eQTLs on the structure and organization of gene clusters in Caenorhabditis elegans. This study provides the first characterization, to our knowledge, of the effects of genetic variants on the regulatory network of gene expression. The approach developed can also facilitate the genetic dissection of other dynamic processes, including development, physiology and disease progression in any organisms.
Collapse
Affiliation(s)
- Ningtao Wang
- Department of Biostatistics, University of Texas School of Public Health, Houston, TX 77030, USA.,Department of Public Health Sciences, The Pennsylvania State University, Hershey, PA 17033, USA
| | - Kirk Gosik
- Department of Statistics, The Pennsylvania State University, University Park, PA 16802, USA
| | - Runze Li
- Department of Biostatistics, University of Texas School of Public Health, Houston, TX 77030, USA.,Department of Statistics, The Pennsylvania State University, University Park, PA 16802, USA
| | - Bruce Lindsay
- Department of Biostatistics, University of Texas School of Public Health, Houston, TX 77030, USA
| | - Rongling Wu
- Department of Biostatistics, University of Texas School of Public Health, Houston, TX 77030, USA.,Department of Statistics, The Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
20
|
Yan Q, Zhu X, Jiang L, Ye M, Sun L, Terblanche JS, Wu R. A computing platform to map ecological metabolism by integrating functional mapping and the metabolic theory of ecology. Brief Bioinform 2016; 18:137-144. [DOI: 10.1093/bib/bbv116] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2015] [Revised: 12/17/2015] [Indexed: 11/12/2022] Open
|
21
|
JingYuan LIU, Wei ZHONG, RunZe LI. A selective overview of feature screening for ultrahigh-dimensional data. SCIENCE CHINA. MATHEMATICS 2015; 58:2033-2054. [PMID: 26779257 PMCID: PMC4711389 DOI: 10.1007/s11425-015-5062-9] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
High-dimensional data have frequently been collected in many scientific areas including genomewide association study, biomedical imaging, tomography, tumor classifications, and finance. Analysis of high-dimensional data poses many challenges for statisticians. Feature selection and variable selection are fundamental for high-dimensional data analysis. The sparsity principle, which assumes that only a small number of predictors contribute to the response, is frequently adopted and deemed useful in the analysis of high-dimensional data. Following this general principle, a large number of variable selection approaches via penalized least squares or likelihood have been developed in the recent literature to estimate a sparse model and select significant variables simultaneously. While the penalized variable selection methods have been successfully applied in many high-dimensional analyses, modern applications in areas such as genomics and proteomics push the dimensionality of data to an even larger scale, where the dimension of data may grow exponentially with the sample size. This has been called ultrahigh-dimensional data in the literature. This work aims to present a selective overview of feature screening procedures for ultrahigh-dimensional data. We focus on insights into how to construct marginal utilities for feature screening on specific models and motivation for the need of model-free feature screening procedures.
Collapse
Affiliation(s)
- LIU JingYuan
- Department of Statistics, School of Economics, Xiamen University, Xiamen 361005, China
- Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen 361005, China
- Fujian Key Laboratory of Statistical Science, Xiamen University, Xiamen 361005, China
| | - ZHONG Wei
- Department of Statistics, School of Economics, Xiamen University, Xiamen 361005, China
- Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen 361005, China
- Fujian Key Laboratory of Statistical Science, Xiamen University, Xiamen 361005, China
| | - LI RunZe
- Department of Statistics and The Methodology Center, Pennsylvania State University, University Park, PA 16802-2111, USA
| |
Collapse
|
22
|
Abstract
Despite increasing emphasis on the genetic study of quantitative traits, we are still far from being able to chart a clear picture of their genetic architecture, given an inherent complexity involved in trait formation. A competing theory for studying such complex traits has emerged by viewing their phenotypic formation as a "system" in which a high-dimensional group of interconnected components act and interact across different levels of biological organization from molecules through cells to whole organisms. This system is initiated by a machinery of DNA sequences that regulate a cascade of biochemical pathways to synthesize endophenotypes and further assemble these endophenotypes toward the end-point phenotype in virtue of various developmental changes. This review focuses on a conceptual framework for genetic mapping of complex traits by which to delineate the underlying components, interactions and mechanisms that govern the system according to biological principles and understand how these components function synergistically under the control of quantitative trait loci (QTLs) to comprise a unified whole. This framework is built by a system of differential equations that quantifies how alterations of different components lead to the global change of trait development and function, and provides a quantitative and testable platform for assessing the multiscale interplay between QTLs and development. The method will enable geneticists to shed light on the genetic complexity of any biological system and predict, alter or engineer its physiological and pathological states.
Collapse
Affiliation(s)
- Lidan Sun
- National Engineering Research Center for Floriculture, College of Landscape Architecture, Beijing Forestry University, Beijing 100083, China; Center for Statistical Genetics, Departments of Public Health Sciences and Statistics, The Pennsylvania State University, Hershey, PA 17033, USA
| | - Rongling Wu
- Center for Computational Biology, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China; Center for Statistical Genetics, Departments of Public Health Sciences and Statistics, The Pennsylvania State University, Hershey, PA 17033, USA.
| |
Collapse
|