Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhou H, Sehl ME, Sinsheimer JS, Lange K. Association screening of common and rare genetic variants by penalized regression. Bioinformatics 2010;26:2375-82. [PMID: 20693321 DOI: 10.1093/bioinformatics/btq448] [Citation(s) in RCA: 104] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

For:	Zhou H, Sehl ME, Sinsheimer JS, Lange K. Association screening of common and rare genetic variants by penalized regression. Bioinformatics 2010;26:2375-82. [PMID: 20693321 DOI: 10.1093/bioinformatics/btq448] [Citation(s) in RCA: 104] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Number

Cited by Other Article(s)

Kim K, Jun TH, Ha BK, Wang S, Sun H. New statistical selection method for pleiotropic variants associated with both quantitative and qualitative traits. BMC Bioinformatics 2023;24:381. [PMID: 37817069 PMCID: PMC10563219 DOI: 10.1186/s12859-023-05505-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Accepted: 09/28/2023] [Indexed: 10/12/2023] Open

Abstract

BACKGROUND

Identification of pleiotropic variants associated with multiple phenotypic traits has received increasing attention in genetic association studies. Overlapping genetic associations from multiple traits help to detect weak genetic associations missed by single-trait analyses. Many statistical methods were developed to identify pleiotropic variants with most of them being limited to quantitative traits when pleiotropic effects on both quantitative and qualitative traits have been observed. This is a statistically challenging problem because there does not exist an appropriate multivariate distribution to model both quantitative and qualitative data together. Alternatively, meta-analysis methods can be applied, which basically integrate summary statistics of individual variants associated with either a quantitative or a qualitative trait without accounting for correlations among genetic variants.

RESULTS

We propose a new statistical selection method based on a unified selection score quantifying how a genetic variant, i.e., a pleiotropic variant associates with both quantitative and qualitative traits. In our extensive simulation studies where various types of pleiotropic effects on both quantitative and qualitative traits were considered, we demonstrated that the proposed method outperforms the existing meta-analysis methods in terms of true positive selection. We also applied the proposed method to a peanut dataset with 6 quantitative and 2 qualitative traits, and a cowpea dataset with 2 quantitative and 6 qualitative traits. We were able to detect some potentially pleiotropic variants missed by the existing methods in both analyses.

CONCLUSIONS

The proposed method is able to locate pleiotropic variants associated with both quantitative and qualitative traits. It has been implemented into an R package 'UNISS', which can be downloaded from http://github.com/statpng/uniss.

Collapse

Caballero FF, Lana A, Struijk EA, Arias-Fernández L, Yévenes-Briones H, Cárdenas-Valladolid J, Salinero-Fort MÁ, Banegas JR, Rodríguez-Artalejo F, Lopez-Garcia E. Prospective Association Between Plasma Concentrations of Fatty Acids and Other Lipids, and Multimorbidity in Older Adults. J Gerontol A Biol Sci Med Sci 2023;78:1763-1770. [PMID: 37156635 DOI: 10.1093/gerona/glad122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Indexed: 05/10/2023] Open

Abstract

Biological mechanisms that lead to multimorbidity are mostly unknown, and metabolomic profiles are promising to explain different pathways in the aging process. The aim of this study was to assess the prospective association between plasma fatty acids and other lipids, and multimorbidity in older adults. Data were obtained from the Spanish Seniors-ENRICA 2 cohort, comprising noninstitutionalized adults ≥65 years old. Blood samples were obtained at baseline and after a 2-year follow-up period for a total of 1 488 subjects. Morbidity was also collected at baseline and end of the follow-up from electronic health records. Multimorbidity was defined as a quantitative score, after weighting morbidities (from a list of 60 mutually exclusive chronic conditions) by their regression coefficients on physical functioning. Generalized estimating equation models were employed to assess the longitudinal association between fatty acids and other lipids, and multimorbidity, and stratified analyses by diet quality, measured with the Alternative Healthy Eating Index-2010, were also conducted. Among study participants, higher concentrations of omega-6 fatty acids [coef. per 1-SD increase (95% CI) = -0.76 (-1.23, -0.30)], phosphoglycerides [-1.26 (-1.77, -0.74)], total cholines [-1.48 (-1.99, -0.96)], phosphatidylcholines [-1.23 (-1.74, -0.71)], and sphingomyelins [-1.65 (-2.12, -1.18)], were associated with lower multimorbidity scores. The strongest associations were observed for those with a higher diet quality. Higher plasma concentrations of omega-6 fatty acids, phosphoglycerides, total cholines, phosphatidylcholines, and sphingomyelins were prospectively associated with lower multimorbidity in older adults, although diet quality could modulate the associations found. These lipids may serve as risk markers for multimorbidity.

Collapse

Affiliation(s)

Francisco Félix Caballero Department of Preventive Medicine and Public Health, Universidad Autónoma de Madrid and CIBER of Epidemiology and Public Health, Madrid, Spain
Alberto Lana Department of Medicine, Universidad de Oviedo/ISPA, Oviedo, Spain
Ellen A Struijk Department of Preventive Medicine and Public Health, Universidad Autónoma de Madrid and CIBER of Epidemiology and Public Health, Madrid, Spain
Lucía Arias-Fernández Primary Health Care Network, Asturias Health Service, Asturias, Spain
Humberto Yévenes-Briones Department of Preventive Medicine and Public Health, Universidad Autónoma de Madrid and CIBER of Epidemiology and Public Health, Madrid, Spain
Juan Cárdenas-Valladolid Dirección Técnica de Sistemas de Información. Gerencia Asistencial de Atención Primaria, Servicio Madrileño de Salud, Fundación de Investigación e Innovación Biosanitaria de Atención Primaria, Madrid, Spain Enfermería, Universidad Alfonso X El Sabio, Villanueva de la Cañada, Spain
Miguel Ángel Salinero-Fort Subdirección General de Investigación Sanitaria, Consejería de Sanidad, Fundación de Investigación e Innovación Sanitaria de Atención Primaria, Madrid, Spain Red de Investigación en Servicios de Salud en Enfermedades Crónicas, Grupo de Envejecimiento y Fragilidad de las personas mayores. IdIPAZ, Madrid, Spain
José R Banegas Department of Preventive Medicine and Public Health, Universidad Autónoma de Madrid and CIBER of Epidemiology and Public Health, Madrid, Spain
Fernando Rodríguez-Artalejo Department of Preventive Medicine and Public Health, Universidad Autónoma de Madrid and CIBER of Epidemiology and Public Health, Madrid, Spain IMDEA-Food Institute. CEI UAM+CSIC, Madrid, Spain
Esther Lopez-Garcia Department of Preventive Medicine and Public Health, Universidad Autónoma de Madrid and CIBER of Epidemiology and Public Health, Madrid, Spain IMDEA-Food Institute. CEI UAM+CSIC, Madrid, Spain

Collapse

Liang X, Sun H. Weighted Selection Probability to Prioritize Susceptible Rare Variants in Multi-Phenotype Association Studies with Application to a Soybean Genetic Data Set. J Comput Biol 2023;30:1075-1088. [PMID: 37871292 DOI: 10.1089/cmb.2022.0487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2023] Open

Boutry S, Helaers R, Lenaerts T, Vikkula M. Rare variant association on unrelated individuals in case-control studies using aggregation tests: existing methods and current limitations. Brief Bioinform 2023;24:bbad412. [PMID: 37974506 DOI: 10.1093/bib/bbad412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 10/14/2023] [Accepted: 10/28/2023] [Indexed: 11/19/2023] Open

Wu M, Hao S, Wang X, Su S, Du S, Zhou S, Yang R, Du H. A pyroptosis-related gene signature that predicts immune infiltration and prognosis in colon cancer. Front Oncol 2023;13:1173181. [PMID: 37503314 PMCID: PMC10369052 DOI: 10.3389/fonc.2023.1173181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 06/23/2023] [Indexed: 07/29/2023] Open

Abstract

Background

Colon cancer (CC) is a highly heterogeneous malignancy associated with high morbidity and mortality. Pyroptosis is a type of programmed cell death characterized by an inflammatory response that can affect the tumor immune microenvironment and has potential prognostic and therapeutic value. The aim of this study was to evaluate the association between pyroptosis-related gene (PRG) expression and CC.

Methods

Based on the expression profiles of PRGs, we classified CC samples from The Cancer Gene Atlas and Gene Expression Omnibus databases into different clusters by unsupervised clustering analysis. The best prognostic signature was screened and established using least absolute shrinkage and selection operator (LASSO) and multivariate COX regression analyses. Subsequently, a nomogram was established based on multivariate COX regression analysis. Next, gene set enrichment analysis (GSEA) and gene set variation analysis (GSVA) were performed to explore the potential molecular mechanisms between the high- and low-risk groups and to explore the differences in clinicopathological characteristics, gene mutation characteristics, abundance of infiltrating immune cells, and immune microenvironment between the two groups. We also evaluated the association between common immune checkpoints and drug sensitivity using risk scores. The immunohistochemistry staining was utilized to confirm the expression of the selected genes in the prognostic model in CC.

Results

The 1163 CC samples were divided into two clusters (clusters A and B) based on the expression profiles of the 33 PRGs. Genes with prognostic value were screened from the DEGs between the two clusters, and an eight PRGs prognostic model was constructed. GSEA and GSVA of the high- and low-risk groups revealed that they were mainly enriched in inflammatory response-related pathways. Compared to those in the low-risk group, patients in the high-risk group had worse overall survival, an immunosuppressive microenvironment, and worse sensitivity to immunotherapy and drug treatment.

Conclusion

Our findings provide a foundation for future research targeting pyroptosis and new insights into prognosis and immunotherapy from the perspective of pyroptosis in CC.

Collapse

Chu BB, Ko S, Zhou JJ, Jensen A, Zhou H, Sinsheimer JS, Lange K. Multivariate genome-wide association analysis by iterative hard thresholding. Bioinformatics 2023;39:btad193. [PMID: 37067496 PMCID: PMC10133532 DOI: 10.1093/bioinformatics/btad193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 04/07/2023] [Accepted: 04/13/2023] [Indexed: 04/18/2023] Open

Survival Analysis with High-Dimensional Omics Data Using a Threshold Gradient Descent Regularization-Based Neural Network Approach. Genes (Basel) 2022;13:genes13091674. [PMID: 36140842 PMCID: PMC9498566 DOI: 10.3390/genes13091674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 09/13/2022] [Accepted: 09/16/2022] [Indexed: 11/17/2022] Open

Caballero FF, Lana A, Struijk EA, Arias-Fernández L, Yévenes-Briones H, Cárdenas-Valladolid J, Salinero-Fort MÁ, Banegas JR, Rodríguez-Artalejo F, Lopez-Garcia E. Prospective Association Between Plasma Amino Acids And Multimorbidity In Older Adults. J Gerontol A Biol Sci Med Sci 2022;78:637-644. [PMID: 35876753 DOI: 10.1093/gerona/glac144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Indexed: 11/14/2022] Open

Affiliation(s)

Francisco Félix Caballero Department of Preventive Medicine and Public Health. Universidad Autónoma de Madrid and CIBER of Epidemiology and Public Health, Madrid
Alberto Lana Department of Medicine. Universidad de Oviedo/ISPA, Oviedo
Ellen A Struijk Department of Preventive Medicine and Public Health. Universidad Autónoma de Madrid and CIBER of Epidemiology and Public Health, Madrid
Lucía Arias-Fernández Primary Health Care Network. Asturias Health Service, Asturias
Humberto Yévenes-Briones Department of Preventive Medicine and Public Health. Universidad Autónoma de Madrid and CIBER of Epidemiology and Public Health, Madrid
Juan Cárdenas-Valladolid Dirección Técnica de Sistemas de Información. Gerencia Asistencial de Atención Primaria, Servicio Madrileño de Salud, Madrid.,Fundación de Investigación e Innovación Biosanitaria de Atención Primaria, Madrid.,Enfermería. Universidad Alfonso X El Sabio, Villanueva de la Cañada
Miguel Ángel Salinero-Fort Fundación de Investigación e Innovación Biosanitaria de Atención Primaria, Madrid.,Subdirección General de Investigación Sanitaria. Consejería de Sanidad, Madrid.,Red de Investigación en Servicios de Salud en Enfermedades Crónicas.,Grupo de Envejecimiento y Fragilidad de las personas mayores. IdIPAZ, Madrid
José R Banegas Department of Preventive Medicine and Public Health. Universidad Autónoma de Madrid and CIBER of Epidemiology and Public Health, Madrid
Fernando Rodríguez-Artalejo Department of Preventive Medicine and Public Health. Universidad Autónoma de Madrid and CIBER of Epidemiology and Public Health, Madrid.,IMDEA-Food Institute. CEI UAM+CSIC, Madrid
Esther Lopez-Garcia Department of Preventive Medicine and Public Health. Universidad Autónoma de Madrid and CIBER of Epidemiology and Public Health, Madrid.,IMDEA-Food Institute. CEI UAM+CSIC, Madrid

Collapse

Liu J, Si Y, Niu Y, Zhang R. Projection quantile correlation and its use in high-dimensional grouped variable screening. Comput Stat Data Anal 2022. [DOI: 10.1016/j.csda.2021.107369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Wang K, Liu Y, Lu G, Xiao J, Huang J, Lei L, Peng J, Li Y, Wei S. A functional methylation signature to predict the prognosis of Chinese lung adenocarcinoma based on TCGA. Cancer Med 2021;11:281-294. [PMID: 34854250 PMCID: PMC8704183 DOI: 10.1002/cam4.4431] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 10/08/2021] [Accepted: 10/10/2021] [Indexed: 01/16/2023] Open

Abstract

Background

Lung cancer is the leading cause of cancer morbidity and mortality worldwide, however, the individualized treatment is still unsatisfactory. DNA methylation can affect gene regulation and may be one of the most valuable biomarkers in predicting the prognosis of lung adenocarcinoma. This study was aimed to identify methylation CpG sites that may be used to predict lung adenocarcinoma prognosis.

Methods

The Cancer Genome Atlas (TCGA) database was used to detect methylation CpG sites associated with lung adenocarcinoma prognosis and construct a methylation signature model. Then, a Chinese cohort was carried out to estimate the association between methylation and lung adenocarcinoma prognosis. Biological function studies, including demethylation treatment, cell proliferative capacity, and gene expression changes in lung adenocarcinoma cell lines, were further performed.

Results

In the TCGA set, three methylation CpG sites were selected that were associated with lung adenocarcinoma prognosis (cg14517217, cg15386964, and cg18878992). The risk of mortality was increased in lung adenocarcinoma patients with the gradual increase level of methylation signature based on three methylation sites levels (HR = 45.30, 95% CI = 26.69–66.83; p < 0.001). The C‐statistic value increased to 0.77 when age, gender, and other clinical variables were added to the signature to prediction model. A similar situation was confirmed in Chinese lung adenocarcinoma cohort. In the biological function studies, the proliferative capacity of cell lines was inhibited when the cells were demethylated with 5‐aza‐2'‐deoxycytidine (5‐aza‐2dC). The mRNA and protein expression levels of SEPT9 and HIST1H2BH (cg14517217 and cg15386964) were downregulated with different concentrations of 5‐aza‐2dC treatment, while cg18878992 showed the opposite result.

Conclusion

This study is the first to develop a three‐CpG‐based model for lung adenocarcinoma, which is a practical and useful tool for prognostic prediction that has been validated in a Chinese population.

Collapse

Kim J, Shen J, Wang A, Mehrotra DV, Ko S, Zhou JJ, Zhou H. VCSEL: Prioritizing SNP-set by penalized variance component selection. Ann Appl Stat 2021;15:1652-1672. [DOI: 10.1214/21-aoas1491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Mieth B, Rozier A, Rodriguez JA, Höhne MMC, Görnitz N, Müller KR. DeepCOMBI: explainable artificial intelligence for the analysis and discovery in genome-wide association studies. NAR Genom Bioinform 2021;3:lqab065. [PMID: 34296082 PMCID: PMC8291080 DOI: 10.1093/nargab/lqab065] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 05/27/2021] [Accepted: 07/08/2021] [Indexed: 02/06/2023] Open

Zou K, Kim KS, Kim K, Kang D, Park YH, Sun H, Ha BK, Ha J, Jun TH. Genetic Diversity and Genome-Wide Association Study of Seed Aspect Ratio Using a High-Density SNP Array in Peanut (Arachis hypogaea L.). Genes (Basel) 2020;12:E2. [PMID: 33375051 PMCID: PMC7822046 DOI: 10.3390/genes12010002] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2020] [Revised: 12/09/2020] [Accepted: 12/17/2020] [Indexed: 12/12/2022] Open

Selection probability of multivariate regularization to identify pleiotropic variants in genetic association studies. COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS 2020. [DOI: 10.29220/csam.2020.27.5.535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Chu BB, Keys KL, German CA, Zhou H, Zhou JJ, Sobel EM, Sinsheimer JS, Lange K. Iterative hard thresholding in genome-wide association studies: Generalized linear models, prior weights, and double sparsity. Gigascience 2020;9:giaa044. [PMID: 32491161 PMCID: PMC7268817 DOI: 10.1093/gigascience/giaa044] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 02/27/2020] [Accepted: 04/14/2020] [Indexed: 11/17/2022] Open

Blagus R, Goeman JJ. Mean squared error of ridge estimators in logistic regression. STAT NEERL 2020. [DOI: 10.1111/stan.12201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Kim K, Sun H. Incorporating genetic networks into case-control association studies with high-dimensional DNA methylation data. BMC Bioinformatics 2019;20:510. [PMID: 31640538 PMCID: PMC6805595 DOI: 10.1186/s12859-019-3040-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Accepted: 08/21/2019] [Indexed: 12/23/2022] Open

Abstract

Background

In human genetic association studies with high-dimensional gene expression data, it has been well known that statistical selection methods utilizing prior biological network knowledge such as genetic pathways and signaling pathways can outperform other methods that ignore genetic network structures in terms of true positive selection. In recent epigenetic research on case-control association studies, relatively many statistical methods have been proposed to identify cancer-related CpG sites and their corresponding genes from high-dimensional DNA methylation array data. However, most of existing methods are not designed to utilize genetic network information although methylation levels between linked genes in the genetic networks tend to be highly correlated with each other.

Results

We propose new approach that combines data dimension reduction techniques with network-based regularization to identify outcome-related genes for analysis of high-dimensional DNA methylation data. In simulation studies, we demonstrated that the proposed approach overwhelms other statistical methods that do not utilize genetic network information in terms of true positive selection. We also applied it to the 450K DNA methylation array data of the four breast invasive carcinoma cancer subtypes from The Cancer Genome Atlas (TCGA) project.

Conclusions

The proposed variable selection approach can utilize prior biological network information for analysis of high-dimensional DNA methylation array data. It first captures gene level signals from multiple CpG sites using data a dimension reduction technique and then performs network-based regularization based on biological network graph information. It can select potentially cancer-related genes and genetic pathways that were missed by the existing methods.

Electronic supplementary material

The online version of this article (10.1186/s12859-019-3040-x) contains supplementary material, which is available to authorized users.

Collapse

Luo S, Chen Z. Feature Selection by Canonical Correlation Search in High-Dimensional Multiresponse Models With Complex Group Structures. J Am Stat Assoc 2019. [DOI: 10.1080/01621459.2019.1609972] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Zhou H, Sinsheimer JS, Bates DM, Chu BB, German CA, Ji SS, Keys KL, Kim J, Ko S, Mosher GD, Papp JC, Sobel EM, Zhai J, Zhou JJ, Lange K. OPENMENDEL: a cooperative programming project for statistical genetics. Hum Genet 2019;139:61-71. [PMID: 30915546 DOI: 10.1007/s00439-019-02001-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Accepted: 03/15/2019] [Indexed: 01/06/2023]

Qian W, Li W, Sogawa Y, Fujimaki R, Yang X, Liu J. An Interactive Greedy Approach to Group Sparsity in High Dimensions. Technometrics 2019. [DOI: 10.1080/00401706.2018.1537897] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Katsevich E, Sabatti C. MULTILAYER KNOCKOFF FILTER: CONTROLLED VARIABLE SELECTION AT MULTIPLE RESOLUTIONS. Ann Appl Stat 2019;13:1-33. [PMID: 31687060 PMCID: PMC6827557 DOI: 10.1214/18-aoas1185] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Fan X, Wang H, Sun L, Zheng X, Yin X, Zuo X, Peng Q, Standish KA, Cheng H, Zhang Y, Wang Z, Xiao F, Yang S, Zhang X, Schork NJ. Fine mapping and subphenotyping implicates ADRA1B gene variants in psoriasis susceptibility in a Chinese population. Epigenomics 2019;11:455-467. [PMID: 30785334 DOI: 10.2217/epi-2018-0131] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open

Affiliation(s)

Xing Fan Departmentof Dermatology, Anhui Medical University, The First Affiliated Hospital ofAnhui Medical University, 218 Jixi Road, Shushan District, Hefei City, Anhui, 230022, PR China
Hongyan Wang Departmentof Dermatology, Anhui Medical University, The First Affiliated Hospital ofAnhui Medical University, 218 Jixi Road, Shushan District, Hefei City, Anhui, 230022, PR China
Liangdan Sun Departmentof Dermatology, Anhui Medical University, The First Affiliated Hospital ofAnhui Medical University, 218 Jixi Road, Shushan District, Hefei City, Anhui, 230022, PR China
Xiaodong Zheng Instituteof Dermatology, Anhui Medical University, 81 Meishan Road, Shushan District, Hefei City, Anhui, 230032, PR China
Xianyong Yin Instituteof Dermatology, Anhui Medical University, 81 Meishan Road, Shushan District, Hefei City, Anhui, 230032, PR China
Xianbo Zuo Instituteof Dermatology, Anhui Medical University, 81 Meishan Road, Shushan District, Hefei City, Anhui, 230032, PR China
Qian Peng Molecular& Cellular Neuroscience, The Scripps Research Institute, 10550 North TorreyPines Road, La Jolla, CA 92037, USA
Kristopher A Standish Genomics, Bioinformatics, J. Craig Venter Institute, 4120 Capricorn Lane, La Jolla, CA92037, USA
Hui Cheng Departmentof Dermatology, Anhui Medical University, The First Affiliated Hospital ofAnhui Medical University, 218 Jixi Road, Shushan District, Hefei City, Anhui, 230022, PR China
Yaohua Zhang Instituteof Dermatology, Department of Dermatology, Huashan Hospital, Fudan University, No.12, Middle Urumqi Road, Shanghai, 200040, PR China
Zaixing Wang Departmentof Dermatology, Anhui Medical University, The First Affiliated Hospital ofAnhui Medical University, 218 Jixi Road, Shushan District, Hefei City, Anhui, 230022, PR China
Fengli Xiao Departmentof Dermatology, Anhui Medical University, The First Affiliated Hospital ofAnhui Medical University, 218 Jixi Road, Shushan District, Hefei City, Anhui, 230022, PR China
Sen Yang Departmentof Dermatology, Anhui Medical University, The First Affiliated Hospital ofAnhui Medical University, 218 Jixi Road, Shushan District, Hefei City, Anhui, 230022, PR China
Xuejun Zhang Departmentof Dermatology, Anhui Medical University, The First Affiliated Hospital ofAnhui Medical University, 218 Jixi Road, Shushan District, Hefei City, Anhui, 230022, PR China
Nicholas J Schork HumanBiology, J. Craig Venter Institute, 4120 Capricorn Lane, La Jolla, CA 92037, USA

Collapse

Genomic prediction of relapse in recipients of allogeneic haematopoietic stem cell transplantation. Leukemia 2018;33:240-248. [PMID: 30089915 PMCID: PMC6326954 DOI: 10.1038/s41375-018-0229-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Revised: 06/21/2018] [Accepted: 07/17/2018] [Indexed: 02/06/2023]

Choi J, Kim K, Sun H. New variable selection strategy for analysis of high-dimensional DNA methylation data. J Bioinform Comput Biol 2018;16:1850010. [PMID: 29954287 DOI: 10.1142/s0219720018500105] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]

Mat AM, Klopp C, Payton L, Jeziorski C, Chalopin M, Amzil Z, Tran D, Wikfors GH, Hégaret H, Soudant P, Huvet A, Fabioux C. Oyster transcriptome response to Alexandrium exposure is related to saxitoxin load and characterized by disrupted digestion, energy balance, and calcium and sodium signaling. AQUATIC TOXICOLOGY (AMSTERDAM, NETHERLANDS) 2018;199:127-137. [PMID: 29621672 DOI: 10.1016/j.aquatox.2018.03.030] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2018] [Revised: 03/22/2018] [Accepted: 03/25/2018] [Indexed: 06/08/2023]

Abstract

Harmful Algal Blooms are worldwide occurrences that can cause poisoning in human seafood consumers as well as mortality and sublethal effets in wildlife, propagating economic losses. One of the most widespread toxigenic microalgal taxa is the dinoflagellate Genus Alexandrium, that includes species producing neurotoxins referred to as PST (Paralytic Shellfish Toxins). Blooms cause shellfish harvest restrictions to protect human consumers from accumulated toxins. Large inter-individual variability in toxin load within an exposed bivalve population complicates monitoring of shellfish toxicity for ecology and human health regulation. To decipher the physiological pathways involved in the bivalve response to PST, we explored the whole transcriptome of the digestive gland of the Pacific oyster Crassostrea gigas fed experimentally with a toxic Alexandrium minutum culture. The largest differences in transcript abundance were between oysters with contrasting toxin loads (1098 transcripts), rather than between exposed and non-exposed oysters (16 transcripts), emphasizing the importance of toxin load in oyster response to toxic dinoflagellates. Additionally, penalized regressions, innovative in this field, modeled accurately toxin load based upon only 70 transcripts. Transcriptomic differences between oysters with contrasting PST burdens revealed a limited suite of metabolic pathways affected, including ion channels, neuromuscular communication, and digestion, all of which are interconnected and linked to sodium and calcium exchanges. Carbohydrate metabolism, unconsidered previously in studies of harmful algal effects on shellfish, was also highlighted, suggesting energy challenge in oysters with high toxin loads. Associations between toxin load, genotype, and mRNA levels were revealed that open new doors for genetic studies identifying genetically-based low toxin accumulation.

Collapse

Keys KL, Chen GK, Lange K. Iterative hard thresholding for model selection in genome-wide association studies. Genet Epidemiol 2017;41:756-768. [PMID: 28875524 DOI: 10.1002/gepi.22068] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Revised: 07/13/2017] [Accepted: 08/02/2017] [Indexed: 11/05/2022]

Bomba L, Walter K, Soranzo N. The impact of rare and low-frequency genetic variants in common disease. Genome Biol 2017;18:77. [PMID: 28449691 PMCID: PMC5408830 DOI: 10.1186/s13059-017-1212-4] [Citation(s) in RCA: 217] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open

Longitudinal data analysis for rare variants detection with penalized quadratic inference function. Sci Rep 2017;7:650. [PMID: 28381821 PMCID: PMC5429681 DOI: 10.1038/s41598-017-00712-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2016] [Accepted: 03/08/2017] [Indexed: 11/08/2022] Open

Zhou H, Blangero J, Dyer TD, Chan KHK, Lange K, Sobel EM. Fast Genome-Wide QTL Association Mapping on Pedigree and Population Data. Genet Epidemiol 2017;41:174-186. [PMID: 27943406 PMCID: PMC5340631 DOI: 10.1002/gepi.21988] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2015] [Revised: 05/02/2016] [Accepted: 05/08/2016] [Indexed: 01/14/2023]

Abstract

Since most analysis software for genome-wide association studies (GWAS) currently exploit only unrelated individuals, there is a need for efficient applications that can handle general pedigree data or mixtures of both population and pedigree data. Even datasets thought to consist of only unrelated individuals may include cryptic relationships that can lead to false positives if not discovered and controlled for. In addition, family designs possess compelling advantages. They are better equipped to detect rare variants, control for population stratification, and facilitate the study of parent-of-origin effects. Pedigrees selected for extreme trait values often segregate a single gene with strong effect. Finally, many pedigrees are available as an important legacy from the era of linkage analysis. Unfortunately, pedigree likelihoods are notoriously hard to compute. In this paper, we reexamine the computational bottlenecks and implement ultra-fast pedigree-based GWAS analysis. Kinship coefficients can either be based on explicitly provided pedigrees or automatically estimated from dense markers. Our strategy (a) works for random sample data, pedigree data, or a mix of both; (b) entails no loss of power; (c) allows for any number of covariate adjustments, including correction for population stratification; (d) allows for testing SNPs under additive, dominant, and recessive models; and (e) accommodates both univariate and multivariate quantitative traits. On a typical personal computer (six CPU cores at 2.67 GHz), analyzing a univariate HDL (high-density lipoprotein) trait from the San Antonio Family Heart Study (935,392 SNPs on 1,388 individuals in 124 pedigrees) takes less than 2 min and 1.5 GB of memory. Complete multivariate QTL analysis of the three time-points of the longitudinal HDL multivariate trait takes less than 5 min and 1.5 GB of memory. The algorithm is implemented as the Ped-GWAS Analysis (Option 29) in the Mendel statistical genetics package, which is freely available for Macintosh, Linux, and Windows platforms from http://genetics.ucla.edu/software/mendel.

Collapse

Brzyski D, Peterson CB, Sobczyk P, Candès EJ, Bogdan M, Sabatti C. Controlling the Rate of GWAS False Discoveries. Genetics 2017;205:61-75. [PMID: 27784720 PMCID: PMC5223524 DOI: 10.1534/genetics.116.193987] [Citation(s) in RCA: 72] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2016] [Accepted: 10/11/2016] [Indexed: 01/13/2023] Open

Wang C, Ruggeri F, Hsiao CK, Argiento R. Bayesian nonparametric clustering and association studies for candidate SNP observations. Int J Approx Reason 2017. [DOI: 10.1016/j.ijar.2016.07.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

Mieth B, Kloft M, Rodríguez JA, Sonnenburg S, Vobruba R, Morcillo-Suárez C, Farré X, Marigorta UM, Fehr E, Dickhaus T, Blanchard G, Schunk D, Navarro A, Müller KR. Combining Multiple Hypothesis Testing with Machine Learning Increases the Statistical Power of Genome-wide Association Studies. Sci Rep 2016;6:36671. [PMID: 27892471 PMCID: PMC5125008 DOI: 10.1038/srep36671] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Accepted: 10/06/2016] [Indexed: 12/21/2022] Open

Affiliation(s)

Bettina Mieth Machine Learning Group, Technische Universität Berlin, Berlin, 10587, Germany
Marius Kloft Department of Computer Science, Humboldt University of Berlin, Berlin, 10099, Germany
Juan Antonio Rodríguez Institut de Biología Evolutiva (CSIC-UPF). Departament de Ciències Experimentals i de la Salut. Universitat Pompeu Fabra, Barcelona, 08003, Spain
Sören Sonnenburg TomTom Research, Berlin, 12555, Germany
Robin Vobruba Machine Learning Group, Technische Universität Berlin, Berlin, 10587, Germany
Carlos Morcillo-Suárez Institut de Biología Evolutiva (CSIC-UPF). Departament de Ciències Experimentals i de la Salut. Universitat Pompeu Fabra, Barcelona, 08003, Spain
Xavier Farré Institut de Biología Evolutiva (CSIC-UPF). Departament de Ciències Experimentals i de la Salut. Universitat Pompeu Fabra, Barcelona, 08003, Spain
Urko M. Marigorta School of Biology, Georgia Institute of Technology, Atlanta, 30332, GA, USA
Ernst Fehr Department of Economics, Laboratory for Social and Neural Systems Research, University of Zurich, Zurich, 8006, Switzerland
Thorsten Dickhaus Institute for Statistics (FB 3), University of Bremen, Bremen, 28359, Germany
Gilles Blanchard Department of Mathematics, University of Potsdam, Potsdam, 14476, Germany
Daniel Schunk Department of Economics, University of Mainz, Mainz, 55099, Germany
Arcadi Navarro Institut de Biología Evolutiva (CSIC-UPF). Departament de Ciències Experimentals i de la Salut. Universitat Pompeu Fabra, Barcelona, 08003, Spain Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, 08010, Spain Center for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, 08003, Spain
Klaus-Robert Müller Machine Learning Group, Technische Universität Berlin, Berlin, 10587, Germany Department of Brain and Cognitive Engineering, Korea University, Seoul, Republic of Korea

Collapse

Identifying rare and common variants with Bayesian variable selection. BMC Proc 2016;10:379-384. [PMID: 27980665 PMCID: PMC5133477 DOI: 10.1186/s12919-016-0059-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open

Larson NB, McDonnell S, Albright LC, Teerlink C, Stanford J, Ostrander EA, Isaacs WB, Xu J, Cooney KA, Lange E, Schleutker J, Carpten JD, Powell I, Bailey-Wilson J, Cussenot O, Cancel-Tassin G, Giles G, MacInnis R, Maier C, Whittemore AS, Hsieh CL, Wiklund F, Catolona WJ, Foulkes W, Mandal D, Eeles R, Kote-Jarai Z, Ackerman MJ, Olson TM, Klein CJ, Thibodeau SN, Schaid DJ. Post hoc Analysis for Detecting Individual Rare Variant Risk Associations Using Probit Regression Bayesian Variable Selection Methods in Case-Control Sequencing Studies. Genet Epidemiol 2016;40:461-9. [PMID: 27312771 PMCID: PMC5063501 DOI: 10.1002/gepi.21983] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2015] [Revised: 04/22/2016] [Accepted: 04/27/2016] [Indexed: 12/27/2022]

Affiliation(s)

Nicholas B. Larson Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN
Shannon McDonnell Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN
Lisa Cannon Albright Dept. Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT
Craig Teerlink Dept. Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT
Janet Stanford Fred Hutchinson Cancer Research Center, Seattle, WA
Elaine A. Ostrander National Human Genome Research Institute, Bethesda, MD
William B. Isaacs Johns Hopkins Hospital, Department of Urology, Baltimore, MD
Jianfeng Xu NorthShore University Health System Research Institute, Chicago, IL
Kathleen A. Cooney Depts. of Internal Medicine and Urology, University of Michigan Medical School, Ann Arbor, MI
Ethan Lange Dept. of Genetics, University of North Carolina, Chapel Hill, NC
Johanna Schleutker Dept. of Medical Biochemistry and Genetics, Institute of Biomedicine, University of Turku, Finland
John D. Carpten Integrated Cancer Genomics Division, The Translational Genomics Research Institute, Phoenix, AZ
Isaac Powell Wayne State University, Detroit, MI
Joan Bailey-Wilson Statistical Genetics Section, National Human Genome Research Institute, Bethesda, MD
Olivier Cussenot CeRePP, Hopital Tenon, Paris, France
Geraldine Cancel-Tassin CeRePP, Hopital Tenon, Paris, France
Graham Giles Cancer Epidemiology Centre, Cancer Council Victoria, and Centre for Epidemiology and Biostatistics, School of Population and Global Health, University of Melbourne, Melbourne, Australia
Robert MacInnis Cancer Epidemiology Centre, Cancer Council Victoria, and Centre for Epidemiology and Biostatistics, School of Population and Global Health, University of Melbourne, Melbourne, Australia
Christiane Maier Dept. of Urology, University of Ulm, Ulm, Germany
Alice S. Whittemore Dept. Health Research and Policy, Stanford University, Stanford, CA
Chih-Lin Hsieh Dept. of Urology, University of Southern California, Los Angeles, CA
Fredrik Wiklund Dept. of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
William J. Catolona Northwestern University Feinberg School of Medicine, Chicago, IL
William Foulkes Depts. Of Oncology and Human Genetics, Montreal General Hospital, Montreal QC, Canada
Diptasri Mandal Dept. of Genetics, LSU Health Sciences Center, New Orleans, LA
Rosalind Eeles Genetics and Epidemiology, Institute of Cancer Research, Sutton Surrey, UK
Zsofia Kote-Jarai Genetics and Epidemiology, Institute of Cancer Research, Sutton Surrey, UK
Michael J. Ackerman Dept. of Pediatric and Adolescent Medicine, Mayo Clinic, Rochester, MN
Timothy M. Olson Dept. of Pediatric and Adolescent Medicine, Mayo Clinic, Rochester, MN
Christopher J. Klein Dept. of Neurology, Mayo Clinic, Rochester, MN
Stephen N. Thibodeau Dept. of Laboratory Medicine/Pathology, Mayo Clinic, Rochester, MN
Daniel J. Schaid Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN

Collapse

He Q, Cai T, Liu Y, Zhao N, Harmon QE, Almli LM, Binder EB, Engel SM, Ressler KJ, Conneely KN, Lin X, Wu MC. Prioritizing individual genetic variants after kernel machine testing using variable selection. Genet Epidemiol 2016;40:722-731. [PMID: 27488097 DOI: 10.1002/gepi.21993] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Revised: 05/28/2016] [Accepted: 06/20/2016] [Indexed: 01/06/2023]

Nicolae DL. Association Tests for Rare Variants. Annu Rev Genomics Hum Genet 2016;17:117-30. [PMID: 27147090 DOI: 10.1146/annurev-genom-083115-022609] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Mallick H, Tiwari HK. EM Adaptive LASSO-A Multilocus Modeling Strategy for Detecting SNPs Associated with Zero-inflated Count Phenotypes. Front Genet 2016;7:32. [PMID: 27066062 PMCID: PMC4811966 DOI: 10.3389/fgene.2016.00032] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2015] [Accepted: 02/22/2016] [Indexed: 11/13/2022] Open

Abstract

Count data are increasingly ubiquitous in genetic association studies, where it is possible to observe excess zero counts as compared to what is expected based on standard assumptions. For instance, in rheumatology, data are usually collected in multiple joints within a person or multiple sub-regions of a joint, and it is not uncommon that the phenotypes contain enormous number of zeroes due to the presence of excessive zero counts in majority of patients. Most existing statistical methods assume that the count phenotypes follow one of these four distributions with appropriate dispersion-handling mechanisms: Poisson, Zero-inflated Poisson (ZIP), Negative Binomial, and Zero-inflated Negative Binomial (ZINB). However, little is known about their implications in genetic association studies. Also, there is a relative paucity of literature on their usefulness with respect to model misspecification and variable selection. In this article, we have investigated the performance of several state-of-the-art approaches for handling zero-inflated count data along with a novel penalized regression approach with an adaptive LASSO penalty, by simulating data under a variety of disease models and linkage disequilibrium patterns. By taking into account data-adaptive weights in the estimation procedure, the proposed method provides greater flexibility in multi-SNP modeling of zero-inflated count phenotypes. A fast coordinate descent algorithm nested within an EM (expectation-maximization) algorithm is implemented for estimating the model parameters and conducting variable selection simultaneously. Results show that the proposed method has optimal performance in the presence of multicollinearity, as measured by both prediction accuracy and empirical power, which is especially apparent as the sample size increases. Moreover, the Type I error rates become more or less uncontrollable for the competing methods when a model is misspecified, a phenomenon routinely encountered in practice.

Collapse

Stell L, Sabatti C. Genetic Variant Selection: Learning Across Traits and Sites. Genetics 2016;202:439-55. [PMID: 26680660 PMCID: PMC4788227 DOI: 10.1534/genetics.115.184572] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2015] [Accepted: 11/30/2015] [Indexed: 11/18/2022] Open

Pineda S, Real FX, Kogevinas M, Carrato A, Chanock SJ, Malats N, Van Steen K. Integration Analysis of Three Omics Data Using Penalized Regression Methods: An Application to Bladder Cancer. PLoS Genet 2015;11:e1005689. [PMID: 26646822 PMCID: PMC4672920 DOI: 10.1371/journal.pgen.1005689] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2015] [Accepted: 10/30/2015] [Indexed: 01/10/2023] Open

Abstract

Omics data integration is becoming necessary to investigate the genomic mechanisms involved in complex diseases. During the integration process, many challenges arise such as data heterogeneity, the smaller number of individuals in comparison to the number of parameters, multicollinearity, and interpretation and validation of results due to their complexity and lack of knowledge about biological processes. To overcome some of these issues, innovative statistical approaches are being developed. In this work, we propose a permutation-based method to concomitantly assess significance and correct by multiple testing with the MaxT algorithm. This was applied with penalized regression methods (LASSO and ENET) when exploring relationships between common genetic variants, DNA methylation and gene expression measured in bladder tumor samples. The overall analysis flow consisted of three steps: (1) SNPs/CpGs were selected per each gene probe within 1Mb window upstream and downstream the gene; (2) LASSO and ENET were applied to assess the association between each expression probe and the selected SNPs/CpGs in three multivariable models (SNP, CPG, and Global models, the latter integrating SNPs and CPGs); and (3) the significance of each model was assessed using the permutation-based MaxT method. We identified 48 genes whose expression levels were significantly associated with both SNPs and CPGs. Importantly, 36 (75%) of them were replicated in an independent data set (TCGA) and the performance of the proposed method was checked with a simulation study. We further support our results with a biological interpretation based on an enrichment analysis. The approach we propose allows reducing computational time and is flexible and easy to implement when analyzing several types of omics data. Our results highlight the importance of integrating omics data by applying appropriate statistical strategies to discover new insights into the complex genetic mechanisms involved in disease conditions.

At present, it is already possible to generate different type of omics–high throughput–data in the same individuals. However, we lack methodology to adequately combine them. Many challenges arise while the amount of data increases and we need to find the way to identify and understand the complex relationships when integrating data. In this regard, new statistical approaches are needed, such as the ones we propose and apply here to integrate three types of omics data (genomics, epigenomics, and transcriptomics) generated using bladder cancer tumor samples. These innovative approaches (LASSO and ENET combined with a permutation-based MaxT method) allowed us to find 48 genes whose expression levels were significantly associated with genomics and epigenomics markers. The adequacy of this approach was confirmed by the use of an independent data set from The Cancer Genome Atlas Consortium: 75% of the genes were replicated. Previous sound biological evidences further support the results obtained.

Collapse

Qiu C, Gelaye B, Denis M, Tadesse MG, Luque Fernandez MA, Enquobahrie DA, Ananth CV, Sanchez SE, Williams MA. Circadian clock-related genetic risk scores and risk of placental abruption. Placenta 2015;36:1480-6. [PMID: 26515929 PMCID: PMC5010362 DOI: 10.1016/j.placenta.2015.10.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/16/2015] [Revised: 10/06/2015] [Accepted: 10/11/2015] [Indexed: 10/22/2022]

Liquet B, Lafaye de Micheaux P, Hejblum BP, Thiébaut R. Group and sparse group partial least square approaches applied in genomics context. Bioinformatics 2015;32:35-42. [DOI: 10.1093/bioinformatics/btv535] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2015] [Accepted: 09/03/2015] [Indexed: 01/07/2023] Open

Fouladi R, Bessonov K, Van Lishout F, Van Steen K. Model-Based Multifactor Dimensionality Reduction for Rare Variant Association Analysis. Hum Hered 2015. [PMID: 26201701 DOI: 10.1159/000381286] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open

Li Y, Nan B, Zhu J. Multivariate sparse group lasso for the multivariate multiple linear regression with an arbitrary group structure. Biometrics 2015;71:354-63. [PMID: 25732839 DOI: 10.1111/biom.12292] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2013] [Revised: 12/01/2014] [Accepted: 01/01/2015] [Indexed: 11/27/2022]

Garner C. Confounded by sequencing depth in association studies of rare alleles. Genet Epidemiol 2015;35:261-8. [PMID: 21328616 DOI: 10.1002/gepi.20574] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2010] [Accepted: 01/12/2011] [Indexed: 11/12/2022]

Interactions of early adversity with stress-related gene polymorphisms impact regional brain structure in females. Brain Struct Funct 2015;221:1667-79. [PMID: 25630611 DOI: 10.1007/s00429-015-0996-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2014] [Accepted: 01/21/2015] [Indexed: 12/17/2022]

Matsui H. SPARSE REGULARIZATION FOR BI-LEVEL VARIABLE SELECTION. JOURNAL JAPANESE SOCIETY OF COMPUTATIONAL STATISTICS 2015. [DOI: 10.5183/jjscs.1502001_216] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Denis M, Enquobahrie DA, Tadesse MG, Gelaye B, Sanchez SE, Salazar M, Ananth CV, Williams MA. Placental genome and maternal-placental genetic interactions: a genome-wide and candidate gene association study of placental abruption. PLoS One 2014;9:e116346. [PMID: 25549360 PMCID: PMC4280220 DOI: 10.1371/journal.pone.0116346] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2014] [Accepted: 12/08/2014] [Indexed: 01/02/2023] Open

Abstract

While available evidence supports the role of genetics in the pathogenesis of placental abruption (PA), PA-related placental genome variations and maternal-placental genetic interactions have not been investigated. Maternal blood and placental samples collected from participants in the Peruvian Abruptio Placentae Epidemiology study were genotyped using Illumina's Cardio-Metabochip platform. We examined 118,782 genome-wide SNPs and 333 SNPs in 32 candidate genes from mitochondrial biogenesis and oxidative phosphorylation pathways in placental DNA from 280 PA cases and 244 controls. We assessed maternal-placental interactions in the candidate gene SNPS and two imprinted regions (IGF2/H19 and C19MC). Univariate and penalized logistic regression models were fit to estimate odds ratios. We examined the combined effect of multiple SNPs on PA risk using weighted genetic risk scores (WGRS) with repeated ten-fold cross-validations. A multinomial model was used to investigate maternal-placental genetic interactions. In placental genome-wide and candidate gene analyses, no SNP was significant after false discovery rate correction. The top genome-wide association study (GWAS) hits were rs544201, rs1484464 (CTNNA2), rs4149570 (TNFRSF1A) and rs13055470 (ZNRF3) (p-values: 1.11e-05 to 3.54e-05). The top 200 SNPs of the GWAS overrepresented genes involved in cell cycle, growth and proliferation. The top candidate gene hits were rs16949118 (COX10) and rs7609948 (THRB) (p-values: 6.00e-03 and 8.19e-03). Participants in the highest quartile of WGRS based on cross-validations using SNPs selected from the GWAS and candidate gene analyses had a 8.40-fold (95% CI: 5.8-12.56) and a 4.46-fold (95% CI: 2.94-6.72) higher odds of PA compared to participants in the lowest quartile. We found maternal-placental genetic interactions on PA risk for two SNPs in PPARG (chr3:12313450 and chr3:12412978) and maternal imprinting effects for multiple SNPs in the C19MC and IGF2/H19 regions. Variations in the placental genome and interactions between maternal-placental genetic variations may contribute to PA risk. Larger studies may help advance our understanding of PA pathogenesis.

Collapse

Ionita-Laza I, Capanu M, De Rubeis S, McCallum K, Buxbaum JD. Identification of rare causal variants in sequence-based studies: methods and applications to VPS13B, a gene involved in Cohen syndrome and autism. PLoS Genet 2014;10:e1004729. [PMID: 25502226 PMCID: PMC4263785 DOI: 10.1371/journal.pgen.1004729] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2014] [Accepted: 09/02/2014] [Indexed: 11/18/2022] Open

Li J, Zhong W, Li R, Wu R. A FAST ALGORITHM FOR DETECTING GENE-GENE INTERACTIONS IN GENOME-WIDE ASSOCIATION STUDIES. Ann Appl Stat 2014;8:2292-2318. [PMID: 26457126 DOI: 10.1214/14-aoas771] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Abstract

With the recent advent of high-throughput genotyping techniques, genetic data for genome-wide association studies (GWAS) have become increasingly available, which entails the development of efficient and effective statistical approaches. Although many such approaches have been developed and used to identify single-nucleotide polymorphisms (SNPs) that are associated with complex traits or diseases, few are able to detect gene-gene interactions among different SNPs. Genetic interactions, also known as epistasis, have been recognized to play a pivotal role in contributing to the genetic variation of phenotypic traits. However, because of an extremely large number of SNP-SNP combinations in GWAS, the model dimensionality can quickly become so overwhelming that no prevailing variable selection methods are capable of handling this problem. In this paper, we present a statistical framework for characterizing main genetic effects and epistatic interactions in a GWAS study. Specifically, we first propose a two-stage sure independence screening (TS-SIS) procedure and generate a pool of candidate SNPs and interactions, which serve as predictors to explain and predict the phenotypes of a complex trait. We also propose a rates adjusted thresholding estimation (RATE) approach to determine the size of the reduced model selected by an independence screening. Regularization regression methods, such as LASSO or SCAD, are then applied to further identify important genetic effects. Simulation studies show that the TS-SIS procedure is computationally efficient and has an outstanding finite sample performance in selecting potential SNPs as well as gene-gene interactions. We apply the proposed framework to analyze an ultrahigh-dimensional GWAS data set from the Framingham Heart Study, and select 23 active SNPs and 24 active epistatic interactions for the body mass index variation. It shows the capability of our procedure to resolve the complexity of genetic control.

Collapse

Sabourin J, Nobel AB, Valdar W. Fine-mapping additive and dominant SNP effects using group-LASSO and fractional resample model averaging. Genet Epidemiol 2014;39:77-88. [PMID: 25417853 DOI: 10.1002/gepi.21869] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2014] [Revised: 09/25/2014] [Accepted: 09/30/2014] [Indexed: 12/28/2022]