Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: de Vlaming R, Groenen PJ. The Current and Future Use of Ridge Regression for Prediction in Quantitative Genetics. Biomed Res Int 2015;2015:143712. [PMID: 26273586 DOI: 10.1155/2015/143712] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/28/2014] [Accepted: 12/24/2014] [Indexed: 01/05/2023]

For:	de Vlaming R, Groenen PJ. The Current and Future Use of Ridge Regression for Prediction in Quantitative Genetics. Biomed Res Int 2015;2015:143712. [PMID: 26273586 DOI: 10.1155/2015/143712] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/28/2014] [Accepted: 12/24/2014] [Indexed: 01/05/2023]

Number

Cited by Other Article(s)

Kim R, Lin T, Pang G, Liu Y, Tungate AS, Hendry PL, Kurz MC, Peak DA, Jones J, Rathlev NK, Swor RA, Domeier R, Velilla MA, Lewandowski C, Datner E, Pearson C, Lee D, Mitchell PM, McLean SA, Linnstaedt SD. Derivation and validation of risk prediction for posttraumatic stress symptoms following trauma exposure. Psychol Med 2023;53:4952-4961. [PMID: 35775366 DOI: 10.1017/s003329172200191x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Affiliation(s)

Raphael Kim Institute for Trauma Recovery, University of North Carolina, Chapel Hill, NC, USA Department of Anesthesiology, University of North Carolina, Chapel Hill, NC, USA Department of Computer Science, University of North Carolina, Chapel Hill, NC, USA Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC, USA
Tina Lin Institute for Trauma Recovery, University of North Carolina, Chapel Hill, NC, USA Department of Anesthesiology, University of North Carolina, Chapel Hill, NC, USA
Gehao Pang Institute for Trauma Recovery, University of North Carolina, Chapel Hill, NC, USA Department of Anesthesiology, University of North Carolina, Chapel Hill, NC, USA
Yufeng Liu Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC, USA Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA Department of Genetics, Carolina Center for Genome Sciences, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC, USA
Andrew S Tungate Institute for Trauma Recovery, University of North Carolina, Chapel Hill, NC, USA Department of Anesthesiology, University of North Carolina, Chapel Hill, NC, USA
Phyllis L Hendry Department of Emergency Medicine, University of Florida College of Medicine, Jacksonville, FL, USA
Michael C Kurz Department of Emergency Medicine, University of Alabama, Birmingham, AL, USA
David A Peak Department of Emergency Medicine, Massachusetts General Hospital, Boston, MA, USA
Jeffrey Jones Department of Emergency Medicine, Spectrum Health Butterworth Campus, Grand Rapids, MI, USA
Niels K Rathlev Department of Emergency Medicine, Baystate State Health System, Springfield, MA, USA
Robert A Swor Department of Emergency Medicine, Beaumont Hospital, Royal Oak, MI, USA
Robert Domeier Department of Emergency Medicine, St Joseph Mercy Health System, Ann Arbor, MI, USA
Marc-Anthony Velilla Department of Emergency Medicine, Sinai Grace, Detroit, MI, USA
Christopher Lewandowski Department of Emergency Medicine, Henry Ford Hospital, Detroit, MI, USA
Elizabeth Datner Department of Emergency Medicine, Albert Einstein Medical Center, Philadelphia, PA, USA
Claire Pearson Department of Emergency Medicine, Detroit Receiving, Detroit, MI, USA
David Lee Department of Emergency Medicine, North Shore University Hospital, Manhasset, NY, USA
Patricia M Mitchell Department of Emergency Medicine, Boston University School of Medicine, Boston, MA, USA
Samuel A McLean Institute for Trauma Recovery, University of North Carolina, Chapel Hill, NC, USA Department of Anesthesiology, University of North Carolina, Chapel Hill, NC, USA Department of Emergency Medicine, University of North Carolina, Chapel Hill, NC, USA
Sarah D Linnstaedt Institute for Trauma Recovery, University of North Carolina, Chapel Hill, NC, USA Department of Anesthesiology, University of North Carolina, Chapel Hill, NC, USA

Collapse

Role of artificial intelligence and machine learning in interventional cardiology. Curr Probl Cardiol 2023;48:101698. [PMID: 36921654 DOI: 10.1016/j.cpcardiol.2023.101698] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 03/08/2023] [Indexed: 03/16/2023]

Hsu W, Warren JR, Riddle PJ. Medication adherence prediction through temporal modelling in cardiovascular disease management. BMC Med Inform Decis Mak 2022;22:313. [DOI: 10.1186/s12911-022-02052-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Accepted: 11/16/2022] [Indexed: 11/30/2022] Open

Abstract Abstract Background Chronic conditions place a considerable burden on modern healthcare systems. Within New Zealand and worldwide cardiovascular disease (CVD) affects a significant proportion of the population and it is the leading cause of death. Like other chronic diseases, the course of cardiovascular disease is usually prolonged and its management necessarily long-term. Despite being highly effective in reducing CVD risk, non-adherence to long-term medication continues to be a longstanding challenge in healthcare delivery. The study investigates the benefits of integrating patient history and assesses the contribution of explicitly temporal models to medication adherence prediction in the context of lipid-lowering therapy. Methods Data from a CVD risk assessment tool is linked to routinely collected national and regional data sets including pharmaceutical dispensing, hospitalisation, lab test results and deaths. The study extracts a sub-cohort from 564,180 patients who had primary CVD risk assessment for analysis. Based on community pharmaceutical dispensing record, proportion of days covered (PDC)

$$\ge$$

≥ 80 is used as the threshold for adherence. Two years (8 quarters) of patient history before their CVD risk assessment is used as the observation window to predict patient adherence in the subsequent 5 years (20 quarters). The predictive performance of temporal deep learning models long short-term memory (LSTM) and simple recurrent neural networks (Simple RNN) are compared against non-temporal models multilayer perceptron (MLP), ridge classifier (RC) and logistic regression (LR). Further, the study investigates the effect of lengthening the observation window on the task of adherence prediction. Results Temporal models that use sequential data outperform non-temporal models, with LSTM producing the best predictive performance achieving a ROC AUC of 0.805. A performance gap is observed between models that can discover non-linear interactions between predictor variables and their linear counter parts, with neural network (NN) based models significantly outperforming linear models. Additionally, the predictive advantage of temporal models become more pronounced when the length of the observation window is increased. Conclusion The findings of the study provide evidence that using deep temporal models to integrate patient history in adherence prediction is advantageous. In particular, the RNN architecture LSTM significantly outperforms all other model comparators. Collapse

Comparison of artificial intelligence algorithms and their ranking for the prediction of genetic merit in sheep. Sci Rep 2022;12:18726. [PMID: 36333409 PMCID: PMC9636184 DOI: 10.1038/s41598-022-23499-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Accepted: 11/01/2022] [Indexed: 11/06/2022] Open

Cheng B, Zhou P, Chen Y. Machine-learning algorithms based on personalized pathways for a novel predictive model for the diagnosis of hepatocellular carcinoma. BMC Bioinformatics 2022;23:248. [PMID: 35739471 PMCID: PMC9219178 DOI: 10.1186/s12859-022-04805-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 06/20/2022] [Indexed: 12/14/2022] Open

Patiyal S, Dhall A, Raghava GPS. Prediction of risk-associated genes and high-risk liver cancer patients from their mutation profile: Benchmarking of mutation calling techniques. Biol Methods Protoc 2022;7:bpac012. [PMID: 35734767 PMCID: PMC9204470 DOI: 10.1093/biomethods/bpac012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 05/20/2022] [Accepted: 05/20/2022] [Indexed: 11/12/2022] Open

Abstract Abstract Identification of somatic mutations with high precision is one of the major challenges in the prediction of high-risk liver-cancer patients. In the past, number of mutations calling techniques have been developed that include MuTect2, MuSE, Varscan2, and SomaticSniper. In this study, an attempt has been made to benchmark the potential of these techniques in predicting the prognostic biomarkers for liver cancer. Initially, we extracted somatic mutations in liver cancer patients using Variant Call Format (VCF) and Mutation Annotation Format (MAF) files from the cancer genome atlas. In terms of size, the MAF files are 42 times smaller than VCF files and containing only high-quality somatic mutations. Further, machine learning based models have been developed for predicting high-risk cancer patients using mutations obtained from different techniques. The performance of different techniques and data files have been compared based on their potential to discriminate high and low-risk liver-cancer patients. Based on correlation analysis, we selected 80 genes having significant negative-correlation with the overall survival of liver cancer patients. The univariate survival analysis revealed the prognostic role of highly mutated genes. Single-gene based analysis showed that MuTect2 technique based MAF file has achieved maximum hazard ratio (HRLAMC3) of 9.25 with p-value 1.78E-06. Further, we developed various prediction models using risk-associated top-10 genes for each technique. Our results indicate that MuTect2 technique based VCF files outperform all other methods with maximum Area Under the Receiver-Operating Characteristic (AUROC) curve of 0.765 and HR 4.50 (p-value 3.83E-15). Eventually, VCF file generated using MuTect2 technique performs better among other mutation calling techniques for the prediction of high-risk liver cancer patients. We hope that our findings will provide a useful and comprehensive comparison of various mutation calling techniques for the prognostic analysis of cancer patients. In order to serve the scientific community, we have provided a Python-based pipeline to develop the prediction models using mutation profiles (VCF/MAF) of cancer patients. It is available on GitHub at https://github.com/raghavagps/mutation_bench. Collapse

Ma C, Wu M, Ma S. Analysis of cancer omics data: a selective review of statistical techniques. Brief Bioinform 2022;23:6510158. [PMID: 35039832 DOI: 10.1093/bib/bbab585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 12/19/2021] [Accepted: 12/20/2021] [Indexed: 11/13/2022] Open

Evaluation and Prediction on the Effect of Ionic Properties of Solvent Extraction Performance of Oily Sludge Using Machine Learning. Molecules 2021;26:molecules26247551. [PMID: 34946635 PMCID: PMC8708711 DOI: 10.3390/molecules26247551] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Revised: 12/06/2021] [Accepted: 12/08/2021] [Indexed: 11/20/2022] Open

Tatsumi K, Igarashi N, Mengxue X. Prediction of plant-level tomato biomass and yield using machine learning with unmanned aerial vehicle imagery. PLANT METHODS 2021;17:77. [PMID: 34266447 PMCID: PMC8281694 DOI: 10.1186/s13007-021-00761-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 06/04/2021] [Indexed: 06/13/2023]

Abstract

BACKGROUND

The objective of this study is twofold. First, ascertain the important variables that predict tomato yields from plant height (PH) and vegetation index (VI) maps. The maps were derived from images taken by unmanned aerial vehicles (UAVs). Second, examine the accuracy of predictions of tomato fresh shoot masses (SM), fruit weights (FW), and the number of fruits (FN) from multiple machine learning algorithms using selected variable sets. To realize our objective, ultra-high-resolution RGB and multispectral images were collected by a UAV on ten days in 2020's tomato growing season. From these images, 756 total variables, including first- (e.g., average, standard deviation, skewness, range, and maximum) and second-order (e.g., gray-level co-occurrence matrix features and growth rates of PH and VIs) statistics for each plant, were extracted. Several selection algorithms (i.e., Boruta, DALEX, genetic algorithm, least absolute shrinkage and selection operator, and recursive feature elimination) were used to select the variable sets useful for predicting SM, FW, and FN. Random forests, ridge regressions, and support vector machines were used to predict the yield using the top five selected variable sets.

RESULTS

First-order statistics of PH and VIs collected during the early to mid-fruit formation periods, about one month prior to harvest, were important variables for predicting SM. Similar to the case for SM, variables collected approximately one month prior to harvest were important for predicting FW and FN. Furthermore, variables related to PH were unimportant for prediction. Compared with predictions obtained using only first-order statistics, those obtained using the second-order statistics of VIs were more accurate for FW and FN. The prediction accuracy of SM, FW, and FN by models constructed from all variables (rRMSE = 8.8-28.1%) was better than that from first-order statistics (rRMSE = 10.0-50.1%).

CONCLUSIONS

In addition to basic statistics (e.g., average and standard deviation), we derived second-order statistics of PH and VIs at the plant level using the ultra-high resolution UAV images. Our findings indicated that our variable selection method reduced the number variables needed for tomato yield prediction, improving the efficiency of phenotypic data collection and assisting with the selection of high-yield lines within breeding programs.

Collapse

Tatsumi K, Igarashi N, Mengxue X. Prediction of plant-level tomato biomass and yield using machine learning with unmanned aerial vehicle imagery. PLANT METHODS 2021;17:77. [PMID: 34266447 DOI: 10.21203/rs.3.rs-344860/v1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 06/04/2021] [Indexed: 05/21/2023]

Abstract

BACKGROUND

RESULTS

CONCLUSIONS

Collapse

Scott MF, Fradgley N, Bentley AR, Brabbs T, Corke F, Gardner KA, Horsnell R, Howell P, Ladejobi O, Mackay IJ, Mott R, Cockram J. Limited haplotype diversity underlies polygenic trait architecture across 70 years of wheat breeding. Genome Biol 2021;22:137. [PMID: 33957956 PMCID: PMC8101041 DOI: 10.1186/s13059-021-02354-7] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Accepted: 04/16/2021] [Indexed: 11/25/2022] Open

Abstract

Background

Selection has dramatically shaped genetic and phenotypic variation in bread wheat. We can assess the genomic basis of historical phenotypic changes, and the potential for future improvement, using experimental populations that attempt to undo selection through the randomizing effects of recombination.

Results

We bred the NIAB Diverse MAGIC multi-parent population comprising over 500 recombinant inbred lines, descended from sixteen historical UK bread wheat varieties released between 1935 and 2004. We sequence the founders’ genes and promoters by capture, and the MAGIC population by low-coverage whole-genome sequencing. We impute 1.1 M high-quality SNPs that are over 99% concordant with array genotypes. Imputation accuracy only marginally improves when including the founders’ genomes as a haplotype reference panel. Despite capturing 73% of global wheat genetic polymorphism, 83% of genes cluster into no more than three haplotypes. We phenotype 47 agronomic traits over 2 years and map 136 genome-wide significant associations, concentrated at 42 genetic loci with large and often pleiotropic effects. Around half of these overlap known quantitative trait loci. Most traits exhibit extensive polygenicity, as revealed by multi-locus shrinkage modelling.

Conclusions

Our results are consistent with a gene pool of low haplotypic diversity, containing few novel loci of large effect. Most past, and projected future, phenotypic changes arising from existing variation involve fine-scale shuffling of a few haplotypes to recombine dozens of polygenic alleles of small effect. Moreover, extensive pleiotropy means selection on one trait will have unintended consequences, exemplified by the negative trade-off between yield and protein content, unless selection and recombination can break unfavorable trait-trait associations.

Supplementary Information

The online version contains supplementary material available at 10.1186/s13059-021-02354-7.

Collapse

Item response theory as a feature selection and interpretation tool in the context of machine learning. Med Biol Eng Comput 2021;59:471-482. [PMID: 33534111 DOI: 10.1007/s11517-020-02301-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Accepted: 12/22/2020] [Indexed: 10/22/2022]

Frouin A, Dandine-Roulland C, Pierre-Jean M, Deleuze JF, Ambroise C, Le Floch E. Exploring the Link Between Additive Heritability and Prediction Accuracy From a Ridge Regression Perspective. Front Genet 2020;11:581594. [PMID: 33329721 PMCID: PMC7672157 DOI: 10.3389/fgene.2020.581594] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Accepted: 09/29/2020] [Indexed: 11/13/2022] Open

Bernardo R. Reinventing quantitative genetics for plant breeding: something old, something new, something borrowed, something BLUE. Heredity (Edinb) 2020;125:375-385. [PMID: 32296132 PMCID: PMC7784685 DOI: 10.1038/s41437-020-0312-1] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Revised: 03/23/2020] [Accepted: 03/23/2020] [Indexed: 01/19/2023] Open

CUX2, BRAP and ALDH2 are associated with metabolic traits in people with excessive alcohol consumption. Sci Rep 2020;10:18118. [PMID: 33093602 PMCID: PMC7583246 DOI: 10.1038/s41598-020-75199-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2019] [Accepted: 10/12/2020] [Indexed: 12/21/2022] Open

Zhou J, Qiu Y, Chen S, Liu L, Liao H, Chen H, Lv S, Li X. A Novel Three-Stage Framework for Association Analysis Between SNPs and Brain Regions. Front Genet 2020;11:572350. [PMID: 33193677 PMCID: PMC7542238 DOI: 10.3389/fgene.2020.572350] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2020] [Accepted: 08/17/2020] [Indexed: 12/17/2022] Open

Thomas M, Sakoda LC, Hoffmeister M, Rosenthal EA, Lee JK, van Duijnhoven FJB, Platz EA, Wu AH, Dampier CH, de la Chapelle A, Wolk A, Joshi AD, Burnett-Hartman A, Gsur A, Lindblom A, Castells A, Win AK, Namjou B, Van Guelpen B, Tangen CM, He Q, Li CI, Schafmayer C, Joshu CE, Ulrich CM, Bishop DT, Buchanan DD, Schaid D, Drew DA, Muller DC, Duggan D, Crosslin DR, Albanes D, Giovannucci EL, Larson E, Qu F, Mentch F, Giles GG, Hakonarson H, Hampel H, Stanaway IB, Figueiredo JC, Huyghe JR, Minnier J, Chang-Claude J, Hampe J, Harley JB, Visvanathan K, Curtis KR, Offit K, Li L, Le Marchand L, Vodickova L, Gunter MJ, Jenkins MA, Slattery ML, Lemire M, Woods MO, Song M, Murphy N, Lindor NM, Dikilitas O, Pharoah PDP, Campbell PT, Newcomb PA, Milne RL, MacInnis RJ, Castellví-Bel S, Ogino S, Berndt SI, Bézieau S, Thibodeau SN, Gallinger SJ, Zaidi SH, Harrison TA, Keku TO, Hudson TJ, Vymetalkova V, Moreno V, Martín V, Arndt V, Wei WQ, Chung W, Su YR, Hayes RB, White E, Vodicka P, Casey G, Gruber SB, Schoen RE, Chan AT, Potter JD, Brenner H, Jarvik GP, Corley DA, Peters U, Hsu L. Genome-wide Modeling of Polygenic Risk Score in Colorectal Cancer Risk. Am J Hum Genet 2020;107:432-444. [PMID: 32758450 PMCID: PMC7477007 DOI: 10.1016/j.ajhg.2020.07.006] [Citation(s) in RCA: 101] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Accepted: 07/13/2020] [Indexed: 02/08/2023] Open

Abstract

Accurate colorectal cancer (CRC) risk prediction models are critical for identifying individuals at low and high risk of developing CRC, as they can then be offered targeted screening and interventions to address their risks of developing disease (if they are in a high-risk group) and avoid unnecessary screening and interventions (if they are in a low-risk group). As it is likely that thousands of genetic variants contribute to CRC risk, it is clinically important to investigate whether these genetic variants can be used jointly for CRC risk prediction. In this paper, we derived and compared different approaches to generating predictive polygenic risk scores (PRS) from genome-wide association studies (GWASs) including 55,105 CRC-affected case subjects and 65,079 control subjects of European ancestry. We built the PRS in three ways, using (1) 140 previously identified and validated CRC loci; (2) SNP selection based on linkage disequilibrium (LD) clumping followed by machine-learning approaches; and (3) LDpred, a Bayesian approach for genome-wide risk prediction. We tested the PRS in an independent cohort of 101,987 individuals with 1,699 CRC-affected case subjects. The discriminatory accuracy, calculated by the age- and sex-adjusted area under the receiver operating characteristics curve (AUC), was highest for the LDpred-derived PRS (AUC = 0.654) including nearly 1.2 M genetic variants (the proportion of causal genetic variants for CRC assumed to be 0.003), whereas the PRS of the 140 known variants identified from GWASs had the lowest AUC (AUC = 0.629). Based on the LDpred-derived PRS, we are able to identify 30% of individuals without a family history as having risk for CRC similar to those with a family history of CRC, whereas the PRS based on known GWAS variants identified only top 10% as having a similar relative risk. About 90% of these individuals have no family history and would have been considered average risk under current screening guidelines, but might benefit from earlier screening. The developed PRS offers a way for risk-stratified CRC screening and other targeted interventions.

Collapse

Affiliation(s)

Minta Thomas Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
Lori C Sakoda Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA; Division of Research, Kaiser Permanente Northern California, Oakland, CA 94612, USA
Michael Hoffmeister Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg 69120, Germany
Elisabeth A Rosenthal Department of Medicine (Medical Genetics), University of Washington Medical Center, Seattle, WA 98195, USA
Jeffrey K Lee Division of Research, Kaiser Permanente Northern California, Oakland, CA 94612, USA
Franzel J B van Duijnhoven Division of Human Nutrition and Health, Wageningen University & Research, Wageningen 176700, the Netherlands
Elizabeth A Platz Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, and the Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins, Baltimore, MD 21287, USA
Anna H Wu University of Southern California, Preventative Medicine, Los Angeles, CA 90089, USA
Christopher H Dampier Department of Surgery, University of Virginia Health System, Charlottesville, VA 22903, USA
Albert de la Chapelle Department of Cancer Biology and Genetics and the Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, USA
Alicja Wolk Institute of Environmental Medicine, Karolinska Institutet, Stockholm 17177, Sweden
Amit D Joshi Clinical and Translational Epidemiology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
Andrea Burnett-Hartman Institute for Health Research, Kaiser Permanente Colorado, Denver, CO 80014, USA
Andrea Gsur Institute of Cancer Research, Department of Medicine I, Medical University Vienna, Vienna 1090, Austria
Annika Lindblom Department of Clinical Genetics, Karolinska University Hospital, Stockholm 17177, Sweden; Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm 17177, Sweden
Antoni Castells Gastroenterology Department, Hospital Clínic, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), University of Barcelona, Barcelona 08007, Spain
Aung Ko Win Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, VIC 3000, Australia
Bahram Namjou Center for Autoimmune Genomics and Etiology (CAGE), Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA; University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA; Cincinnati VA Medical Center, Cincinnati, OH 45229, USA
Bethany Van Guelpen Department of Radiation Sciences, Oncology Unit, Umeå University, Umeå 90187, Sweden; Wallenberg Centre for Molecular Medicine, Umeå University, Umeå 90187, Sweden
Catherine M Tangen SWOG Statistical Center, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
Qianchuan He Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
Christopher I Li Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
Clemens Schafmayer Department of General Surgery, University Hospital Rostock, Rostock 18051, Germany
Corinne E Joshu Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, and the Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins, Baltimore, MD 21287, USA
Cornelia M Ulrich Huntsman Cancer Institute and Department of Population Health Sciences, University of Utah, Salt Lake City, UT 84112, USA
D Timothy Bishop Leeds Institute of Cancer and Pathology, University of Leeds, Leeds LS2 9JT, UK
Daniel D Buchanan University of Melbourne Centre for Cancer Research, Victorian Comprehensive Cancer Centre, Parkville, VIC 3010, Australia; Colorectal Oncogenomics Group, Department of Clinical Pathology, The University of Melbourne, Parkville, VIC 3010, Australia; Genomic Medicine and Family Cancer Clinic, Royal Melbourne Hospital, Parkville, VIC 3010, Australia
Daniel Schaid Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905, USA
David A Drew Clinical and Translational Epidemiology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
David C Muller School of Public Health, Imperial College London, London SW7 2AZ, UK
David Duggan Translational Genomics Research Institute - An Affiliate of City of Hope, Phoenix, AZ 85003, USA
David R Crosslin Department of Bioinformatics and Medical Education, University of Washington Medical Center, Seattle, WA 98195, USA
Demetrius Albanes Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
Edward L Giovannucci Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Department of Nutrition, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA 02108, USA
Eric Larson Kaiser Permanente Washington Research Institute, Seattle, WA 98101, USA
Flora Qu Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
Frank Mentch Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
Graham G Giles Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, VIC 3000, Australia; Cancer Epidemiology Division, Cancer Council Victoria, 615 St Kilda Road, Melbourne, VIC 3004, Australia; Precision Medicine, School of Clinical Sciences at Monash Health, Monash University, Clayton, VIC 3168, Australia
Hakon Hakonarson Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
Heather Hampel Division of Human Genetics, Department of Internal Medicine, The Ohio State University Comprehensive Cancer Center, Columbus, OH 43210, USA
Ian B Stanaway Department of Medicine (Medical Genetics), University of Washington Medical Center, Seattle, WA 98195, USA
Jane C Figueiredo Department of Medicine, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
Jeroen R Huyghe Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
Jessica Minnier School of Public Health, Oregon Health & Science University, Portland, OR 97239, USA
Jenny Chang-Claude Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, 69120 Germany; University Medical Centre Hamburg-Eppendorf, University Cancer Centre Hamburg (UCCH), Hamburg 20246, Germany
Jochen Hampe Department of Medicine I, University Hospital Dresden, Technische Universität Dresden (TU Dresden), Dresden 01062, Germany
John B Harley Center for Autoimmune Genomics and Etiology (CAGE), Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA; University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA; Cincinnati VA Medical Center, Cincinnati, OH 45229, USA
Kala Visvanathan Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, and the Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins, Baltimore, MD 21287, USA
Keith R Curtis Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
Kenneth Offit Clinical Genetics Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY 10021, USA; Department of Medicine, Weill Cornell Medical College, NY 10065, USA
Li Li Department of Family Medicine, University of Virginia, Charlottesville, VA 22903, USA
Loic Le Marchand University of Hawaii Cancer Center, Honolulu, HI 96813, USA
Ludmila Vodickova Department of Molecular Biology of Cancer, Institute of Experimental Medicine of the Czech Academy of Sciences, 142 20 Prague 4, Czech Republic; Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University, 128 00 Prague, Czech Republic; Faculty of Medicine and Biomedical Center in Pilsen, Charles University, 323 00 Pilsen, Czech Republic
Marc J Gunter Nutrition and Metabolism Section, International Agency for Research on Cancer, World Health Organization, Lyon 69372, France
Mark A Jenkins Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, VIC 3000, Australia
Martha L Slattery Department of Internal Medicine, University of Utah, Salt Lake City, UT 84132, USA
Mathieu Lemire PanCuRx Translational Research Initiative, Ontario, Institute for Cancer Research, Toronto, ON M5G0A3, Canada
Michael O Woods Memorial University of Newfoundland, Discipline of Genetics, St. John's, NL A1B 3R7, Canada
Mingyang Song Clinical and Translational Epidemiology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Broad Institute of Harvard and MIT, Cambridge, MA 02141, USA; Department of Nutrition, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA 02115, USA
Neil Murphy Nutrition and Metabolism Section, International Agency for Research on Cancer, World Health Organization, Lyon 69372, France
Noralane M Lindor Department of Health Science Research, Mayo Clinic, Scottsdale, AZ 85260, USA
Ozan Dikilitas Department of Cardiovascular Medicine, Mayo Clinic, Rochester, MN 55905, USA
Paul D P Pharoah Department of Public Health and Primary Care, University of Cambridge, Cambridge CB2 0SR, UK
Peter T Campbell Behavioral and Epidemiology Research Group, American Cancer Society, Atlanta, GA 30303, USA
Polly A Newcomb Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA; School of Public Health, University of Washington, Seattle, WA 98195, USA
Roger L Milne Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, VIC 3000, Australia; Cancer Epidemiology Division, Cancer Council Victoria, 615 St Kilda Road, Melbourne, VIC 3004, Australia; Precision Medicine, School of Clinical Sciences at Monash Health, Monash University, Clayton, VIC 3168, Australia
Robert J MacInnis Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, VIC 3000, Australia; Cancer Epidemiology Division, Cancer Council Victoria, 615 St Kilda Road, Melbourne, VIC 3004, Australia
Sergi Castellví-Bel Gastroenterology Department, Hospital Clínic, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), University of Barcelona, Barcelona 08007, Spain
Shuji Ogino Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Broad Institute of Harvard and MIT, Cambridge, MA 02141, USA; Program in MPE Molecular Pathological Epidemiology, Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA; Department of Oncologic Pathology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
Sonja I Berndt Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
Stéphane Bézieau Service de Génétique Médicale, Centre Hospitalier Universitaire (CHU) Nantes, Nantes 44093, France
Stephen N Thibodeau Division of Laboratory Genetics, Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 85054, USA
Steven J Gallinger Lunenfeld Tanenbaum Research Institute, Mount Sinai Hospital, University of Toronto, Toronto, ON M5G1X5, Canada
Syed H Zaidi Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
Tabitha A Harrison Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
Temitope O Keku Center for Gastrointestinal Biology and Disease, University of North Carolina, Chapel Hill, NC 27599, USA
Thomas J Hudson Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
Veronika Vymetalkova Department of Molecular Biology of Cancer, Institute of Experimental Medicine of the Czech Academy of Sciences, 142 20 Prague 4, Czech Republic; Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University, 128 00 Prague, Czech Republic; Faculty of Medicine and Biomedical Center in Pilsen, Charles University, 323 00 Pilsen, Czech Republic
Victor Moreno Oncology Data Analytics Program, Catalan Institute of Oncology, L'Hospitalet de Llobregat, Barcelona 08908, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Madrid 28029, Spain; Department of Clinical Sciences, Faculty of Medicine, University of Barcelona, Barcelona 08907, Spain; ONCOBEL Program, Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, Barcelona 08908, Spain
Vicente Martín CIBER Epidemiología y Salud Pública (CIBERESP), Madrid 28029, Spain; Biomedicine Institute (IBIOMED), University of León, León 24071, Spain
Volker Arndt Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg 69120, Germany
Wei-Qi Wei Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, USA
Wendy Chung Office of Research & Development, Department of Veterans Affairs, Washington, DC 20420, USA; Departments of Pediatrics and Medicine, Columbia University Medical Center, New York, NY 10032, USA
Yu-Ru Su Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
Richard B Hayes Division of Epidemiology, Department of Population Health, New York University School of Medicine, New York, NY 10016, USA
Emily White Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA; Department of Epidemiology, University of Washington, Seattle, WA 98195, USA
Pavel Vodicka Department of Molecular Biology of Cancer, Institute of Experimental Medicine of the Czech Academy of Sciences, 142 20 Prague 4, Czech Republic; Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University, 128 00 Prague, Czech Republic; Faculty of Medicine and Biomedical Center in Pilsen, Charles University, 323 00 Pilsen, Czech Republic
Graham Casey Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22903, USA
Stephen B Gruber Department of Preventive Medicine, USC Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA 90089, USA
Robert E Schoen Department of Medicine and Epidemiology, University of Pittsburgh Medical Center, Pittsburgh, PA 15219, USA
Andrew T Chan Clinical and Translational Epidemiology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Broad Institute of Harvard and MIT, Cambridge, MA 02141, USA; Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA 02115, USA
John D Potter Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA; Centre for Public Health Research, Massey University, Wellington 6140, New Zealand
Hermann Brenner Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg 69120, Germany; Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg 69120, Germany; German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg 69120, Germany
Gail P Jarvik Department of Medicine (Medical Genetics), University of Washington Medical Center, Seattle, WA 98195, USA; Genome Sciences, University of Washington Medical Center, Seattle, WA 98195, USA
Douglas A Corley Division of Research, Kaiser Permanente Northern California, Oakland, CA 94612, USA
Ulrike Peters Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA; Department of Epidemiology, University of Washington, Seattle, WA 98195, USA.
Li Hsu Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA; Department of Biostatistics, University of Washington, Seattle, WA 98195, USA.

Collapse

Han Y, Adolphs R. Estimating the heritability of psychological measures in the Human Connectome Project dataset. PLoS One 2020;15:e0235860. [PMID: 32645058 PMCID: PMC7347217 DOI: 10.1371/journal.pone.0235860] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Accepted: 06/24/2020] [Indexed: 12/03/2022] Open

Abstract

The Human Connectome Project (HCP) is a large structural and functional MRI dataset with a rich array of behavioral and genotypic measures, as well as a biologically verified family structure. This makes it a valuable resource for investigating questions about individual differences, including questions about heritability. While its MRI data have been analyzed extensively in this regard, to our knowledge a comprehensive estimation of the heritability of the behavioral dataset has never been conducted. Using a set of behavioral measures of personality, emotion and cognition, we show that it is possible to re-identify the same individual across two testing times (fingerprinting), and to identify identical twins significantly above chance. Standard heritability estimates of 37 behavioral measures were derived from twin correlations, and machine-learning models (univariate linear model, Ridge classifier and Random Forest model) were trained to classify monozygotic twins and dizygotic twins. Correlations between the standard heritability metric and each set of model weights ranged from 0.36 to 0.7, and questionnaire-based and task-based measures did not differ significantly in their heritability. We further explored the heritability of a smaller number of latent factors extracted from the 37 measures and repeated the heritability estimation; in this case, the correlations between the standard heritability and each set of model weights were lower, ranging from 0.05 to 0.43. One specific discrepancy arose for the general intelligence factor, which all models assigned high importance, but the standard heritability calculation did not. We present a thorough investigation of the heritabilities of the behavioral measures in the HCP as a resource for other investigators, and illustrate the utility of machine-learning methods for qualitative characterization of the differential heritability across diverse measures.

Collapse

Hybrid Breeding for MLN Resistance: Heterosis, Combining Ability, and Hybrid Prediction. PLANTS 2020;9:plants9040468. [PMID: 32276322 PMCID: PMC7238107 DOI: 10.3390/plants9040468] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Revised: 03/24/2020] [Accepted: 03/25/2020] [Indexed: 11/18/2022]

Abstract

Prior knowledge on heterosis and quantitative genetic parameters on maize lethal necrosis (MLN) can help the breeders to develop numerous resistant or tolerant hybrids with optimum resources. Our objectives were to (1) estimate the quantitative genetic parameters for MLN disease severity, (2) investigate the efficiency of the prediction of hybrid performance based on parental per se and general combining ability (GCA) effects, and (3) examine the potential of hybrid prediction for MLN resistance or tolerance based on markers. Fifty elite maize inbred lines were selected based on their response to MLN under artificial inoculation. Crosses were made in a half diallel mating design to produce 307 F1 hybrids. All hybrids were evaluated in MLN quarantine facility in Naivasha, Kenya for two seasons under artificial inoculation. All 50 inbreds were genotyped with genotyping-by-sequencing (GBS) SNPs. The phenotypic variation was significant for all traits and the heritability was moderate to high. We observed that hybrids were superior to the mean performance of the parents for disease severity (−14.57%) and area under disease progress curve (AUDPC) (14.9%). Correlations were significant and moderate between line per se and GCA; and mean of parental value with hybrid performance for both disease severity and AUDPC value. Very low and negative correlation was observed between parental lines marker based genetic distance and heterosis. Nevertheless, the correlation of GCA effects was very high with hybrid performance which can suggests as a good predictor of MLN resistance. Genomic prediction of hybrid performance for MLN is high for both traits. We therefore conclude that there is potential for prediction of hybrid performance for MLN. Overall, the estimated quantitative genetic parameters suggest that through targeted approach, it is possible to develop outstanding lines and hybrids for MLN resistance.

Collapse

Beesley LJ, Salvatore M, Fritsche LG, Pandit A, Rao A, Brummett C, Willer CJ, Lisabeth LD, Mukherjee B. The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities. Stat Med 2020;39:773-800. [PMID: 31859414 PMCID: PMC7983809 DOI: 10.1002/sim.8445] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Revised: 09/10/2019] [Accepted: 11/16/2019] [Indexed: 01/03/2023]

Radiomics as Applied in Precision Medicine. Clin Nucl Med 2020. [DOI: 10.1007/978-3-030-39457-8_3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Nyaga C, Gowda M, Beyene Y, Muriithi WT, Makumbi D, Olsen MS, Suresh LM, Bright JM, Das B, Prasanna BM. Genome-Wide Analyses and Prediction of Resistance to MLN in Large Tropical Maize Germplasm. Genes (Basel) 2019;11:genes11010016. [PMID: 31877962 PMCID: PMC7016728 DOI: 10.3390/genes11010016] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Revised: 12/17/2019] [Accepted: 12/18/2019] [Indexed: 11/16/2022] Open

Affiliation(s)

Christine Nyaga Department of Agricultural Science and Technology, Kenyatta University, Nairobi 43844-00100, Kenya; (C.N.); (W.T.M.) International Maize and Wheat Improvement Centre (CIMMYT), World Agroforestry Centre (ICRAF), United Nations Avenue, Gigiri, Nairobi 1041-00621, Kenya; (Y.B.); (D.M.); (M.S.O.); (L.M.S.); (J.M.B.); (B.D.); (B.M.P.)
Manje Gowda International Maize and Wheat Improvement Centre (CIMMYT), World Agroforestry Centre (ICRAF), United Nations Avenue, Gigiri, Nairobi 1041-00621, Kenya; (Y.B.); (D.M.); (M.S.O.); (L.M.S.); (J.M.B.); (B.D.); (B.M.P.) Correspondence: ; Tel.: +254-727-019-454
Yoseph Beyene International Maize and Wheat Improvement Centre (CIMMYT), World Agroforestry Centre (ICRAF), United Nations Avenue, Gigiri, Nairobi 1041-00621, Kenya; (Y.B.); (D.M.); (M.S.O.); (L.M.S.); (J.M.B.); (B.D.); (B.M.P.)
Wilson T. Muriithi Department of Agricultural Science and Technology, Kenyatta University, Nairobi 43844-00100, Kenya; (C.N.); (W.T.M.)
Dan Makumbi International Maize and Wheat Improvement Centre (CIMMYT), World Agroforestry Centre (ICRAF), United Nations Avenue, Gigiri, Nairobi 1041-00621, Kenya; (Y.B.); (D.M.); (M.S.O.); (L.M.S.); (J.M.B.); (B.D.); (B.M.P.)
Michael S. Olsen International Maize and Wheat Improvement Centre (CIMMYT), World Agroforestry Centre (ICRAF), United Nations Avenue, Gigiri, Nairobi 1041-00621, Kenya; (Y.B.); (D.M.); (M.S.O.); (L.M.S.); (J.M.B.); (B.D.); (B.M.P.)
L. M. Suresh International Maize and Wheat Improvement Centre (CIMMYT), World Agroforestry Centre (ICRAF), United Nations Avenue, Gigiri, Nairobi 1041-00621, Kenya; (Y.B.); (D.M.); (M.S.O.); (L.M.S.); (J.M.B.); (B.D.); (B.M.P.)
Jumbo M. Bright International Maize and Wheat Improvement Centre (CIMMYT), World Agroforestry Centre (ICRAF), United Nations Avenue, Gigiri, Nairobi 1041-00621, Kenya; (Y.B.); (D.M.); (M.S.O.); (L.M.S.); (J.M.B.); (B.D.); (B.M.P.)
Biswanath Das International Maize and Wheat Improvement Centre (CIMMYT), World Agroforestry Centre (ICRAF), United Nations Avenue, Gigiri, Nairobi 1041-00621, Kenya; (Y.B.); (D.M.); (M.S.O.); (L.M.S.); (J.M.B.); (B.D.); (B.M.P.)
Boddupalli M. Prasanna International Maize and Wheat Improvement Centre (CIMMYT), World Agroforestry Centre (ICRAF), United Nations Avenue, Gigiri, Nairobi 1041-00621, Kenya; (Y.B.); (D.M.); (M.S.O.); (L.M.S.); (J.M.B.); (B.D.); (B.M.P.)

Collapse

Li J, Veeranampalayam-Sivakumar AN, Bhatta M, Garst ND, Stoll H, Stephen Baenziger P, Belamkar V, Howard R, Ge Y, Shi Y. Principal variable selection to explain grain yield variation in winter wheat from features extracted from UAV imagery. PLANT METHODS 2019;15:123. [PMID: 31695728 PMCID: PMC6824016 DOI: 10.1186/s13007-019-0508-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Accepted: 10/19/2019] [Indexed: 05/23/2023]

Abstract

BACKGROUND

Automated phenotyping technologies are continually advancing the breeding process. However, collecting various secondary traits throughout the growing season and processing massive amounts of data still take great efforts and time. Selecting a minimum number of secondary traits that have the maximum predictive power has the potential to reduce phenotyping efforts. The objective of this study was to select principal features extracted from UAV imagery and critical growth stages that contributed the most in explaining winter wheat grain yield. Five dates of multispectral images and seven dates of RGB images were collected by a UAV system during the spring growing season in 2018. Two classes of features (variables), totaling to 172 variables, were extracted for each plot from the vegetation index and plant height maps, including pixel statistics and dynamic growth rates. A parametric algorithm, LASSO regression (the least angle and shrinkage selection operator), and a non-parametric algorithm, random forest, were applied for variable selection. The regression coefficients estimated by LASSO and the permutation importance scores provided by random forest were used to determine the ten most important variables influencing grain yield from each algorithm.

RESULTS

Both selection algorithms assigned the highest importance score to the variables related with plant height around the grain filling stage. Some vegetation indices related variables were also selected by the algorithms mainly at earlier to mid growth stages and during the senescence. Compared with the yield prediction using all 172 variables derived from measured phenotypes, using the selected variables performed comparable or even better. We also noticed that the prediction accuracy on the adapted NE lines (r = 0.58-0.81) was higher than the other lines (r = 0.21-0.59) included in this study with different genetic backgrounds.

CONCLUSIONS

With the ultra-high resolution plot imagery obtained by the UAS-based phenotyping we are now able to derive more features, such as the variation of plant height or vegetation indices within a plot other than just an averaged number, that are potentially very useful for the breeding purpose. However, too many features or variables can be derived in this way. The promising results from this study suggests that the selected set from those variables can have comparable prediction accuracies on the grain yield prediction than the full set of them but possibly resulting in a better allocation of efforts and resources on phenotypic data collection and processing.

Collapse

Williams DR, Rhemtulla M, Wysocki AC, Rast P. On Nonregularized Estimation of Psychological Networks. MULTIVARIATE BEHAVIORAL RESEARCH 2019;54:719-750. [PMID: 30957629 PMCID: PMC6736701 DOI: 10.1080/00273171.2019.1575716] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]

Abstract

An important goal for psychological science is developing methods to characterize relationships between variables. Customary approaches use structural equation models to connect latent factors to a number of observed measurements, or test causal hypotheses between observed variables. More recently, regularized partial correlation networks have been proposed as an alternative approach for characterizing relationships among variables through off-diagonal elements in the precision matrix. While the graphical Lasso (glasso) has emerged as the default network estimation method, it was optimized in fields outside of psychology with very different needs, such as high dimensional data where the number of variables (p) exceeds the number of observations (n). In this article, we describe the glasso method in the context of the fields where it was developed, and then we demonstrate that the advantages of regularization diminish in settings where psychological networks are often fitted ( p≪n ). We first show that improved properties of the precision matrix, such as eigenvalue estimation, and predictive accuracy with cross-validation are not always appreciable. We then introduce nonregularized methods based on multiple regression and a nonparametric bootstrap strategy, after which we characterize performance with extensive simulations. Our results demonstrate that the nonregularized methods can be used to reduce the false-positive rate, compared to glasso, and they appear to provide consistent performance across sparsity levels, sample composition (p/n), and partial correlation size. We end by reviewing recent findings in the statistics literature that suggest alternative methods often have superior performance than glasso, as well as suggesting areas for future research in psychology. The nonregularized methods have been implemented in the R package GGMnonreg.

Collapse

Li C, Huang Q, Yang R, Dai Y, Zeng Y, Tao L, Li X, Zeng J, Wang Q. Gut microbiota composition and bone mineral loss-epidemiologic evidence from individuals in Wuhan, China. Osteoporos Int 2019;30:1003-1013. [PMID: 30666372 DOI: 10.1007/s00198-019-04855-5] [Citation(s) in RCA: 88] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Accepted: 01/13/2019] [Indexed: 12/11/2022]

Abstract

UNLABELLED

We explored the association between gut microbiota composition and bone mineral loss in Chinese elderly people by high-throughput 16S ribosomal RNA (rRNA) gene sequencing. Compared with controls, a smaller number of operational taxonomic units (OTUs), several taxa with altered abundance, and specific functional pathways were found in individuals with low-bone mineral density (BMD).

INTRODUCTION

Gut microbiota plays important roles in human health and associates with a number of diseases. However, few studies explored its association with bone mineral loss in human.

METHODS

We collected 102 fecal samples from each eligible individual belonging to low-BMD and control groups for high-throughput 16S rRNA gene sequencing.

RESULTS

The low-BMD individuals had a smaller number of OTUs and bacterial taxa at each level. At the phylum level, Bacteroidetes were more abundant in the low-BMD group; Firmicutes were enriched in the control group; Firmicutes and Actinobacteria positively correlated and Bacteroidetes negatively correlated with the BMD and T-score in all subjects. At the family level, the abundance of Lachnospiraceae in low-BMD individuals reduced and positively correlated with BMD and T-score; meanwhile, BMD increased with increasing Bifidobacteriaceae. At the genus level, low-BMD individuals had decreased proportions of Roseburia compared with control ones (P < 0.05). Roseburia, Bifidobacterium, and Lactobacillus positively correlated with BMD and T-score. Furthermore, BMD increased with rising abundance of Bifidobacterium. Functional prediction revealed that 93 metabolic pathways significantly differed between the two groups (FDR-corrected P < 0.05). Most pathways, especially pathways related to LPS biosynthesis, were more abundant in low-BMD individuals than in control ones.

CONCLUSIONS

Several taxa with altered abundance and specific functional pathways were discovered in low-BMD individuals. Our findings provide novel epidemiologic evidence to elucidate the underlying microbiota-relevant mechanism in bone mineral loss and osteoporosis.

Collapse

Chasioti D, Yan J, Nho K, Saykin AJ. Progress in Polygenic Composite Scores in Alzheimer's and Other Complex Diseases. Trends Genet 2019;35:371-382. [PMID: 30922659 PMCID: PMC6475476 DOI: 10.1016/j.tig.2019.02.005] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 02/12/2019] [Accepted: 02/22/2019] [Indexed: 11/25/2022]

Keaton SA, Madaj ZB, Heilman P, Smart L, Grit J, Gibbons R, Postolache T, Roaten K, Achtyes E, Brundin L. An inflammatory profile linked to increased suicide risk. J Affect Disord 2019;247:57-65. [PMID: 30654266 PMCID: PMC6860980 DOI: 10.1016/j.jad.2018.12.100] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 11/25/2018] [Accepted: 12/24/2018] [Indexed: 12/19/2022]

Abstract

BACKGROUND

Suicide risk assessments are often challenging for clinicians, and therefore, biological markers are warranted as guiding tools in these assessments. Suicidal patients display increased cytokine levels in peripheral blood, although the composite inflammatory profile in the subjects is still unknown. It is also not yet established whether certain inflammatory changes are specific to suicidal subjects. To address this, we measured 45 immunobiological factors in peripheral blood and identified the biological profiles associated with cross-diagnostic suicide risk and depression, respectively.

METHODS

Sixty-six women with mood and anxiety disorders underwent computerized adaptive testing for mental health, assessing depression and suicide risk. Weighted correlation network analysis was used to uncover system level associations between suicide risk, depression, and the immunobiological factors in plasma. Secondary regression models were used to establish the sensitivity of the results to potential confounders, including age, body mass index (BMI), treatment and symptoms of depression and anxiety.

RESULTS

The biological profile of patients assessed to be at increased suicide risk differed from that associated with depression. At the system level, a biological cluster containing increased levels of interleukin-6, lymphocytes, monocytes, white blood cell count and polymorphonuclear leukocyte count significantly impacted suicide risk, with the latter two inferring the strongest influence. The cytokine interleukin-8 was independently and negatively associated with increased suicide risk. The results remained after adjusting for confounders.

LIMITATIONS

This study is cross-sectional and not designed to prove causality.

DISCUSSION

A unique immunobiological profile was linked to increased suicide risk. The profile was different from that observed in patients with depressive symptoms, and indicates that granulocyte mediated biological mechanisms could be activated in patients at risk for suicide.

Collapse

Cherlin S, Plant D, Taylor JC, Colombo M, Spiliopoulou A, Tzanis E, Morgan AW, Barnes MR, McKeigue P, Barrett JH, Pitzalis C, Barton A, Consortium MATURA, Cordell HJ. Prediction of treatment response in rheumatoid arthritis patients using genome-wide SNP data. Genet Epidemiol 2018;42:754-771. [PMID: 30311271 PMCID: PMC6334178 DOI: 10.1002/gepi.22159] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2018] [Revised: 07/06/2018] [Accepted: 07/28/2018] [Indexed: 01/13/2023]

Affiliation(s)

Svetlana Cherlin Institute of Genetic MedicineNewcastle UniversityNewcastle upon TyneUK
Darren Plant NIHR Manchester Biomedical Research Centre, Manchester University NHS Foundation TrustManchester Academic Health Science CentreManchesterUK
John C. Taylor Leeds Institute of Cancer and PathologyUniversity of LeedsLeedsUK NIHR Leeds Biomedical Research CentreLeeds Teaching Hospitals NHS TrustLeedsUK
Marco Colombo Centre for Population Health Sciences, Usher Institute of Population Health Sciences and InformaticsUniversity of EdinburghEdinburghUK
Athina Spiliopoulou Centre for Population Health Sciences, Usher Institute of Population Health Sciences and InformaticsUniversity of EdinburghEdinburghUK
Evan Tzanis Centre for Experimental Medicine and Rheumatology, William Harvey Research Institute, Barts and the London School of Medicine and DentistryQueen Mary University of London and Barts Health NHS TrustLondonUK
Ann W. Morgan NIHR Leeds Biomedical Research CentreLeeds Teaching Hospitals NHS TrustLeedsUK Leeds Institute of Rheumatic and Musculoskeletal MedicineUniversity of LeedsLeedsUK
Michael R. Barnes Centre for Experimental Medicine and Rheumatology, William Harvey Research Institute, Barts and the London School of Medicine and DentistryQueen Mary University of London and Barts Health NHS TrustLondonUK
Paul McKeigue Centre for Population Health Sciences, Usher Institute of Population Health Sciences and InformaticsUniversity of EdinburghEdinburghUK
Jennifer H. Barrett Leeds Institute of Cancer and PathologyUniversity of LeedsLeedsUK NIHR Leeds Biomedical Research CentreLeeds Teaching Hospitals NHS TrustLeedsUK
Costantino Pitzalis Centre for Experimental Medicine and Rheumatology, William Harvey Research Institute, Barts and the London School of Medicine and DentistryQueen Mary University of London and Barts Health NHS TrustLondonUK
Anne Barton NIHR Manchester Biomedical Research Centre, Manchester University NHS Foundation TrustManchester Academic Health Science CentreManchesterUK Arthritis Research UK Centre for Genetics and Genomics, Centre for Musculoskeletal ResearchThe University of ManchesterManchesterUK
MATURA Consortium Institute of Genetic MedicineNewcastle UniversityNewcastle upon TyneUK NIHR Manchester Biomedical Research Centre, Manchester University NHS Foundation TrustManchester Academic Health Science CentreManchesterUK Leeds Institute of Cancer and PathologyUniversity of LeedsLeedsUK NIHR Leeds Biomedical Research CentreLeeds Teaching Hospitals NHS TrustLeedsUK Centre for Population Health Sciences, Usher Institute of Population Health Sciences and InformaticsUniversity of EdinburghEdinburghUK Centre for Experimental Medicine and Rheumatology, William Harvey Research Institute, Barts and the London School of Medicine and DentistryQueen Mary University of London and Barts Health NHS TrustLondonUK Leeds Institute of Rheumatic and Musculoskeletal MedicineUniversity of LeedsLeedsUK Arthritis Research UK Centre for Genetics and Genomics, Centre for Musculoskeletal ResearchThe University of ManchesterManchesterUK
Heather J. Cordell NIHR Manchester Biomedical Research Centre, Manchester University NHS Foundation TrustManchester Academic Health Science CentreManchesterUK

Collapse

Jiang W, Lakshminarayanan P, Hui X, Han P, Cheng Z, Bowers M, Shpitser I, Siddiqui S, Taylor RH, Quon H, McNutt T. Machine Learning Methods Uncover Radiomorphologic Dose Patterns in Salivary Glands that Predict Xerostomia in Patients with Head and Neck Cancer. Adv Radiat Oncol 2018;4:401-412. [PMID: 31011686 PMCID: PMC6460328 DOI: 10.1016/j.adro.2018.11.008] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2018] [Accepted: 11/14/2018] [Indexed: 01/06/2023] Open

Abstract

Purpose

Patients with head-and-neck cancer (HNC) may experience xerostomia after radiation therapy (RT), which leads to compromised quality of life. The purpose of this study is to explore how the spatial pattern of radiation dose (radiomorphology) in the major salivary glands influences xerostomia in patients with HNC.

Methods and materials

A data-driven approach using spatially explicit dosimetric predictors, voxel dose (ie, actual radiation dose in voxels in parotid glands [PG] and submandibular glands [SMG]) was used to predict whether patients would develop xerostomia 3 months after RT. Using planned radiation dose data and other nondose covariates including baseline xerostomia grade of 427 patients with HNC in our database, the machine learning methods were used to investigate the influence of dose patterns across subvolumes in PG and SMG on xerostomia.

Results

Of the 3 supervised learning methods studied, ridge logistic regression yielded the best predictive performance. Ridge logistic regression was also preferred to evaluate the influence pattern of highly correlated dose on xerostomia, which showed a discriminative pattern of influence of doses in the PG and SMG on xerostomia. Moreover, the superior–anterior portion of the contralateral PG and medial portion of the ipsilateral PG were determined to be the most influential regions regarding dose effect on xerostomia. The area under the receiver operating characteristic curve from a 10-fold cross-validation was 0.70 ± 0.04.

Conclusions

Radiomorphology, combined with machine learning methods, is able to suggest patterns of dose in PG and SMG that are the most influential on xerostomia. The influence pattern identified by this data-driven approach and machine learning methods may help improve RT treatment planning and reduce xerostomia after treatment.

Collapse

Coupé C. Modeling Linguistic Variables With Regression Models: Addressing Non-Gaussian Distributions, Non-independent Observations, and Non-linear Predictors With Random Effects and Generalized Additive Models for Location, Scale, and Shape. Front Psychol 2018;9:513. [PMID: 29713298 PMCID: PMC5911484 DOI: 10.3389/fpsyg.2018.00513] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2017] [Accepted: 03/27/2018] [Indexed: 11/13/2022] Open

Abstract

As statistical approaches are getting increasingly used in linguistics, attention must be paid to the choice of methods and algorithms used. This is especially true since they require assumptions to be satisfied to provide valid results, and because scientific articles still often fall short of reporting whether such assumptions are met. Progress is being, however, made in various directions, one of them being the introduction of techniques able to model data that cannot be properly analyzed with simpler linear regression models. We report recent advances in statistical modeling in linguistics. We first describe linear mixed-effects regression models (LMM), which address grouping of observations, and generalized linear mixed-effects models (GLMM), which offer a family of distributions for the dependent variable. Generalized additive models (GAM) are then introduced, which allow modeling non-linear parametric or non-parametric relationships between the dependent variable and the predictors. We then highlight the possibilities offered by generalized additive models for location, scale, and shape (GAMLSS). We explain how they make it possible to go beyond common distributions, such as Gaussian or Poisson, and offer the appropriate inferential framework to account for 'difficult' variables such as count data with strong overdispersion. We also demonstrate how they offer interesting perspectives on data when not only the mean of the dependent variable is modeled, but also its variance, skewness, and kurtosis. As an illustration, the case of phonemic inventory size is analyzed throughout the article. For over 1,500 languages, we consider as predictors the number of speakers, the distance from Africa, an estimation of the intensity of language contact, and linguistic relationships. We discuss the use of random effects to account for genealogical relationships, the choice of appropriate distributions to model count data, and non-linear relationships. Relying on GAMLSS, we assess a range of candidate distributions, including the Sichel, Delaporte, Box-Cox Green and Cole, and Box-Cox t distributions. We find that the Box-Cox t distribution, with appropriate modeling of its parameters, best fits the conditional distribution of phonemic inventory size. We finally discuss the specificities of phoneme counts, weak effects, and how GAMLSS should be considered for other linguistic variables.

Collapse

Dron JS, Hegele RA. Polygenic influences on dyslipidemias. Curr Opin Lipidol 2018;29:133-143. [PMID: 29300201 DOI: 10.1097/mol.0000000000000482] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Schwabe I, Janss L, van den Berg SM. Can We Validate the Results of Twin Studies? A Census-Based Study on the Heritability of Educational Achievement. Front Genet 2017;8:160. [PMID: 29123543 PMCID: PMC5662588 DOI: 10.3389/fgene.2017.00160] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2017] [Accepted: 10/10/2017] [Indexed: 11/13/2022] Open

Mazo Lopera MA, Coombes BJ, de Andrade M. An Efficient Test for Gene-Environment Interaction in Generalized Linear Mixed Models with Family Data. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2017;14:ijerph14101134. [PMID: 28953253 PMCID: PMC5664635 DOI: 10.3390/ijerph14101134] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2017] [Revised: 09/20/2017] [Accepted: 09/25/2017] [Indexed: 02/07/2023]

Incorporating Gene Annotation into Genomic Prediction of Complex Phenotypes. Genetics 2017;207:489-501. [PMID: 28839043 DOI: 10.1534/genetics.117.300198] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2016] [Accepted: 08/16/2017] [Indexed: 11/18/2022] Open

Abstract

Today, genomic prediction (GP) is an established technology in plant and animal breeding programs. Current standard methods are purely based on statistical considerations but do not make use of the abundant biological knowledge, which is easily available from public databases. Major questions that have to be answered before biological prior information can be used routinely in GP approaches are which types of information can be used, and at which points they can be incorporated into prediction methods. In this study, we propose a novel strategy to incorporate gene annotation into GP of complex phenotypes by defining haploblocks according to gene positions. Haplotype effects are then modeled as categorical or as numerical allele dosage variables. The underlying concept of this approach is to build the statistical model on variables representing the biologically functional units. We evaluate the new methods with data from a heterogeneous stock mouse population, the Drosophila Genetic Reference Panel (DGRP), and a rice breeding population from the Rice Diversity Panel. Our results show that using gene annotation to define haploblocks often leads to a comparable, but for some traits to a higher, predictive ability compared to SNP-based models or to haplotype models that do not use gene annotation information. Modeling gene interaction effects can further improve predictive ability. We also illustrate that the additional use of markers that have not been mapped to any gene in a second separate relatedness matrix does in many cases not lead to a relevant additional increase in predictive ability when the first matrix is based on haploblocks defined with gene annotation data, suggesting that intergenic markers only provide redundant information on the considered data sets. Therefore, gene annotation information seems to be appropriate to perceive the importance of DNA segments. Finally, we discuss the effects of gene annotation quality, marker density, and linkage disequilibrium on the performance of the new methods. To our knowledge, this is the first work that incorporates epistatic interaction or gene annotation into haplotype-based prediction approaches.

Collapse

Su YR, Di CZ, Hsu L. A unified powerful set-based test for sequencing data analysis of GxE interactions. Biostatistics 2016;18:119-131. [PMID: 27474101 DOI: 10.1093/biostatistics/kxw034] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2015] [Revised: 04/27/2016] [Accepted: 06/17/2016] [Indexed: 11/13/2022] Open

Abstract

The development of next-generation sequencing technologies has allowed researchers to study comprehensively the contribution of genetic variation particularly rare variants to complex diseases. To date many sequencing analyses of rare variants have focused on marginal genetic effects and have not explored the potential role environmental factors play in modifying genetic risk. Analysis of gene-environment interaction (GxE) for rare variants poses considerable challenges because of variant rarity and paucity of subjects who carry the variants while being exposed. To tackle this challenge, we propose a hierarchical model to jointly assess the GxE effects of a set of rare variants for example, in a gene or regulatory region, leveraging the information across the variants. Under this model, GxE is modeled by two components. The first component incorporates variant functional information as weights to calculate the weighted burden of variant alleles across variants, and then assess their GxE interaction with the environmental factor. Since this information is a priori known, this component is fixed effects in the model. The second component involves residual GxE effects that have not been accounted for by the fixed effects. In this component, the residual GxE effects are postulated to follow an unspecified distribution with mean 0 and variance [Formula: see text] We develop a novel testing procedure by deriving two independent score statistics for the fixed effects and the variance component separately. We propose two data-adaptive combination approaches for combining these two score statistics and establish the asymptotic distributions. An extensive simulation study shows that the proposed approaches maintain the correct type I error and the power is comparable to or better than existing methods under a wide range of scenarios. Finally we illustrate the proposed methods by a exome-wide GxE analysis with NSAIDs use in colorectal cancer.

Collapse