1
|
Cho SJ, Wu H, Naveiras M. The effective sample size in Bayesian information criterion for level-specific fixed and random-effect selection in a two-level nested model. Br J Math Stat Psychol 2024; 77:289-315. [PMID: 38591555 DOI: 10.1111/bmsp.12327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 08/01/2023] [Accepted: 10/17/2023] [Indexed: 04/10/2024]
Abstract
Popular statistical software provides the Bayesian information criterion (BIC) for multi-level models or linear mixed models. However, it has been observed that the combination of statistical literature and software documentation has led to discrepancies in the formulas of the BIC and uncertainties as to the proper use of the BIC in selecting a multi-level model with respect to level-specific fixed and random effects. These discrepancies and uncertainties result from different specifications of sample size in the BIC's penalty term for multi-level models. In this study, we derive the BIC's penalty term for level-specific fixed- and random-effect selection in a two-level nested design. In this new version of BIC, calledBIC E 1 , this penalty term is decomposed into two parts if the random-effect variance-covariance matrix has full rank: (a) a term with the log of average sample size per cluster and (b) the total number of parameters times the log of the total number of clusters. Furthermore, we derive the new version of BIC, calledBIC E 2 , in the presence of redundant random effects. We show that the derived formulae,BIC E 1 andBIC E 2 , adhere to empirical values via numerical demonstration and thatBIC E (E indicating eitherE 1 orE 2 ) is the best global selection criterion, as it performs at least as well as BIC with the total sample size and BIC with the number of clusters across various multi-level conditions through a simulation study. In addition, the use ofBIC E 1 is illustrated with a textbook example dataset.
Collapse
Affiliation(s)
- Sun-Joo Cho
- Vanderbilt University, Nashville, Tennessee, USA
| | - Hao Wu
- Vanderbilt University, Nashville, Tennessee, USA
| | | |
Collapse
|
2
|
Zhao T, Wang F, Mott R, Dekkers J, Cheng H. Using encrypted genotypes and phenotypes for collaborative genomic analyses to maintain data confidentiality. Genetics 2024; 226:iyad210. [PMID: 38085098 DOI: 10.1093/genetics/iyad210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 11/13/2023] [Indexed: 03/08/2024] Open
Abstract
To adhere to and capitalize on the benefits of the FAIR (findable, accessible, interoperable, and reusable) principles in agricultural genome-to-phenome studies, it is crucial to address privacy and intellectual property issues that prevent sharing and reuse of data in research and industry. Direct sharing of genotype and phenotype data is often prohibited due to intellectual property and privacy concerns. Thus, there is a pressing need for encryption methods that obscure confidential aspects of the data, without affecting the outcomes of certain statistical analyses. A homomorphic encryption method for genotypes and phenotypes (HEGP) has been proposed for single-marker regression in genome-wide association studies (GWAS) using linear mixed models with Gaussian errors. This methodology permits frequentist likelihood-based parameter estimation and inference. In this paper, we extend HEGP to broader applications in genome-to-phenome analyses. We show that HEGP is suited to commonly used linear mixed models for genetic analyses of quantitative traits including genomic best linear unbiased prediction (GBLUP) and ridge-regression best linear unbiased prediction (RR-BLUP), as well as Bayesian variable selection methods (e.g. those in Bayesian Alphabet), for genetic parameter estimation, genomic prediction, and GWAS. By advancing the capabilities of HEGP, we offer researchers and industry professionals a secure and efficient approach for collaborative genomic analyses while preserving data confidentiality.
Collapse
Affiliation(s)
- Tianjing Zhao
- Department of Animal Science, University of California, Davis, CA 95616, USA
- Department of Animal Science, University of Nebraska-Lincoln, Lincoln, NE 68583, USA
| | - Fangyi Wang
- Department of Plant Sciences, University of California, Davis, CA 95616, USA
| | - Richard Mott
- Genetics Institute, University College London, London, WC1E 6BT, UK
| | - Jack Dekkers
- Department of Animal Science, Iowa State University, Ames, IA 50011, USA
| | - Hao Cheng
- Department of Animal Science, University of California, Davis, CA 95616, USA
| |
Collapse
|
3
|
Liu PS, Kuo TY, Chen IC, Lee SW, Chang TG, Chen HL, Chen JP. Optimizing methadone dose adjustment in patients with opioid use disorder. Front Psychiatry 2024; 14:1258029. [PMID: 38260800 PMCID: PMC10800821 DOI: 10.3389/fpsyt.2023.1258029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 12/21/2023] [Indexed: 01/24/2024] Open
Abstract
Introduction Opioid use disorder is a cause for concern globally. This study aimed to optimize methadone dose adjustments using mixed modeling and machine learning. Methods This retrospective study was conducted at Taichung Veterans General Hospital between January 1, 2019, and December 31, 2020. Overall, 40,530 daily dosing records and 1,508 urine opiate test results were collected from 96 patients with opioid use disorder. A two-stage approach was used to create a model of the optimized methadone dose. In Stage 1, mixed modeling was performed to analyze the association between methadone dose, age, sex, treatment duration, HIV positivity, referral source, urine opiate level, last methadone dose taken, treatment adherence, and likelihood of treatment discontinuation. In Stage 2, machine learning was performed to build a model for optimized methadone dose. Results Likelihood of discontinuation was associated with reduced methadone doses (β = 0.002, 95% CI = 0.000-0.081). Correlation analysis between the methadone dose determined by physicians and the optimized methadone dose showed a mean correlation coefficient of 0.995 ± 0.003, indicating that the difference between the methadone dose determined by physicians and that determined by the model was within the allowable range (p < 0.001). Conclusion We developed a model for methadone dose adjustment in patients with opioid use disorders. By integrating urine opiate levels, treatment adherence, and likelihood of treatment discontinuation, the model could suggest automatic adjustment of the methadone dose, particularly when face-to-face encounters are impractical.
Collapse
Affiliation(s)
- Po-Shen Liu
- Department of Psychiatry, Taichung Veterans General Hospital, Taichung, Taiwan
| | - Teng-Yao Kuo
- Fundamental General Education Center, National Chinyi University of Technology, Taiping, Taiwan
| | - I-Chun Chen
- Department of Psychiatry, Taichung Veterans General Hospital, Taichung, Taiwan
- Faculty of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
- Department of Post-Baccalaureate Medicine, College of Medicine, National Chung Hsing University, Taichung, Taiwan
| | - Shu-Wua Lee
- Department of Psychiatry, Taichung Veterans General Hospital, Taichung, Taiwan
| | - Ting-Gang Chang
- Department of Psychiatry, Taichung Veterans General Hospital, Taichung, Taiwan
| | - Hou-Liang Chen
- Tsaotun Psychiatric Center, Ministry of Health and Welfare, Nantou, Taiwan
| | - Jun-Peng Chen
- Biostatistics Task Force of Taichung Veterans General Hospital, Taichung, Taiwan
| |
Collapse
|
4
|
Le Bourdonnec K, Samieri C, Tzourio C, Mura T, Mishra A, Trégouët DA, Proust-Lima C. Addressing unmeasured confounders in cohort studies: Instrumental variable method for a time-fixed exposure on an outcome trajectory. Biom J 2024; 66:e2200358. [PMID: 38098309 DOI: 10.1002/bimj.202200358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 06/12/2023] [Accepted: 08/11/2023] [Indexed: 01/30/2024]
Abstract
Instrumental variable methods, which handle unmeasured confounding by targeting the part of the exposure explained by an exogenous variable not subject to confounding, have gained much interest in observational studies. We consider the very frequent setting of estimating the unconfounded effect of an exposure measured at baseline on the subsequent trajectory of an outcome repeatedly measured over time. We didactically explain how to apply the instrumental variable method in such setting by adapting the two-stage classical methodology with (1) the prediction of the exposure according to the instrumental variable, (2) its inclusion into a mixed model to quantify the exposure association with the subsequent outcome trajectory, and (3) the computation of the estimated total variance. A simulation study illustrates the consequences of unmeasured confounding in classical analyses and the usefulness of the instrumental variable approach. The methodology is then applied to 6224 participants of the 3C cohort to estimate the association of type-2 diabetes with subsequent cognitive trajectory, using 42 genetic polymorphisms as instrumental variables. This contribution shows how to handle endogeneity when interested in repeated outcomes, along with a R implementation. However, it should still be used with caution as it relies on instrumental variable assumptions hardly testable in practice.
Collapse
Affiliation(s)
| | - Cécilia Samieri
- Inserm, BPH, U1219, University of Bordeaux, Bordeaux, France
| | | | - Thibault Mura
- Institute for Neurosciences of Montpellier INM, University of Montpellier, INSERM, Montpellier, France
| | - Aniket Mishra
- Inserm, BPH, U1219, University of Bordeaux, Bordeaux, France
| | | | | |
Collapse
|
5
|
Stüber AT, Coors S, Schachtner B, Weber T, Rügamer D, Bender A, Mittermeier A, Öcal O, Seidensticker M, Ricke J, Bischl B, Ingrisch M. A Comprehensive Machine Learning Benchmark Study for Radiomics-Based Survival Analysis of CT Imaging Data in Patients With Hepatic Metastases of CRC. Invest Radiol 2023; 58:874-881. [PMID: 37504498 PMCID: PMC10662603 DOI: 10.1097/rli.0000000000001009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 05/24/2023] [Indexed: 07/29/2023]
Abstract
OBJECTIVES Optimizing a machine learning (ML) pipeline for radiomics analysis involves numerous choices in data set composition, preprocessing, and model selection. Objective identification of the optimal setup is complicated by correlated features, interdependency structures, and a multitude of available ML algorithms. Therefore, we present a radiomics-based benchmarking framework to optimize a comprehensive ML pipeline for the prediction of overall survival. This study is conducted on an image set of patients with hepatic metastases of colorectal cancer, for which radiomics features of the whole liver and of metastases from computed tomography images were calculated. A mixed model approach was used to find the optimal pipeline configuration and to identify the added prognostic value of radiomics features. MATERIALS AND METHODS In this study, a large-scale ML benchmark pipeline consisting of preprocessing, feature selection, dimensionality reduction, hyperparameter optimization, and training of different models was developed for radiomics-based survival analysis. Portal-venous computed tomography imaging data from a previous prospective randomized trial evaluating radioembolization of liver metastases of colorectal cancer were quantitatively accessible through a radiomics approach. One thousand two hundred eighteen radiomics features of hepatic metastases and the whole liver were calculated, and 19 clinical parameters (age, sex, laboratory values, and treatment) were available for each patient. Three ML algorithms-a regression model with elastic net regularization (glmnet), a random survival forest (RSF), and a gradient tree-boosting technique (xgboost)-were evaluated for 5 combinations of clinical data, tumor radiomics, and whole-liver features. Hyperparameter optimization and model evaluation were optimized toward the performance metric integrated Brier score via nested cross-validation. To address dependency structures in the benchmark setup, a mixed-model approach was developed to compare ML and data configurations and to identify the best-performing model. RESULTS Within our radiomics-based benchmark experiment, 60 ML pipeline variations were evaluated on clinical data and radiomics features from 491 patients. Descriptive analysis of the benchmark results showed a preference for RSF-based pipelines, especially for the combination of clinical data with radiomics features. This observation was supported by the quantitative analysis via a linear mixed model approach, computed to differentiate the effect of data sets and pipeline configurations on the resulting performance. This revealed the RSF pipelines to consistently perform similar or better than glmnet and xgboost. Further, for the RSF, there was no significantly better-performing pipeline composition regarding the sort of preprocessing or hyperparameter optimization. CONCLUSIONS Our study introduces a benchmark framework for radiomics-based survival analysis, aimed at identifying the optimal settings with respect to different radiomics data sources and various ML pipeline variations, including preprocessing techniques and learning algorithms. A suitable analysis tool for the benchmark results is provided via a mixed model approach, which showed for our study on patients with intrahepatic liver metastases, that radiomics features captured the patients' clinical situation in a manner comparable to the provided information solely from clinical parameters. However, we did not observe a relevant additional prognostic value obtained by these radiomics features.
Collapse
|
6
|
Errekagorri I, López del Campo R, Resta R, Castellano J. Performance Analysis of the Spanish Men's Top and Second Professional Football Division Teams during Eight Consecutive Seasons. Sensors (Basel) 2023; 23:9115. [PMID: 38005503 PMCID: PMC10675284 DOI: 10.3390/s23229115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 11/07/2023] [Accepted: 11/09/2023] [Indexed: 11/26/2023]
Abstract
The present study aimed to analyse the performance of the Spanish men's top (LaLiga1) and second (LaLiga2) professional football division teams for eight consecutive seasons (from 2011-2012 to 2018-2019). The variables recorded were Passes, Successful Passes, Crosses, Shots, Goals, Corners, Fouls, Width, Length, Height, distance from the goalkeeper to the nearest defender (GkDef) and total distance covered (TD). The main results were that (1) LaLiga1 teams showed lower values of Length from 2013-2014, and lower values of GkDef and TD from 2014-2015; (2) LaLiga2 teams showed fewer Passes and lower values of GkDef and TD from 2014-2015, and fewer Goals and lower values of Length from 2015-2016; and (3) LaLiga1 teams showed more Passes, Successful Passes, Shots and Goals and higher values of TD compared to LaLiga2 teams during the eight-season period. This study concludes that LaLiga1 teams showed fewer final offensive actions, LaLiga2 teams showed fewer Passes and Goals and the teams of both leagues played in a space with greater density (meters by player), covering less distance as the seasons passed. The information provided in this study makes it possible to have reference values that have characterised the performance of the teams.
Collapse
Affiliation(s)
- Ibai Errekagorri
- Society, Sports and Physical Exercise Research Group (GIKAFIT), Department of Physical Education and Sport, Faculty of Education and Sport, University of the Basque Country (UPV/EHU), Lasarte 71, 01007 Vitoria-Gasteiz, Spain
| | - Roberto López del Campo
- Department of Competitions and Mediacocach, LaLiga, Torrelaguna 60, 28043 Madrid, Spain; (R.L.d.C.); (R.R.)
| | - Ricardo Resta
- Department of Competitions and Mediacocach, LaLiga, Torrelaguna 60, 28043 Madrid, Spain; (R.L.d.C.); (R.R.)
| | - Julen Castellano
- Society, Sports and Physical Exercise Research Group (GIKAFIT), Department of Physical Education and Sport, Faculty of Education and Sport, University of the Basque Country (UPV/EHU), Lasarte 71, 01007 Vitoria-Gasteiz, Spain
| |
Collapse
|
7
|
Modric T, Versic S, Winter C, Coll I, Chmura P, Andrzejewski M, Konefał M, Sekulic D. The effect of team formation on match running performance in UEFA Champions League matches: implications for position-specific conditioning. SCI MED FOOTBALL 2023; 7:366-373. [PMID: 36093788 DOI: 10.1080/24733938.2022.2123952] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Abstract
This study aimed to determine the effect of team formation on position-specific match running performance (MRP) at highest-level football. Players' MRP (n = 226) was observed in four team formations: 3-5-2 (n = 24), 4-4-2 (n = 44), 4-2-3-1 (n = 77) and 4-3-3 (n = 81). Central defenders in the 3-5-2 formation achieved a greater amount of high-intensity running distance than in the 4-3-3 formation (mean difference (MD) [95% confidence interval] = 144 m [12, 267], medium ES). Fullbacks in the 4-4-2 formation covered less total distance than in 3-5-2 (MD = -762 m [-1431, -94], large ES) and 4-2-3-1 (MD = -662 m [-1055, -269], medium ES). Central midfielders' total distance in the 4-4-2 formation was lower than that in the 3-5-2 (MD = -645 m [-79, -1211], medium ES) and 4-3-3 (MD = -656 m [-1181, -132], medium ES) formations. Wide midfielders' walking distance in the 4-4-2 formation was lower than that in the 4-3-3 (MD = -484 m [-742, -226], very large ES) and 4-2-3-1 (MD = -535 m [-789, -282], very large ES) formations. Forwards' high-intensity running in the 4-2-3-1 formation was lower than that in the 4-3-3 (MD = -363 m [-613, -112], large ES) and 4-4-2 (MD = -396 m, [-688, -103], large ES) formations. These findings show that conditioning programs for players on all playing positions should be tailored according to the formations of their teams. .
Collapse
Affiliation(s)
- Toni Modric
- Faculty of Kinesiology, University of Split, Split, Croatia
| | - Sime Versic
- Faculty of Kinesiology, University of Split, Split, Croatia
- HNK Hajduk Split, Split, Croatia
| | - Christan Winter
- Institute of Sport Science, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Ian Coll
- HNK Hajduk Split, Split, Croatia
| | - Paweł Chmura
- Department of Team Games, Wrocław University of Health and Sport Sciences, Wrocław, Poland
| | - Marcin Andrzejewski
- Department of Methodology of Recreation, Poznań University of Physical Education, Poznań, Poland
| | - Marek Konefał
- Department of Human Motor Skills, Wrocław University of Health and Sport Sciences, Wrocław, Poland
| | - Damir Sekulic
- Faculty of Kinesiology, University of Split, Split, Croatia
| |
Collapse
|
8
|
Sampaio Filho JS, Olivoto T, Campos MDS, de Oliveira EJ. Multi-trait selection in multi-environments for performance and stability in cassava genotypes. Front Plant Sci 2023; 14:1282221. [PMID: 37965017 PMCID: PMC10642803 DOI: 10.3389/fpls.2023.1282221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 10/16/2023] [Indexed: 11/16/2023]
Abstract
Genotype-environment interaction (GEI) presents challenges when aiming to select optimal cassava genotypes, often due to biased genetic estimates. Various strategies have been proposed to address the need for simultaneous improvements in multiple traits, while accounting for performance and yield stability. Among these methods are mean performance and stability (MPS) and the multi-trait mean performance and stability index (MTMPS), both utilizing linear mixed models. This study's objective was to assess genetic variation and GEI effects on fresh root yield (FRY), along with three primary and three secondary traits. A comprehensive evaluation of 22 genotypes was conducted using a randomized complete block design with three replicates across 47 distinct environments (year x location) in Brazil. The broad-sense heritability (H 2 ) averaged 0.37 for primary traits and 0.44 for secondary traits, with plot-based heritability (h m ɡ 2 ) consistently exceeding 0.90 for all traits. The high extent of GEI variance (σ ɡ x e 2 ) demonstrates the GEI effect on the expression of these traits. The dominant analytic factor ( F A 3 ) accounted for over 85% of the total variance, and the communality (ɧ) surpassed 87% for all traits. These values collectively suggest a substantial capacity for genetic variance explanation. In Cluster 1, composed of remarkably productive and stable genotypes for primary traits, genotypes BRS Novo Horizonte and BR11-34-69 emerged as prime candidates for FRY enhancement, while BRS Novo Horizonte and BR12-107-002 were indicated for optimizing dry matter content. Moreover, MTMPS, employing a selection intensity of 30%, identified seven genotypes distinguished by heightened stability. This selection encompassed innovative genotypes chosen based on regression variance index (S d i 2 , R 2 , and RMSE) considerations for multiple traits. In essence, incorporating methodologies that account for stability and productive performance can significantly bolster the credibility of recommendations for novel cassava cultivars.
Collapse
Affiliation(s)
| | - Tiago Olivoto
- Department of Crop Science, Federal University of Santa Catarina, Florianópolis, Brazil
| | | | | |
Collapse
|
9
|
Chang YHH, Buras MR, Davis JM, Crowson CS. Avoiding Blunders When Analyzing Correlated Data, Clustered Data, or Repeated Measures. J Rheumatol 2023; 50:1269-1272. [PMID: 37188383 PMCID: PMC10543393 DOI: 10.3899/jrheum.2022-1109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/09/2023] [Indexed: 05/17/2023]
Abstract
Rheumatology research often involves correlated and clustered data. A common error when analyzing these data occurs when instead we treat these data as independent observations. This can lead to incorrect statistical inference. The data used are a subset of the 2017 study from Raheel et al consisting of 633 patients with rheumatoid arthritis (RA) between 1988 and 2007. RA flare and the number of swollen joints served as our binary and continuous outcomes, respectively. Generalized linear models (GLM) were fitted for each, while adjusting for rheumatoid factor (RF) positivity and sex. Additionally, a generalized linear mixed model with a random intercept and a generalized estimating equation were used to model RA flare and the number of swollen joints, respectively, to take additional correlation into account. The GLM's β coefficients and their 95% confidence intervals (CIs) are then compared to their mixed-effects equivalents. The β coefficients compared between methodologies are very similar. However, their standard errors increase when correlation is accounted for. As a result, if the additional correlations are not considered, the standard error can be underestimated. This results in an overestimated effect size, narrower CIs, increased type I error, and a smaller P value, thus potentially producing misleading results. It is important to model the additional correlation that occurs in correlated data.
Collapse
Affiliation(s)
- Yu-Hui H Chang
- Y.H.H. Chang, PhD, MS, M.R. Buras, MS, Department of Quantitative Health Sciences, Mayo Clinic, Scottsdale, Arizona
| | - Matthew R Buras
- Y.H.H. Chang, PhD, MS, M.R. Buras, MS, Department of Quantitative Health Sciences, Mayo Clinic, Scottsdale, Arizona
| | - John M Davis
- J.M. Davis III, MD, MS, Division of Rheumatology, Mayo Clinic, Rochester, Minnesota
| | - Cynthia S Crowson
- C.S. Crowson, PhD, Division of Rheumatology, and Department of Quantitative Health Sciences, Mayo Clinic, Rochester, Minnesota, USA.
| |
Collapse
|
10
|
Pedrazzini C, Strasser H, Zemp N, Holderegger R, Widmer F, Enkerli J. Spatial and temporal patterns in the population genomics of the European cockchafer Melolontha melolontha in the Alpine region. Evol Appl 2023; 16:1586-1597. [PMID: 37752964 PMCID: PMC10519412 DOI: 10.1111/eva.13588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 08/04/2023] [Accepted: 08/12/2023] [Indexed: 09/28/2023] Open
Abstract
The European cockchafer Melolontha melolontha is an agricultural pest in many European countries. Populations have a synchronized 3 or 4 years life cycle, leading to temporally isolated populations. Despite the economic importance and availability of comprehensive historical as well as current records on cockchafer occurrence, population genomic analyses of M. melolontha are missing. For example, the effects of geographic separation caused by the mountainous terrain of the Alps and of temporal isolation on the genomic structure of M. melolontha still remain unknown. To address this gap, we genotyped 475 M. melolontha adults collected during 3 years from 35 sites in a central Alpine region. Subsequent population structure analyses discriminated two main genetic clusters, i.e., the South Tyrol cluster including collections located southeast of the Alpine mountain range, and a northwestern alpine cluster with all the other collections, reflecting distinct evolutionary history and geographic barriers. The "passo di Resia" linking South and North Tyrol represented a regional contact zone of the two genetic clusters, highlighting genomic differentiation between the collections from the northern and southern regions. Although the collections from northwestern Italy were assigned to the northwestern alpine genetic cluster, they displayed evidence of admixture with the South Tyrolean genetic cluster, suggesting shared ancestry. A linear mixed model confirmed that both geographic distance and, to a lower extent, also temporal isolation had a significant effect on the genetic distance among M. melolontha populations. These effects may be attributed to limited dispersal capacity and reproductive isolation resulting from synchronized and non-synchronized swarming flights, respectively. This study contributes to the understanding of the phylogeography of an organism that is recognized as an agricultural problem and provides significant information on the population genomics of insect species with prolonged temporally shifted and locally synchronized life cycles.
Collapse
Affiliation(s)
- Chiara Pedrazzini
- Molecular Ecology, AgroscopeZürichSwitzerland
- Institute of Environmental Systems ScienceETHZürichSwitzerland
| | - Hermann Strasser
- Institute of MicrobiologyLeopold‐Franzens University InnsbruckInnsbruckAustria
| | - Niklaus Zemp
- Genetic Diversity Centre (GDC)ETHZürichSwitzerland
| | - Rolf Holderegger
- Institute of Environmental Systems ScienceETHZürichSwitzerland
- Swiss Federal Research Institute WSLBirmensdorfSwitzerland
| | | | | |
Collapse
|
11
|
Brossard M, Paterson AD, Espin-Garcia O, Craiu RV, Bull SB. Characterization of direct and/or indirect genetic associations for multiple traits in longitudinal studies of disease progression. Genetics 2023; 225:iyad119. [PMID: 37369448 DOI: 10.1093/genetics/iyad119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 06/07/2023] [Accepted: 06/19/2023] [Indexed: 06/29/2023] Open
Abstract
When quantitative longitudinal traits are risk factors for disease progression and subject to random biological variation, joint model analysis of time-to-event and longitudinal traits can effectively identify direct and/or indirect genetic association of single nucleotide polymorphisms (SNPs) with time-to-event. We present a joint model that integrates: (1) a multivariate linear mixed model describing trajectories of multiple longitudinal traits as a function of time, SNP effects, and subject-specific random effects and (2) a frailty Cox survival model that depends on SNPs, longitudinal trajectory effects, and subject-specific frailty accounting for dependence among multiple time-to-event traits. Motivated by complex genetic architecture of type 1 diabetes complications (T1DC) observed in the Diabetes Control and Complications Trial (DCCT), we implement a 2-stage approach to inference with bootstrap joint covariance estimation and develop a hypothesis testing procedure to classify direct and/or indirect SNP association with each time-to-event trait. By realistic simulation study, we show that joint modeling of 2 time-to-T1DC (retinopathy and nephropathy) and 2 longitudinal risk factors (HbA1c and systolic blood pressure) reduces estimation bias in genetic effects and improves classification accuracy of direct and/or indirect SNP associations, compared to methods that ignore within-subject risk factor variability and dependence among longitudinal and time-to-event traits. Through DCCT data analysis, we demonstrate feasibility for candidate SNP modeling and quantify effects of sample size and Winner's curse bias on classification for 2 SNPs identified as having indirect associations with time-to-T1DC traits. Joint analysis of multiple longitudinal and multiple time-to-event traits provides insight into complex traits architecture.
Collapse
Affiliation(s)
- Myriam Brossard
- Prosserman Centre for Population Health Research, Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto M5T 3L9, Ontario, Canada
| | - Andrew D Paterson
- Program in Genetics and Genome Biology, Hospital for Sick Children Research Institute, Toronto M5G 1X8, Ontario, Canada
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto M5T 3M7, Ontario, Canada
| | - Osvaldo Espin-Garcia
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto M5T 3M7, Ontario, Canada
- Department of Biostatistics, Princess Margaret Cancer Centre, Toronto M5G 2C1, Ontario, Canada
- Department of Statistical Sciences, University of Toronto, Toronto M5S 3G3, Ontario, Canada
- Department of Epidemiology and Biostatistics, Western University, London N6A 5C1, Ontario, Canada
| | - Radu V Craiu
- Department of Statistical Sciences, University of Toronto, Toronto M5S 3G3, Ontario, Canada
| | - Shelley B Bull
- Prosserman Centre for Population Health Research, Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto M5T 3L9, Ontario, Canada
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto M5T 3M7, Ontario, Canada
| |
Collapse
|
12
|
An Y, Lee C. Identification and Interpretation of eQTL and eGenes for Hodgkin Lymphoma Susceptibility. Genes (Basel) 2023; 14:1142. [PMID: 37372322 DOI: 10.3390/genes14061142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/19/2023] [Accepted: 05/23/2023] [Indexed: 06/29/2023] Open
Abstract
Genome-wide association studies (GWAS) have revealed approximately 100 genomic signals associated with Hodgkin lymphoma (HL); however, their target genes and underlying mechanisms causing HL susceptibility remain unclear. In this study, transcriptome-wide analysis of expression quantitative trait loci (eQTL) was conducted to identify target genes associated with HL GWAS signals. A mixed model, which explains polygenic regulatory effects by the genomic covariance among individuals, was implemented to discover expression genes (eGenes) using genotype data from 462 European/African individuals. Overall, 80 eGenes were identified to be associated with 20 HL GWAS signals. Enrichment analysis identified apoptosis, immune responses, and cytoskeletal processes as functions of these eGenes. The eGene of rs27524 encodes ERAP1 that can cleave peptides attached to human leukocyte antigen in immune responses; its minor allele may help Reed-Sternberg cells to escape the immune response. The eGene of rs7745098 encodes ALDH8A1 that can oxidize the precursor of acetyl-CoA for the production of ATP; its minor allele may increase oxidization activity to evade apoptosis of pre-apoptotic germinal center B cells. Thus, these minor alleles may be genetic risk factors for HL susceptibility. Experimental studies on genetic risk factors are needed to elucidate the underlying mechanisms of HL susceptibility and improve the accuracy of precision oncology.
Collapse
Affiliation(s)
- Yeeun An
- Department of Bioinformatics and Life Science, Soongsil University, 369 Sangdo-ro, Dongjak-gu, Seoul 06978, Republic of Korea
| | - Chaeyoung Lee
- Department of Bioinformatics and Life Science, Soongsil University, 369 Sangdo-ro, Dongjak-gu, Seoul 06978, Republic of Korea
| |
Collapse
|
13
|
Bi W, Zhou W, Zhang P, Sun Y, Yue W, Lee S. Scalable mixed model methods for set-based association studies on large-scale categorical data analysis and its application to exome-sequencing data in UK Biobank. Am J Hum Genet 2023; 110:762-773. [PMID: 37019109 PMCID: PMC10183366 DOI: 10.1016/j.ajhg.2023.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 03/13/2023] [Indexed: 04/07/2023] Open
Abstract
The ongoing release of large-scale sequencing data in the UK Biobank allows for the identification of associations between rare variants and complex traits. SAIGE-GENE+ is a valid approach to conducting set-based association tests for quantitative and binary traits. However, for ordinal categorical phenotypes, applying SAIGE-GENE+ with treating the trait as quantitative or binarizing the trait can cause inflated type I error rates or power loss. In this study, we propose a scalable and accurate method for rare-variant association tests, POLMM-GENE, in which we used a proportional odds logistic mixed model to characterize ordinal categorical phenotypes while adjusting for sample relatedness. POLMM-GENE fully utilizes the categorical nature of phenotypes and thus can well control type I error rates while remaining powerful. In the analyses of UK Biobank 450k whole-exome-sequencing data for five ordinal categorical traits, POLMM-GENE identified 54 gene-phenotype associations.
Collapse
Affiliation(s)
- Wenjian Bi
- Department of Medical Genetics, School of Basic Medical Sciences, Peking University, Beijing, China; Center for Medical Genetics, School of Basic Medical Sciences, Peking University, Beijing, China; Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing, China.
| | - Wei Zhou
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Peipei Zhang
- Department of Biochemistry and Biophysics, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China; Key Laboratory for Neuroscience, Ministry of Education/National Health and Family Planning Commission, Peking University, Beijing, China
| | - Yaoyao Sun
- Peking University Sixth Hospital, Peking University Institute of Mental Health, Beijing, China; NHC Key Laboratory of Mental Health (Peking University), National Clinical Research Center for Mental Disorders (Peking University Sixth Hospital), Beijing, China
| | - Weihua Yue
- Peking University Sixth Hospital, Peking University Institute of Mental Health, Beijing, China; NHC Key Laboratory of Mental Health (Peking University), National Clinical Research Center for Mental Disorders (Peking University Sixth Hospital), Beijing, China; Henan Key Lab of Biological Psychiatry, the Second Affiliated Hospital of Xinxiang Medical University, Xinxiang, Henan, China; Chinese Institute for Brain Research, Beijing, China
| | - Seunggeun Lee
- Graduate School of Data Science, Seoul National University, Seoul, Korea.
| |
Collapse
|
14
|
Cen S, Gebregziabher M, Moazami S, Azevedo C, Pelletier D. Toward Precision Medicine Using a "Digital Twin" Approach: Modeling the Onset of Disease-Specific Brain Atrophy in Individuals with Multiple Sclerosis. Res Sq 2023:rs.3.rs-2833532. [PMID: 37205476 PMCID: PMC10187410 DOI: 10.21203/rs.3.rs-2833532/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Digital Twin (DT) is a novel concept that may bring a paradigm shift for precision medicine. In this study we demonstrate a DT application for estimating the age of onset of disease-specific brain atrophy in individuals with multiple sclerosis (MS) using brain MRI. We first augmented longitudinal data from a well-fitted spline model derived from a large cross-sectional normal aging data. Then we compared different mixed spline models through both simulated and real-life data and identified the mixed spline model with the best fit. Using the appropriate covariate structure selected from 52 different candidate structures, we augmented the thalamic atrophy trajectory over the lifespan for each individual MS patient and a corresponding hypothetical twin with normal aging. Theoretically, the age at which the brain atrophy trajectory of an MS patient deviates from the trajectory of their hypothetical healthy twin can be considered as the onset of progressive brain tissue loss. With a 10-fold cross validation procedure through 1000 bootstrapping samples, we found the onset age of progressive brain tissue loss was, on average, 5-6 years prior to clinical symptom onset. Our novel approach also discovered two clear patterns of patient clusters: earlier onset vs. simultaneous onset of brain atrophy.
Collapse
|
15
|
Cakar S, Yavuz FG. Hybrid statistical and machine learning modeling of cognitive neuroscience data. J Appl Stat 2023; 51:1076-1097. [PMID: 38628450 PMCID: PMC11018039 DOI: 10.1080/02664763.2023.2176834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 01/31/2023] [Indexed: 02/18/2023]
Abstract
The nested data structure is prevalent for cognitive measure experiments due to repeatedly taken observations from different brain locations within subjects. The analysis methods used for this data type should consider the dependency structure among the repeated measurements. However, the dependency assumption is mainly ignored in the cognitive neuroscience data analysis literature. We consider both statistical, and machine learning methods extended to repeated data analysis and compare distinct algorithms in terms of their advantage and disadvantages. Unlike basic algorithm comparison studies, this article analyzes novel neuroscience data considering the dependency structure for the first time with several statistical and machine learning methods and their hybrid forms. In addition, the fitting performances of different algorithms are compared using contaminated data sets, and the cross-validation approach. One of our findings suggests that the GLMM tree, including random term indices indicating the location of functional near-infrared spectroscopy optodes nested within experimental units, shows the best predictive performance with the lowest MSE, RMSE, and MAE model performance metrics. However, there is a trade-off between accuracy and speed since this algorithm is required the highest computational time.
Collapse
Affiliation(s)
- Serenay Cakar
- Department of Statistics, Middle East Technical University, Ankara, Turkey
| | - Fulya Gokalp Yavuz
- Department of Statistics, Middle East Technical University, Ankara, Turkey
| |
Collapse
|
16
|
Kemppainen L, Kemppainen T, Fokkema T, Wrede S, Kouvonen A. Neighbourhood Ethnic Density, Local Language Skills, and Loneliness among Older Migrants-A Population-Based Study on Russian Speakers in Finland. Int J Environ Res Public Health 2023; 20:1117. [PMID: 36673878 PMCID: PMC9859331 DOI: 10.3390/ijerph20021117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 01/04/2023] [Accepted: 01/05/2023] [Indexed: 06/17/2023]
Abstract
So far, little attention has been paid to contextual factors shaping loneliness and their interaction with individual characteristics. Moreover, the few existing studies have not included older migrants, identified as a group who are vulnerable to loneliness. This study examined the association between neighbourhood ethnic density (the proportion of own-group residents and the proportion of other ethnic residents in an area) and loneliness among older migrants. Furthermore, we investigated whether local language skills moderated this association. A population-based representative survey (The CHARM study, n = 1082, 57% men, mean age 63.2 years) and postal code area statistics were used to study Russian-speaking migrants aged 50 or older in Finland. The study design and data are hierarchical, with individuals nested in postcode areas. We accounted for this by estimating corresponding mixed models. We used a linear outcome specification and conducted logistic and ordinal robustness checks. After controlling for covariates, we found that ethnic density variables (measured as the proportion of Russian speakers and the proportion of other foreign speakers) were not associated with loneliness. Our interaction results showed that increased own-group ethnic density was associated with a higher level of loneliness among those with good local language skills but not among those with weaker skills. Good local language skills may indicate a stronger orientation towards the mainstream destination society and living in a neighbourhood with a higher concentration of own-language speakers may feel alienating for those who wish to be more included in mainstream society.
Collapse
Affiliation(s)
- Laura Kemppainen
- Faculty of Social Sciences, University of Helsinki, P.O. Box 4 (Yliopistonkatu 3), 00014 Helsinki, Finland
| | - Teemu Kemppainen
- Department of Geosciences and Geography, University of Helsinki, P.O. Box 4 (Yliopistonkatu 3), 00014 Helsinki, Finland
| | - Tineke Fokkema
- Netherlands Interdisciplinary Demographic Institute (NIDI)-KNAW/University of Groningen, Lange Houtstraat 19, 2511 CV The Hague, The Netherlands
- Department of Public Administration and Sociology, Erasmus School of Social and Behavioural Sciences, Erasmus University Rotterdam, 3062 PA Rotterdam, The Netherlands
| | - Sirpa Wrede
- Faculty of Social Sciences, University of Helsinki, P.O. Box 4 (Yliopistonkatu 3), 00014 Helsinki, Finland
| | - Anne Kouvonen
- Faculty of Social Sciences, University of Helsinki, P.O. Box 4 (Yliopistonkatu 3), 00014 Helsinki, Finland
- Centre for Public Health, Institute of Clinical Science, Queen’s University Belfast, Block A, Royal Victoria Hospital, BT12 6BA Belfast, Ireland
| |
Collapse
|
17
|
Rezapour M, Ksaibati K. The mixed-mixed multinomial logit model for identification of factors to the passengers' seatbelt use. Int J Inj Contr Saf Promot 2023; 30:262-269. [PMID: 36595470 DOI: 10.1080/17457300.2022.2164308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
A better understanding of the underlying factors to the choice of seatbelt use could contribute to the policy solutions, which consequently enhance the rate of seatbelt usage. To achieve that goal, it is important to obtain unbiased and reliable results by employing a valid statistical technique. In this paper, the latent class (LC) model was extended to account for unobserved heterogeneity across parameters within the same class. The random parameter latent class, or mixed-mixed (MM) model, is an extension of the mixed and LC models by adding another layer to the LC model, with an objective of accounting for heterogeneity within a same class. The results indicated that although the LC model outperformed the mixed model, the standard LC model did not account for the whole heterogeneity in the dataset and adding an extra layer for changing the parameter across the observations result in an improvement in a model fit. The results indicated that seatbelt status of the driver, vehicle type, day of a week, and driver gender are some of factors impacting whether or not passengers would wear their seatbelts. It was also observed that accounting for day of a week, drivers' gender, and type of vehicle heterogeneities in the second layer of the MM model result in a better fit, compared with the LC technique. The results of this study expand our understanding about factors to the choice of seatbelt use while capturing extra heterogeneity of the front-seat passengers' choice of seatbelt use. This is one of the earliest studies implemented the technique in the context of the traffic safety, with individual-specific observations.
Collapse
Affiliation(s)
| | - Khaled Ksaibati
- Department of Civil Engineering, University of Wyoming, Laramie, Wyoming, USA
| |
Collapse
|
18
|
Cesarani A, Mastrangelo S, Congiu M, Portolano B, Gaspa G, Tolone M, Macciotta NPP. Relationship between inbreeding and milk production traits in two Italian dairy sheep breeds. J Anim Breed Genet 2023; 140:28-38. [PMID: 36239218 PMCID: PMC10092622 DOI: 10.1111/jbg.12741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Accepted: 09/22/2022] [Indexed: 12/13/2022]
Abstract
The effects of inbreeding in livestock species breeds have been well documented and they have a negative impact on profitability. The objective of this study was to evaluate the levels of inbreeding in Sarda (SAR, n = 785) and Valle del Belice (VdB, n = 473) dairy sheep breeds and their impact on milk production traits. Two inbreeding coefficients (F) were estimated: using pedigree (FPED ), or runs of homozygosity (ROH; FROH ) at different minimum ROH lengths and different ROH classes. After the quality control, 38,779 single nucleotide polymorphisms remained for further analyses. A mixed-linear model was used to evaluate the impact of inbreeding coefficients on production traits within each breed. VdB showed higher inbreeding coefficients compared to SAR, with both breeds showing lower estimates as the minimum ROH length increased. Significant inbreeding depression was found only for milk yield, with a loss of around 7 g/day (for SAR) and 9 g/day (VdB) for a 1% increase of FROH . The present study confirms how the use of genomic information can be used to manage intra-breed diversity and to calculate the effects of inbreeding on phenotypic traits.
Collapse
Affiliation(s)
- Alberto Cesarani
- Dipartimento di Agraria, Università di Sassari, Sassari, Italy.,Department of Animal and Dairy Science, University of Georgia, Athens, Georgia, USA
| | - Salvatore Mastrangelo
- Dipartimento Scienze Agrarie, Alimentari e Forestali, Università di Palermo, Palermo, Italy
| | - Michele Congiu
- Dipartimento di Agraria, Università di Sassari, Sassari, Italy
| | - Baldassare Portolano
- Dipartimento Scienze Agrarie, Alimentari e Forestali, Università di Palermo, Palermo, Italy
| | - Giustino Gaspa
- Dipartimento di Scienze Agrarie, Forestali e Alimentari, Università di Torino, Grugliasco, Italy
| | - Marco Tolone
- Dipartimento Scienze Agrarie, Alimentari e Forestali, Università di Palermo, Palermo, Italy
| | | |
Collapse
|
19
|
Cai R, Zhang J, Li Z, Zeng C, Qiao S, Li X. Using Twitter Data to Estimate the Prevalence of Symptoms of Mental Disorders in the United States During the COVID-19 Pandemic: Ecological Cohort Study. JMIR Form Res 2022; 6:e37582. [PMID: 36459569 PMCID: PMC9770024 DOI: 10.2196/37582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 11/29/2022] [Accepted: 11/30/2022] [Indexed: 12/05/2022] Open
Abstract
BACKGROUND Existing research and national surveillance data suggest an increase of the prevalence of mental disorders during the COVID-19 pandemic. Social media platforms, such as Twitter, could be a source of data for estimation owing to its real-time nature, high availability, and large geographical coverage. However, there is a dearth of studies validating the accuracy of the prevalence of mental disorders on Twitter compared to that reported by the Centers for Disease Control and Prevention (CDC). OBJECTIVE This study aims to verify the feasibility of Twitter-based prevalence of mental disorders symptoms being an instrument for prevalence estimation, where feasibility is gauged via correlations between Twitter-based prevalence of mental disorder symptoms (ie, anxiety and depressive symptoms) and that based on national surveillance data. In addition, this study aims to identify how the correlations changed over time (ie, the temporal trend). METHODS State-level prevalence of anxiety and depressive symptoms was retrieved from the national Household Pulse Survey (HPS) of the CDC from April 2020 to July 2021. Tweets were retrieved from the Twitter streaming application programming interface during the same period and were used to estimate the prevalence of symptoms of mental disorders for each state using keyword analysis. Stratified linear mixed models were used to evaluate the correlations between the Twitter-based prevalence of symptoms of mental disorders and those reported by the CDC. The magnitude and significance of model parameters were considered to evaluate the correlations. Temporal trends of correlations were tested after adding the time variable to the model. Geospatial differences were compared on the basis of random effects. RESULTS Pearson correlation coefficients between the overall prevalence reported by the CDC and that on Twitter for anxiety and depressive symptoms were 0.587 (P<.001) and 0.368 (P<.001), respectively. Stratified by 4 phases (ie, April 2020, August 2020, October 2020, and April 2021) defined by the HPS, linear mixed models showed that Twitter-based prevalence for anxiety symptoms had a positive and significant correlation with CDC-reported prevalence in phases 2 and 3, while a significant correlation for depressive symptoms was identified in phases 1 and 3. CONCLUSIONS Positive correlations were identified between Twitter-based and CDC-reported prevalence, and temporal trends of these correlations were found. Geospatial differences in the prevalence of symptoms of mental disorders were found between the northern and southern United States. Findings from this study could inform future investigation on leveraging social media platforms to estimate symptoms of mental disorders and the provision of immediate prevention measures to improve health outcomes.
Collapse
Affiliation(s)
- Ruilie Cai
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
| | - Jiajia Zhang
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
- South Carolina SmartState Center for Healthcare Quality, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
- University of South Carolina Big Data Health Science Center, Columbia, SC, United States
| | - Zhenlong Li
- South Carolina SmartState Center for Healthcare Quality, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
- University of South Carolina Big Data Health Science Center, Columbia, SC, United States
- Geoinformation and Big Data Research Lab, Department of Geography, University of South Carolina, Columbia, SC, United States
| | - Chengbo Zeng
- South Carolina SmartState Center for Healthcare Quality, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
- University of South Carolina Big Data Health Science Center, Columbia, SC, United States
- Department of Health Promotion, Education and Behavior, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
| | - Shan Qiao
- South Carolina SmartState Center for Healthcare Quality, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
- University of South Carolina Big Data Health Science Center, Columbia, SC, United States
- Department of Health Promotion, Education and Behavior, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
| | - Xiaoming Li
- South Carolina SmartState Center for Healthcare Quality, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
- University of South Carolina Big Data Health Science Center, Columbia, SC, United States
- Department of Health Promotion, Education and Behavior, Arnold School of Public Health, University of South Carolina, Columbia, SC, United States
| |
Collapse
|
20
|
Fischer A, Dai X, Kalscheur KF. Feed efficiency of lactating Holstein cows is repeatable within diet but less reproducible when changing dietary starch and forage concentrations. Animal 2022; 16:100599. [PMID: 35907383 DOI: 10.1016/j.animal.2022.100599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Revised: 06/13/2022] [Accepted: 06/16/2022] [Indexed: 11/01/2022] Open
Abstract
Improving feed efficiency has become an important target for dairy farmers to produce more milk with fewer feed resources. With decreasing availability of arable land to produce feeds that are edible for human consumption, it will be important to increase the proportion of feeds in the diets for dairy cattle that are less edible for human consumption. The current research analyzed the ability of lactating dairy cows to maintain their feed efficiency when switching between a high starch diet (HS diet: 27% starch, 29% NDF, 47.1% forages on a DM basis) and a low starch diet (LS diet: 13% starch, 37% NDF, 66.4% forages on a DM basis). Sixty-two lactating Holstein cows (137 ± 23 days in milk (DIM) at the start of experiment), of which 29 were primiparous cows, were utilized in a crossover design with two 70-d experimental periods, including a 14-d adaption period for each. Feed efficiency was estimated as the individual deviation from the population average intercept in a mixed model predicting DM intake (DMI) with net energy in milk, maintenance and BW gain and loss. Repeatability was estimated within each diet by comparing feed efficiency estimated over the first 28-day period and the second 28-day period within each diet, using Pearson's and intraclass correlations, and the estimation of error of repeatability. Similarly, reproducibility was estimated by comparing the second 28-day period of one diet with the first 28-day period of the other diet. Feed efficiency was less reproducible across diets than repeatable within the same diet. This was shown by lower intraclass correlations (0.399) across diets compared to that in the HS diet (0.587) and LS diet (0.806), as well as a lower Pearson's correlation coefficient (0.418) across diets compared to that in the HS diet (0.630) and LS diet (0.809). In addition, the estimation of error of repeatability was higher (0.830 kg DM/d) across diets compared to that in the HS diet (0.761 kg DM/d) and LS diet (0.504 kg DM/d). This means that the feed efficiency of dairy cows is more likely to change after a diet change than over subsequent lactation stages. Other determinants, such as digestive processes, need to be further investigated to determine its effects on estimating feed efficiency.
Collapse
Affiliation(s)
- A Fischer
- U.S. Dairy Forage Research Center, USDA-Agricultural Research Service, Madison, WI 53706, USA
| | - X Dai
- U.S. Dairy Forage Research Center, USDA-Agricultural Research Service, Madison, WI 53706, USA
| | - K F Kalscheur
- U.S. Dairy Forage Research Center, USDA-Agricultural Research Service, Madison, WI 53706, USA.
| |
Collapse
|
21
|
Zhao T, Zeng J, Cheng H. Extend mixed models to multilayer neural networks for genomic prediction including intermediate omics data. Genetics 2022; 221:6536967. [PMID: 35212766 PMCID: PMC9071534 DOI: 10.1093/genetics/iyac034] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 02/17/2022] [Indexed: 11/13/2022] Open
Abstract
With the growing amount and diversity of intermediate omics data complementary to genomics (e.g. DNA methylation, gene expression, and protein abundance), there is a need to develop methods to incorporate intermediate omics data into conventional genomic evaluation. The omics data help decode the multiple layers of regulation from genotypes to phenotypes, thus forms a connected multilayer network naturally. We developed a new method named NN-MM to model the multiple layers of regulation from genotypes to intermediate omics features, then to phenotypes, by extending conventional linear mixed models ("MM") to multilayer artificial neural networks ("NN"). NN-MM incorporates intermediate omics features by adding middle layers between genotypes and phenotypes. Linear mixed models (e.g. pedigree-based BLUP, GBLUP, Bayesian Alphabet, single-step GBLUP, or single-step Bayesian Alphabet) can be used to sample marker effects or genetic values on intermediate omics features, and activation functions in neural networks are used to capture the nonlinear relationships between intermediate omics features and phenotypes. NN-MM had significantly better prediction performance than the recently proposed single-step approach for genomic prediction with intermediate omics data. Compared to the single-step approach, NN-MM can handle various patterns of missing omics measures and allows nonlinear relationships between intermediate omics features and phenotypes. NN-MM has been implemented in an open-source package called "JWAS".
Collapse
Affiliation(s)
- Tianjing Zhao
- Department of Animal Science, University of California Davis, Davis, CA 95616, USA,Integrative Genetics and Genomics Graduate Group, University of California Davis, Davis, CA 95616, USA
| | - Jian Zeng
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Hao Cheng
- Department of Animal Science, University of California Davis, Davis, CA 95616, USA,Corresponding author: Department of Animal Science, University of California, Davis, CA 95616, USA.
| |
Collapse
|
22
|
Abstract
We propose fast univariate inferential approaches for longitudinal Gaussian and non-Gaussian functional data. The approach consists of three steps: (1) fit massively univariate pointwise mixed effects models; (2) apply any smoother along the functional domain; and (3) obtain joint confidence bands using analytic approaches for Gaussian data or a bootstrap of study participants for non-Gaussian data. Methods are motivated by two applications: (1) Diffusion Tensor Imaging (DTI) measured at multiple visits along the corpus callosum of multiple sclerosis (MS) patients; and (2) physical activity data measured by body-worn accelerometers for multiple days. An extensive simulation study indicates that model fitting and inference are accurate and much faster than existing approaches. Moreover, the proposed approach was the only one that was computationally feasible for the physical activity data application. Methods are accompanied by R software, though the method is "read-and-use", as it can be implemented by any analyst who is familiar with mixed effects model software.
Collapse
Affiliation(s)
- Erjia Cui
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, USA
| | - Andrew Leroux
- Department of Biostatistics and Informatics, University of Colorado, USA
| | | | | |
Collapse
|
23
|
Mishra SK, Pradhan SK, Pati S, Sahu S, Nanda RK. Waning of Anti-spike Antibodies in AZD1222 (ChAdOx1) Vaccinated Healthcare Providers: A Prospective Longitudinal Study. Cureus 2021; 13:e19879. [PMID: 34976499 PMCID: PMC8712221 DOI: 10.7759/cureus.19879] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/24/2021] [Indexed: 12/11/2022] Open
Abstract
Introduction Coronavirus disease 2019 (COVID-19) vaccines are nothing short of a miracle story halting the pandemic across the globe. Nearly half of the global population has received at least one dose. Nevertheless, antibody levels in vaccinated people have shown waning, and breakthrough infections have occurred. Our study aims to measure antibody kinetics following AZD1222 (ChAdOx1) vaccination six months after the second dose and the factors affecting the kinetics. Materials and methods We conducted a prospective longitudinal study monitoring for six months after the second of two AZD1222 (ChAdOx1) vaccine doses in healthcare professionals and healthcare facility employees at Veer Surendra Sai Institute of Medical Sciences and Research (included doctors, nurses, paramedical staff, security and sanitary workers, and students). Two 0.5-mL doses of the vaccine were administered intramuscularly, containing 5 x 1010 viral particles 28 to 30 days between doses. We collected blood samples one month after the first dose (Round 1), one month after the second dose (Round 2), and six months after the second dose (Round 3). We tested for immunoglobulin G (IgG) levels against the receptor-binding domain of the spike protein of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) by chemiluminescence microparticle immunoassay. We conducted a linear mixed model analysis to study the antibody kinetics and influencing factors. Results Our study included 122 participants (mean age, 41.5 years; 66 men, 56 women). The geometric mean IgG titers were 138.01 binding antibody units (BAU)/mL in Round 1, 176.48 BAU/mL in Round 2, and 112.95 BAU/mL in Round 3. Seven participants showed seroreversion, and 11 had breakthrough infections. Eighty-six participants showed a substantial decline in antibody titer from Rounds 2 to 3. Persons aged 45 or older had higher mean titer than people aged younger than 45 years. Overweight and obese (BMI ≥ 25 kg/m2) had a higher mean titer than average or underweight persons. The only significant predictor of IgG titers at six months was SARS-CoV-2 infection on mixed model analysis. Conclusion We found a substantial decline in antibody levels leading to seven cases of seroreversion in healthcare professionals who received the ChAdOx1 vaccine. History of prior COVID-19 was the only significant factor in antibody levels at six months. Seroreversion and breakthrough infection warrant further research into the optimal timing and potential benefits of booster doses of the AZD1222 (ChAdOx1) COVID-19 vaccine.
Collapse
Affiliation(s)
- Sanjeeb K Mishra
- Community Medicine, Veer Surendra Sai Institute of Medical Sciences And Research, Sambalpur, IND
- Field Epidemiology, Indian Council of Medical Research, Chennai, IND
| | - Subrat K Pradhan
- Community Medicine, Veer Surendra Sai Institute of Medical Sciences And Research, Sambalpur, IND
| | - Sanghamitra Pati
- Biochemistry, Regional Medical Research Centre, Bhubaneswar, Bhubaneswar, IND
| | - Sumanta Sahu
- Microbiology, Veer Surendra Sai Institute of Medical Sciences And Research, Sambalpur, IND
| | - Rajiv K Nanda
- Physiology, Veer Surendra Sai Institute of Medical Sciences And Research, Sambalpur, IND
| |
Collapse
|
24
|
Birk N, Matsuzaki M, Fung TT, Li Y, Batis C, Stampfer MJ, Deitchler M, Willett WC, Fawzi WW, Bromage S, Kinra S, Bhupathiraju SN, Lake E. Exploration of Machine Learning and Statistical Techniques in Development of a Low-Cost Screening Method Featuring the Global Diet Quality Score for Detecting Prediabetes in Rural India. J Nutr 2021; 151:110S-118S. [PMID: 34689190 PMCID: PMC8542097 DOI: 10.1093/jn/nxab281] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 07/26/2021] [Accepted: 08/02/2021] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND The prevalence of type 2 diabetes has increased substantially in India over the past 3 decades. Undiagnosed diabetes presents a public health challenge, especially in rural areas, where access to laboratory testing for diagnosis may not be readily available. OBJECTIVES The present work explores the use of several machine learning and statistical methods in the development of a predictive tool to screen for prediabetes using survey data from an FFQ to compute the Global Diet Quality Score (GDQS). METHODS The outcome variable prediabetes status (yes/no) used throughout this study was determined based upon a fasting blood glucose measurement ≥100 mg/dL. The algorithms utilized included the generalized linear model (GLM), random forest, least absolute shrinkage and selection operator (LASSO), elastic net (EN), and generalized linear mixed model (GLMM) with family unit as a (cluster) random (intercept) effect to account for intrafamily correlation. Model performance was assessed on held-out test data, and comparisons made with respect to area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. RESULTS The GLMM, GLM, LASSO, and random forest modeling techniques each performed quite well (AUCs >0.70) and included the GDQS food groups and age, among other predictors. The fully adjusted GLMM, which included a random intercept for family unit, achieved slightly superior results (AUC of 0.72) in classifying the prediabetes outcome in these cluster-correlated data. CONCLUSIONS The models presented in the current work show promise in identifying individuals at risk of developing diabetes, although further studies are necessary to assess other potentially impactful predictors, as well as the consistency and generalizability of model performance. In addition, future studies to examine the utility of the GDQS in screening for other noncommunicable diseases are recommended.
Collapse
Affiliation(s)
- Nick Birk
- Department of Biostatistics, Harvard TH Chan School of Public Health, Boston, MA, USA
- Department of Non-Communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, University of London, London, United Kingdom
| | - Mika Matsuzaki
- Department of Nutrition, Harvard TH Chan School of Public Health, Boston, MA, USA
- Department of International Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Teresa T Fung
- Nutrition Department, Simmons University, Boston, MA, USA
| | - Yanping Li
- Department of Nutrition, Harvard TH Chan School of Public Health, Boston, MA, USA
| | - Carolina Batis
- CONACYT—Health and Nutrition Research Center, National Institute of Public Health, Cuernavaca, Mexico
| | - Meir J Stampfer
- Department of Nutrition, Harvard TH Chan School of Public Health, Boston, MA, USA
- Department of Epidemiology, Harvard TH Chan School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Megan Deitchler
- Intake—Center for Dietary Assessment, FHI Solutions, Washington, DC, USA
| | - Walter C Willett
- Department of Nutrition, Harvard TH Chan School of Public Health, Boston, MA, USA
- Department of Epidemiology, Harvard TH Chan School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Wafaie W Fawzi
- Department of Global Health and Population, Harvard TH Chan School of Public Health, Boston, MA, USA
| | - Sabri Bromage
- Department of Nutrition, Harvard TH Chan School of Public Health, Boston, MA, USA
| | - Sanjay Kinra
- Department of Non-Communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, University of London, London, United Kingdom
| | - Shilpa N Bhupathiraju
- Department of Nutrition, Harvard TH Chan School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Erin Lake
- Department of Biostatistics, Harvard TH Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
25
|
Suchocki T, Czech B, Dunislawska A, Slawinska A, Derebecka N, Wesoly J, Siwek M, Szyda J. SNP prioritization in targeted sequencing data associated with humoral immune responses in chicken. Poult Sci 2021; 100:101433. [PMID: 34551372 PMCID: PMC8458985 DOI: 10.1016/j.psj.2021.101433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 07/13/2021] [Accepted: 08/15/2021] [Indexed: 11/30/2022] Open
Abstract
Our study aimed to identify single nucleotide polymorphisms (SNPs) with a significant impact on the innate immunity represented by antibody response against lipopolysaccharide (LPS) and lipoteichoid acid (LTA) and the adaptive immune response represented toward keyhole limpet hemocyanin (KLH) using the SNP prioritization method. Data set consisted of 288 F2 experimental individuals, created by crossing Green-legged Partridgelike and White Leghorn. The analyzed SNPs were located within 24 short genomic regions of GGA1, GGA2, GGA3, GGA4, GGA9, GGA10, GGA14, GGA18, and GGZ, pre-targeted based on literature references and database information. For the specific antibody response toward KLH at d 0 the most highly prioritized SNP for additive and dominance effects were located on GGA2 in the 3’UTR of MYD88. For the response at d 7, the most highly prioritized SNP pointed at the 3’UTR of MYD88, but potential causal additive variants were located within ADIPOQ and one in PROCR. The highest priority for additive and dominance effects in the antibody response toward lipoteichoic acid at d 0 was attributed to the same SNP, located on GGA2 in the 3’UTR region of MYD88. Two SNPs among the top-10 for additive effect were located in the exon of NOCT. SNPs selected for their additive effect on antibody response toward lipopolysaccharide at d 0 marked 3 genes – NOCT, MYD88, and SNX8, while SNPs selected for their dominance effect marked – NOCT, ADIPOQ, and MYD88. The top-10 variants identified in our study were located in different functional parts of the genome. In the context of causality three groups can be distinguished: variants located in exons of protein coding genes (ADIPOQ, NOCT, PROCR, SNX8), variants within exons of non-coding transcripts, and variants located in genes’ UTR regions. Variants from the first group influence protein structure and variants from both latter groups’ exhibit regulatory roles on DNA (UTR) or RNA (lncRNA).
Collapse
Affiliation(s)
- Tomasz Suchocki
- Biostatistics Group, Department of Genetics, Wrocław University of Environmental and Life Sciences, Wrocław, Poland; National Research Institute of Animal Production, Balice, Poland
| | - Bartosz Czech
- Biostatistics Group, Department of Genetics, Wrocław University of Environmental and Life Sciences, Wrocław, Poland
| | - Aleksandra Dunislawska
- Department of Animal Biotechnology and Genetics, UTP University of Science and Technology, Bydgoszcz 85-084, Poland
| | - Anna Slawinska
- Department of Animal Biotechnology and Genetics, UTP University of Science and Technology, Bydgoszcz 85-084, Poland
| | - Natalia Derebecka
- Laboratory of High Throughput Technologies, Faculty of Biology, Adam Mickiewicz University, Poznan, Poland
| | - Joanna Wesoly
- Laboratory of High Throughput Technologies, Faculty of Biology, Adam Mickiewicz University, Poznan, Poland
| | - Maria Siwek
- Department of Animal Biotechnology and Genetics, UTP University of Science and Technology, Bydgoszcz 85-084, Poland.
| | - Joanna Szyda
- Biostatistics Group, Department of Genetics, Wrocław University of Environmental and Life Sciences, Wrocław, Poland; National Research Institute of Animal Production, Balice, Poland
| |
Collapse
|
26
|
Chuy V, Gentreau M, Artero S, Berticat C, Rigalleau V, Pérès K, Helmer C, Samieri C, Féart C. Simple carbohydrate intake and higher risk for physical frailty over 15 years in community-dwelling older adults. J Gerontol A Biol Sci Med Sci 2021; 77:10-18. [PMID: 34417799 DOI: 10.1093/gerona/glab243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Indexed: 11/13/2022] Open
Abstract
Insulin resistance is a major mechanism involved in the onset of physical frailty (PF). Although rich carbohydrate diets may promote insulin resistance, few studies have examined their association with PF risk. This study aimed to investigate the spectrum of carbohydrate exposure, including carbohydrate intake (simple, complex, and total), glycemic load (measure of the diet-related insulin-demand), and adherence to a low-carbohydrate diet with the incident risk of PF in community-dwelling older adults. Baseline carbohydrate exposure was assessed in non-frail participants of the Three-City-Bordeaux cohort using a 24H dietary recall. Over 15 years of follow-up, participants were screened for PF, defined by the FRAIL scale (≥3 criteria out of Fatigue, Resistance, Ambulation, Illnesses, and weight Loss). Associations were estimated using mixed-effects logistic models adjusted for sex, age, education, smoking status, alcohol consumption, depressive symptomatology, global cognitive performances, and protein and energy intakes. The sample included 1,210 participants (62% females, mean age 76 years). Over the follow-up, 295 (24%) incident cases of PF were documented (28% in females, 18% in males). Higher intake of simple carbohydrates was significantly associated with greater odds of incident PF (per 1-SD increased: OR = 1.29; 95% CI = 1.02-1.62), specifically among males (OR = 1.52; 95% CI = 1.04-2.22). No association was observed with complex or total carbohydrate intake, glycemic load, or low-carbohydrate diet. Among the whole carbohydrate exposure, only higher consumption of simple carbohydrates in older age was associated with a higher risk of developing PF. Further studies are required to explore underlying mechanisms.
Collapse
Affiliation(s)
- Virginie Chuy
- Univ. Bordeaux, INSERM, BPH, U1219, Bordeaux, France.,Univ. Bordeaux, CHU Bordeaux, Department of Dentistry and Oral Health, Bordeaux, France
| | - Mélissa Gentreau
- Institute of Functional Genomics, University of Montpellier, CNRS, INSERM, Montpellier, France
| | - Sylvaine Artero
- Institute of Functional Genomics, University of Montpellier, CNRS, INSERM, Montpellier, France
| | - Claire Berticat
- ISEM, University of Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - Vincent Rigalleau
- Univ. Bordeaux, INSERM, BPH, U1219, Bordeaux, France.,Univ. Bordeaux, CHU Bordeaux, Department of Endocrinology, Bordeaux, France
| | - Karine Pérès
- Univ. Bordeaux, INSERM, BPH, U1219, Bordeaux, France
| | - Catherine Helmer
- Univ. Bordeaux, INSERM, BPH, U1219, Bordeaux, France.,Clinical and Epidemiological Research Unit, INSERM CIC1401, Bordeaux, France
| | | | | |
Collapse
|
27
|
Devasia TP, Dewaraja YK, Frey KA, Wong KK, Schipper MJ. A Novel Time-Activity Information-Sharing Approach Using Nonlinear Mixed Models for Patient-Specific Dosimetry with Reduced Imaging Time Points: Application in SPECT/CT After 177Lu-DOTATATE. J Nucl Med 2021; 62:1118-1125. [PMID: 33443063 DOI: 10.2967/jnumed.120.256255] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 12/01/2020] [Indexed: 11/16/2022] Open
Abstract
Multiple-time-point SPECT/CT imaging for dosimetry is burdensome for patients and lacks statistical efficiency. A novel method for joint kidney time-activity estimation based on a statistical mixed model, a prior cohort of patients with complete time-activity data, and only 1 or 2 imaging points for new patients was compared with previously proposed single-time-point methods in virtual and clinical patient data. Methods: Data were available for 10 patients with neuroendocrine tumors treated with 177Lu-DOTATATE and imaged up to 4 times between days 0 and 7 using SPECT/CT. Mixed models using 1 or 2 time points were evaluated retrospectively in the clinical cohort, using the multiple-time-point fit as the reference. Time-activity data for 250 virtual patients were generated using parameter values from the clinical cohort. Mixed models were fit using 1 (∼96 h) and 2 (4 h, ∼96 h) time points for each virtual patient combined with complete data for the other patients in each dataset. Time-integrated activities (TIAs) calculated from mixed model fits and other reduced-time-point methods were compared with known values. Results: All mixed models and single-time-point methods performed well overall, achieving mean bias < 7% in the virtual cohort. Mixed models exhibited lower bias, greater precision, and substantially fewer outliers than did single-time-point methods. For clinical patients, 1- and 2-time-point mixed models resulted in more accurate TIA estimates for 94% (17/18) and 72% (13/18) of kidneys, respectively. In virtual patients, mixed models resulted in more than a 2-fold reduction in the proportion of kidneys with |bias| > 10% (6% vs. 15%). Conclusion: Mixed models based on a historical cohort of patients with complete time-activity data and new patients with only 1 or 2 SPECT/CT scans demonstrate less bias on average and significantly fewer outliers when estimating kidney TIA, compared with popular reduced-time-point methods. Use of mixed models allows for reduction of the imaging burden while maintaining accuracy, which is crucial for clinical implementation of dosimetry-based treatment.
Collapse
Affiliation(s)
- Theresa P Devasia
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan;
| | - Yuni K Dewaraja
- Department of Radiology, University of Michigan, Ann Arbor, Michigan; and
| | - Kirk A Frey
- Department of Radiology, University of Michigan, Ann Arbor, Michigan; and
| | - Ka Kit Wong
- Department of Radiology, University of Michigan, Ann Arbor, Michigan; and
| | - Matthew J Schipper
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan
| |
Collapse
|
28
|
Tao C, Gu M, Xu P, Wang J, Xiao L, Gui W, Li F, Jiang S, Liu X, Hu W, Sun W. Stressful life events can predict post-stroke fatigue in patients with ischemic stroke. Eur J Neurol 2021; 28:3080-3088. [PMID: 34129716 DOI: 10.1111/ene.14977] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 06/09/2021] [Accepted: 06/10/2021] [Indexed: 01/04/2023]
Abstract
OBJECTIVE To investigate whether stressful life events (SLEs) can predict post-stroke fatigue (PSF) in patients with acute ischemic stroke (AIS). METHODS This prospective cohort study included data from patients with AIS who were followed up to 2-year interview. PSF was assessed at admission and at 6 (n = 916), 12 (n = 880), and 24 (n = 857) months with the fatigue severity scale (FSS). SLEs were measured with the Social Readjustment Rating Scale questionnaire at 6, 12 and 24 months' interview. RESULTS A significant dose-response association was found between SLEs and FSS score across all examined time-points: compared with those did not experience SLEs, FSS score was higher for those experiencing SLEs ≥3 at 6 months (β 0.53, 95% CI 0.28-0.78), 12 months (β 0.54, 95% CI 0.30-0.78) and 24 months (β 0.48, 95% CI 0.29-0.68). Longitudinal analyses indicated a significantly positive relationship between the number of SLEs and FSS score (SLEs: ≥3 vs. 0, β 0.14, 95% CI 0.09-0.19). Moreover, a distinct interaction of follow-up time and SLE numbers on FSS score was observed (p < 0.05), which means elevated exposure to SLEs during follow-up was associated with a lower rate of fatigue decline. A similar association was found in SLE load analysis. CONCLUSION Patients with severe fatigue were more likely to report increased number of SLEs in the previous 6 months, which could suggest that a non-specific stressful event leads to an extra burden to an already vulnerable psychological system.
Collapse
Affiliation(s)
- Chunrong Tao
- Stroke Center & Department of Neurology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| | - Mengmeng Gu
- Department of Neurology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Pengfei Xu
- Stroke Center & Department of Neurology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| | - Jinjing Wang
- Department of Neurology, Affiliated Jinling Hospital, Medical School of Nanjing University, Nanjing, China
| | - Lulu Xiao
- Department of Neurology, Affiliated Jinling Hospital, Medical School of Nanjing University, Nanjing, China
| | - Wei Gui
- Stroke Center & Department of Neurology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| | - Fengli Li
- Department of Neurology, Xinqiao Hospital and The Second Affiliated Hospital, Army Medical University (Third Military Medical University), Chongqing, China
| | - Shiyi Jiang
- Stroke Center & Department of Neurology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| | - Xinfeng Liu
- Stroke Center & Department of Neurology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China.,Department of Neurology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Wei Hu
- Stroke Center & Department of Neurology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| | - Wen Sun
- Stroke Center & Department of Neurology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China.,Department of Neurology, Affiliated Jinling Hospital, Medical School of Nanjing University, Nanjing, China
| |
Collapse
|
29
|
Arakawa A, Hayashi T, Taniguchi M, Mikawa S, Nishio M. Hamiltonian Monte Carlo method for estimating variance components. Anim Sci J 2021; 92:e13575. [PMID: 34227195 DOI: 10.1111/asj.13575] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Revised: 05/09/2021] [Accepted: 05/12/2021] [Indexed: 11/27/2022]
Abstract
A Hamiltonian Monte Carlo algorithm is a Markov chain Monte Carlo method, and the method has a potential to improve estimating parameters effectively. Hamiltonian Monte Carlo is based on Hamiltonian dynamics, and it follows Hamilton's equations, which are expressed as two differential equations. In the sampling process of Hamiltonian Monte Carlo, a numerical integration method called leapfrog integration is used to approximately solve Hamilton's equations, and the integration is required to set the number of discrete time steps and the integration stepsize. These two parameters require some amount of tuning and calibration for effective sampling. In this study, we applied the Hamiltonian Monte Carlo method to animal breeding data and identified the optimal tunings of leapfrog integration for normal and inverse chi-square distributions. Then, using real pig data, we revealed the properties of the Hamiltonian Monte Carlo method with the optimal tuning by applying models including variance explained by pedigree information or genomic information. Compared with the Gibbs sampling method, the Hamiltonian Monte Carlo method had superior performance in both models. We have provided the source codes of this method written in the Fortran language at https://github.com/A-ARAKAWA/HMC.
Collapse
Affiliation(s)
- Aisaku Arakawa
- Division of Animal Breeding and Reproduction Research, Institute of Livestock and Grassland Science, National Agriculture and Food Research Organization, Tsukuba, Ibaraki, Japan
| | - Takeshi Hayashi
- Division of Basic Research, Institute of Crop Science, National Agriculture and Food Research Organization, Tsukuba, Ibaraki, Japan
| | - Masaaki Taniguchi
- Division of Animal Breeding and Reproduction Research, Institute of Livestock and Grassland Science, National Agriculture and Food Research Organization, Tsukuba, Ibaraki, Japan
| | - Satoshi Mikawa
- Division of Animal Breeding and Reproduction Research, Institute of Livestock and Grassland Science, National Agriculture and Food Research Organization, Tsukuba, Ibaraki, Japan
| | - Motohide Nishio
- Division of Animal Breeding and Reproduction Research, Institute of Livestock and Grassland Science, National Agriculture and Food Research Organization, Tsukuba, Ibaraki, Japan
| |
Collapse
|
30
|
Ibidhi R, Bharanidharan R, Kim JG, Hong WH, Nam IS, Baek YC, Kim TH, Kim KH. Developing Equations for Converting Digestible Energy to Metabolizable Energy for Korean Hanwoo Beef Cattle. Animals (Basel) 2021; 11:1696. [PMID: 34200254 DOI: 10.3390/ani11061696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Revised: 05/29/2021] [Accepted: 06/05/2021] [Indexed: 11/29/2022] Open
Abstract
Simple Summary The available energy in feedstuff represents the largest proportion of the total cost for intensive beef production. Therefore, the energy content of feeds must be known before diet formulation. The determination of digestible energy (DE) and metabolizable energy (ME) values by animal experiments is both time-consuming and costly. Predictive equations to estimate the ME from DE can be useful for feed ingredient evaluations and diet formulations. A range of regression equations were developed in the present study, taking into consideration the gender and body weights of the animals, as well as the feed nutrients, to predict the relationship between the DE and ME. An evaluation of these equations suggested predicting the ME value based on ME = 0.9215 × DE − 0.1434 (R2 = 0.999). The generation of these predictive equations represents a step towards updating the ME:DE default conversion factor value of 0.82 adopted from the National Research Council to meet the ME requirements of beef cattle in Korea. The new recommended predictive equation enables the adjustment of the nutrient requirements, thus enhancing animal productivity and maximising the economic return for beef farmers. Abstract This study was performed to update and generate prediction equations for converting digestible energy (DE) to metabolizable energy (ME) for Korean Hanwoo beef cattle, taking into consideration the gender (male and female) and body weights (BW above and below 350 kg) of the animals. The data consisted of 141 measurements from respiratory chambers with a wide range of diets and energy intake levels. A simple linear regression of the overall unadjusted data suggested a strong relationship between the DE and ME (Mcal/kg DM): ME = 0.8722 × DE + 0.0016 (coefficient of determination (R2) = 0.946, root mean square error (RMSE) = 0.107, p < 0.001 for intercept and slope). Mixed-model regression analyses to adjust for the effects of the experiment from which the data were obtained similarly showed a strong linear relationship between the DE and ME (Mcal/kg of DM): ME = 0.9215 × DE − 0.1434 (R2 = 0.999, RMSE = 0.004, p < 0.001 for the intercept and slope). The DE was strongly related to the ME for both genders: ME = 0.8621 × DE + 0.0808 (R2 = 0.9600, RMSE = 0.083, p < 0.001 for the intercept and slope) and ME = 0.7785 × DE + 0.1546 (R2 = 0.971, RMSE = 0.070, p < 0.001 for the intercept and slope) for male and female Hanwoo cattle, respectively. By BW, the simple linear regression similarly showed a strong relationship between the DE and ME for Hanwoo above and below 350 kg BW: ME = 0.9833 × DE − 0.2760 (R2 = 0.991, RMSE = 0.055, p < 0.001 for the intercept and slope) and ME = 0.72975 × DE + 0.38744 (R2 = 0.913, RMSE = 0.100, p < 0.001 for the intercept and slope), respectively. A multiple regression using the DE and dietary factors as independent variables did not improve the accuracy of the ME prediction (ME = 1.149 × DE − 0.045 × crude protein + 0.011 × neutral detergent fibre − 0.027 × acid detergent fibre + 0.683).
Collapse
|
31
|
Calvo E, Allel K, Staudinger UM, Castillo-Carniglia A, Medina JT, Keyes KM. Cross-country differences in age trends in alcohol consumption among older adults: a cross-sectional study of individuals aged 50 years and older in 22 countries. Addiction 2021; 116:1399-1412. [PMID: 33241648 PMCID: PMC8131222 DOI: 10.1111/add.15292] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Revised: 06/01/2020] [Accepted: 10/02/2020] [Indexed: 12/14/2022]
Abstract
BACKGROUND AND AIMS Age-related changes in physiological, metabolic and medication profiles make alcohol consumption likely to be more harmful among older than younger adults. This study aimed to estimate cross-national variation in the quantity and patterns of drinking throughout older age, and to investigate country-level variables explaining cross-national variation in consumption for individuals aged 50 years and older. DESIGN Cross-sectional observational study using previously harmonized survey data. SETTING Twenty-two countries surveyed in 2010 or the closest available year. PARTICIPANTS A total of 106 180 adults aged 50 years and over. MEASUREMENTS Cross-national variation in age trends were estimated for two outcomes: weekly number of standard drink units (SDUs) and patterns of alcohol consumption (never, ever, occasional, moderate and heavy drinking). Human Development Index and average prices of vodka were used as country-level variables moderating age-related declines in drinking. FINDINGS Alcohol consumption was negatively associated with age (risk ratio = 0.98; 95% confidence interval = 0.97, 0.99; P-value < 0.001), but there was substantial cross-country variation in the age-related differences in alcohol consumption [likelihood ratio (LR) test P-value < 0.001], even after adjusting for the composition of populations. Countries' development level and alcohol prices explained 31% of cross-country variability in SDUs (LR test P-value < 0.001) but did not explain cross-country variability in the prevalence of heavy drinkers. CONCLUSIONS Use and harmful use of alcohol among older adults appears to vary widely across age and countries. This variation can be partly explained both by the country-specific composition of populations and country-level contextual factors such as development level and alcohol prices.
Collapse
Affiliation(s)
- Esteban Calvo
- Society and Health Research Center, School of Public Health, Universidad Mayor, Santiago, Chile
- Laboratory on Aging and Social Epidemiology, Facultad de Humanidades, Universidad Mayor, Santiago, Chile
- Department of Epidemiology, Mailman School of Public Health, Columbia University, NY, USA
- Robert N. Butler Columbia Aging Center, Mailman School of Public Health, Columbia University, NY, USA
| | - Kasim Allel
- Society and Health Research Center, School of Public Health, Universidad Mayor, Santiago, Chile
- Laboratory on Aging and Social Epidemiology, Facultad de Humanidades, Universidad Mayor, Santiago, Chile
- Institute for Global Health, University College London, London, UK
| | - Ursula M. Staudinger
- Robert N. Butler Columbia Aging Center, Mailman School of Public Health, Columbia University, NY, USA
- Department of Sociomedical Sciences, Mailman School of Public Health, Columbia University, NY, USA
| | - Alvaro Castillo-Carniglia
- Society and Health Research Center, School of Public Health, Universidad Mayor, Santiago, Chile
- Laboratory on Aging and Social Epidemiology, Facultad de Humanidades, Universidad Mayor, Santiago, Chile
- Department of Population Health, New York University School of Medicine, NY, USA
| | - José T. Medina
- Society and Health Research Center, School of Public Health, Universidad Mayor, Santiago, Chile
- Laboratory on Aging and Social Epidemiology, Facultad de Humanidades, Universidad Mayor, Santiago, Chile
| | - Katherine M. Keyes
- Society and Health Research Center, School of Public Health, Universidad Mayor, Santiago, Chile
- Department of Epidemiology, Mailman School of Public Health, Columbia University, NY, USA
| |
Collapse
|
32
|
Saroj R, Soumya SL, Singh S, Sankar SM, Chaudhary R, Saini N, Vasudev S, Yadava DK. Unraveling the Relationship Between Seed Yield and Yield-Related Traits in a Diversity Panel of Brassica juncea Using Multi-Traits Mixed Model. Front Plant Sci 2021; 12:651936. [PMID: 34017349 PMCID: PMC8129585 DOI: 10.3389/fpls.2021.651936] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 03/16/2021] [Indexed: 06/12/2023]
Abstract
The response to selection in any crop improvement program depends on the degree of variance and heritability. The objective of the current study was to explain variance and heritability components in Indian mustard Brassica juncea (L). Czern & Coss to recognize promising genotypes for effective breeding. Two hundred and eighty-nine diverse accessions of Indian mustard belonging to four continents were analyzed for yield and yield-related traits (20 traits) over two seasons (2017-2018 and 2018-2019) using an alpha lattice design. The genetic variance was found to be significant (P ≤ 0.01) for the individual and under pooled analysis for all of the evaluated traits, demonstrating the presence of significant genetic variability in the diversity panel, which bids greater opportunities for utilizing these traits in future breeding programs. High heritability combined with high genetic advance as percent of mean and genotypic coefficient of variation was observed for flowering traits, plant height traits, seed size, and seed yield/plant; hence, a better genetic gain is expected upon the selection of these traits over subsequent generations. Both correlation and stepwise regression analysis indicated that the main shoot length, biological yield, total seed yield, plant height up to the first primary branch, seed size, total siliqua count, days to flowering initiation, plant height at maturity, siliquae on the main shoot, main shoot length, and siliqua length were the most significant contributory traits for seed yield/plant. Also, promising genotypes were identified among the diversity panel, which can be utilized as a donor to improve Indian mustard further. These results indicated a greater scope for improving seed yield per plant directly through a selection of genotypes having the parsimonious combination of these nine traits.
Collapse
|
33
|
Klich A, Ecochard R, Subtil F. Trajectory clustering using mixed classification models. Stat Med 2021; 40:3425-3439. [PMID: 33827149 DOI: 10.1002/sim.8975] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2020] [Revised: 02/15/2021] [Accepted: 03/21/2021] [Indexed: 11/12/2022]
Abstract
Trajectory classification has become frequent in clinical research to understand the heterogeneity of individual trajectories. The standard classification model for trajectories assumes no between-individual variance within groups. However, this assumption is often not appropriate, which may overestimate the error variance of the model, leading to a biased classification. Hence, two extensions of the standard classification model were developed through a mixed model. The first one considers an equal between-individual variance across groups, and the second one considers unequal between-individual variance. Simulations were performed to evaluate the impact of these considerations on the classification. The simulation results showed that the first extended model gives a lower misclassification percentage (with differences up to 50%) than the standard one in case of presence of a true variance between individuals inside groups. The second model decreases the misclassification percentage compared with the first one (up to 11%) when the between-individual variance is unequal between groups. However, these two extensions require high number of repeated measurements to be adjusted correctly. Using human chorionic gonadotropin trajectories after curettage for hydatidiform mole, the standard classification model classified trajectories mainly according to their levels whereas the two extended models classified them according to their patterns, which provided more clinically relevant groups. In conclusion, for studies with a nonnegligible number of repeated measurements, the use, in first instance, of a classification model that considers equal between-individual variance across groups rather than a standard classification model, appears more appropriate. A model that considers unequal between-individual variance may find its place thereafter.
Collapse
Affiliation(s)
- Amna Klich
- Université de Lyon, Lyon, France.,Université Lyon 1, Villeurbanne, France.,Service de Biostatistique-Bioinformatique, Pôle Santé Publique, Hospices Civils de Lyon, Lyon, France.,Équipe Biostatistique-Santé, Laboratoire de Biométrie et Biologie Évolutive, UMR CNRS 5558, Villeurbanne, France
| | - René Ecochard
- Université de Lyon, Lyon, France.,Université Lyon 1, Villeurbanne, France.,Service de Biostatistique-Bioinformatique, Pôle Santé Publique, Hospices Civils de Lyon, Lyon, France.,Équipe Biostatistique-Santé, Laboratoire de Biométrie et Biologie Évolutive, UMR CNRS 5558, Villeurbanne, France
| | - Fabien Subtil
- Université de Lyon, Lyon, France.,Université Lyon 1, Villeurbanne, France.,Service de Biostatistique-Bioinformatique, Pôle Santé Publique, Hospices Civils de Lyon, Lyon, France.,Équipe Biostatistique-Santé, Laboratoire de Biométrie et Biologie Évolutive, UMR CNRS 5558, Villeurbanne, France
| |
Collapse
|
34
|
Josey KP, Ringham BM, Barón AE, Schenkman M, Sauder KA, Muller KE, Dabelea D, Glueck DH. Power for balanced linear mixed models with complex missing data processes. COMMUN STAT-THEOR M 2021; 52:46-64. [PMID: 36743328 PMCID: PMC9897326 DOI: 10.1080/03610926.2021.1909732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Accepted: 03/21/2021] [Indexed: 02/07/2023]
Abstract
When designing repeated measures studies, both the amount and the pattern of missing outcome data can affect power. The chance that an observation is missing may vary across measurements, and missingness may be correlated across measurements. For example, in a physiotherapy study of patients with Parkinson's disease, increasing intermittent dropout over time yielded missing measurements of physical function. In this example, we assume data are missing completely at random, since the chance that a data point was missing appears to be unrelated to either outcomes or covariates. For data missing completely at random, we propose noncentral F power approximations for the Wald test for balanced linear mixed models with Gaussian responses. The power approximations are based on moments of missing data summary statistics. The moments were derived assuming a conditional linear missingness process. The approach provides approximate power for both complete-case analyses, which include independent sampling units where all measurements are present, and observed-case analyses, which include all independent sampling units with at least one measurement. Monte Carlo simulations demonstrate the accuracy of the method in small samples. We illustrate the utility of the method by computing power for proposed replications of the Parkinson's study.
Collapse
Affiliation(s)
- Kevin P. Josey
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Denver, Denver, Colorado, USA
| | - Brandy M. Ringham
- Lifecourse Epidemiology of Adiposity and Disease (LEAD) Center, Colorado School of Public Health, University of Colorado Denver, Denver, Colorado, USA
| | - Anna E. Barón
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Denver, Denver, Colorado, USA
| | - Margaret Schenkman
- Physical Therapy Program, School of Medicine, University of Colorado Denver, Denver, Colorado, USA
| | - Katherine A. Sauder
- Department of Pediatrics, School of Medicine, University of Colorado Denver, Denver, Colorado, USA
| | - Keith E. Muller
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Dana Dabelea
- Department of Epidemiology and the Lifecourse Epidemiology of Adiposity and Diabetes (LEAD) Center, Colorado School of Public Health, University of Colorado Denver, Denver, Colorado, USA
| | - Deborah H. Glueck
- Department of Pediatrics, School of Medicine, University of Colorado Denver, Denver, Colorado, USA
| |
Collapse
|
35
|
Happ MM, Graef GL, Wang H, Howard R, Posadas L, Hyten DL. Comparing a Mixed Model Approach to Traditional Stability Estimators for Mapping Genotype by Environment Interactions and Yield Stability in Soybean [ Glycine max (L.) Merr.]. Front Plant Sci 2021; 12:630175. [PMID: 33868333 PMCID: PMC8044453 DOI: 10.3389/fpls.2021.630175] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Accepted: 03/11/2021] [Indexed: 06/12/2023]
Abstract
Identifying genetic loci associated with yield stability has helped plant breeders and geneticists begin to understand the role and influence of genotype by environment (GxE) interactions in soybean [Glycine max (L.) Merr.] productivity, as well as other crops. Quantifying a genotype's range of performance across testing locations has been developed over decades with dozens of methodologies available. This includes directly modeling GxE interactions as part of an overall model for yield, as well as methods which generate overall yield "stability" values from multi-environment trial data. Correspondence between these methods as it pertains to the outcomes of genome wide association studies (GWAS) has not been well defined. In this study, the GWAS results for yield and yield stability were compared in 213 soybean lines across 11 environments to determine their utility and potential intersection. Both univariate and multivariate conventional stability estimates were considered alongside a mixed model for yield that fit marker by environment interactions as a random effect. One-hundred and six total QTL were discovered across all mapping results, however, genetic loci that were significant in the mixed model for grain yield that fit marker by environment interactions were completely distinct from those that were significant when mapping using traditional stability measures as a phenotype. Furthermore, 73.21% of QTL discovered in the mixed model were determined to cause a crossover interaction effect which cause genotype rank changes between environments. Overall, the QTL discovered via explicitly mapping GxE interactions also explained more yield variance that those QTL associated with differences in traditional stability estimates making their theoretical impact on selection greater. A lack of intersecting results between mapping approaches highlights the importance of examining stability in multiple contexts when attempting to manipulate GxE interactions in soybean.
Collapse
Affiliation(s)
- Mary M. Happ
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - George L. Graef
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - Haichuan Wang
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - Reka Howard
- Department of Statistics, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - Luis Posadas
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - David L. Hyten
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, United States
| |
Collapse
|
36
|
Ebrahimpoor M, Spitali P, Goeman JJ, Tsonaka R. Pathway testing for longitudinal metabolomics. Stat Med 2021; 40:3053-3065. [PMID: 33768548 PMCID: PMC8252476 DOI: 10.1002/sim.8957] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Revised: 02/19/2021] [Accepted: 03/04/2021] [Indexed: 01/12/2023]
Abstract
We propose a top‐down approach for pathway analysis of longitudinal metabolite data. We apply a score test based on a shared latent process mixed model which can identify pathways with differentially progressing metabolites. The strength of our approach is that it can handle unbalanced designs, deals with potential missing values in the longitudinal markers, and gives valid results even with small sample sizes. Contrary to bottom‐up approaches, correlations between metabolites are explicitly modeled leveraging power gains. For large pathway sizes, a computationally efficient solution is proposed based on pseudo‐likelihood methodology. We demonstrate the advantages of the proposed method in identification of differentially expressed pathways through simulation studies. Finally, longitudinal metabolite data from a mice experiment is analyzed to demonstrate our methodology.
Collapse
Affiliation(s)
- Mitra Ebrahimpoor
- Medical Statistics, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - Pietro Spitali
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Jelle J Goeman
- Medical Statistics, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - Roula Tsonaka
- Medical Statistics, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| |
Collapse
|
37
|
Wu H, Jones MP. Proportional likelihood ratio mixed model for discrete longitudinal data. Stat Med 2021; 40:2272-2285. [PMID: 33588517 DOI: 10.1002/sim.8902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2020] [Revised: 12/23/2020] [Accepted: 01/21/2021] [Indexed: 11/07/2022]
Abstract
Rathouz and Gao [2] and Luo and Tsai [3] proposed valuable extensions to the generalized linear model for modeling a nonlinear monotonic relationship between the mean response and a set of covariates. In their extensions for discrete data the baseline response distribution is unspecified and is estimated from the data. We propose to extend this model for the analysis of longitudinal data by incorporating random effects into the linear predictor, and using maximum likelihood for estimation and inference. Motivated in particular by longitudinal studies of clinical scale outcomes, we developed an estimation procedure for a finite-support response using a generalized expectation-maximization algorithm where Gauss-Hermite quadrature is employed to approximate the integrals in the E step of the algorithm. Upon convergence, the observed information matrix is estimated through second-order numerical differentiation of the log-likelihood function. Asymptotic properties of the maximum likelihood estimates follow under the usual regularity conditions. Simulation studies are conducted to assess its finite-sample properties and compare the proposed model to the generalized linear mixed model. The proposed method is illustrated in an analysis of data from a longitudinal study of Huntington's disease.
Collapse
Affiliation(s)
| | - Michael P Jones
- Department of Biostatistics, University of Iowa, Iowa City, Iowa, USA
| |
Collapse
|
38
|
Newans T, Bellinger P, Buxton S, Quinn K, Minahan C. Movement Patterns and Match Statistics in the National Rugby League Women's (NRLW) Premiership. Front Sports Act Living 2021; 3:618913. [PMID: 33644751 PMCID: PMC7904888 DOI: 10.3389/fspor.2021.618913] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Accepted: 01/20/2021] [Indexed: 01/22/2023] Open
Abstract
As women's rugby league grows, the need for understanding the movement patterns of the sport is essential for coaches and sports scientists. The aims of the present study were to quantify the position-specific demographics, technical match statistics, and movement patterns of the National Rugby League Women's (NRLW) Premiership and to identify whether there was a change in the intensity of play as a function of game time played. A retrospective observational study was conducted utilizing global positioning system, demographic, and match statistics collected from 117 players from all NRLW clubs across the full 2018 and 2019 seasons and were compared between the ten positions using generalized linear mixed models. The GPS data were separated into absolute (i.e., total distance, high-speed running distance, and acceleration load) and relative movement patterns (i.e., mean speed, mean high speed (> 12 km·h-1), and mean acceleration). For absolute external outputs, fullbacks covered the greatest distance (5,504 m), greatest high-speed distance (1,081 m), and most ball-carry meters (97 m), while five-eighths recorded the greatest acceleration load (1,697 m·s-2). For relative external outputs, there were no significant differences in mean speed and mean high speed between positions, while mean acceleration only significantly differed between wingers and interchanges. Only interchange players significantly decreased in mean speed as their number of minutes played increased. By understanding the load of NRLW matches, coaches, high-performance staff, and players can better prepare as the NRLW Premiership expands. These movement patterns and match statistics of NRLW matches can lay the foundation for future research as women's rugby league expands. Similarly, coaches, high-performance staff, and players can also refine conditioning practices with a greater understanding of the external output of NRLW players.
Collapse
Affiliation(s)
- Tim Newans
- Griffith Sports Science, Griffith University, Gold Coast, QLD, Australia.,Queensland Academy of Sport, Nathan, QLD, Australia
| | - Phillip Bellinger
- Griffith Sports Science, Griffith University, Gold Coast, QLD, Australia.,Queensland Academy of Sport, Nathan, QLD, Australia
| | | | - Karlee Quinn
- Griffith Sports Science, Griffith University, Gold Coast, QLD, Australia.,Queensland Academy of Sport, Nathan, QLD, Australia
| | - Clare Minahan
- Griffith Sports Science, Griffith University, Gold Coast, QLD, Australia
| |
Collapse
|
39
|
Maupetit A, Fabre B, Pétrowski J, Andrieux A, De Mita S, Frey P, Halkett F, Hayden KJ. Evolution of morphological but not aggressiveness-related traits following a major resistance breakdown in the poplar rust fungus, Melampsora larici-populina. Evol Appl 2021; 14:513-523. [PMID: 33664791 PMCID: PMC7896724 DOI: 10.1111/eva.13136] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Revised: 09/01/2020] [Accepted: 09/02/2020] [Indexed: 11/29/2022] Open
Abstract
Crop varieties carrying qualitative resistance to targeted pathogens lead to strong selection pressure on parasites, often resulting in resistance breakdown. It is well known that qualitative resistance breakdowns modify pathogen population structure but few studies have analyzed the consequences on their quantitative aggressiveness-related traits. The aim of this study was to characterize the evolution of these traits following a resistance breakdown in the poplar rust fungus, Melampsora larici-populina. We based our experiment on three temporal populations sampled just before the breakdown event, immediately after and four years later. First, we quantified phenotypic differences among populations for a set of aggressiveness traits on a universally susceptible cultivar (infection efficiency, latent period, lesion size, mycelium quantity, and sporulation rate) and one morphological trait (mean spore volume). Then, we estimated heritability to establish which traits could be subjected to adaptive evolution and tested for evidence of selection. Our results revealed significant changes in the morphological trait but no variation in aggressiveness traits. By contrast, recent works have demonstrated that quantitative resistance (initially assumed more durable) could be eroded and lead to increased aggressiveness. Hence, this study is one example suggesting that the use of qualitative resistance may be revealed to be less detrimental to long-term sustainable crop production.
Collapse
Affiliation(s)
- Agathe Maupetit
- INRAEUniversité de LorraineNancyFrance
- Royal Botanical Garden EdinburghEdinburghUK
- Present address:
IFREMER, Physiology and Biotechnology of Algae LaboratoryNantesFrance
| | | | | | | | | | | | | | | |
Collapse
|
40
|
Straumfors A, Corbin M, McLean D, 't Mannetje A, Olsen R, Afanou A, Daae HL, Skare Ø, Ulvestad B, Laier Johnsen H, Eduard W, Douwes J. Exposure Determinants of Wood Dust, Microbial Components, Resin Acids and Terpenes in the Saw- and Planer Mill Industry. Ann Work Expo Health 2021; 64:282-296. [PMID: 31942929 PMCID: PMC7064270 DOI: 10.1093/annweh/wxz096] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Revised: 11/04/2019] [Accepted: 12/11/2019] [Indexed: 12/02/2022] Open
Abstract
Objectives Sawmill workers have an increased risk of adverse respiratory outcomes, but knowledge about exposure–response relationships is incomplete. The objective of this study was to assess exposure determinants of dust, microbial components, resin acids, and terpenes in sawmills processing pine and spruce, to guide the development of department and task-based exposure prediction models. Methods 2474 full-shift repeated personal airborne measurements of dust, resin acids, fungal spores and fragments, endotoxins, mono-, and sesquiterpenes were conducted in 10 departments of 11 saw- and planer mills in Norway in 2013–2016. Department and task-based exposure determinants were identified and geometric mean ratios (GMRs) estimated using mixed model regression. The effects of season and wood type were also studied. Results The exposure ratio of individual components was similar in many of the departments. Nonetheless, the highest microbial and monoterpene exposure (expressed per hour) were estimated in the green part of the sawmills: endotoxins [GMR (95% confidence interval) 1.2 (1.0–1.3)], fungal spores [1.1 (1.0–1.2)], and monoterpenes [1.3 (1.1–1.4)]. The highest resin acid GMR was estimated in the dry part of the sawmills [1.4 (1.2–1.5)]. Season and wood type had a large effect on the estimated exposure. In particular, summer and spruce were strong determinants of increased exposure to endotoxin (GMRs [4.6 (3.5–6.2)] and [2.0 (1.4–3.0)], respectively) and fungal spores (GMRs [2.2 (1.7–2.8)] and [1.5 (1.0–2.1)], respectively). Pine was a strong determinant for increased exposure to both resin acid and monoterpenes. Work as a boilerman was associated with moderate to relatively high exposure to all components [1.0–1.4 (0.8–2.0)], although the estimates were based on 13–15 samples only. Cleaning in the saw, planer, and sorting of dry timber departments was associated with high exposure estimates for several components, whereas work with transportation and stock/finished goods were associated with low exposure estimates for all components. The department-based models explained 21–61% of the total exposure variances, 0–90% of the between worker (BW) variance, and 1–36% of the within worker (WW) variances. The task-based models explained 22–62% of the total variance, 0–91% of the BW variance, and 0–33% of the WW variance. Conclusions Exposure determinants in sawmills including department, task, season, and wood type differed for individual components, and explained a relatively large proportion of the total variances. Application of department/task-based exposure prediction models for specific exposures will therefore likely improve the assessment of exposure–response associations.
Collapse
Affiliation(s)
- Anne Straumfors
- National Institute of Occupational Health, Majorstuen, Oslo, Norway.,Centre for Public Health Research, Massey University, Wellington Campus, Wellington, New Zealand
| | - Marine Corbin
- Centre for Public Health Research, Massey University, Wellington Campus, Wellington, New Zealand
| | - Dave McLean
- Centre for Public Health Research, Massey University, Wellington Campus, Wellington, New Zealand
| | - Andrea 't Mannetje
- Centre for Public Health Research, Massey University, Wellington Campus, Wellington, New Zealand
| | - Raymond Olsen
- National Institute of Occupational Health, Majorstuen, Oslo, Norway
| | - Anani Afanou
- National Institute of Occupational Health, Majorstuen, Oslo, Norway
| | - Hanne-Line Daae
- National Institute of Occupational Health, Majorstuen, Oslo, Norway
| | - Øivind Skare
- National Institute of Occupational Health, Majorstuen, Oslo, Norway
| | - Bente Ulvestad
- National Institute of Occupational Health, Majorstuen, Oslo, Norway
| | | | - Wijnand Eduard
- National Institute of Occupational Health, Majorstuen, Oslo, Norway
| | - Jeroen Douwes
- Centre for Public Health Research, Massey University, Wellington Campus, Wellington, New Zealand
| |
Collapse
|
41
|
Ujeneza EL, Ndifon W, Sawry S, Fatti G, Riou J, Davies MA, Nieuwoudt M. A mechanistic model for long-term immunological outcomes in South African HIV-infected children and adults receiving ART. eLife 2021; 10:42390. [PMID: 33443013 PMCID: PMC7857728 DOI: 10.7554/elife.42390] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Accepted: 01/13/2021] [Indexed: 01/23/2023] Open
Abstract
Long-term effects of the growing population of HIV-treated people in Southern Africa on individuals and the public health sector at large are not yet understood. This study proposes a novel ‘ratio’ model that relates CD4+ T-cell counts of HIV-infected individuals to the CD4+ count reference values from healthy populations. We use mixed-effects regression to fit the model to data from 1616 children (median age 4.3 years at ART initiation) and 14,542 adults (median age 36 years at ART initiation). We found that the scaled carrying capacity, maximum CD4+ count relative to an HIV-negative individual of similar age, and baseline scaled CD4+ counts were closer to healthy values in children than in adults. Post-ART initiation, CD4+ growth rate was inversely correlated with baseline CD4+ T-cell counts, and consequently higher in adults than children. Our results highlight the impacts of age on dynamics of the immune system of healthy and HIV-infected individuals. The human immunodeficiency virus (HIV) remains an ongoing global pandemic. There is currently no cure for HIV, but antiretroviral therapies can keep the virus in check and allow individuals with HIV to live longer, healthier lives. These drugs work in two ways. They block the ability of the virus to multiply and they allow numbers of an important type of infection-fighting cell called CD4+ T cells to rebound. As more patients with HIV survive and transition from one life stage to the next, it is critical to understand how long-term antiretroviral therapies will affect normal age-related changes in their immune systems. The health of an immune system can be evaluated by looking at the number of CD4+ T cells an individual has, though this will vary by age and location. Clinicians use the same metrics to assess the immune health of individuals with HIV, however, as they age, it becomes a challenge to identify if a patient’s immune system recovers normally or insufficiently. Thus, learning more about age-related differences in CD4+ T cells in people living with HIV may help improve their care. Using data from 1,616 children and 14,542 adults from South Africa, Ujeneza et al. created a simple mathematical model that can compare the immune system of person with HIV with the immune system of a similarly aged healthy individual. The model shows that among individuals with HIV receiving antiretroviral therapies, children have CD4+ T-cell numbers that are closest to the numbers seen in healthy individuals of the same age. This suggests that children may be more able to recover immune system function than adults after beginning treatment. Children also start antiretroviral therapies before their immune system has been severely damaged, while adults tend to start treatment much later when they have fewer CD4+ T cells left. Ujeneza et al. show that the fewer CD4+ T cells a person has when they start treatment, the faster the number of these cells grows after starting treatment. This suggests that the more damaged the immune system is, the harder it works to recover. This reinforces the need to identify people infected with HIV as soon as possible through testing and to begin treatment promptly. The new model may help clinicians and policy makers develop screening and treatment protocols tailored to the specific needs of children and adults living with HIV.
Collapse
Affiliation(s)
- Eva Liliane Ujeneza
- Department of Science and Technology and National Research Foundation, South African Centre for Epidemiological Modelling and Analysis (SACEMA), Stellenbosch University, Stellenbosch, South Africa.,African Institute for Mathematical Sciences (AIMS), Next Einstein Initiative, Kigali, Rwanda
| | - Wilfred Ndifon
- African Institute for Mathematical Sciences (AIMS), Next Einstein Initiative, Kigali, Rwanda
| | - Shobna Sawry
- Harriet Shezi Children's Clinic, Wits Reproductive Health and HIV Institute, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| | - Geoffrey Fatti
- Kheth'Impilo AIDS Free Living, Cape Town, South Africa.,Division of Epidemiology and Biostatistics, Department of Global Health, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Julien Riou
- Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland
| | - Mary-Ann Davies
- Centre for Infectious Disease Epidemiology and Research, School of Public Health and Family Medicine, University of Cape Town, Cape Town, South Africa
| | - Martin Nieuwoudt
- Department of Science and Technology and National Research Foundation, South African Centre for Epidemiological Modelling and Analysis (SACEMA), Stellenbosch University, Stellenbosch, South Africa.,Institute for Biomedical Engineering (IBE), Stellenbosch University, Stellenbosch, South Africa
| | | |
Collapse
|
42
|
Tegegne AS. Joint Predictors of Hypertension and Type 2 Diabetes Among Adults Under Treatment in Amhara Region (North-Western Ethiopia). Diabetes Metab Syndr Obes 2021; 14:2453-2463. [PMID: 34103954 PMCID: PMC8179751 DOI: 10.2147/dmso.s309925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/06/2021] [Accepted: 05/07/2021] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND One of the chronic diseases, all over the world, due to its significant contribution to the existence of other health problems is hypertension. It is known that hypertensive patients exposed to diabetes and the reverse is also true. The objective of the current investigation was to identify joint risk factors for hypertension and type 2 diabetes for adults under treatment. METHODS A random sample of 748 hypertensive and type 2 diabetic patients was selected. A retrospective longitudinal study was conducted with the selected patients who were receiving treatment for both hypertension and type 2 diabetes. A joint linear mixed-effect model was used for data analysis in this investigation. RESULTS The current investigation revealed that age (β = 0.18, p-value = 0.04 for hypertension, β = 0.81, p-value = 0.02 for type 2 diabetes) and weight of patients (β = 0.52, p-value <0.01 for hypertension, β = 0.32, p-value <0.01 for type 2 diabetes) were positively and significantly associated with existence of hypertension and type 2 diabetes whereas visiting times (β = -0.08, p-value = 0.04 for hypertension, β = -0.38, p-value = 0.03 for type 2 diabetes) were negatively associated with the variables of interest. Similarly, patients who do not exercise, who smoke, and drink and patients with a family history of disease were positively associated with the existence of the variables of interest. CONCLUSION Hypertension and diabetes are highly correlated and one is the causes of the other. Hypertensive and diabetic patients should be aware that they should stop drinking alcohol and smoking and should attend properly to their medication as prescribed by health staff. They should also be advised to undertake physical exercise to reduce risks related to these two correlated diseases.
Collapse
Affiliation(s)
- Awoke Seyoum Tegegne
- Department of Statistics, Bahir Dar University, Bahir Dar, Ethiopia
- Correspondence: Awoke Seyoum Tegegne Department of Statistics, Bahir Dar University, Bahir Dar, EthiopiaTel +251918779451 Email
| |
Collapse
|
43
|
Frouin A, Dandine-Roulland C, Pierre-Jean M, Deleuze JF, Ambroise C, Le Floch E. Exploring the Link Between Additive Heritability and Prediction Accuracy From a Ridge Regression Perspective. Front Genet 2020; 11:581594. [PMID: 33329721 PMCID: PMC7672157 DOI: 10.3389/fgene.2020.581594] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Accepted: 09/29/2020] [Indexed: 11/13/2022] Open
Abstract
Genome-Wide Association Studies (GWAS) explain only a small fraction of heritability for most complex human phenotypes. Genomic heritability estimates the variance explained by the SNPs on the whole genome using mixed models and accounts for the many small contributions of SNPs in the explanation of a phenotype. This paper approaches heritability from a machine learning perspective, and examines the close link between mixed models and ridge regression. Our contribution is two-fold. First, we propose estimating genomic heritability using a predictive approach via ridge regression and Generalized Cross Validation (GCV). We show that this is consistent with classical mixed model based estimation. Second, we derive simple formulae that express prediction accuracy as a function of the ratio n p , where n is the population size and p the total number of SNPs. These formulae clearly show that a high heritability does not imply an accurate prediction when p > n. Both the estimation of heritability via GCV and the prediction accuracy formulae are validated using simulated data and real data from UK Biobank.
Collapse
Affiliation(s)
- Arthur Frouin
- CNRGH, Institut Jacob, CEA - Université Paris-Saclay, Évry, France
| | | | | | - Jean-François Deleuze
- CNRGH, Institut Jacob, CEA - Université Paris-Saclay, Évry, France.,Centre d'Etude du Polymorphisme Humain, Fondation Jean Dausset, Paris, France
| | - Christophe Ambroise
- LaMME, Université Paris-Saclay, CNRS, Université d'Évry val d'Essonne, Évry, France
| | - Edith Le Floch
- CNRGH, Institut Jacob, CEA - Université Paris-Saclay, Évry, France
| |
Collapse
|
44
|
Shin Y, Sun S, Bandyopadhyay D. Impact of adolescent obesity on middle-age health of women given data MAR. Biom J 2020; 62:1702-1716. [PMID: 32542849 DOI: 10.1002/bimj.201900094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Revised: 02/12/2020] [Accepted: 03/26/2020] [Indexed: 11/10/2022]
Abstract
We analyze adolescent BMI and middle-age systolic blood pressure (SBP) repeatedly measured on women enrolled in the Fels Longitudinal Study (FLS) between 1929 and 2010 to address three questions: Do adolescent-specific growth rates in body mass index (BMI) and menarche affect middle-age SBP? Do they moderate the aging effect on middle-age SBP? Have the effects changed over historical time? To address the questions, we propose analyzing a growth curve model (GCM) that controls for age, birth-year cohort, and historical time. However, several complications in the data make the GCM analysis nonstandard. First, the person-specific adolescent BMI and middle-age SBP trajectories are unobservable. Second, missing data are substantial on BMI, SBP, and menarche. Finally, modeling the latent trajectories for BMI and SBP, repeatedly measured on two distinct sets of unbalanced time points, are computationally intensive. We adopt a bivariate GCM for BMI and SBP with correlated random coefficients. To efficiently handle missing values of BMI, SBP, and menarche assumed missing at random, we estimate their joint distribution by maximum likelihood via the EM algorithm where the correlated random coefficients and menarche are multivariate normal. The estimated distribution will be transformed to the desired GCM for SBP that includes the random coefficients of BMI and menarche as covariates. We demonstrate unbiased estimation by simulation. We find that adolescent growth rates in BMI and menarche are positively associated with, and moderate, the aging effect on SBP in middle age, controlling for age, cohort, and historical time, but the effect sizes are at most modest. The aging effect is significant on SBP, controlling for cohort and historical time, but not vice versa.
Collapse
Affiliation(s)
- Yongyun Shin
- Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, USA
| | - Shumei Sun
- Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, USA
| | | |
Collapse
|
45
|
Gruner P, Schmitt AK, Flath K, Schmiedchen B, Eifler J, Gordillo A, Schmidt M, Korzun V, Fromme FJ, Siekmann D, Tratwal A, Danielewicz J, Korbas M, Marciniak K, Krysztofik R, Niewińska M, Koch S, Piepho HP, Miedaner T. Mapping Stem Rust ( Puccinia graminis f. sp. secalis) Resistance in Self-Fertile Winter Rye Populations. Front Plant Sci 2020; 11:667. [PMID: 32528509 PMCID: PMC7265987 DOI: 10.3389/fpls.2020.00667] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Accepted: 04/29/2020] [Indexed: 06/03/2023]
Abstract
Rye stem rust caused by Puccinia graminis f. sp. secalis can be found in all European rye growing regions. When the summers are warm and dry, the disease can cause severe yield losses over large areas. To date only little research was done in Europe to trigger resistance breeding. To our knowledge, all varieties currently registered in Germany are susceptible. In this study, three biparental populations of inbred lines and one testcross population developed for mapping resistance were investigated. Over 2 years, 68-70 genotypes per population were tested, each in three locations. Combining the phenotypic data with genotyping results of a custom 10k Infinium iSelect single nucleotide polymorphism (SNP) array, we identified both quantitatively inherited adult plant resistance and monogenic all-stage resistance. A single resistance gene, tentatively named Pgs1, located at the distal end of chromosome 7R, could be identified in two independently developed populations. With high probability, it is closely linked to a nucleotide-binding leucine-rich repeat (NB-LRR) resistance gene homolog. A marker for a competitive allele-specific polymerase chain reaction (KASP) genotyping assay was designed that could explain 73 and 97% of the genetic variance in each of both populations, respectively. Additional investigation of naturally occurring rye leaf rust (caused by Puccinia recondita ROEBERGE) revealed a gene complex on chromosome 7R. The gene Pgs1 and further identified quantitative trait loci (QTL) have high potential to be used for breeding stem rust resistant rye.
Collapse
Affiliation(s)
- Paul Gruner
- State Plant Breeding Institute, University of Hohenheim, Stuttgart, Germany
| | - Anne-Kristin Schmitt
- Institute for Plant Protection in Field Crops and Grassland, Julius-Kuehn Institute, Kleinmachnow, Germany
| | - Kerstin Flath
- Institute for Plant Protection in Field Crops and Grassland, Julius-Kuehn Institute, Kleinmachnow, Germany
| | | | | | | | | | - Viktor Korzun
- KWS SAAT SE & Co. KGaA, Einbeck, Germany
- Federal State Budgetary Institution of Science Federal Research Center “Kazan Scientific Center of Russian Academy of Sciences”, Kazan, Russia
| | | | | | - Anna Tratwal
- Institute of Plant Protection – National Research Institute, Poznań, Poland
| | - Jakub Danielewicz
- Institute of Plant Protection – National Research Institute, Poznań, Poland
| | - Marek Korbas
- Institute of Plant Protection – National Research Institute, Poznań, Poland
| | | | | | | | - Silvia Koch
- State Plant Breeding Institute, University of Hohenheim, Stuttgart, Germany
| | - Hans-Peter Piepho
- Biostatistics Unit, Institute of Crop Science, University of Hohenheim, Stuttgart, Germany
| | - Thomas Miedaner
- State Plant Breeding Institute, University of Hohenheim, Stuttgart, Germany
| |
Collapse
|
46
|
Delhez P, Colinet F, Vanderick S, Bertozzi C, Gengler N, Soyeurt H. Predicting milk mid-infrared spectra from first-parity Holstein cows using a test-day mixed model with the perspective of herd management. J Dairy Sci 2020; 103:6258-6270. [PMID: 32418684 DOI: 10.3168/jds.2019-17717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Accepted: 02/27/2020] [Indexed: 11/19/2022]
Abstract
The use of test-day models to model milk mid-infrared (MIR) spectra for genetic purposes has already been explored; however, little attention has been given to their use to predict milk MIR spectra for management purposes. The aim of this paper was to study the ability of a test-day mixed model to predict milk MIR spectra for management purposes. A data set containing 467,496 test-day observations from 53,781 Holstein dairy cows in first lactation was used for model building. Principal component analysis was implemented on the selected 311 MIR spectral wavenumbers to reduce the number of traits for modeling; 12 principal components (PC) were retained, explaining approximately 96% of the total spectral variation. Each of the retained PC was modeled using a single trait test-day mixed model. The model solutions were used to compute the predicted scores of each PC, followed by a back-transformation to obtain the 311 predicted MIR spectral wavenumbers. Four new data sets, containing altogether 122,032 records, were used to test the ability of the model to predict milk MIR spectra in 4 distinct scenarios with different levels of information about the cows. The average correlation between observed and predicted values of each spectral wavenumber was 0.85 for the modeling data set and ranged from 0.36 to 0.62 for the scenarios. Correlations between milk fat, protein, and lactose contents predicted from the observed spectra and from the modeled spectra ranged from 0.83 to 0.89 for the modeling set and from 0.32 to 0.73 for the scenarios. Our results demonstrated a moderate but promising ability to predict milk MIR spectra using a test-day mixed model. Current and future MIR traits prediction equations could be applied on the modeled spectra to predict all MIR traits in different situations instead of developing one test-day model separately for each trait. Modeling MIR spectra would benefit farmers for cow and herd management, for instance through prediction of future records or comparison between observed and expected wavenumbers or MIR traits for the detection of health and management problems. Potential resulting tools could be incorporated into milk recording systems.
Collapse
Affiliation(s)
- P Delhez
- National Fund for Scientific Research (FRS-FNRS), Brussels 1000, Belgium; TERRA Teaching and Research Centre, Gembloux Agro-Bio Tech, University of Liège, Gembloux 5030, Belgium.
| | - F Colinet
- TERRA Teaching and Research Centre, Gembloux Agro-Bio Tech, University of Liège, Gembloux 5030, Belgium
| | - S Vanderick
- TERRA Teaching and Research Centre, Gembloux Agro-Bio Tech, University of Liège, Gembloux 5030, Belgium
| | - C Bertozzi
- Walloon Breeding Association (awé Groupe), Ciney 5590, Belgium
| | - N Gengler
- TERRA Teaching and Research Centre, Gembloux Agro-Bio Tech, University of Liège, Gembloux 5030, Belgium
| | - H Soyeurt
- TERRA Teaching and Research Centre, Gembloux Agro-Bio Tech, University of Liège, Gembloux 5030, Belgium
| |
Collapse
|
47
|
Francq BG, Lin D, Hoyer W. Confidence, prediction, and tolerance in linear mixed models. Stat Med 2019; 38:5603-5622. [PMID: 31659784 PMCID: PMC6916346 DOI: 10.1002/sim.8386] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Revised: 08/05/2019] [Accepted: 09/13/2019] [Indexed: 11/15/2022]
Abstract
The literature about Prediction Interval (PI) and Tolerance Interval (TI) in linear mixed models is usually developed for specific designs, which is a main limitation to their use. This paper proposes to reformulate the two‐sided PI to be generalizable under a wide variety of designs (one random factor, nested and crossed designs for multiple random factors, and balanced or unbalanced designs). This new methodology is based on the Hessian matrix, namely, the inverse of (observed) Fisher Information matrix, and is built with a cell mean model. The degrees of freedom for the total variance are calculated with the generalized Satterthwaite method and compared to the Kenward‐Roger's degrees of freedom for fixed effects. Construction of two‐sided TIs are also detailed with one random factor, and two nested and two crossed random variables. An extensive simulation study is carried out to compare the widths and coverage probabilities of Confidence Intervals (CI), PIs, and TIs to their nominal levels. It shows excellent coverage whatever the design and the sample size are. Finally, these CIs, PIs, and TIs are applied to two real data sets: one from orthopedic surgery study (intralesional resection risk) and the other from assay validation study during vaccine development.
Collapse
Affiliation(s)
| | - Dan Lin
- Pre-Clinical & Research - Biostatistics and Statistical Programming, GSK, Rixensart, Belgium
| | - Walter Hoyer
- TRD - CMC Statistical Sciences, GSK, Marburg, Germany
| |
Collapse
|
48
|
Baldoni PL, Rashid NU, Ibrahim JG. Improved detection of epigenomic marks with mixed-effects hidden Markov models. Biometrics 2019; 75:1401-1413. [PMID: 31081192 PMCID: PMC6851437 DOI: 10.1111/biom.13083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Accepted: 05/03/2019] [Indexed: 11/30/2022]
Abstract
Chromatin immunoprecipitation followed by next-generation sequencing (ChIP-seq) is a technique to detect genomic regions containing protein-DNA interaction, such as transcription factor binding sites or regions containing histone modifications. One goal of the analysis of ChIP-seq experiments is to identify genomic loci enriched for sequencing reads pertaining to DNA bound to the factor of interest. The accurate identification of such regions aids in the understanding of epigenomic marks and gene regulatory mechanisms. Given the reduction of massively parallel sequencing costs, methods to detect consensus regions of enrichment across multiple samples are of interest. Here, we present a statistical model to detect broad consensus regions of enrichment from ChIP-seq technical or biological replicates through a class of zero-inflated mixed-effects hidden Markov models. We show that the proposed model outperforms existing methods for consensus peak calling in common epigenomic marks by accounting for the excess zeros and sample-specific biases. We apply our method to data from the Encyclopedia of DNA Elements and Roadmap Epigenomics projects and also from an extensive simulation study.
Collapse
Affiliation(s)
- Pedro L. Baldoni
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Naim U. Rashid
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Joseph G. Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| |
Collapse
|
49
|
Purfield DC, Evans RD, Carthy TR, Berry DP. Genomic Regions Associated With Gestation Length Detected Using Whole-Genome Sequence Data Differ Between Dairy and Beef Cattle. Front Genet 2019; 10:1068. [PMID: 31749838 PMCID: PMC6848454 DOI: 10.3389/fgene.2019.01068] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Accepted: 10/04/2019] [Indexed: 12/17/2022] Open
Abstract
While many association studies exist that have attempted to relate genomic markers to phenotypic performance in cattle, very few have considered gestation length as a phenotype, and of those that did, none used whole genome sequence data from multiple breeds. The objective of the present study was therefore to relate imputed whole genome sequence data to estimated breeding values for gestation length using 22,566 sires (representing 2,262,706 progeny) of multiple breeds [Angus (AA), Charolais (CH), Holstein-Friesian (HF), and Limousin (LM)]. The associations were undertaken within breed using linear mixed models that accounted for genomic relatedness among sires; a separate association analysis was undertaken with all breeds analysed together but with breed included as a fixed effect in the model. Furthermore, the genome was divided into 500 kb segments and whether or not segments harboured a single nucleotide polymorphism (SNP) with a P ≤ 1 × 10-4 common to different combinations of breeds was determined. Putative quantitative trait loci (QTL) regions associated with gestation length were detected in all breeds; significant associations with gestation length were only detected in the HF population and in the across-breed analysis of all 22,566 sires. Twenty-five SNPs were significantly associated (P ≤ 5 × 10-8) with gestation length in the HF population. Of the 25 significant SNPs, 18 were located within three QTLs on Bos taurus autosome number (BTA) 18, six were in two QTL on BTA19, and one was located within a QTL on BTA7. The strongest association was rs381577268, a downstream variant of ZNF613 located within a QTL spanning from 58.06 to 58.19 Mb on BTA18; it accounted for 1.37% of the genetic variance in gestation length. Overall there were 11 HF animals within the edited dataset that were homozygous for the T allele at rs381577268 and these had a 3.3 day longer (P < 0.0001) estimated breeding value (EBV) for gestation length than the heterozygous animals and a 4.7 day longer (P < 0.0001) EBV for gestation length than the homozygous CC animals. The majority of the 500 kb windows harboring a SNP with a P ≤ 1 × 10-4 were unique to a single breed and no window was shared among all four breeds for gestation length, suggesting any QTLs identified are breed-specific associations.
Collapse
Affiliation(s)
- Deirdre C Purfield
- Animal & Grassland Research and Innovation Centre, Teagasc, Cork, Ireland
| | | | - Tara R Carthy
- Animal & Grassland Research and Innovation Centre, Teagasc, Cork, Ireland
| | - Donagh P Berry
- Animal & Grassland Research and Innovation Centre, Teagasc, Cork, Ireland
| |
Collapse
|
50
|
Pooley CM, Bishop SC, Doeschl-Wilson A, Marion G. Posterior-based proposals for speeding up Markov chain Monte Carlo. R Soc Open Sci 2019; 6:190619. [PMID: 31827823 PMCID: PMC6894579 DOI: 10.1098/rsos.190619] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2019] [Accepted: 10/23/2019] [Indexed: 05/23/2023]
Abstract
Markov chain Monte Carlo (MCMC) is widely used for Bayesian inference in models of complex systems. Performance, however, is often unsatisfactory in models with many latent variables due to so-called poor mixing, necessitating the development of application-specific implementations. This paper introduces 'posterior-based proposals' (PBPs), a new type of MCMC update applicable to a huge class of statistical models (whose conditional dependence structures are represented by directed acyclic graphs). PBPs generate large joint updates in parameter and latent variable space, while retaining good acceptance rates (typically 33%). Evaluation against other approaches (from standard Gibbs/random walk updates to state-of-the-art Hamiltonian and particle MCMC methods) was carried out for widely varying model types: an individual-based model for disease diagnostic test data, a financial stochastic volatility model, a mixed model used in statistical genetics and a population model used in ecology. While different methods worked better or worse in different scenarios, PBPs were found to be either near to the fastest or significantly faster than the next best approach (by up to a factor of 10). PBPs, therefore, represent an additional general purpose technique that can be usefully applied in a wide variety of contexts.
Collapse
Affiliation(s)
- C. M. Pooley
- The Roslin Institute, The University of Edinburgh, Midlothian EH25 9RG, UK
- Biomathematics and Statistics Scotland, James Clerk Maxwell Building, The King's Buildings, Peter Guthrie Tait Road, Edinburgh EH9 3FD, UK
| | - S. C. Bishop
- The Roslin Institute, The University of Edinburgh, Midlothian EH25 9RG, UK
| | - A. Doeschl-Wilson
- The Roslin Institute, The University of Edinburgh, Midlothian EH25 9RG, UK
| | - G. Marion
- Biomathematics and Statistics Scotland, James Clerk Maxwell Building, The King's Buildings, Peter Guthrie Tait Road, Edinburgh EH9 3FD, UK
| |
Collapse
|