1
|
Radiotranscriptomics of non-small cell lung carcinoma for assessing high-level clinical outcomes using a machine learning-derived multi-modal signature. Biomed Eng Online 2023; 22:125. [PMID: 38102586 PMCID: PMC10724973 DOI: 10.1186/s12938-023-01190-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 12/07/2023] [Indexed: 12/17/2023] Open
Abstract
BACKGROUND Multi-omics research has the potential to holistically capture intra-tumor variability, thereby improving therapeutic decisions by incorporating the key principles of precision medicine. The purpose of this study is to identify a robust method of integrating features from different sources, such as imaging, transcriptomics, and clinical data, to predict the survival and therapy response of non-small cell lung cancer patients. METHODS 2996 radiomics, 5268 transcriptomics, and 8 clinical features were extracted from the NSCLC Radiogenomics dataset. Radiomics and deep features were calculated based on the volume of interest in pre-treatment, routine CT examinations, and then combined with RNA-seq and clinical data. Several machine learning classifiers were used to perform survival analysis and assess the patient's response to adjuvant chemotherapy. The proposed analysis was evaluated on an unseen testing set in a k-fold cross-validation scheme. Score- and concatenation-based multi-omics were used as feature integration techniques. RESULTS Six radiomics (elongation, cluster shade, entropy, variance, gray-level non-uniformity, and maximal correlation coefficient), six deep features (NasNet-based activations), and three transcriptomics (OTUD3, SUCGL2, and RQCD1) were found to be significant for therapy response. The examined score-based multi-omic improved the AUC up to 0.10 on the unseen testing set (0.74 ± 0.06) and the balance between sensitivity and specificity for predicting therapy response for 106 patients, resulting in less biased models and improving upon the either highly sensitive or highly specific single-source models. Six radiomics (kurtosis, GLRLM- and GLSZM-based non-uniformity from images with no filtering, biorthogonal, and daubechies wavelets), seven deep features (ResNet-based activations), and seven transcriptomics (ELP3, ZZZ3, PGRMC2, TRAK1, ATIC, USP7, and PNPLA2) were found to be significant for the survival analysis. Accordingly, the survival analysis for 115 patients was also enhanced up to 0.20 by the proposed score-based multi-omics in terms of the C-index (0.79 ± 0.03). CONCLUSIONS Compared to single-source models, multi-omics integration has the potential to improve prediction performance, increase model stability, and reduce bias for both treatment response and survival analysis.
Collapse
|
2
|
Psychosis Symptom Trajectories Across Childhood and Adolescence in Three Longitudinal Studies: An Integrative Data Analysis with Mixture Modeling. PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2023; 24:1636-1647. [PMID: 37615885 DOI: 10.1007/s11121-023-01581-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/14/2023] [Indexed: 08/25/2023]
Abstract
Psychotic-like experiences (PLEs) are common throughout childhood, and the presence of these experiences is a significant risk factor for poor mental health later in development. Given the association of PLEs with a broad number of mental health diagnoses, these experiences serve as an important malleable target for early preventive interventions. However, little is known about these experiences across childhood. While these experiences may be common, longitudinal measurement in non-clinical settings is not. Therefore, in order to explore longitudinal trajectories of PLEs in childhood, we harmonized three school-based randomized control trials with longitudinal follow-up to identify heterogeneity in trajectories of these experiences. In an integrative data analysis (IDA) using growth mixture modeling, we identified three latent trajectory classes. One trajectory class was characterized by persistent PLEs, one was characterized by high initial probabilities but improving across the analytic period, and one was characterized by no reports of PLEs. Compared to the class without PLEs, those in the improving class were more likely to be male and have higher levels of aggressive and disruptive behavior at baseline. In addition to the substantive impact this work has on PLE research, we also discuss the methodological innovation as it relates to IDA. This IDA demonstrates the complexity of pooling data across multiple studies to estimate longitudinal mixture models.
Collapse
|
3
|
Retrospective Psychometrics and Effect Heterogeneity in Integrated Data Analysis: Commentary on the Special Issue. PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2023; 24:1672-1681. [PMID: 37938526 PMCID: PMC11018253 DOI: 10.1007/s11121-023-01592-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/02/2023] [Indexed: 11/09/2023]
Abstract
The current special issue of Prevention Science indicates that momentum in using individual participant data (IPD) and integrative data analysis (IDA) to combine and synthesize findings in prevention science has accelerated over the past decade. In this commentary, we focus on two general themes involving methods for harmonizing measures and findings of effect heterogeneity. We describe methods for harmonization as retrospective psychometrics, requiring that we attend to the assumptions necessary for accurate measurement, but adjust our methods given the constraints of working with existing datasets that often involve different measures in different studies. We point to novel approaches for increasing confidence that semantic matching and empirical modeling used in these studies will yield accurate and valid measurements that can be combined in IDA. We also review findings about effect heterogeneity, emphasizing the importance of using etiologic and action theories to identify and evaluate sources of such effects. We note that all of the papers in this issue deserve careful attention, as they illustrate how prevention scientists are approaching the complexities of IDA and exploring novel methods for overcoming its challenges.
Collapse
|
4
|
An Integrative Data Analysis of Main and Moderated Crossover Effects of Parent-Mediated Interventions on Depression and Anxiety Symptoms in Youth in Foster Care. PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2023; 24:1547-1557. [PMID: 36930405 DOI: 10.1007/s11121-023-01524-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/07/2023] [Indexed: 03/18/2023]
Abstract
Without preventative intervention, youth with a history of foster care (FC) involvement have a high likelihood of developing depression and anxiety (DA) symptoms. The current study used integrative data analysis to harmonize data across four foster and kinship parent-mediated interventions (and seven randomized control trials) designed to reduce youth externalizing and other problem behaviors to determine if, and for how long, these interventions may have crossover effects on youth DA symptoms. Moderation of intervention effects by youth biological sex, developmental period, number of prior placements, and race/ethnicity was also examined. Youth (N = 1891; 59% female; ages 4 to 18 years) behaviors were assessed via the Child Behavior Checklist, Parent Daily Report, and Eyberg Child Behavior Inventory at baseline, the end of the interventions (4-6 months post baseline), and two follow-up assessments (9-12 months and 18-24 months post baseline), yielding 4830 total youth-by-time assessments. The interventions were effective at reducing DA symptoms at the end of the interventions; however, effects were only sustained for one program at the follow-up assessments. No moderation effects were found. The current study indicates that parent-mediated interventions implemented during childhood or adolescence aimed at reducing externalizing and other problem behaviors had crossover effects on youth DA symptoms at the end of the interventions. Such intervention effects were sustained 12 and 24 months later only for the most at-risk youth involved in the most intensive intervention.
Collapse
|
5
|
Who Benefits from School-Based Teen Pregnancy Prevention Programs? Examining Multidimensional Moderators of Program Effectiveness Across Four Studies. PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2023; 24:1535-1546. [PMID: 35994193 DOI: 10.1007/s11121-022-01423-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/05/2022] [Indexed: 10/15/2022]
Abstract
Recent research has suggested the importance of understanding for whom programs are most effective (Supplee et al., 2013) and that multidimensional profiles of risk and protective factors may moderate the effectiveness of programs (Lanza & Rhoades, 2012). For school-based prevention programs, moderators of program effectiveness may occur at both the individual and school levels. However, due to the relatively small number of schools in most individual trials, integrative data analysis across multiple studies may be necessary to fully understand the multidimensional individual and school factors that may influence program effectiveness. In this study, we applied multilevel latent class analysis to integrated data across four studies of a middle school pregnancy prevention program to examine moderators of program effectiveness on initiation of vaginal sex. Findings suggest that the program may be particularly effective for schools with USA-born students who speak another language at home. In addition, findings suggest potential positive outcomes of the program for individuals who are lower risk and engaging in normative dating or individuals with family risk. Findings suggest potential mechanisms by which teen pregnancy prevention programs may be effective.
Collapse
|
6
|
Introduction to the Special Issue on Innovations and Applications of Integrative Data Analysis (IDA) and Related Data Harmonization Procedures in Prevention Science. PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2023; 24:1425-1434. [PMID: 37943445 DOI: 10.1007/s11121-023-01600-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/10/2023] [Indexed: 11/10/2023]
Abstract
This paper serves as an introduction to the special issue of Prevention Science entitled, "Innovations and Applications of Integrative Data Analysis (IDA) and Related Data Harmonization Procedures in Prevention Science." This special issue includes a collection of original papers from multiple disciplines that apply individual-level data synthesis methodologies, including IDA, individual participant meta-analysis, and other related methods to harmonize and integrate multiple datasets from intervention trials of the same or similar interventions. This work builds on a series of papers appearing in a prior Prevention Science special issue, entitled "Who Benefits from Programs to Prevent Adolescent Depression?" (Howe, Pantin, & Perrino, 2018). Since the publication of this prior work, the use of individual-level data synthesis has increased considerably in and outside of prevention. As such, there is a need for an update on current and future directions in IDA, with careful consideration of innovations and applications of these methods to fill important research gaps in prevention science. The papers in this issue are organized into two broad categories of (1) evidence synthesis papers that apply best practices in data harmonization and individual-level data synthesis and (2) new and emerging design, psychometric, and methodological issues and solutions. This collection of original papers is followed by two invited commentaries which provide insight and important reflections on the field and future directions for prevention science.
Collapse
|
7
|
Mitigating Multiple Sources of Bias in a Quasi-Experimental Integrative Data Analysis: Does Treating Childhood Anxiety Prevent Substance Use Disorders in Late Adolescence/Young Adulthood? PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2023; 24:1622-1635. [PMID: 36057023 DOI: 10.1007/s11121-022-01422-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/05/2022] [Indexed: 11/26/2022]
Abstract
Psychiatric epidemiologists, developmental psychopathologists, prevention scientists, and treatment researchers have long speculated that treating child anxiety disorders could prevent alcohol and other drug use disorders in young adulthood. A primary challenge in examining long-term effects of anxiety disorder treatment from randomized controlled trials is that all participants receive an immediate or delayed study-related treatment prior to long-term follow-up assessment. Thus, if a long-term follow-up is conducted, a comparison condition no longer exists within the trial. Quasi-experimental designs (QEDs) pairing such clinical samples with comparable untreated epidemiological samples offer a method of addressing this challenge. Selection bias, often a concern in QEDs, can be mitigated by propensity score weighting. A second challenge may arise because the clinical and epidemiological studies may not have used identical measures, necessitating Integrative Data Analysis (IDA) for measure harmonization and scale score estimation. The present study uses a combination of propensity score weighting, zero-inflated mixture moderated nonlinear factor analysis (ZIM-MNLFA), and potential outcomes mediation in a child anxiety treatment QED/IDA (n = 396). Under propensity score-weighted potential outcomes mediation, CBT led to reductions in substance use disorder severity, the effects of which were mediated by reductions in anxiety severity in young adulthood. Sensitivity analyses highlighted the importance of attending to multiple types of bias. This study illustrates how hybrid QED/IDAs can be used in secondary prevention contexts for improved measurement and causal inference, particularly when control participants in clinical trials receive study-related treatment prior to long-term assessment.
Collapse
|
8
|
Harmonizing Social, Emotional, and Behavioral Constructs in Prevention Science: Digging into the Weeds of Aligning Disparate Measures. PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2023; 24:1581-1594. [PMID: 36753042 DOI: 10.1007/s11121-022-01467-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/08/2022] [Indexed: 02/09/2023]
Abstract
While integrative data analysis (IDA) presents great opportunity, it also necessitates a myriad of methodological decisions related to harmonizing disparate measures collected across multiple studies. There is a lack of step-by-step methodological guidance for harmonizing disparate measures of latent constructs differently conceptualized or operationalized across studies, such as social, emotional, and behavioral constructs often utilized in prevention science. The current paper addressed this gap by providing methodological guidance and a case illustration focused on harmonizing measures of disparately conceptualized and operationalized constructs. We do so by outlining a five-phased harmonization approach paired with an illustrative example of the approach as applied to harmonization of broadband latent emotional and behavioral health constructs assessed with different measures across studies. This approach builds on and expands upon procedures currently recommended in the IDA literature with parallels to best practices in test development procedures. The illustrative example of our phased approach is drawn from an IDA study of 11 randomized controlled trials of Coping Power (Lochman & Wells, 2004), an evidence-based preventive intervention. We demonstrate the harmonization of two constructs, internalizing and externalizing problems, as harmonized across the teacher-reported scales of the Achenbach System of Empirically Based Assessment (Achenbach, 1991a) and the Behavior Assessment System for Children (Reynolds & Kamphaus, 2004). Finally, we consider the potential strengths and limitations of this phased approach, underscoring areas for future methodological research and conclude with some recommendations.
Collapse
|
9
|
Methodological Strategies for Prospective Harmonization of Studies: Application to 10 Distinct Outcomes Studies of Preventive Interventions Targeting Opioid Misuse. PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2023; 24:16-29. [PMID: 35976525 PMCID: PMC9935745 DOI: 10.1007/s11121-022-01412-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/18/2022] [Indexed: 02/02/2023]
Abstract
The Helping to End Addiction Long-Term (HEAL) Prevention Cooperative (HPC) is rapidly developing 10 distinct evidence-based interventions for implementation in a variety of settings to prevent opioid misuse and opioid use disorder. One HPC objective is to compare intervention impacts on opioid misuse initiation, escalation, severity, and disorder and identify whether any HPC interventions are more effective than others for types of individuals. It provides a rare opportunity to prospectively harmonize measures across distinct outcomes studies. This paper describes the needs, opportunities, strategies, and processes that were used to harmonize HPC data. They are illustrated with a strategy to measure opioid use that spans the spectrum of opioid use experiences (termed involvement) and is composed of common "anchor items" ranging from initiation to symptoms of opioid use disorder. The limitations and opportunities anticipated from this approach to data harmonization are reviewed. Lastly, implications for future research cooperatives and the broader HEAL data ecosystem are discussed.
Collapse
|
10
|
Examining the latent structure and correlates of sensory reactivity in autism: a multi-site integrative data analysis by the autism sensory research consortium. Mol Autism 2023; 14:31. [PMID: 37635263 PMCID: PMC10464466 DOI: 10.1186/s13229-023-00563-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 08/11/2023] [Indexed: 08/29/2023] Open
Abstract
BACKGROUND Differences in responding to sensory stimuli, including sensory hyperreactivity (HYPER), hyporeactivity (HYPO), and sensory seeking (SEEK) have been observed in autistic individuals across sensory modalities, but few studies have examined the structure of these "supra-modal" traits in the autistic population. METHODS Leveraging a combined sample of 3868 autistic youth drawn from 12 distinct data sources (ages 3-18 years and representing the full range of cognitive ability), the current study used modern psychometric and meta-analytic techniques to interrogate the latent structure and correlates of caregiver-reported HYPER, HYPO, and SEEK within and across sensory modalities. Bifactor statistical indices were used to both evaluate the strength of a "general response pattern" factor for each supra-modal construct and determine the added value of "modality-specific response pattern" scores (e.g., Visual HYPER). Bayesian random-effects integrative data analysis models were used to examine the clinical and demographic correlates of all interpretable HYPER, HYPO, and SEEK (sub)constructs. RESULTS All modality-specific HYPER subconstructs could be reliably and validly measured, whereas certain modality-specific HYPO and SEEK subconstructs were psychometrically inadequate when measured using existing items. Bifactor analyses supported the validity of a supra-modal HYPER construct (ωH = .800) but not a supra-modal HYPO construct (ωH = .653), and supra-modal SEEK models suggested a more limited version of the construct that excluded some sensory modalities (ωH = .800; 4/7 modalities). Modality-specific subscales demonstrated significant added value for all response patterns. Meta-analytic correlations varied by construct, although sensory features tended to correlate most with other domains of core autism features and co-occurring psychiatric symptoms (with general HYPER and speech HYPO demonstrating the largest numbers of practically significant correlations). LIMITATIONS Conclusions may not be generalizable beyond the specific pool of items used in the current study, which was limited to caregiver report of observable behaviors and excluded multisensory items that reflect many "real-world" sensory experiences. CONCLUSION Of the three sensory response patterns, only HYPER demonstrated sufficient evidence for valid interpretation at the supra-modal level, whereas supra-modal HYPO/SEEK constructs demonstrated substantial psychometric limitations. For clinicians and researchers seeking to characterize sensory reactivity in autism, modality-specific response pattern scores may represent viable alternatives that overcome many of these limitations.
Collapse
|
11
|
An integrative approach for the analysis of risk and health across the life course: challenges, innovations, and opportunities for life course research. DISCOVER SOCIAL SCIENCE AND HEALTH 2023; 3:14. [PMID: 37469576 PMCID: PMC10352429 DOI: 10.1007/s44155-023-00044-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 06/26/2023] [Indexed: 07/21/2023]
Abstract
Life course epidemiology seeks to understand the intricate relationships between risk factors and health outcomes across different stages of life to inform prevention and intervention strategies to optimize health throughout the lifespan. However, extant evidence has predominantly been based on separate analyses of data from individual birth cohorts or panel studies, which may not be sufficient to unravel the complex interplay of risk and health across different contexts. We highlight the importance of a multi-study perspective that enables researchers to: (a) Compare and contrast findings from different contexts and populations, which can help identify generalizable patterns and context-specific factors; (b) Examine the robustness of associations and the potential for effect modification by factors such as age, sex, and socioeconomic status; and (c) Improve statistical power and precision by pooling data from multiple studies, thereby allowing for the investigation of rare exposures and outcomes. This integrative framework combines the advantages of multi-study data with a life course perspective to guide research in understanding life course risk and resilience on adult health outcomes by: (a) Encouraging the use of harmonized measures across studies to facilitate comparisons and synthesis of findings; (b) Promoting the adoption of advanced analytical techniques that can accommodate the complexities of multi-study, longitudinal data; and (c) Fostering collaboration between researchers, data repositories, and funding agencies to support the integration of longitudinal data from diverse sources. An integrative approach can help inform the development of individualized risk scores and personalized interventions to promote health and well-being at various life stages.
Collapse
|
12
|
Harmonizing Ethno-Regionally Diverse Datasets to Advance the Global Epidemiology of Dementia. Clin Geriatr Med 2023; 39:177-190. [PMID: 36404030 PMCID: PMC9767705 DOI: 10.1016/j.cger.2022.07.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Understanding dementia and cognitive impairment is a global effort needing data from multiple sources across diverse ethno-regional groups. Methodological heterogeneity means that these data often require harmonization to make them comparable before analysis. We discuss the benefits and challenges of harmonization, both retrospective and prospective, broadly and with a focus on data types that require particular sorts of approaches, including neuropsychological test scores and neuroimaging data. Throughout our discussion, we illustrate general principles and give examples of specific approaches in the context of contemporary research in dementia and cognitive impairment from around the world.
Collapse
|
13
|
Design and methodology for an integrative data analysis of coping power: Direct and indirect effects on adolescent suicidality. Contemp Clin Trials 2022; 115:106705. [PMID: 35176503 PMCID: PMC9018598 DOI: 10.1016/j.cct.2022.106705] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 02/08/2022] [Accepted: 02/08/2022] [Indexed: 01/02/2023]
Abstract
As suicide rates have risen in the last decade, there has been greater emphasis on targeting early risk conditions for suicidality among youth and adolescents as a form of suicide "inoculation". Two particular needs that have been raised in this nascent literature are a) the dearth of examination of early intervention effects on distal suicide risk that target externalizing behaviors and b) the need to harmonize multiple existing intervention datasets for greater precision in modeling intervention effects on low base rate outcomes such as suicidal behaviors. This project, entitled "Integrative Data Analysis of Coping Power (CP): Effects on Adolescent Suicidality", funded by the National Institute of Mental Health (NIMH), will harmonize and analyze data from 11 randomized controlled trials of CP (total individual-level N = 3183, total school-level N = 189). CP is an empirically-supported, child- and family-focused preventive intervention that focuses on reducing externalizing more broadly among youth who exhibit early aggression, which makes it ideally suited to targeting externalizing pathways to suicidality. The project utilizes three measurement and data analysis frameworks that have emerged across multiple independent disciplines: integrative data analysis (IDA), random treatment effects multilevel modeling (RTE-MLM), and propensity score weighting (PSW). If successful, the project will a) provide initial evidence that CP would have gender-specific indirect effects on suicidality through reductions in externalizing for boys and reductions in internalizing for girls and b) identify optimal conditions under which CP is delivered (e.g., groups, individuals, online) across participants on reductions in suicidality and other key intermediate endpoints.
Collapse
|
14
|
Abstract
Multiple myeloma (MM) is the second most frequent hematological malignancy in the world although the related pathogenesis remains unclear. Gene profiling studies, commonly carried out through next-generation sequencing (NGS) and Microarrays technologies, represent powerful tools for discovering prognostic markers in MM. NGS technologies have made great leaps forward both economically and technically gaining in popularity. As NGS techniques becomes simpler and cheaper, researchers choose NGS over microarrays for more of their genomic applications. However, Microarrays still provide significant benefits with respect to NGS. For instance, RNA-Seq requires more complex bioinformatic analysis with respect to Microarray as well as it lacks of standardized protocols for analysis. Therefore, a synergy between the two technologies may be well expected in the future. In order to take up this challenge, a valid tool for integrative analysis of MM data retrieved through NGS techniques is MMRFBiolinks, a new R package for integrating and analyzing datasets from the Multiple Myeloma Research Foundation (MMRF) CoMMpass (Clinical Outcomes in MM to Personal Assessment of Genetic Profile) study, available at MMRF Researcher Gateway (MMRF-RG), and at the National Cancer Institute Genomic Data Commons (NCI-GDC) Data Portal. Instead of developing a completely new package from scratch, we decided to leverage TC-GABiolinks, an R/Bioconductor package, because it provides some useful methods to access and analyze MMRF-CoMMpass data. An integrative analysis workflow based on the usage of MMRFBiolinks is illustrated.In particular, it leads towards a comparative analysis of RNA-Seq data stored at GDC Data Portal that allows to carry out a Kaplan Meier (KM ) Survival Analysis and an enrichment analysis for a Differential Gene Expression (DGE) gene set.Furthermore, it deals with MMRF-RG data for analyzing the correlation between canonical variants and treatment outcome as well as treatment class. In order to show the potential of the workflow, we present two case studies. The former deals with data of MM Bone Marrow sample types available at GDC Data Portal. The latter deals with MMRF-RG data for analyzing the correlation between canonical variants in a gene set obtained from the case study 1 and the treatment outcome as well as the treatment class.
Collapse
|
15
|
An application of moderated nonlinear factor analysis to develop a commensurate measure of alcohol problems across four alcohol treatment studies. Drug Alcohol Depend 2021; 229:109068. [PMID: 34628095 PMCID: PMC8671250 DOI: 10.1016/j.drugalcdep.2021.109068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 08/17/2021] [Accepted: 08/17/2021] [Indexed: 10/20/2022]
Abstract
BACKGROUND Self-report measures of alcohol problems are commonly included in studies evaluating treatment and recovery from alcohol use disorder (AUD), but no prior study has examined the replicability of the measurement of alcohol problems across studies with various measures and diverse samples. Further, it is unclear which items may be better indicators of alcohol problems for patient subgroups. In the present study, we integrated data from four large alcohol treatment studies to develop a commensurate measure of alcohol problems using moderated nonlinear factor analysis (MNLFA). METHODS Data were from the COMBINE study, Project MATCH, the Relapse Replication and Extension Project (RREP), and the United Kingdom Alcohol Treatment Trial (UKATT), yielding a total sample size of 4414. MNLFA was carried out on the Drinker Inventory of Consequences (COMBINE, MATCH, RREP) and Alcohol Problems Questionnaire (UKATT). RESULTS We successfully created a 78-item commensurate measure of alcohol problems and examined differential item functioning (DIF) by study membership, time, and socio-demographic characteristics. Sixty-two items demonstrated intercept DIF, suggesting differences in rates of item endorsement for clients with the same underlying levels of alcohol problems across patient subgroups. Six items demonstrated loading DIF, suggesting differences in the extent to which the items were indicative of alcohol problems across patient subgroups. CONCLUSIONS The self-reported measurement of alcohol problems replicates across measures and diverse samples. Items with DIF have clinical implications for the treatment of AUD. Finally, MNLFA scores can be used to test substantive research questions across these studies.
Collapse
|
16
|
A Structural Equation Modeling Approach to Meta-analytic Mediation Analysis Using Individual Participant Data: Testing Protective Behavioral Strategies as a Mediator of Brief Motivational Intervention Effects on Alcohol-Related Problems. PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2021; 23:390-402. [PMID: 34767159 PMCID: PMC8975788 DOI: 10.1007/s11121-021-01318-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/01/2021] [Indexed: 11/25/2022]
Abstract
This paper introduces a meta-analytic mediation analysis approach for individual participant data (IPD) from multiple studies. Mediation analysis evaluates whether the effectiveness of an intervention on health outcomes occurs because of change in a key behavior targeted by the intervention. However, individual trials are often statistically underpowered to test mediation hypotheses. Existing approaches for evaluating mediation in the meta-analytic context are limited by their reliance on aggregate data; thus, findings may be confounded with study-level differences unrelated to the pathway of interest. To overcome the limitations of existing meta-analytic mediation approaches, we used a one-stage estimation approach using structural equation modeling (SEM) to combine IPD from multiple studies for mediation analysis. This approach (1) accounts for the clustering of participants within studies, (2) accommodates missing data via multiple imputation, and (3) allows valid inferences about the indirect (i.e., mediated) effects via bootstrapped confidence intervals. We used data (N = 3691 from 10 studies) from Project INTEGRATE (Mun et al. Psychology of Addictive Behaviors, 29, 34–48, 2015) to illustrate the SEM approach to meta-analytic mediation analysis by testing whether improvements in the use of protective behavioral strategies mediate the effectiveness of brief motivational interventions for alcohol-related problems among college students. To facilitate the application of the methodology, we provide annotated computer code in R and data for replication. At a substantive level, stand-alone personalized feedback interventions reduced alcohol-related problems via greater use of protective behavioral strategies; however, the net-mediated effect across strategies was small in size, on average.
Collapse
|
17
|
A Coordinated Multi-study Analysis of the Longitudinal Association Between Handgrip Strength and Cognitive Function in Older Adults. J Gerontol B Psychol Sci Soc Sci 2021; 76:229-241. [PMID: 31187137 DOI: 10.1093/geronb/gbz072] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Indexed: 12/11/2022] Open
Abstract
OBJECTIVE Handgrip strength, an indicator of overall muscle strength, has been found to be associated with slower rate of cognitive decline and decreased risk for cognitive impairment and dementia. However, evaluating the replicability of associations between aging-related changes in physical and cognitive functioning is challenging due to differences in study designs and analytical models. A multiple-study coordinated analysis approach was used to generate new longitudinal results based on comparable construct-level measurements and identical statistical models and to facilitate replication and research synthesis. METHODS We performed coordinated analysis on 9 cohort studies affiliated with the Integrative Analysis of Longitudinal Studies of Aging and Dementia (IALSA) research network. Bivariate linear mixed models were used to examine associations among individual differences in baseline level, rate of change, and occasion-specific variation across grip strength and indicators of cognitive function, including mental status, processing speed, attention and working memory, perceptual reasoning, verbal ability, and learning and memory. Results were summarized using meta-analysis. RESULTS After adjustment for covariates, we found an overall moderate association between change in grip strength and change in each cognitive domain for both males and females: Average correlation coefficient was 0.55 (95% CI = 0.44-0.56). We also found a high level of heterogeneity in this association across studies. DISCUSSION Meta-analytic results from nine longitudinal studies showed consistently positive associations between linear rates of change in grip strength and changes in cognitive functioning. Future work will benefit from the examination of individual patterns of change to understand the heterogeneity in rates of aging and health-related changes across physical and cognitive biomarkers.
Collapse
|
18
|
An ontology-based documentation of data discovery and integration process in cancer outcomes research. BMC Med Inform Decis Mak 2020; 20:292. [PMID: 33317497 PMCID: PMC7734720 DOI: 10.1186/s12911-020-01270-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 09/17/2020] [Indexed: 01/24/2023] Open
Abstract
Background To reduce cancer mortality and improve cancer outcomes, it is critical to understand the various cancer risk factors (RFs) across different domains (e.g., genetic, environmental, and behavioral risk factors) and levels (e.g., individual, interpersonal, and community levels). However, prior research on RFs of cancer outcomes, has primarily focused on individual level RFs due to the lack of integrated datasets that contain multi-level, multi-domain RFs. Further, the lack of a consensus and proper guidance on systematically identify RFs also increase the difficulty of RF selection from heterogenous data sources in a multi-level integrative data analysis (mIDA) study. More importantly, as mIDA studies require integrating heterogenous data sources, the data integration processes in the limited number of existing mIDA studies are inconsistently performed and poorly documented, and thus threatening transparency and reproducibility. Methods Informed by the National Institute on Minority Health and Health Disparities (NIMHD) research framework, we (1) reviewed existing reporting guidelines from the Enhancing the QUAlity and Transparency Of health Research (EQUATOR) network and (2) developed a theory-driven reporting guideline to guide the RF variable selection, data source selection, and data integration process. Then, we developed an ontology to standardize the documentation of the RF selection and data integration process in mIDA studies. Results We summarized the review results and created a reporting guideline—ATTEST—for reporting the variable selection and data source selection and integration process. We provided an ATTEST check list to help researchers to annotate and clearly document each step of their mIDA studies to ensure the transparency and reproducibility. We used the ATTEST to report two mIDA case studies and further transformed annotation results into sematic triples, so that the relationships among variables, data sources and integration processes are explicitly standardized and modeled using the classes and properties from OD-ATTEST. Conclusion Our ontology-based reporting guideline solves some key challenges in current mIDA studies for cancer outcomes research, through providing (1) a theory-driven guidance for multi-level and multi-domain RF variable and data source selection; and (2) a standardized documentation of the data selection and integration processes powered by an ontology, thus a way to enable sharing of mIDA study reports among researchers.
Collapse
|
19
|
Harmonizing altered measures in integrative data analysis: A methods analogue study. Behav Res Methods 2020; 53:1031-1045. [PMID: 32939683 DOI: 10.3758/s13428-020-01472-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In the current study, we used an analogue integrative data analysis (IDA) design to test optimal scoring strategies for harmonizing alcohol- and drug-use consequence measures with varying degrees of alteration across four study conditions. We evaluated performance of mean, confirmatory factor analysis (CFA), and moderated nonlinear factor analysis (MNLFA) scores based on traditional indices of reliability (test-retest, internal, and score recovery or parallel forms) and validity. Participants in the analogue study included 854 college students (46% male; 21% African American, 5% Hispanic/Latino, 56% European American) who completed two versions of the altered measures at two sessions, separated by 2 weeks. As expected, mean, CFA, and MNLFA scores all resulted in scales with lower reliability given increasing scale alteration (with less fidelity to formerly developed scales) and shorter scale length. MNLFA and CFA scores, however, showed greater validity than mean scores, demonstrating stronger relationships with external correlates. Implications for measurement harmonization in the context of IDA are discussed.
Collapse
|
20
|
Age-related associations between substance use and sexual risk behavior among high-risk young African American women in the South. Addict Behav 2019; 96:110-118. [PMID: 31075728 DOI: 10.1016/j.addbeh.2019.04.031] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Revised: 04/27/2019] [Accepted: 04/28/2019] [Indexed: 11/23/2022]
Abstract
BACKGROUND We assessed age-related associations between substance use and sexual risk behavior using data from three HIV prevention trials that enrolled young African American women. METHODS We used integrative data analysis to pool data from 1862 individuals aged 16-25 years. We used time-varying effect models to examine associations between substance use (alcoholic drinks per month, recent marijuana use, cigarettes smoked per day) and sexual risk behaviors (monthly frequency of vaginal sex, multiple sex partners, condomless sex), adjusting for the fixed effect of trial. RESULTS In models that included all three substances, cigarette smoking was not associated with any outcome. Alcohol quantity was associated with greater frequency of sex at all ages, an increased likelihood of having multiple sex partners from about age 17-24 years, and an increased likelihood of condomless sex after about age 18.5 years. Associations between alcohol quantity and sex frequency were relatively stable; associations with having multiple sex partners and condomless sex increased beginning at about age 22 years. Marijuana use was associated with greater sex frequency at approximate ages 16.5-24 years and an increased likelihood of having multiple sex partners at ages 18-24 years. Associations with sex frequency were relatively stable; associations with having multiple sex partners increased from about age 18 and peaked at about age 23 years. CONCLUSIONS We observed developmentally-dependent relationships between both alcohol and marijuana and sexual risk behavior. The findings underscore the need to address substance-related sexual risk among young African American women and may inform optimal timing of intervention.
Collapse
|
21
|
Impact of Behavioral Drug Abuse Treatment on Sexual Risk Behaviors: An Integrative Data Analysis of Eight Trials Conducted Within the National Drug Abuse Treatment Clinical Trials Network. PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2019; 19:761-771. [PMID: 29868998 DOI: 10.1007/s11121-018-0913-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Abstract
The extent to which behavioral drug abuse treatments affect sexual risk behaviors is largely unknown. This study examined the impact of behavioral drug abuse treatments on sexual risk behaviors using an integrative data analysis approach across eight trials conducted within the National Drug Abuse Treatment Clinical Trials Network (CTN). Participants (N = 1305) from eight randomized controlled trials who were sexually active at baseline were included in the pooled dataset; 48.7% were female, 64.1% self-identified as a racial/ethnic minority, with M (SD) age of 34.9 (9.6). Longitudinal logistic regression estimated the probability of risky sexual behavior (i.e., inconsistent condom use and/or > 1 sexual partner in past 30 days) post-intervention with an indicator variable (1 for post-intervention), study condition (control, intervention), and their interaction as predictors; the analysis employed random effects for each trial and included relevant control variables. Time-varying differences in effects based on weeks post-intervention were incorporated using interacted linear and quadratic terms with condition status. Approximately 84.2% reported risky sexual behaviors at baseline. The control and intervention conditions were 18.5 and 17.3 percentage points less likely to report risky sexual behavior post-intervention, respectively. Results suggest decreasing rates of risky sex engagement until 8 weeks (control) or 9 weeks (intervention) post-intervention; risky sexual behavior subsequently increased. Behavioral CTN trial participation was associated with decreased sexual risk behaviors in both the intervention and control trial conditions. Participation in behavioral substance use treatment may result in secondary benefits of sexual risk behavior reductions.
Collapse
|
22
|
Integrative Data Analysis of Gender and Ethnic Measurement Invariance in Nicotine Dependence Symptoms. PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2019; 19:748-760. [PMID: 29396761 DOI: 10.1007/s11121-018-0867-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Little research has evaluated whether conflicting evidence for gender and racial/ethnic differences in nicotine dependence (ND) may be attributed to differences in psychometric properties of ND symptoms, particularly for young Hispanic smokers. Inadequate racial/ethnic diversity and limited smoking exposure variability has hampered research in young smokers. We used integrative data analysis (IDA) to pool DSM-IV ND symptom data for current smokers aged 12-25 (N = 20,328) from three nationally representative surveys (1999, 2000 National Surveys on Drug Use and Health (NSDUH) and Wave 1 National Epidemiologic Survey on Alcohol and Related Conditions (NESARC). Moderated nonlinear factor analysis (MNLFA) tested symptom measurement invariance in the pooled sample containing greater ethnic and smoking exposure variability. There was study noninvariance for most symptoms. NESARC participants were more likely to report tolerance, using larger amounts or for longer periods, inability to cut down/quit, and more time spent smoking at higher levels of ND severity, but reported emotional/physical health problems at lower ND severity. Four symptoms showed gender or race/ethnicity noninvariance, but observed differences were small. An ND severity factor score adjusting for symptom noninvariance related to study membership, gender, and race/ethnicity did not differ substantively from traditional DSM-IV diagnosis and number of endorsed symptoms in estimated gender and race/ethnicity differences in ND. Results were consistent with studies finding minimal gender and racial/ethnic differences in ND, and suggest that symptom noninvariance is not a major contributor to observed differences. Results support IDA as a potentially promising approach for testing novel ND hypotheses not possible in independent studies.
Collapse
|
23
|
Integrative analysis of loss-of-function variants in clinical and genomic data reveals novel genes associated with cardiovascular traits. BMC Med Genomics 2019; 12:108. [PMID: 31345219 PMCID: PMC6657044 DOI: 10.1186/s12920-019-0542-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Background Genetic loss-of-function variants (LoFs) associated with disease traits are increasingly recognized as critical evidence for the selection of therapeutic targets. We integrated the analysis of genetic and clinical data from 10,511 individuals in the Mount Sinai BioMe Biobank to identify genes with loss-of-function variants (LoFs) significantly associated with cardiovascular disease (CVD) traits, and used RNA-sequence data of seven metabolic and vascular tissues isolated from 600 CVD patients in the Stockholm-Tartu Atherosclerosis Reverse Network Engineering Task (STARNET) study for validation. We also carried out in vitro functional studies of several candidate genes, and in vivo studies of one gene. Results We identified LoFs in 433 genes significantly associated with at least one of 10 major CVD traits. Next, we used RNA-sequence data from the STARNET study to validate 115 of the 433 LoF harboring-genes in that their expression levels were concordantly associated with corresponding CVD traits. Together with the documented hepatic lipid-lowering gene, APOC3, the expression levels of six additional liver LoF-genes were positively associated with levels of plasma lipids in STARNET. Candidate LoF-genes were subjected to gene silencing in HepG2 cells with marked overall effects on cellular LDLR, levels of triglycerides and on secreted APOB100 and PCSK9. In addition, we identified novel LoFs in DGAT2 associated with lower plasma cholesterol and glucose levels in BioMe that were also confirmed in STARNET, and showed a selective DGAT2-inhibitor in C57BL/6 mice not only significantly lowered fasting glucose levels but also affected body weight. Conclusion In sum, by integrating genetic and electronic medical record data, and leveraging one of the world’s largest human RNA-sequence datasets (STARNET), we identified known and novel CVD-trait related genes that may serve as targets for CVD therapeutics and as such merit further investigation. Electronic supplementary material The online version of this article (10.1186/s12920-019-0542-3) contains supplementary material, which is available to authorized users.
Collapse
|
24
|
Programs for Preventing Depression in Adolescence: Who Benefits and Who Does Not? An Introduction to the Supplemental Issue. PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2019; 19:1-5. [PMID: 29368296 DOI: 10.1007/s11121-018-0870-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
We introduce this supplemental issue of Prevention Science, which brings together a set of papers from leading investigators who have conducted trials testing whether intervention programs prevent adolescent depression. Using data from these trials, these papers explore a series of factors that might account for variation in intervention benefit, employing several novel methods for assessing effect heterogeneity. These studies follow two general paradigms: three papers report findings from single randomized preventive intervention trials, while the remaining papers develop and apply new methods for combining data from multiple studies to evaluate effect heterogeneity more broadly. Colleagues from NIMH and SAMHSA also provide commentaries on these studies. They conclude that synthesis of findings from multiple trials holds great promise for advancing the field, and progress will be accelerated if collaborative data sharing becomes the norm rather than the exception.
Collapse
|
25
|
Addressing Methodologic Challenges and Minimizing Threats to Validity in Synthesizing Findings from Individual-Level Data Across Longitudinal Randomized Trials. PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2019; 19:60-73. [PMID: 28434055 DOI: 10.1007/s11121-017-0769-1] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Integrative Data Analysis (IDA) encompasses a collection of methods for data synthesis that pools participant-level data across multiple studies. Compared with single-study analyses, IDA provides larger sample sizes, better representation of participant characteristics, and often increased statistical power. Many of the methods currently available for IDA have focused on examining developmental changes using longitudinal observational studies employing different measures across time and study. However, IDA can also be useful in synthesizing across multiple randomized clinical trials to improve our understanding of the comprehensive effectiveness of interventions, as well as mediators and moderators of those effects. The pooling of data from randomized clinical trials presents a number of methodological challenges, and we discuss ways to examine potential threats to internal and external validity. Using as an illustration a synthesis of 19 randomized clinical trials on the prevention of adolescent depression, we articulate IDA methods that can be used to minimize threats to internal validity, including (1) heterogeneity in the outcome measures across trials, (2) heterogeneity in the follow-up assessments across trials, (3) heterogeneity in the sample characteristics across trials, (4) heterogeneity in the comparison conditions across trials, and (5) heterogeneity in the impact trajectories. We also demonstrate a technique for minimizing threats to external validity in synthesis analysis that may result from non-availability of some trial datasets. The proposed methods rely heavily on latent variable modeling extensions of the latent growth curve model, as well as missing data procedures. The goal is to provide strategies for researchers considering IDA.
Collapse
|
26
|
Abstract
DNA methylation is a widely investigated epigenetic mark with important roles in development and disease. High-throughput assays enable genome-scale DNA methylation analysis in large numbers of samples. Here, we describe a new version of our RnBeads software - an R/Bioconductor package that implements start-to-finish analysis workflows for Infinium microarrays and various types of bisulfite sequencing. RnBeads 2.0 (https://rnbeads.org/) provides additional data types and analysis methods, new functionality for interpreting DNA methylation differences, improved usability with a novel graphical user interface, and better use of computational resources. We demonstrate RnBeads 2.0 in four re-runnable use cases focusing on cell differentiation and cancer.
Collapse
|
27
|
Approaches for creating comparable measures of alcohol use symptoms: Harmonization with eight studies of criminal justice populations. Drug Alcohol Depend 2019; 194:59-68. [PMID: 30412898 PMCID: PMC6312501 DOI: 10.1016/j.drugalcdep.2018.10.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/04/2018] [Revised: 10/16/2018] [Accepted: 10/17/2018] [Indexed: 11/28/2022]
Abstract
BACKGROUND With increasing data archives comprised of studies with similar measurement, optimal methods for data harmonization and measurement scoring are a pressing need. We compare three methods for harmonizing and scoring the AUDIT as administered with minimal variation across 11 samples from eight study sites within the STTR (Seek-Test-Treat-Retain) Research Harmonization Initiative. Descriptive statistics and predictive validity results for cut-scores, sum scores, and Moderated Nonlinear Factor Analysis scores (MNLFA; a psychometric harmonization method) are presented. METHODS Across the eight study sites, sample sizes ranged from 50 to 2405 and target populations varied based on sampling frame, location, and inclusion/exclusion criteria. The pooled sample included 4667 participants (82% male, 52% Black, 24% White, 13% Hispanic, and 8% Asian/ Pacific Islander; mean age of 38.9 years). Participants completed the AUDIT at baseline in all studies. RESULTS After logical harmonization of items, we scored the AUDIT using three methods: published cut-scores, sum scores, and MNLFA. We found greater variation, fewer floor effects, and the ability to directly address missing data in MNLFA scores as compared to cut-scores and sum scores. MNLFA scores showed stronger associations with binge drinking and clearer study differences than did other scores. CONCLUSIONS MNLFA scores are a promising tool for data harmonization and scoring in pooled data analysis. Model complexity with large multi-study applications, however, may require new statistical advances to fully realize the benefits of this approach.
Collapse
|
28
|
An ontology-guided semantic data integration framework to support integrative data analysis of cancer survival. BMC Med Inform Decis Mak 2018; 18:41. [PMID: 30066664 PMCID: PMC6069766 DOI: 10.1186/s12911-018-0636-4] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Cancer is the second leading cause of death in the United States, exceeded only by heart disease. Extant cancer survival analyses have primarily focused on individual-level factors due to limited data availability from a single data source. There is a need to integrate data from different sources to simultaneously study as much risk factors as possible. Thus, we proposed an ontology-based approach to integrate heterogeneous datasets addressing key data integration challenges. METHODS Following best practices in ontology engineering, we created the Ontology for Cancer Research Variables (OCRV) adapting existing semantic resources such as the National Cancer Institute (NCI) Thesaurus. Using the global-as-view data integration approach, we created mapping axioms to link the data elements in different sources to OCRV. Implemented upon the Ontop platform, we built a data integration pipeline to query, extract, and transform data in relational databases using semantic queries into a pooled dataset according to the downstream multi-level Integrative Data Analysis (IDA) needs. RESULTS Based on our use cases in the cancer survival IDA, we created tailored ontological structures in OCRV to facilitate the data integration tasks. Specifically, we created a flexible framework addressing key integration challenges: (1) using a shared, controlled vocabulary to make data understandable to both human and computers, (2) explicitly modeling the semantic relationships makes it possible to compute and reason with the data, (3) linking patients to contextual and environmental factors through geographic variables, (4) being able to document the data manipulation and integration processes clearly in the ontologies. CONCLUSIONS Using an ontology-based data integration approach not only standardizes the definitions of data variables through a common, controlled vocabulary, but also makes the semantic relationships among variables from different sources explicit and clear to all users of the same datasets. Such an approach resolves the ambiguity in variable selection, extraction and integration processes and thus improve reproducibility of the IDA.
Collapse
|
29
|
Understanding Who Benefits from Parenting Interventions for Children's Conduct Problems: an Integrative Data Analysis. PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2018; 19:579-588. [PMID: 29349546 PMCID: PMC5899103 DOI: 10.1007/s11121-018-0864-y] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Parenting interventions are an effective strategy to reduce children's conduct problems. For some families, that is, not all families benefit equally. Individual trials tend to be underpowered and often lack variability to differentiate between families how benefit less or more. Integrating individual family level data across trials, we aimed to provide more conclusive results about often presumed key family (parental education and ethnic background) and child characteristics (problem severity, ADHD symptoms and emotional problems) as putative moderators of parenting intervention effects. We included data from 786 families (452 intervention; 334 control) from all four trials on the Incredible Years parenting intervention in The Netherlands (three randomized; one matched control). Children ranged between 2 and 10 years (M = 5.79; SD = 1.66). Of the families, 31% had a lower educational level and 29% had an ethnic minority background. Using multilevel regression, we tested whether each of the putative moderators affected intervention effects. Incredible Years reduced children's conduct problems (d = - .34). There were no differential effects by families' educational or ethnic background, or by children's level of ADHD symptoms. Children with more severe conduct problems and those with more emotional problems benefited more. Post hoc sensitivity analyses showed that for the two trials with longer-term data, moderation effects disappeared at 4 or 12 months follow-up. Often assumed moderators have some, but limited abilities to explain who benefits from parenting interventions. This suggests the need for studying theoretically more precise moderators in prevention research, other than relatively static family characteristics alone.
Collapse
|
30
|
Using Integrative Data Analysis to Examine Changes in Alcohol Use and Changes in Sexual Risk Behavior Across Four Samples of STI Clinic Patients. Ann Behav Med 2018; 51:39-56. [PMID: 27550626 DOI: 10.1007/s12160-016-9826-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
Abstract
BACKGROUND Patients in sexually transmitted infection (STI) clinics report high levels of alcohol use, which are associated with risky sexual behavior. However, no studies have examined how changes in alcohol use relate to changes in sexual risk behavior. PURPOSE We used parallel process latent growth modeling to explore how changes in alcohol use related to changes in sexual behavior across four samples of clinic patients. METHODS Patients participating in HIV prevention trials from urban clinics in the Northeastern and Midwestern USA (N = 3761, 59 % male, 72 % Black) completed measures at 3-month intervals over 9-12 months. Integrative data analysis was used to create composite measures of alcohol use across samples. Sexual risk measures were counts of partners and unprotected sex acts. Parallel process models tested whether alcohol use changes were correlated with changes in the number of partners and unprotected sex. RESULTS Growth models with good fit showed decreases that slowed over time in sexual risk behaviors and alcohol use. Parallel process models showed positive correlations between levels of (rs = 0.17-0.40, ps < 0.001) and changes in (rs = 0.21-0.80, ps < 0.05) alcohol use and number of sexual partners across studies. There were strong associations between levels of (rs = 0.25-0.43, ps < 0.001) and changes in (rs = 0.24-0.57, ps < 0.01) alcohol use and unprotected sex in one study recruiting hazardous drinkers. CONCLUSIONS Across four samples of clinic patients, reductions in alcohol use were associated with reductions in the number of sexual partners. HIV prevention interventions may be strengthened by addressing alcohol use.
Collapse
|
31
|
An integrative data analysis of gender differences in children's understanding of mathematical equivalence. J Exp Child Psychol 2017; 163:140-150. [PMID: 28705552 DOI: 10.1016/j.jecp.2017.06.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2016] [Revised: 05/26/2017] [Accepted: 06/05/2017] [Indexed: 11/29/2022]
Abstract
This study examined gender as a potential source of variation in children's formal understanding of mathematical equivalence. The hypothesis was that girls would perform more poorly than boys. An integrative data analysis was conducted with 960 second and third graders across 14 previously conducted studies of children's understanding of mathematical equivalence. Measures included problem solving, problem encoding, and equal sign definition. Overall, children performed poorly on all measures. As predicted, girls were less likely than boys to solve mathematical equivalence problems correctly, even though there were no gender differences in calculation accuracy. In addition, girls were more likely than boys to use the "add-all" strategy, an incorrect strategy that has been shown to be more resistant to change than other incorrect strategies. There were not statistically significant differences for encoding or defining the equal sign, suggesting that deficits may reflect girls' tendency to follow taught algorithms.
Collapse
|
32
|
Modeling Pathways of Character Development across the First Three Decades of Life: An Application of Integrative Data Analysis Techniques to Understanding the Development of Hopeful Future Expectations. J Youth Adolesc 2017; 46:1216-1237. [PMID: 28332053 DOI: 10.1007/s10964-017-0660-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Accepted: 03/08/2017] [Indexed: 11/28/2022]
Abstract
There were two purposes of the present research: first, to add to scholarship about a key character virtue, hopeful future expectations; and second, to demonstrate a recent innovation in longitudinal methodology that may be especially useful in enhancing the understanding of the developmental course of hopeful future expectations and other character virtues that have been the focus of recent scholarship in youth development. Burgeoning interest in character development has led to a proliferation of short-term, longitudinal studies on character. These data sets are sometimes limited in their ability to model character development trajectories due to low power or relatively brief time spans assessed. However, the integrative data analysis approach allows researchers to pool raw data across studies in order to fit one model to an aggregated data set. The purpose of this article is to demonstrate the promises and challenges of this new tool for modeling character development. We used data from four studies evaluating youth character strengths in different settings to fit latent growth curve models of hopeful future expectations from participants aged 7 through 26 years. We describe the analytic strategy for pooling the data and modeling the growth curves. Implications for future research are discussed in regard to the advantages of integrative data analysis. Finally, we discuss issues researchers should consider when applying these techniques in their own work.
Collapse
|
33
|
Abstract
Since the first report of a genome-wide association study (GWAS) on human age-related macular degeneration, GWAS has successfully been used to discover genetic variants for a variety of complex human diseases and/or traits, and thousands of associated loci have been identified. However, the underlying mechanisms for these loci remain largely unknown. To make these GWAS findings more useful, it is necessary to perform in-depth data mining. The data analysis in the post-GWAS era will include the following aspects: fine-mapping of susceptibility regions to identify susceptibility genes for elucidating the biological mechanism of action; joint analysis of susceptibility genes in different diseases; integration of GWAS, transcriptome, and epigenetic data to analyze expression and methylation quantitative trait loci at the whole-genome level, and find single-nucleotide polymorphisms that influence gene expression and DNA methylation; genome-wide association analysis of disease-related DNA copy number variations. Applying these strategies and methods will serve to strengthen GWAS data to enhance the utility and significance of GWAS in improving understanding of the genetics of complex diseases or traits and translate these findings for clinical applications.
Collapse
|
34
|
Participant-level meta-analysis of mobile phone-based interventions for smoking cessation across different countries. Prev Med 2016; 89:90-97. [PMID: 27154349 PMCID: PMC4969103 DOI: 10.1016/j.ypmed.2016.05.002] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Revised: 04/14/2016] [Accepted: 05/01/2016] [Indexed: 11/28/2022]
Abstract
With meta-analysis, participant-level data from five text messaging-based smoking cessation intervention studies were pooled to investigate cessation patterns across studies and participants. Individual participant data (N=8315) collected in New Zealand (2001-2003; n=1705), U.K. (2008-2009; n=5792), U.S. (2012; n=503; n=164) and Turkey (2012; n=151) were collectively analyzed in 2014. The primary outcome was self-reported 7-day continuous abstinence at 4weeks post-quit day. Secondary outcomes were: (1) self-reported 7-day continuous abstinence at 3months and (2) self-reported continuous abstinence at 6months post-quit day. Generalized linear mixed models were fit to estimate the overall treatment effect, while accounting for clustering within individual studies. Estimates were adjusted for age, sex, socioeconomic status, previous quit attempts, and baseline Fagerstrom score. Analyses were intention to treat. Participants lost to follow-up were treated as smokers. Twenty-nine percent of intervention participants and 12% of control participants quit smoking at 4weeks (adjusted odds ratio [aOR]=2.89, 95% CI [2.57, 3.26], p<.0001). An attenuated but significant effect for cessation for those in the intervention versus control groups was observed at 3months (aOR=1.88, 95% CI [1.53, 2.31]) and 6months (aOR=2.24, 95% CI [1.90, 2.64]). Subgroup analyses were conducted but few significant findings were noted. Text messaging-based smoking cessation programs increase self-reported quitting rates across a diversity of countries and cultures. Efforts to expand these low-cost and scalable programs, along with ongoing evaluation, appear warranted.
Collapse
|
35
|
Network-based analysis for identification of candidate genes for colorectal cancer progression. Biochem Biophys Res Commun 2016; 476:534-540. [PMID: 27255996 DOI: 10.1016/j.bbrc.2016.05.158] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Accepted: 05/29/2016] [Indexed: 01/10/2023]
Abstract
Although high-throughput biological technologies have been producing a vast amount of multi-omics data regarding cancer genomics and several disease susceptible genes have been reported, many of these genes are likely to be irrelevant for the cancer process because only one feature of the tumor pathway could be focused on. By identifying 'CpG core', which was extracted from CpG sites in genomic DNA by our newly developed method, we performed integrated analysis using gene expression and DNA methylation profiles of 116 colorectal cancer samples. First, based on gene expression values, colorectal cancer samples were divided into three clusters (Cluster-1, -2, and -3) by k-means clustering. The 5-year overall survival rates of colorectal cancer patients were 74.8%, 29.2%, and 29.4% in Cluster-1, -2, and -3, respectively, and the prognosis of Cluster-2 was significantly poorer than that of the other two clusters owing to liver metastasis (P < 0.001). Second, each cluster was divided into two subgroups based on methylation status, and the 5-year overall survival rate of Cluster-1H (36.8%) was significantly shorter than that of Cluster-1L (96.1%) due to the accumulation of aberrant DNA methylation (P = 0.014). Third, network-based analysis using expression and methylation profiles demonstrated that nucleoporin family genes were downregulated in Cluster-2 and that the PTX3 gene was highly methylated in Cluster-1H. These combined data indicate that integrated analysis can identify disease characteristics that would be missed using single comprehensive analysis, and that multiple pathways would play pivotal roles in the liver metastasis of colorectal cancer.
Collapse
|
36
|
Reproducibility and differential item functioning of the alcohol dependence syndrome construct across four alcohol treatment studies: An integrative data analysis. Drug Alcohol Depend 2016; 158:86-93. [PMID: 26613839 PMCID: PMC4698096 DOI: 10.1016/j.drugalcdep.2015.11.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/19/2015] [Revised: 11/03/2015] [Accepted: 11/04/2015] [Indexed: 10/22/2022]
Abstract
BACKGROUND The validity of the alcohol dependence syndrome has been supported. The question of whether different measures of the construct are comparable across studies and patient subgroups has not been examined. This study examined the alcohol dependence construct across four diverse large-scale treatment samples using integrative data analysis (IDA). METHOD We utilized existing data (n=4393) from the COMBINE Study, Project MATCH, the Relapse Replication and Extension Project (RREP), and the United Kingdom Alcohol Treatment Trial (UKATT). We focused on four measures of alcohol dependence: the Alcohol Dependence Scale (COMBINE and RREP), Alcohol Use Inventory (MATCH), the Leeds Dependence Questionnaire (UKATT), and the Diagnostic and Statistical Manual of Mental Disorders (COMBINE and MATCH). Moderated nonlinear factor analysis was used to create a measure of alcohol dependence severity that was moderated by study membership, gender, age, and marital status. RESULTS A commensurate measure of alcohol dependence severity was successfully created using 20 items available in four studies. We identified differential item functioning by study membership, age, gender, and/or marital status for 12 of the 20 items, indicating specific patient subgroups who responded differently to items based on their underlying dependence severity. CONCLUSIONS Alcohol dependence severity is a single unidimensional construct that is comparable across studies. The use of IDA provided a strong test of the validity of the alcohol dependence syndrome and clues as to how some items used to measure dependence severity may be more or less central to the construct for some patients.
Collapse
|
37
|
Size and consistency of problem-solving consultation outcomes: an empirical analysis. J Sch Psychol 2015; 53:161-78. [PMID: 25746825 DOI: 10.1016/j.jsp.2015.01.001] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2013] [Revised: 01/17/2015] [Accepted: 01/24/2015] [Indexed: 12/14/2022]
Abstract
In this study, we analyzed extant data to evaluate the variability and magnitude of students' behavior change outcomes (academic, social, and behavioral) produced by consultants through problem-solving consultation with teachers. Research questions were twofold: (a) Do consultants produce consistent and sizeable positive student outcomes across their cases as measured through direct and frequent assessment? and (b) What proportion of variability in student outcomes is attributable to consultants? Analyses of extant data collected from problem-solving consultation outcome studies that used single-case, time-series AB designs with multiple participants were analyzed. Four such studies ultimately met the inclusion criteria for the extant data, comprising 124 consultants who worked with 302 school teachers regarding 453 individual students. Consultants constituted the independent variable, while the primary dependent variable was a descriptive effect size based on student behavior change as measured by (a) curriculum-based measures, (b) permanent products, or (c) direct observations. Primary analyses involved visual and statistical evaluation of effect size magnitude and variability observed within and between consultants and studies. Given the nested nature of the data, multilevel analyses were used to assess consultant effects on student outcomes. Results suggest that consultants consistently produced positive effect sizes on average across their cases, but outcomes varied between consultants. Findings also indicated that consultants, teachers, and the corresponding studies accounted for a significant proportion of variability in student outcomes. This investigation advances the use of multilevel and integrative data analyses to evaluate consultation outcomes and extends research on problem-solving consultation, consultant effects, and meta-analysis of case study AB designs. Practical implications for evaluating consultation service delivery in school settings are also discussed.
Collapse
|