1
|
Cheng C, Messerschmidt L, Bravo I, Waldbauer M, Bhavikatti R, Schenk C, Grujic V, Model T, Kubinec R, Barceló J. A General Primer for Data Harmonization. Sci Data 2024; 11:152. [PMID: 38297013 PMCID: PMC10831085 DOI: 10.1038/s41597-024-02956-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 01/11/2024] [Indexed: 02/02/2024] Open
Affiliation(s)
- Cindy Cheng
- Hochschule für Politik, Technical University of Munich, Richard-Wagner Str. 1, Munich, 80333, Bavaria, Germany.
| | - Luca Messerschmidt
- Hochschule für Politik, Technical University of Munich, Richard-Wagner Str. 1, Munich, 80333, Bavaria, Germany
| | - Isaac Bravo
- Hochschule für Politik, Technical University of Munich, Richard-Wagner Str. 1, Munich, 80333, Bavaria, Germany
| | - Marco Waldbauer
- Hochschule für Politik, Technical University of Munich, Richard-Wagner Str. 1, Munich, 80333, Bavaria, Germany
| | | | - Caress Schenk
- School of Humanities and Social Sciences, Nazarbayev University, Kabanbay Batry Ave., 53, Astana, 010000, Kazakhstan
| | - Vanja Grujic
- Faculty of Law, University of Brasilia, Campus Universitário Darcy Ribeiro Asa Norte, Brasília, 10587, Brazil
| | - Tim Model
- Delve, 2225 3rd St, San Francisco, 94107, California, USA
| | - Robert Kubinec
- Division of Social Science, New York University Abu Dhabi, Social Science Building (A5), Abu Dhabi, 129188, United Arab Emirates
| | - Joan Barceló
- Division of Social Science, New York University Abu Dhabi, Social Science Building (A5), Abu Dhabi, 129188, United Arab Emirates
| |
Collapse
|
2
|
Fortier I, Wey TW, Bergeron J, Pinot de Moira A, Nybo-Andersen AM, Bishop T, Murtagh MJ, Miočević M, Swertz MA, van Enckevort E, Marcon Y, Mayrhofer MT, Ornelas JP, Sebert S, Santos AC, Rocha A, Wilson RC, Griffith LE, Burton P. Life course of retrospective harmonization initiatives: key elements to consider. J Dev Orig Health Dis 2023; 14:190-198. [PMID: 35957574 DOI: 10.1017/s2040174422000460] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Optimizing research on the developmental origins of health and disease (DOHaD) involves implementing initiatives maximizing the use of the available cohort study data; achieving sufficient statistical power to support subgroup analysis; and using participant data presenting adequate follow-up and exposure heterogeneity. It also involves being able to undertake comparison, cross-validation, or replication across data sets. To answer these requirements, cohort study data need to be findable, accessible, interoperable, and reusable (FAIR), and more particularly, it often needs to be harmonized. Harmonization is required to achieve or improve comparability of the putatively equivalent measures collected by different studies on different individuals. Although the characteristics of the research initiatives generating and using harmonized data vary extensively, all are confronted by similar issues. Having to collate, understand, process, host, and co-analyze data from individual cohort studies is particularly challenging. The scientific success and timely management of projects can be facilitated by an ensemble of factors. The current document provides an overview of the 'life course' of research projects requiring harmonization of existing data and highlights key elements to be considered from the inception to the end of the project.
Collapse
Affiliation(s)
- Isabel Fortier
- Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - Tina W Wey
- Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - Julie Bergeron
- Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | | | | | - Tom Bishop
- Epidemiology Unit, University of Cambridge, England, UK
| | - Madeleine J Murtagh
- School of Social and Political Sciences, University of Glasgow, Scotland, UK
| | - Milica Miočević
- Department of Psychology, McGill University, Montreal, QC, Canada
| | - Morris A Swertz
- University Medical Center Groningen, University of Groningen, Netherlands
| | - Esther van Enckevort
- Department of Genetics, University Medical Center Groningen, University of Groningen, Netherlands
| | | | | | - Jos Pedro Ornelas
- INESC TEC - Institute for Systems and Computer Engineering, Technology and Science, Porto, Portugal
| | | | - Ana Cristina Santos
- Department of Epidemiology, Institute of Public Health of the University of Porto, Portugal
| | - Artur Rocha
- INESC TEC - Institute for Systems and Computer Engineering, Technology and Science, Porto, Portugal
| | - Rebecca C Wilson
- Department of Public Health, Policy and Systems, University of Liverpool, Liverpool, England, UK
| | - Lauren E Griffith
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Paul Burton
- Population Health Sciences Institute, Newcastle University, Newcastle-upon-Tyne, England, UK
| |
Collapse
|
3
|
Abbasizanjani H, Torabi F, Bedston S, Bolton T, Davies G, Denaxas S, Griffiths R, Herbert L, Hollings S, Keene S, Khunti K, Lowthian E, Lyons J, Mizani MA, Nolan J, Sudlow C, Walker V, Whiteley W, Wood A, Akbari A. Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration. BMC Med Inform Decis Mak 2023; 23:8. [PMID: 36647111 PMCID: PMC9842203 DOI: 10.1186/s12911-022-02093-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 12/21/2022] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND The CVD-COVID-UK consortium was formed to understand the relationship between COVID-19 and cardiovascular diseases through analyses of harmonised electronic health records (EHRs) across the four UK nations. Beyond COVID-19, data harmonisation and common approaches enable analysis within and across independent Trusted Research Environments. Here we describe the reproducible harmonisation method developed using large-scale EHRs in Wales to accommodate the fast and efficient implementation of cross-nation analysis in England and Wales as part of the CVD-COVID-UK programme. We characterise current challenges and share lessons learnt. METHODS Serving the scope and scalability of multiple study protocols, we used linked, anonymised individual-level EHR, demographic and administrative data held within the SAIL Databank for the population of Wales. The harmonisation method was implemented as a four-layer reproducible process, starting from raw data in the first layer. Then each of the layers two to four is framed by, but not limited to, the characterised challenges and lessons learnt. We achieved curated data as part of our second layer, followed by extracting phenotyped data in the third layer. We captured any project-specific requirements in the fourth layer. RESULTS Using the implemented four-layer harmonisation method, we retrieved approximately 100 health-related variables for the 3.2 million individuals in Wales, which are harmonised with corresponding variables for > 56 million individuals in England. We processed 13 data sources into the first layer of our harmonisation method: five of these are updated daily or weekly, and the rest at various frequencies providing sufficient data flow updates for frequent capturing of up-to-date demographic, administrative and clinical information. CONCLUSIONS We implemented an efficient, transparent, scalable, and reproducible harmonisation method that enables multi-nation collaborative research. With a current focus on COVID-19 and its relationship with cardiovascular outcomes, the harmonised data has supported a wide range of research activities across the UK.
Collapse
Affiliation(s)
- Hoda Abbasizanjani
- Population Data Science, Swansea University Medical School, Faculty of Medicine, Health and Life Science, Swansea University, Swansea, UK.
| | - Fatemeh Torabi
- Population Data Science, Swansea University Medical School, Faculty of Medicine, Health and Life Science, Swansea University, Swansea, UK
| | - Stuart Bedston
- Population Data Science, Swansea University Medical School, Faculty of Medicine, Health and Life Science, Swansea University, Swansea, UK
| | - Thomas Bolton
- British Heart Foundation Data Science Centre, Health Data Research UK, London, UK
| | - Gareth Davies
- Population Data Science, Swansea University Medical School, Faculty of Medicine, Health and Life Science, Swansea University, Swansea, UK
| | - Spiros Denaxas
- British Heart Foundation Data Science Centre, Health Data Research UK, London, UK
- Institute of Health Informatics, University College London, London, UK
| | - Rowena Griffiths
- Population Data Science, Swansea University Medical School, Faculty of Medicine, Health and Life Science, Swansea University, Swansea, UK
| | - Laura Herbert
- Population Data Science, Swansea University Medical School, Faculty of Medicine, Health and Life Science, Swansea University, Swansea, UK
| | | | - Spencer Keene
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
| | - Kamlesh Khunti
- Diabetes Research Centre, University of Leicester, Leicester, UK
| | - Emily Lowthian
- Population Data Science, Swansea University Medical School, Faculty of Medicine, Health and Life Science, Swansea University, Swansea, UK
| | - Jane Lyons
- Population Data Science, Swansea University Medical School, Faculty of Medicine, Health and Life Science, Swansea University, Swansea, UK
| | - Mehrdad A Mizani
- British Heart Foundation Data Science Centre, Health Data Research UK, London, UK
| | - John Nolan
- British Heart Foundation Data Science Centre, Health Data Research UK, London, UK
| | - Cathie Sudlow
- British Heart Foundation Data Science Centre, Health Data Research UK, London, UK
| | - Venexia Walker
- Department of Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - William Whiteley
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, UK
| | - Angela Wood
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
| | - Ashley Akbari
- Population Data Science, Swansea University Medical School, Faculty of Medicine, Health and Life Science, Swansea University, Swansea, UK
| |
Collapse
|
4
|
Timmermans EJ, Leeuwis AE, Bots ML, van Alphen JL, Biessels GJ, Brunner-La Rocca HP, Kappelle LJ, van Rossum AC, van Osch MJP, Vaartjes I. Neighbourhood walkability in relation to cognitive functioning in patients with disorders along the heart-brain axis. Health Place 2023; 79:102956. [PMID: 36525834 DOI: 10.1016/j.healthplace.2022.102956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 12/01/2022] [Accepted: 12/05/2022] [Indexed: 12/15/2022]
Abstract
This study examined associations of neighbourhood walkability with cognitive functioning (i.e., global cognition, memory, language, attention-psychomotor speed, and executive functioning) in participants without or with either heart failure, carotid occlusive disease, or vascular cognitive impairment. Neighbourhood walkability at baseline was positively associated with global cognition and attention-psychomotor speed. These associations were stronger in patients with vascular cognitive impairment. Individuals who live in residential areas with higher walkability levels were less likely to have impairments in language and executive functioning at two-year follow-up. These findings highlight the importance of the built environment for cognitive functioning in healthy and vulnerable groups.
Collapse
Affiliation(s)
- Erik J Timmermans
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands.
| | - Anna E Leeuwis
- Alzheimer Center Amsterdam, Department of Neurology, Amsterdam Neuroscience, Amsterdam UMC, VU University Medical Center, Amsterdam, the Netherlands
| | - Michiel L Bots
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Juliette L van Alphen
- Alzheimer Center Amsterdam, Department of Neurology, Amsterdam Neuroscience, Amsterdam UMC, VU University Medical Center, Amsterdam, the Netherlands
| | - Geert Jan Biessels
- Department of Neurology, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht, the Netherlands
| | | | - L Jaap Kappelle
- Department of Neurology, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht, the Netherlands
| | - Albert C van Rossum
- Department of Cardiology, Amsterdam UMC, VU University Medical Center, Amsterdam, the Netherlands
| | - Matthias J P van Osch
- C.J. Gorter MRI Center, Department of Radiology, Leiden University Medical Center, Leiden, the Netherlands
| | - Ilonca Vaartjes
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| |
Collapse
|
5
|
Krokstad S, Sund ER, Kvaløy K, Rangul V, Næss M. HUNT for better public health. Scand J Public Health 2022; 50:968-971. [PMID: 36113104 PMCID: PMC9578099 DOI: 10.1177/14034948221102309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Aims: The Trøndelag Health Study (HUNT) has collected population data through comprehensive decennial surveys over the last four decades and has so far collected data from 240,000 participants. The participants are identified with the unique Norwegian birth number, which enables them to be followed throughout different life stages, from survey to survey, and to endpoint measures in Norwegian national health registers without attrition bias. Methods: The study design of HUNT offers several advantages: it provides an overview of the public health development in the population over decades, the data can be used in health services research, clinical epidemiology, studies of causation, trajectories, and consequences of diseases, and to study gene × environment interactions. Results: HUNT data have shown major shifts in public health trends, such as decreasing mean blood pressure and resting heart rate among adults, increasing prevalence of obesity, geographical and socioeconomic inequalities in health, increasing mental health distress among adolescents and young adults with an opposite development among the elderly. Data from HUNT have been used in several major international research projects, where data harmonization with several other population cohorts internationally has been done. HUNT has placed great emphasis on safeguarding research ethics, privacy, and data security. The Norwegian authorities established national regulations for the surveys from the time General Data Protection Regulation was introduced in 2018. Conclusions: Researchers can apply for HUNT data access from HUNT Research Centre provided they have obtained project approval from the Regional Committee for Medical and Health Research Ethics. Researchers not affiliated to a Norwegian research institution must collaborate with and apply through a Norwegian principal investigator. Information on the application and conditions for data access is available at www.ntnu.edu/hunt/data.
Collapse
Affiliation(s)
- Steinar Krokstad
- HUNT Research Centre, Department of Public Health and Nursing, Faculty of Medicine and Health Sciences, NTNU, Levanger, Norway
- Levanger Hospital, Nord-Trøndelag Hospital Trust, Levanger, Norway
| | - Erik R. Sund
- HUNT Research Centre, Department of Public Health and Nursing, Faculty of Medicine and Health Sciences, NTNU, Levanger, Norway
- Levanger Hospital, Nord-Trøndelag Hospital Trust, Levanger, Norway
| | - Kirsti Kvaløy
- HUNT Research Centre, Department of Public Health and Nursing, Faculty of Medicine and Health Sciences, NTNU, Levanger, Norway
- Levanger Hospital, Nord-Trøndelag Hospital Trust, Levanger, Norway
| | - Vegar Rangul
- HUNT Research Centre, Department of Public Health and Nursing, Faculty of Medicine and Health Sciences, NTNU, Levanger, Norway
- Faculty of Nursing and Health Sciences, Nord University, Levanger, Norway
| | - Marit Næss
- HUNT Research Centre, Department of Public Health and Nursing, Faculty of Medicine and Health Sciences, NTNU, Levanger, Norway
- Levanger Hospital, Nord-Trøndelag Hospital Trust, Levanger, Norway
| |
Collapse
|
6
|
Eva G, Liese G, Stephanie B, Petr H, Leslie M, Roel V, Martine V, Sergi B, Mette H, Sarah J, Laura RM, Arnout S, Morris A S, Jan T, Xenia T, Nina V, Koert VE, Sylvie R, Greet S. Position paper on management of personal data in environment and health research in Europe. ENVIRONMENT INTERNATIONAL 2022; 165:107334. [PMID: 35696847 DOI: 10.1016/j.envint.2022.107334] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 05/30/2022] [Accepted: 06/01/2022] [Indexed: 06/15/2023]
Abstract
Management of datasets that include health information and other sensitive personal information of European study participants has to be compliant with the General Data Protection Regulation (GDPR, Regulation (EU) 2016/679). Within scientific research, the widely subscribed'FAIR' data principles should apply, meaning that research data should be findable, accessible, interoperable and re-usable. Balancing the aim of open science driven FAIR data management with GDPR compliant personal data protection safeguards is now a common challenge for many research projects dealing with (sensitive) personal data. In December 2020 a workshop was held with representatives of several large EU research consortia and of the European Commission to reflect on how to apply the FAIR data principles for environment and health research (E&H). Several recent data intensive EU funded E&H research projects face this challenge and work intensively towards developing solutions to access, exchange, store, handle, share, process and use such sensitive personal data, with the aim to support European and transnational collaborations. As a result, several recommendations, opportunities and current limitations were formulated. New technical developments such as federated data management and analysis systems, machine learning together with advanced search software, harmonized ontologies and data quality standards should in principle facilitate the FAIRification of data. To address ethical, legal, political and financial obstacles to the wider re-use of data for research purposes, both specific expertise and underpinning infrastructure are needed. There is a need for the E&H research data to find their place in the European Open Science Cloud. Communities using health and population data, environmental data and other publicly available data have to interconnect and synergize. To maximize the use and re-use of environment and health data, a dedicated supporting European infrastructure effort, such as the EIRENE research infrastructure within the ESFRI roadmap 2021, is needed that would interact with existing infrastructures.
Collapse
Affiliation(s)
- Govarts Eva
- VITO Health, Flemish Institute for Technological Research (VITO), Mol, Belgium.
| | - Gilles Liese
- VITO Health, Flemish Institute for Technological Research (VITO), Mol, Belgium
| | - Bopp Stephanie
- European Commission, Joint Research Centre (JRC), Ispra, Italy
| | | | - Matalonga Leslie
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Vermeulen Roel
- Institute for Risk Assessment Sciences, Utrecht University, Utrecht, Netherlands
| | - Vrijheid Martine
- ISGlobal, Barcelona, Spain; Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Beltran Sergi
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia, Universitat de Barcelona (UB), Barcelona, Spain
| | - Hartlev Mette
- Faculty of Law, University of Copenhagen, Copenhagen, Denmark
| | | | | | - Standaert Arnout
- VITO Health, Flemish Institute for Technological Research (VITO), Mol, Belgium
| | - Swertz Morris A
- Department of Genetics & Genomics Coordination Center, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Theunis Jan
- VITO Health, Flemish Institute for Technological Research (VITO), Mol, Belgium
| | - Trier Xenia
- European Environment Agency (EEA), Copenhagen, Denmark
| | - Vogel Nina
- German Environment Agency (UBA), Berlin, Germany
| | | | - Remy Sylvie
- VITO Health, Flemish Institute for Technological Research (VITO), Mol, Belgium
| | - Schoeters Greet
- VITO Health, Flemish Institute for Technological Research (VITO), Mol, Belgium; Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| |
Collapse
|