1
|
Edwards L, Pickett J, Ashcroft DM, Dambha-Miller H, Majeed A, Mallen C, Petersen I, Qureshi N, van Staa T, Abel G, Carvalho C, Denholm R, Kontopantelis E, Macaulay A, Macleod J. UK research data resources based on primary care electronic health records: review and summary for potential users. BJGP Open 2023; 7:BJGPO.2023.0057. [PMID: 37429634 PMCID: PMC10646196 DOI: 10.3399/bjgpo.2023.0057] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Revised: 06/12/2023] [Accepted: 07/07/2023] [Indexed: 07/12/2023] Open
Abstract
BACKGROUND The range and scope of electronic health record (EHR) data assets in the UK has recently increased, which has been mainly in response to the COVID-19 pandemic. Summarising and comparing the large primary care resources will help researchers to choose the data resources most suited to their needs. AIM To describe the current landscape of UK EHR databases and considerations of access and use of these resources relevant to researchers. DESIGN & SETTING Narrative review of EHR databases in the UK. METHOD Information was collected from the Health Data Research Innovation Gateway, publicly available websites and other published data, and from key informants. The eligibility criteria were population-based open-access databases sampling EHRs across the whole population of one or more countries in the UK. Published database characteristics were extracted and summarised, and these were corroborated with resource providers. Results were synthesised narratively. RESULTS Nine large national primary care EHR data resources were identified and summarised. These resources are enhanced by linkage to other administrative data to a varying extent. Resources are mainly intended to support observational research, although some can support experimental studies. There is considerable overlap of populations covered. While all resources are accessible to bona fide researchers, access mechanisms, costs, timescales, and other considerations vary across databases. CONCLUSION Researchers are currently able to access primary care EHR data from several sources. Choice of data resource is likely to be driven by project needs and access considerations. The landscape of data resources based on primary care EHRs in the UK continues to evolve.
Collapse
Affiliation(s)
| | | | - Darren M Ashcroft
- Centre for Pharmacoepidemiology and Drug Safety, NIHR Greater Manchester Patient Safety Translational Research Centre, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, UK
| | | | - Azeem Majeed
- Primary Care and Public Health, Imperial College London, London, UK
| | | | - Irene Petersen
- Department of Primary Care & Population Health, Institute of Epidemiology & Health, University College London, London, UK
| | - Nadeem Qureshi
- Centre for Academic Primary Care, University of Nottingham, Nottingham, UK
| | - Tjeerd van Staa
- Health eResearch Centre, University of Manchester, Manchester, UK
| | - Gary Abel
- Department of Health and Community Sciences (Medical School), Faculty of Health and Life Sciences, University of Exeter, Exeter, UK
| | - Chris Carvalho
- Clinical Effectiveness Group, Queen Mary University of London, London, UK
| | - Rachel Denholm
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
- Centre for Academic Primary Care, University of Bristol, Bristol, UK
- NIHR Bristol Biomedical Research Centre, Bristol, UK
- Health Data Research UK South-West, Bristol, UK
- NIHR Applied Research Collaboration (ARC) West, Bristol, UK
| | - Evangelos Kontopantelis
- Division of Informatics, Imaging and Data Sciences, University of Manchester, Manchester, UK
| | | | - John Macleod
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
- NIHR Applied Research Collaboration (ARC) West, Bristol, UK
| |
Collapse
|
2
|
Kirk D, Kok E, Tufano M, Tekinerdogan B, Feskens EJM, Camps G. Machine Learning in Nutrition Research. Adv Nutr 2022; 13:2573-2589. [PMID: 36166846 PMCID: PMC9776646 DOI: 10.1093/advances/nmac103] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2022] [Revised: 08/02/2022] [Accepted: 09/22/2022] [Indexed: 01/29/2023] Open
Abstract
Data currently generated in the field of nutrition are becoming increasingly complex and high-dimensional, bringing with them new methods of data analysis. The characteristics of machine learning (ML) make it suitable for such analysis and thus lend itself as an alternative tool to deal with data of this nature. ML has already been applied in important problem areas in nutrition, such as obesity, metabolic health, and malnutrition. Despite this, experts in nutrition are often without an understanding of ML, which limits its application and therefore potential to solve currently open questions. The current article aims to bridge this knowledge gap by supplying nutrition researchers with a resource to facilitate the use of ML in their research. ML is first explained and distinguished from existing solutions, with key examples of applications in the nutrition literature provided. Two case studies of domains in which ML is particularly applicable, precision nutrition and metabolomics, are then presented. Finally, a framework is outlined to guide interested researchers in integrating ML into their work. By acting as a resource to which researchers can refer, we hope to support the integration of ML in the field of nutrition to facilitate modern research.
Collapse
Affiliation(s)
- Daniel Kirk
- Division of Human Nutrition and Health, Wageningen University and Research, Wageningen, The Netherlands
| | - Esther Kok
- Division of Human Nutrition and Health, Wageningen University and Research, Wageningen, The Netherlands
| | - Michele Tufano
- Division of Human Nutrition and Health, Wageningen University and Research, Wageningen, The Netherlands
| | - Bedir Tekinerdogan
- Information Technology Group, Wageningen University and Research, Wageningen, The Netherlands
| | - Edith J M Feskens
- Division of Human Nutrition and Health, Wageningen University and Research, Wageningen, The Netherlands
| | - Guido Camps
- Division of Human Nutrition and Health, Wageningen University and Research, Wageningen, The Netherlands.,OnePlanet Research Center, Wageningen, The Netherlands
| |
Collapse
|
3
|
Loef B, Wong A, Janssen NAH, Strak M, Hoekstra J, Picavet HSJ, Boshuizen HCH, Verschuren WMM, Herber GCM. Using random forest to identify longitudinal predictors of health in a 30-year cohort study. Sci Rep 2022; 12:10372. [PMID: 35725920 PMCID: PMC9209521 DOI: 10.1038/s41598-022-14632-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Accepted: 06/09/2022] [Indexed: 11/09/2022] Open
Abstract
Due to the wealth of exposome data from longitudinal cohort studies that is currently available, the need for methods to adequately analyze these data is growing. We propose an approach in which machine learning is used to identify longitudinal exposome-related predictors of health, and illustrate its potential through an application. Our application involves studying the relation between exposome and self-perceived health based on the 30-year running Doetinchem Cohort Study. Random Forest (RF) was used to identify the strongest predictors due to its favorable prediction performance in prior research. The relation between predictors and outcome was visualized with partial dependence and accumulated local effects plots. To facilitate interpretation, exposures were summarized by expressing them as the average exposure and average trend over time. The RF model's ability to discriminate poor from good self-perceived health was acceptable (Area-Under-the-Curve = 0.707). Nine exposures from different exposome-related domains were largely responsible for the model's performance, while 87 exposures seemed to contribute little to the performance. Our approach demonstrates that ML can be interpreted more than widely believed, and can be applied to identify important longitudinal predictors of health over the life course in studies with repeated measures of exposure. The approach is context-independent and broadly applicable.
Collapse
Affiliation(s)
- Bette Loef
- Center for Nutrition, Prevention and Health Services, National Institute for Public Health and the Environment, P.O. Box 1, 3720 BA, Bilthoven, The Netherlands.
| | - Albert Wong
- Center for Nutrition, Prevention and Health Services, National Institute for Public Health and the Environment, P.O. Box 1, 3720 BA, Bilthoven, The Netherlands
| | - Nicole A H Janssen
- Center for Nutrition, Prevention and Health Services, National Institute for Public Health and the Environment, P.O. Box 1, 3720 BA, Bilthoven, The Netherlands
| | - Maciek Strak
- Center for Nutrition, Prevention and Health Services, National Institute for Public Health and the Environment, P.O. Box 1, 3720 BA, Bilthoven, The Netherlands
| | - Jurriaan Hoekstra
- Center for Nutrition, Prevention and Health Services, National Institute for Public Health and the Environment, P.O. Box 1, 3720 BA, Bilthoven, The Netherlands
| | - H Susan J Picavet
- Center for Nutrition, Prevention and Health Services, National Institute for Public Health and the Environment, P.O. Box 1, 3720 BA, Bilthoven, The Netherlands
| | - H C Hendriek Boshuizen
- Center for Nutrition, Prevention and Health Services, National Institute for Public Health and the Environment, P.O. Box 1, 3720 BA, Bilthoven, The Netherlands
- Wageningen University and Research, Wageningen, The Netherlands
| | - W M Monique Verschuren
- Center for Nutrition, Prevention and Health Services, National Institute for Public Health and the Environment, P.O. Box 1, 3720 BA, Bilthoven, The Netherlands
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Gerrie-Cor M Herber
- Center for Nutrition, Prevention and Health Services, National Institute for Public Health and the Environment, P.O. Box 1, 3720 BA, Bilthoven, The Netherlands
| |
Collapse
|