1
|
Woltmann L, Deepe J, Hartmann C, Lehner W. evalPM: a framework for evaluating machine learning models for particulate matter prediction. ENVIRONMENTAL MONITORING AND ASSESSMENT 2023; 195:1491. [PMID: 37979062 PMCID: PMC10657320 DOI: 10.1007/s10661-023-11996-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 10/22/2023] [Indexed: 11/19/2023]
Abstract
Air pollution through particulate matter (PM) is one of the largest threats to human health. To understand the causes of PM pollution and enact suitable countermeasures, reliable predictions of future PM concentrations are required. In the scientific literature, many methods exist for machine learning (ML)-based PM prediction, though their quality is difficult to compare because, among other things, they use different data sets and evaluate the resulting predictions differently. For a new data set, it is not apparent which of the existing prediction methods is best suited. In order to ease the assessment of said models, we present evalPM, a framework to easily create, evaluate, and compare different ML models for immission-based PM prediction. To achieve this, the framework provides flexibility regarding data sets, input features, target variables, model types, hyperparameters, and model evaluation. It has a modular design consisting of several components, each providing at least one required flexibility. The individual capabilities of the framework are demonstrated using 16 different models from the related literature by means of temporal prediction of PM concentrations for four European data sets, showing the capabilities and advantages of the evalPM framework. In doing so, it is shown that the framework allows fast creation and evaluation of ML-based PM prediction models.
Collapse
Affiliation(s)
- Lucas Woltmann
- TU Dresden, Dresden Database Research Group, Dresden, Germany.
| | - Jonas Deepe
- TU Dresden, Dresden Database Research Group, Dresden, Germany
| | | | - Wolfgang Lehner
- TU Dresden, Dresden Database Research Group, Dresden, Germany
| |
Collapse
|
2
|
Tsai CY, Su CL, Wang YH, Wu SM, Liu WT, Hsu WH, Majumdar A, Stettler M, Chen KY, Lee YT, Hu CJ, Lee KY, Tsuang BJ, Tseng CH. Impact of lifetime air pollution exposure patterns on the risk of chronic disease. ENVIRONMENTAL RESEARCH 2023; 229:115957. [PMID: 37084949 DOI: 10.1016/j.envres.2023.115957] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 04/17/2023] [Accepted: 04/18/2023] [Indexed: 05/03/2023]
Abstract
Long-term exposure to air pollution can lead to cardiovascular disease, metabolic syndrome, and chronic respiratory disease. However, from a lifetime perspective, the critical period of air pollution exposure in terms of health risk is unknown. This study aimed to evaluate the impact of air pollution exposure at different life stages. The study participants were recruited from community centers in Northern Taiwan between October 2018 and April 2021. Their annual averages for fine particulate matter (PM2.5) exposure were derived from a national visibility database. Lifetime PM2.5 exposures were determined using residential address information and were separated into three stages (<20, 20-40, and >40 years). We employed exponentially weighted moving averages, applying different weights to the aforementioned life stages to simulate various weighting distribution patterns. Regression models were implemented to examine associations between weighting distributions and disease risk. We applied a random forest model to compare the relative importance of the three exposure life stages. We also compared model performance by evaluating the accuracy and F1 scores (the harmonic mean of precision and recall) of late-stage (>40 years) and lifetime exposure models. Models with 89% weighting on late-stage exposure showed significant associations between PM2.5 exposure and metabolic syndrome, hypertension, diabetes, and cardiovascular disease, but not gout or osteoarthritis. Lifetime exposure models showed higher precision, accuracy, and F1 scores for metabolic syndrome, hypertension, diabetes, and cardiovascular disease, whereas late-stage models showed lower performance metrics for these outcomes. We conclude that exposure to high-level PM2.5 after 40 years of age may increase the risk of metabolic syndrome, hypertension, diabetes, and cardiovascular disease. However, models considering lifetime exposure showed higher precision, accuracy, and F1 scores and lower equal error rates than models incorporating only late-stage exposures. Future studies regarding long-term air pollution modelling are required considering lifelong exposure pattern. .1.
Collapse
Affiliation(s)
- Cheng-Yu Tsai
- Department of Civil and Environmental Engineering, Imperial College London, London, SW7 2AZ, United Kingdom; Division of Pulmonary Medicine, Department of Internal Medicine, Shuang Ho Hospital, Taipei Medical University, New Taipei City, 235041, Taiwan
| | - Chien-Ling Su
- Division of Pulmonary Medicine, Department of Internal Medicine, Shuang Ho Hospital, Taipei Medical University, New Taipei City, 235041, Taiwan; School of Respiratory Therapy, College of Medicine, Taipei Medical University, Taipei, 110301, Taiwan; Department of Physical Therapy, Shu-Zen Junior College of Medicine and Management, Kaohsiung City, 821004, Taiwan
| | - Yuan-Hung Wang
- Graduate Institute of Clinical Medicine, College of Medicine, Taipei Medical University, Taipei, 110301, Taiwan; Department of Medical Research, Shuang Ho Hospital, Taipei Medical University, New Taipei City, 235041, Taiwan
| | - Sheng-Ming Wu
- School of Respiratory Therapy, College of Medicine, Taipei Medical University, Taipei, 110301, Taiwan; Division of Pulmonary Medicine, Department of Internal Medicine, School of Medicine, College of Medicine, Taipei Medical University, Taipei, 110301, Taiwan
| | - Wen-Te Liu
- Division of Pulmonary Medicine, Department of Internal Medicine, Shuang Ho Hospital, Taipei Medical University, New Taipei City, 235041, Taiwan; School of Respiratory Therapy, College of Medicine, Taipei Medical University, Taipei, 110301, Taiwan; Sleep Center, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan; Research Center of Artificial Intelligence in Medicine, Taipei Medical University, Taipei, 110301, Taiwan
| | - Wen-Hua Hsu
- School of Respiratory Therapy, College of Medicine, Taipei Medical University, Taipei, 110301, Taiwan
| | - Arnab Majumdar
- Department of Civil and Environmental Engineering, Imperial College London, London, SW7 2AZ, United Kingdom
| | - Marc Stettler
- Department of Civil and Environmental Engineering, Imperial College London, London, SW7 2AZ, United Kingdom
| | - Kuan-Yuan Chen
- Division of Pulmonary Medicine, Department of Internal Medicine, Shuang Ho Hospital, Taipei Medical University, New Taipei City, 235041, Taiwan
| | - Ya-Ting Lee
- Division of Pulmonary Medicine, Department of Internal Medicine, Shuang Ho Hospital, Taipei Medical University, New Taipei City, 235041, Taiwan
| | - Chaur-Jong Hu
- Department of Neurology, Shuang Ho Hospital, Taipei Medical University, New Taipei City, 235041, Taiwan; Department of Neurology, School of Medicine, College of Medicine, Taipei Medical University, Taipei, 11031, Taiwan
| | - Kang-Yun Lee
- Division of Pulmonary Medicine, Department of Internal Medicine, Shuang Ho Hospital, Taipei Medical University, New Taipei City, 235041, Taiwan; Division of Pulmonary Medicine, Department of Internal Medicine, School of Medicine, College of Medicine, Taipei Medical University, Taipei, 110301, Taiwan
| | - Ben-Jei Tsuang
- Department of Environmental Engineering, National Chung-Hsing University, Taichung, Taiwan
| | - Chien-Hua Tseng
- Division of Pulmonary Medicine, Department of Internal Medicine, Shuang Ho Hospital, Taipei Medical University, New Taipei City, 235041, Taiwan; Division of Pulmonary Medicine, Department of Internal Medicine, School of Medicine, College of Medicine, Taipei Medical University, Taipei, 110301, Taiwan; Division of Critical Care Medicine, Department of Emergency and Critical Care Medicine, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan.
| |
Collapse
|
3
|
Embedded Generative Air Pollution Model with Variational Autoencoder and Environmental Factor Effect in Ulaanbaatar City. ATMOSPHERE 2021. [DOI: 10.3390/atmos13010071] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Air pollution is one of the most pressing modern-day issues in cities around the world. However, most cities have adopted air quality measurement devices that only measure the past pollution levels without paying attention to the influencing factors. To obtain preliminary pollution information with regard to environmental factors, we developed a variational autoencoder and feedforward neural network-based embedded generative model to examine the relationship between air quality and the effects of environmental factors. In the model, actual SO2, NO2, PM2.5, PM10, and CO measurements from 2016 to 2020 were used, which were assembled from 15 differently located ground monitoring stations in Ulaanbaatar city. A wide range of weather and fuel measurements were used as the data for the influencing factors, and were collected over the same period as the air pollution data were recorded. The prediction results concerned all measurement stations, and the results were visualized as a spatial–temporal distribution of pollution and the performance of individual stations. A cross-validated R2 was used to estimate the entire pollution distribution through the regions as SO2: 0.81, PM2.5: 0.76, PM10: 0.89, and CO: 0.83. Pearson’s chi-squared tests were used for assessing each measurement station, and the contingency tables represent a high correlation between the actual and model results. The model can be applied to perform specific analysis of the interdependencies between pollution and environmental factors, and the performance of the model improves with long-range data.
Collapse
|