1
|
Zhang X, Zhao X, Bian Y, Huang J, Yin L. Interactive effects analysis of road, traffic, and weather characteristics on shared e-bike speeding risk: A data-driven approach. ACCIDENT; ANALYSIS AND PREVENTION 2024; 207:107755. [PMID: 39214034 DOI: 10.1016/j.aap.2024.107755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 07/26/2024] [Accepted: 08/21/2024] [Indexed: 09/04/2024]
Abstract
As electric bikes (e-bikes) rapidly develop in China, their traffic safety issues are becoming increasingly prominent. Accurately detecting risky riding behaviors and conducting mechanism analysis on the multiple risk factors are crucial in formulating and implementing precise management policies. The emergence of shared e-bikes and the advancements in interpretable machine learning present new opportunities for accurately analyzing the determinants of risky riding behaviors. The primary objective of this study is to examine and analyze the risk factors related to speeding behavior to aid urban management agencies in crafting necessary management policies. This study utilizes a large-scale dataset of shared e-bike trajectory data to establish a framework for detecting speeding behavior. Subsequently, the extreme gradient boosting (XGBoost) model is employed to identify the level of speeding risk by leveraging its excellent identification ability. Moreover, based on measuring the degree of interaction among road, traffic, and weather characteristics, the investigation of the complex interactive effects of these risk factors on high-risk speeding is conducted using bivariate partial dependence plots (PDP) by its superior parsing ability. Feature importance analysis results indicate that the top five ranked variables that significantly affect the identified results of speed risk levels are land use density, rainfall, road level, curbside parking density, and bike lane width. The interaction analysis results indicate that higher levels of road and bike lane width correspond to an increased possibility of high-risk speeding among riders. Land use density, curbside parking density, and rainfall display a nonlinear effect on high-risk speeding. Introducing road level, bike lane width, and time interval could change the patterns of nonlinear effects in land use density, curbside parking density, and rainfall. Finally, several policy recommendations are proposed to improve e-bike traffic safety by utilizing the extracted feature values associated with a higher probability of high-risk speeding.
Collapse
Affiliation(s)
- Xiaolong Zhang
- Faculty of Architecture, Civil and Transportation Engineering, Beijing University of Technology, Beijing 100124, PR China.
| | - Xiaohua Zhao
- Faculty of Architecture, Civil and Transportation Engineering, Beijing University of Technology, Beijing 100124, PR China.
| | - Yang Bian
- Faculty of Architecture, Civil and Transportation Engineering, Beijing University of Technology, Beijing 100124, PR China.
| | - Jianling Huang
- Beijing Intelligent Transportation Development Center, Beijing 100073, PR China.
| | - Luyao Yin
- Faculty of Architecture, Civil and Transportation Engineering, Beijing University of Technology, Beijing 100124, PR China.
| |
Collapse
|
2
|
Joshi RC, Srivastava P, Mishra R, Burget R, Dutta MK. Biomarker profiling and integrating heterogeneous models for enhanced multi-grade breast cancer prognostication. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 255:108349. [PMID: 39096573 DOI: 10.1016/j.cmpb.2024.108349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 07/01/2024] [Accepted: 07/22/2024] [Indexed: 08/05/2024]
Abstract
BACKGROUND Breast cancer remains a leading cause of female mortality worldwide, exacerbated by limited awareness, inadequate screening resources, and treatment options. Accurate and early diagnosis is crucial for improving survival rates and effective treatment. OBJECTIVES This study aims to develop an innovative artificial intelligence (AI) based model for predicting breast cancer and its various histopathological grades by integrating multiple biomarkers and subject age, thereby enhancing diagnostic accuracy and prognostication. METHODS A novel ensemble-based machine learning (ML) framework has been introduced that integrates three distinct biomarkers-beta-human chorionic gonadotropin (β-hCG), Programmed Cell Death Ligand 1 (PD-L1), and alpha-fetoprotein (AFP)-alongside subject age. Hyperparameter optimization was performed using the Particle Swarm Optimization (PSO) algorithm, and minority oversampling techniques were employed to mitigate overfitting. The model's performance was validated through rigorous five-fold cross-validation. RESULTS The proposed model demonstrated superior performance, achieving a 97.93% accuracy and a 98.06% F1-score on meticulously labeled test data across diverse age groups. Comparative analysis showed that the model outperforms state-of-the-art approaches, highlighting its robustness and generalizability. CONCLUSION By providing a comprehensive analysis of multiple biomarkers and effectively predicting tumor grades, this study offers a significant advancement in breast cancer screening, particularly in regions with limited medical resources. The proposed framework has the potential to reduce breast cancer mortality rates and improve early intervention and personalized treatment strategies.
Collapse
Affiliation(s)
- Rakesh Chandra Joshi
- Amity Centre for Artificial Intelligence, Amity University, Noida, Uttar Pradesh, India; Centre for Advanced Studies, Dr. A.P.J. Abdul Kalam Technical University, Lucknow, Uttar Pradesh, India
| | - Pallavi Srivastava
- Department of Biotechnology, Noida Institute of Engineering & Technology, Greater Noida, Uttar Pradesh, India
| | - Rashmi Mishra
- Department of Biotechnology, Noida Institute of Engineering & Technology, Greater Noida, Uttar Pradesh, India
| | - Radim Burget
- Department of Telecommunications, Faculty of Electrical Engineering and Communication, Brno University of Technology, Brno, Czech Republic
| | - Malay Kishore Dutta
- Amity Centre for Artificial Intelligence, Amity University, Noida, Uttar Pradesh, India.
| |
Collapse
|
3
|
Mejía R, Quinteros E, Ribó Arnau A. Geographic areas with the highest concentration of traffic accidents in San Salvador, El Salvador: a spatial analysis of the 2014-2018 period. Rev Peru Med Exp Salud Publica 2024; 40:413-422. [PMID: 38597469 PMCID: PMC11138829 DOI: 10.17843/rpmesp.2023.404.12963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 11/22/2023] [Indexed: 04/11/2024] Open
Abstract
OBJECTIVE. This study aimed to identify the areas with the highest concentration of traffic accidents and injuries in the San Salvador Metropolitan Area (SSMA). MATERIALS AND METHODS. Traffic accidents were analyzed spatially by point location and by the sum of events in areas of 200 m2. The point location was analyzed by "nearest neighbor analysis", while the areas with the sum of traffic accidents were analyzed by Getis-Ord Gi* to obtain the hot spots. The resulting hot spots with the highest concentration of traffic accidents in the SSMA were evaluated in the field using an observation form to collect data on infrastructure and road safety characteristics. RESULTS. Five areas with the highest number of traffic accidents and injuries, mainly containing primary roads, were identified by analyzing 8191 traffic accidents reported between 2014-2018. CONCLUSION. The sites with the highest concentration of traffic accidents and injuries were characterized by considerably damaged road infrastructure and the lack of safety systems for drivers and pedestrians. The spatial analysis of traffic accidents and injuries can contribute to improve surveillance and road safety in the SSMA.
Collapse
Affiliation(s)
- Roberto Mejía
- Instituto Nacional de Salud, San Salvador, El Salvador.Instituto Nacional de SaludSan SalvadorEl Salvador
| | - Edgar Quinteros
- Instituto Nacional de Salud, San Salvador, El Salvador.Instituto Nacional de SaludSan SalvadorEl Salvador
| | - Alexandre Ribó Arnau
- Departament d'ensenyament, Generalitat de Catalunya, España.Departament d'ensenyamentGeneralitat de CatalunyaEspaña
| |
Collapse
|
4
|
Tao H, Jawad AH, Shather AH, Al-Khafaji Z, Rashid TA, Ali M, Al-Ansari N, Marhoon HA, Shahid S, Yaseen ZM. Machine learning algorithms for high-resolution prediction of spatiotemporal distribution of air pollution from meteorological and soil parameters. ENVIRONMENT INTERNATIONAL 2023; 175:107931. [PMID: 37119651 DOI: 10.1016/j.envint.2023.107931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 03/18/2023] [Accepted: 04/11/2023] [Indexed: 05/22/2023]
Abstract
This study uses machine learning (ML) models for a high-resolution prediction (0.1°×0.1°) of air fine particular matter (PM2.5) concentration, the most harmful to human health, from meteorological and soil data. Iraq was considered the study area to implement the method. Different lags and the changing patterns of four European Reanalysis (ERA5) meteorological variables, rainfall, mean temperature, wind speed and relative humidity, and one soil parameter, the soil moisture, were used to select the suitable set of predictors using a non-greedy algorithm known as simulated annealing (SA). The selected predictors were used to simulate the temporal and spatial variability of air PM2.5 concentration over Iraq during the early summer (May-July), the most polluted months, using three advanced ML models, extremely randomized trees (ERT), stochastic gradient descent backpropagation (SGD-BP) and long short-term memory (LSTM) integrated with Bayesian optimizer. The spatial distribution of the annual average PM2.5 revealed the population of the whole of Iraq is exposed to a pollution level above the standard limit. The changes in temperature and soil moisture and the mean wind speed and humidity of the month before the early summer can predict the temporal and spatial variability of PM2.5 over Iraq during May-July. Results revealed the higher performance of LSTM with normalized root-mean-square error and Kling-Gupta efficiency of 13.4% and 0.89, compared to 16.02% and 0.81 for SDG-BP and 17.9% and 0.74 for ERT. The LSTM could also reconstruct the observed spatial distribution of PM2.5 with MapCurve and Cramer's V values of 0.95 and 0.91, compared to 0.9 and 0.86 for SGD-BP and 0.83 and 0.76 for ERT. The study provided a methodology for forecasting spatial variability of PM2.5 concentration at high resolution during the peak pollution months from freely available data, which can be replicated in other regions for generating high-resolution PM2.5 forecasting maps.
Collapse
Affiliation(s)
- Hai Tao
- School of Computer and Information, Qiannan Normal University for Nationalities, Duyun, Guizhou 558000, China; State Key Laboratory of Public Big Data, Guizhou University, Guizhou, Guiyang 550025, China; Institute for Big Data Analytics and Artificial Intelligence (IBDAAI), Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia.
| | - Ali H Jawad
- Faculty of Applied Sciences, UniversitiTeknologi MARA, 40450 Shah Alam, Selangor, Malaysia.
| | - A H Shather
- Dep of Computer Technology Engineering, Engineering Technical College, University of Alkitab, Iraq.
| | - Zainab Al-Khafaji
- Department of Building and Construction Technologies Engineering, AL-Mustaqbal University College, Hillah 51001, Iraq.
| | - Tarik A Rashid
- Computer Science and Engineering Department, University of Kurdistan Hewler, Erbil, KR, Iraq.
| | - Mumtaz Ali
- UniSQ College, University of Southern Queensland, QLD 4350, Australia.
| | - Nadhir Al-Ansari
- Dept. of Civil, Environmental and Natural Resources Engineering, Lulea Univ. of Technology, Lulea T3334, Sweden.
| | - Haydar Abdulameer Marhoon
- Information and Communication Technology Research Group, Scientific Research Center, Al-Ayen University, Thi-Qar, Iraq; College of Computer Sciences and Information Technology, University of Kerbala, Karbala, Iraq.
| | - Shamsuddin Shahid
- Department of Hydraulics and Hydrology, School of Civil Engineering, Faculty of Engineering, Universiti Teknologi Malaysia (UTM), 81310 Skudia, Johor, Malaysia.
| | - Zaher Mundher Yaseen
- Civil and Environmental Engineering Department, King Fahd University of Petroleum & Minerals, Dhahran 31261, Saudi Arabia; Interdisciplinary Research Center for Membranes and Water Security, King Fahd University of Petroleum & Minerals, Dhahran 31261, Saudi Arabia.
| |
Collapse
|
5
|
Khan A, Uddin J, Ali F, Kumar H, Alghamdi W, Ahmad A. AFP-SPTS: An Accurate Prediction of Antifreeze Proteins Using Sequential and Pseudo-Tri-Slicing Evolutionary Features with an Extremely Randomized Tree. J Chem Inf Model 2023; 63:826-834. [PMID: 36649569 DOI: 10.1021/acs.jcim.2c01417] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
The development of intracellular ice in the bodies of cold-blooded living organisms may cause them to die. These species yield antifreeze proteins (AFPs) to live in subzero temperature environments. Additionally, AFPs are implemented in biotechnological, industrial, agricultural, and medical fields. Machine learning-based predictors were presented for AFP identification. However, more accurate predictors are still highly desirable for boosting the AFP prediction. This work presents a novel approach, named AFP-SPTS, for the correct prediction of AFPs. We explored the discriminative features with four schemes, namely, dipeptide deviation from the expected mean (DDE), reduced amino acid alphabet (RAAA), grouped dipeptide composition (GDPC), and a novel representative method, called pseudo-position-specific scoring matrix tri-slicing (PseTS-PSSM). Considering the advantages of ensemble learning strategy, we fused each feature vector into different combinations and trained the models with five machine learning algorithms, i.e., multilayer perceptron (MLP), extremely randomized tree (ERT), decision tree (DT), random forest (RF), and AdaBoost. Among all models, PseTS-PSSM + RAAA with an extremely randomized tree attained the best outcomes. The proposed predictor (AFP-SPTS) boosted the accuracies of AFPs in the literature by 1.82 and 4.1%.
Collapse
Affiliation(s)
- Adnan Khan
- Qurtuba University of Science and Information Technology, Peshawar5000, Khyber Pakhtunkhwa, Pakistan
| | - Jamal Uddin
- Qurtuba University of Science and Information Technology, Peshawar5000, Khyber Pakhtunkhwa, Pakistan
| | - Farman Ali
- Sarhad University of Science and Information Technology, Mardan Campus, Peshawar23200, Pakistan.,Department of Elementary and Secondary Education Department, Government of Khyber Pakhtunkhwa, Peshawar5000, Khyber Pakhtunkhwa, Pakistan
| | - Harish Kumar
- Department of Computer Science, College of Computer Science, King Khalid University, Abha61421, Saudi Arabia
| | - Wajdi Alghamdi
- Department of Information Technology, Faculty of Computing and Information Technology, King AbdulAziz University, Jeddah21589, Saudi Arabia
| | - Aftab Ahmad
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan23200, Pakistan
| |
Collapse
|