1
|
Lu H, Zhang H, Zhong Y, Meng XY, Zhang MF, Qiu T. A machine learning model based on CHAT-23 for early screening of autism in Chinese children. Front Pediatr 2024; 12:1400110. [PMID: 39318617 PMCID: PMC11420024 DOI: 10.3389/fped.2024.1400110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Accepted: 07/31/2024] [Indexed: 09/26/2024] Open
Abstract
Introduction Autism spectrum disorder (ASD) is a neurodevelopmental condition that significantly impacts the mental, emotional, and social development of children. Early screening for ASD typically involves the use of a series of questionnaires. With answers to these questionnaires, healthcare professionals can identify whether a child is at risk for developing ASD and refer them for further evaluation and diagnosis. CHAT-23 is an effective and widely used screening test in China for the early screening of ASD, which contains 23 different kinds of questions. Methods We have collected clinical data from Wuxi, China. All the questions of CHAT-23 are regarded as different kinds of features for building machine learning models. We introduce machine learning methods into ASD screening, using the Max-Relevance and Min-Redundancy (mRMR) feature selection method to analyze the most important questions among all 23 from the collected CHAT-23 questionnaires. Seven mainstream supervised machine learning models were built and experiments were conducted. Results Among the seven supervised machine learning models evaluated, the best-performing model achieved a sensitivity of 0.909 and a specificity of 0.922 when the number of features was reduced to 9. This demonstrates the model's ability to accurately identify children for ASD with high precision, even with a more concise set of features. Discussion Our study focuses on the health of Chinese children, introducing machine learning methods to provide more accurate and effective early screening tests for autism. This approach not only enhances the early detection of ASD but also helps in refining the CHAT-23 questionnaire by identifying the most relevant questions for the diagnosis process.
Collapse
Affiliation(s)
- Hengyang Lu
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China
- Engineering Research Center of Intelligent Technology for Healthcare, Ministry of Education, Wuxi, China
| | - Heng Zhang
- Department of Child Health Care, Affiliated Women’s Hospital of Jiangnan University, Wuxi, China
| | - Yi Zhong
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China
| | - Xiang-Yu Meng
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China
| | - Meng-Fei Zhang
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China
| | - Ting Qiu
- Department of Child Health Care, Affiliated Women’s Hospital of Jiangnan University, Wuxi, China
| |
Collapse
|
2
|
Ejiyi CJ, Qin Z, Ukwuoma CC, Nneji GU, Monday HN, Ejiyi MB, Ejiyi TU, Okechukwu U, Bamisile OO. Comparative performance analysis of Boruta, SHAP, and Borutashap for disease diagnosis: A study with multiple machine learning algorithms. NETWORK (BRISTOL, ENGLAND) 2024:1-38. [PMID: 38511557 DOI: 10.1080/0954898x.2024.2331506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 03/13/2024] [Indexed: 03/22/2024]
Abstract
Interpretable machine learning models are instrumental in disease diagnosis and clinical decision-making, shedding light on relevant features. Notably, Boruta, SHAP (SHapley Additive exPlanations), and BorutaShap were employed for feature selection, each contributing to the identification of crucial features. These selected features were then utilized to train six machine learning algorithms, including LR, SVM, ETC, AdaBoost, RF, and LR, using diverse medical datasets obtained from public sources after rigorous preprocessing. The performance of each feature selection technique was evaluated across multiple ML models, assessing accuracy, precision, recall, and F1-score metrics. Among these, SHAP showcased superior performance, achieving average accuracies of 80.17%, 85.13%, 90.00%, and 99.55% across diabetes, cardiovascular, statlog, and thyroid disease datasets, respectively. Notably, the LGBM emerged as the most effective algorithm, boasting an average accuracy of 91.00% for most disease states. Moreover, SHAP enhanced the interpretability of the models, providing valuable insights into the underlying mechanisms driving disease diagnosis. This comprehensive study contributes significant insights into feature selection techniques and machine learning algorithms for disease diagnosis, benefiting researchers and practitioners in the medical field. Further exploration of feature selection methods and algorithms holds promise for advancing disease diagnosis methodologies, paving the way for more accurate and interpretable diagnostic models.
Collapse
Affiliation(s)
- Chukwuebuka Joseph Ejiyi
- School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Zhen Qin
- School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Chiagoziem Chima Ukwuoma
- School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Grace Ugochi Nneji
- Software Engineering Department, Sino-British Collaborative Education, Chengdu University of Technology, Oxford Brookes University, Chengdu, China
| | - Happy Nkanta Monday
- Software Engineering Department, Sino-British Collaborative Education, Chengdu University of Technology, Oxford Brookes University, Chengdu, China
| | | | - Thomas Ugochukwu Ejiyi
- Department of Pure and Industrial Chemistry, University of Nigeria Nsukka, Enugu, Nigeria
| | | | - Olusola O Bamisile
- Sichuan Industrial Internet Intelligent Monitoring and Application Engineering Technology Research Centre, Chengdu University of Technology, Chengdu, China
| |
Collapse
|
3
|
Sha M, Alqahtani A, Alsubai S, Dutta AK. Modified Meta Heuristic BAT with ML Classifiers for Detection of Autism Spectrum Disorder. Biomolecules 2023; 14:48. [PMID: 38254648 PMCID: PMC10813510 DOI: 10.3390/biom14010048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 12/22/2023] [Accepted: 12/27/2023] [Indexed: 01/24/2024] Open
Abstract
ASD (autism spectrum disorder) is a complex developmental and neurological disorder that impacts the social life of the affected person by disturbing their capability for interaction and communication. As it is a behavioural disorder, early treatment will improve the quality of life of ASD patients. Traditional screening is carried out with behavioural assessment through trained physicians, which is expensive and time-consuming. To resolve the issue, several conventional methods strive to achieve an effective ASD identification system, but are limited by handling large data sets, accuracy, and speed. Therefore, the proposed identification system employed the MBA (modified bat) algorithm based on ANN (artificial neural networks), modified ANN (modified artificial neural networks), DT (decision tree), and KNN (k-nearest neighbours) for the classification of ASD in children and adolescents. A BA (bat algorithm) is utilised for the automatic zooming capability, which improves the system's efficacy by excellently finding the solutions in the identification system. Conversely, BA is effective in the identification, it still has certain drawbacks like speed, accuracy, and falls into local extremum. Therefore, the proposed identification system modifies the BA optimisation with random perturbation of trends and optimal orientation. The dataset utilised in the respective model is the Q-chat-10 dataset. This dataset contains data of four stages of age groups such as toddlers, children, adolescents, and adults. To analyse the quality of the dataset, dataset evaluation mechanism, such as the Chi-Squared Statistic and p-value, are used in the respective research. The evaluation signifies the relation of the dataset with respect to the proposed model. Further, the performance of the proposed detection system is examined with certain performance metrics to calculate its efficiency. The outcome revealed that the modified ANN classifier model attained an accuracy of 1.00, ensuring improved performance when compared with other state-of-the-art methods. Thus, the proposed model was intended to assist physicians and researchers in enhancing the diagnosis of ASD to improve the standard of life of ASD patients.
Collapse
Affiliation(s)
- Mohemmed Sha
- Department of Software Engineering, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Kharj 16278, Saudi Arabia;
| | - Abdullah Alqahtani
- Department of Software Engineering, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Kharj 16278, Saudi Arabia;
| | - Shtwai Alsubai
- Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Kharj 16278, Saudi Arabia;
| | - Ashit Kumar Dutta
- Department of Computer Science and Information Systems, College of Applied Sciences, Almaarefa University, Riyadh 11597, Saudi Arabia;
| |
Collapse
|
4
|
Twala B, Molloy E. On effectively predicting autism spectrum disorder therapy using an ensemble of classifiers. Sci Rep 2023; 13:19957. [PMID: 37968315 PMCID: PMC10651853 DOI: 10.1038/s41598-023-46379-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Accepted: 10/31/2023] [Indexed: 11/17/2023] Open
Abstract
An ensemble of classifiers combines several single classifiers to deliver a final prediction or classification decision. An increasingly provoking question is whether such an ensemble can outperform the single best classifier. If so, what form of ensemble learning system (also known as multiple classifier learning systems) yields the most significant benefits in the size or diversity of the ensemble? In this paper, the ability of ensemble learning to predict and identify factors that influence or contribute to autism spectrum disorder therapy (ASDT) for intervention purposes is investigated. Given that most interventions are typically short-term in nature, henceforth, developing a robotic system that will provide the best outcome and measurement of ASDT therapy has never been so critical. In this paper, the performance of five single classifiers against several multiple classifier learning systems in exploring and predicting ASDT is investigated using a dataset of behavioural data and robot-enhanced therapy against standard human treatment based on 3000 sessions and 300 h, recorded from 61 autistic children. Experimental results show statistically significant differences in performance among the single classifiers for ASDT prediction with decision trees as the more accurate classifier. The results further show multiple classifier learning systems (MCLS) achieving better performance for ASDT prediction (especially those ensembles with three core classifiers). Additionally, the results show bagging and boosting ensemble learning as robust when predicting ASDT with multi-stage design as the most dominant architecture. It also appears that eye contact and social interaction are the most critical contributing factors to the ASDT problem among children.
Collapse
Affiliation(s)
- Bhekisipho Twala
- Office of the Deputy Vice-Chancellor (Digital Transformation), Tshwane University of Technology, Private Bag x680, Pretoria, 001, South Africa.
| | - Eamon Molloy
- Waterford Institute of Technology, School of Science & Computing, Waterford, Ireland
| |
Collapse
|
5
|
Awaji B, Senan EM, Olayah F, Alshari EA, Alsulami M, Abosaq HA, Alqahtani J, Janrao P. Hybrid Techniques of Facial Feature Image Analysis for Early Detection of Autism Spectrum Disorder Based on Combined CNN Features. Diagnostics (Basel) 2023; 13:2948. [PMID: 37761315 PMCID: PMC10527645 DOI: 10.3390/diagnostics13182948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 09/07/2023] [Accepted: 09/11/2023] [Indexed: 09/29/2023] Open
Abstract
Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder characterized by difficulties in social communication and repetitive behaviors. The exact causes of ASD remain elusive and likely involve a combination of genetic, environmental, and neurobiological factors. Doctors often face challenges in accurately identifying ASD early due to its complex and diverse presentation. Early detection and intervention are crucial for improving outcomes for individuals with ASD. Early diagnosis allows for timely access to appropriate interventions, leading to better social and communication skills development. Artificial intelligence techniques, particularly facial feature extraction using machine learning algorithms, display promise in aiding the early detection of ASD. By analyzing facial expressions and subtle cues, AI models identify patterns associated with ASD features. This study developed various hybrid systems to diagnose facial feature images for an ASD dataset by combining convolutional neural network (CNN) features. The first approach utilized pre-trained VGG16, ResNet101, and MobileNet models. The second approach employed a hybrid technique that combined CNN models (VGG16, ResNet101, and MobileNet) with XGBoost and RF algorithms. The third strategy involved diagnosing ASD using XGBoost and an RF based on features of VGG-16-ResNet101, ResNet101-MobileNet, and VGG16-MobileNet models. Notably, the hybrid RF algorithm that utilized features from the VGG16-MobileNet models demonstrated superior performance, reached an AUC of 99.25%, an accuracy of 98.8%, a precision of 98.9%, a sensitivity of 99%, and a specificity of 99.1%.
Collapse
Affiliation(s)
- Bakri Awaji
- Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran 6646, Saudi Arabia; (M.A.); (H.A.A.); (J.A.)
| | - Ebrahim Mohammed Senan
- Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, Alrazi University, Sana’a, Yemen
| | - Fekry Olayah
- Department of Information System, College of Computer Science and Information Systems, Najran University, Najran 6646, Saudi Arabia;
| | - Eman A. Alshari
- Department of Computer Science and Information Technology, Thamar University, Dhamar 87246, Yemen;
- Department of Artificial Intelligence, Faculty of Engineering and Smart Computing, Modern Specialized University, Sana’a, Yemen
| | - Mohammad Alsulami
- Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran 6646, Saudi Arabia; (M.A.); (H.A.A.); (J.A.)
| | - Hamad Ali Abosaq
- Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran 6646, Saudi Arabia; (M.A.); (H.A.A.); (J.A.)
| | - Jarallah Alqahtani
- Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran 6646, Saudi Arabia; (M.A.); (H.A.A.); (J.A.)
| | - Prachi Janrao
- Thakur College of Engineering and Technology, Kandivali(E), Mumbai 400101, India;
| |
Collapse
|