1
|
Lucas C, Torres-Guzman R, James AJ, Corlew S, Stone A, Powell ME, Golinko M, Pontell ME. Machine Learning for Automatic Detection of Velopharyngeal Dysfunction: A Preliminary Report. J Craniofac Surg 2024:00001665-990000000-01509. [PMID: 38709082 DOI: 10.1097/scs.0000000000010147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 02/16/2024] [Indexed: 05/07/2024] Open
Abstract
BACKGROUND Even after palatoplasty, the incidence of velopharyngeal dysfunction (VPD) can reach 30%; however, these estimates arise from high-income countries (HICs) where speech-language pathologists (SLP) are part of standardized cleft teams. The VPD burden in low- and middle-income countries (LMICs) is unknown. This study aims to develop a machine-learning model that can detect the presence of VPD using audio samples alone. METHODS Case and control audio samples were obtained from institutional and publicly available sources. A machine-learning model was built using Python software. RESULTS The initial 110 audio samples used to test and train the model were retested after format conversion and file deidentification. Each sample was tested 5 times yielding a precision of 100%. Sensitivity was 92.73% (95% CI: 82.41%-97.98%) and specificity was 98.18% (95% CI: 90.28%-99.95%). One hundred thirteen prospective samples, which had not yet interacted with the model, were then tested. Precision was again 100% with a sensitivity of 88.89% (95% CI: 78.44%-95.41%) and a specificity of 66% (95% CI: 51.23%-78.79%). DISCUSSION VPD affects nearly 100% of patients with unrepaired overt soft palatal clefts and up to 30% of patients who have undergone palatoplasty. VPD can render patients unintelligible, thereby accruing significant psychosocial morbidity. The true burden of VPD in LMICs is unknown, and likely exceeds estimates from HICs. The ability to access a phone-based screening machine-learning model could expand access to diagnostic, and potentially therapeutic modalities for an innumerable amount of patients worldwide who suffer from VPD.
Collapse
Affiliation(s)
- Claiborne Lucas
- Department of General Surgery, Prisma Health Greenville, Greenville, SC
| | | | - Andrew J James
- Department of Plastic Surgery, Vanderbilt University Medical Center, Nashville, TN
| | - Scott Corlew
- Blavatnik Institute of Global Health & Social Medicine, Program in Global Surgery and Social Change, Harvard Medical School, Boston Children's Hospital, Boston, MA
| | - Amy Stone
- Department of Otolaryngology-Head and Neck Surgery, Vanderbilt University Medical Center
| | - Maria E Powell
- Department of Otolaryngology-Head and Neck Surgery, Vanderbilt University Medical Center
| | - Michael Golinko
- Department of Plastic Surgery, Vanderbilt University Medical Center, Nashville, TN
- Division of Pediatric Plastic Surgery, Monroe Carell Jr. Children's Hospital, Nashville, TN
| | - Matthew E Pontell
- Department of Plastic Surgery, Vanderbilt University Medical Center, Nashville, TN
- Division of Pediatric Plastic Surgery, Monroe Carell Jr. Children's Hospital, Nashville, TN
| |
Collapse
|
2
|
Huqh MZU, Abdullah JY, AL-Rawas M, Husein A, Ahmad WMAW, Jamayet NB, Genisa M, Yahya MRB. Development of Artificial Neural Network-Based Prediction Model for Evaluation of Maxillary Arch Growth in Children with Complete Unilateral Cleft Lip and Palate. Diagnostics (Basel) 2023; 13:3025. [PMID: 37835768 PMCID: PMC10572375 DOI: 10.3390/diagnostics13193025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 09/14/2023] [Accepted: 09/20/2023] [Indexed: 10/15/2023] Open
Abstract
INTRODUCTION Cleft lip and palate (CLP) are the most common congenital craniofacial deformities that can cause a variety of dental abnormalities in children. The purpose of this study was to predict the maxillary arch growth and to develop a neural network logistic regression model for both UCLP and non-UCLP individuals. METHODS This study utilizes a novel method incorporating many approaches, such as the bootstrap method, a multi-layer feed-forward neural network, and ordinal logistic regression. A dataset was created based on the following factors: socio-demographic characteristics such as age and gender, as well as cleft type and category of malocclusion associated with the cleft. Training data were used to create a model, whereas testing data were used to validate it. The study is separated into two phases: phase one involves the use of a multilayer neural network and phase two involves the use of an ordinal logistic regression model to analyze the underlying association between cleft and the factors chosen. RESULTS The findings of the hybrid technique using ordinal logistic regression are discussed, where category acts as both a dependent variable and as the study's output. The ordinal logistic regression was used to classify the dependent variables into three categories. The suggested technique performs exceptionally well, as evidenced by a Predicted Mean Square Error (PMSE) of 2.03%. CONCLUSION The outcome of the study suggests that there is a strong association between gender, age, and cleft. The difference in width and length of the maxillary arch in UCLP is mainly related to the severity of the cleft and facial growth pattern.
Collapse
Affiliation(s)
- Mohamed Zahoor Ul Huqh
- Orthodontic Unit, School of Dental Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Malaysia;
| | - Johari Yap Abdullah
- Craniofacial Imaging Lab, School of Dental Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Malaysia
| | - Matheel AL-Rawas
- Prosthodontic Unit, School of Dental Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Malaysia;
| | - Adam Husein
- Prosthodontic Unit, School of Dental Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Malaysia;
| | - Wan Muhamad Amir W Ahmad
- Department of Biostatistics, School of Dental Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Malaysia;
| | - Nafij Bin Jamayet
- Division of Restorative Dentistry (Prosthodontics), School of Dentistry, International Medical University, Bukit Jalil, Kuala Lumpur 57000, Malaysia;
| | - Maya Genisa
- Biomedical Programme, Faculty of Pascasarjana, YARSI University, Jakarta 10510, Indonesia;
| | - Mohd Rosli Bin Yahya
- Oral & Maxillofacial Department, Hospital Raja Perempuan Zainab II, Kota Bharu 15586, Malaysia;
| |
Collapse
|
3
|
Huqh MZU, Abdullah JY, Wong LS, Jamayet NB, Alam MK, Rashid QF, Husein A, Ahmad WMAW, Eusufzai SZ, Prasadh S, Subramaniyan V, Fuloria NK, Fuloria S, Sekar M, Selvaraj S. Clinical Applications of Artificial Intelligence and Machine Learning in Children with Cleft Lip and Palate-A Systematic Review. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph191710860. [PMID: 36078576 PMCID: PMC9518587 DOI: 10.3390/ijerph191710860] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Accepted: 08/22/2022] [Indexed: 05/03/2023]
Abstract
OBJECTIVE The objective of this systematic review was (a) to explore the current clinical applications of AI/ML (Artificial intelligence and Machine learning) techniques in diagnosis and treatment prediction in children with CLP (Cleft lip and palate), (b) to create a qualitative summary of results of the studies retrieved. MATERIALS AND METHODS An electronic search was carried out using databases such as PubMed, Scopus, and the Web of Science Core Collection. Two reviewers searched the databases separately and concurrently. The initial search was conducted on 6 July 2021. The publishing period was unrestricted; however, the search was limited to articles involving human participants and published in English. Combinations of Medical Subject Headings (MeSH) phrases and free text terms were used as search keywords in each database. The following data was taken from the methods and results sections of the selected papers: The amount of AI training datasets utilized to train the intelligent system, as well as their conditional properties; Unilateral CLP, Bilateral CLP, Unilateral Cleft lip and alveolus, Unilateral cleft lip, Hypernasality, Dental characteristics, and sagittal jaw relationship in children with CLP are among the problems studied. RESULTS Based on the predefined search strings with accompanying database keywords, a total of 44 articles were found in Scopus, PubMed, and Web of Science search results. After reading the full articles, 12 papers were included for systematic analysis. CONCLUSIONS Artificial intelligence provides an advanced technology that can be employed in AI-enabled computerized programming software for accurate landmark detection, rapid digital cephalometric analysis, clinical decision-making, and treatment prediction. In children with corrected unilateral cleft lip and palate, ML can help detect cephalometric predictors of future need for orthognathic surgery.
Collapse
Affiliation(s)
- Mohamed Zahoor Ul Huqh
- Orthodontic Unit, School of Dental Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Malaysia
| | - Johari Yap Abdullah
- Craniofacial Imaging Lab, School of Dental Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Malaysia
- Correspondence: (J.Y.A.); (L.S.W.); (S.S.)
| | - Ling Shing Wong
- Faculty of Health and Life Sciences, INTI International University, Nilai 71800, Malaysia
- Correspondence: (J.Y.A.); (L.S.W.); (S.S.)
| | - Nafij Bin Jamayet
- Division of Clinical Dentistry (Prosthodontics), School of Dentistry, International Medical University, Bukit Jalil, Kuala Lumpur 57000, Malaysia
| | - Mohammad Khursheed Alam
- Orthodontic Division, Preventive Dentistry Department, College of Dentistry, Jouf University, Sakaka 72345, Saudi Arabia
| | - Qazi Farah Rashid
- Prosthodontic Unit, School of Dental Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Malaysia
| | - Adam Husein
- Prosthodontic Unit, School of Dental Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Malaysia
| | - Wan Muhamad Amir W. Ahmad
- Department of Biostatistics, School of Dental Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Malaysia
| | - Sumaiya Zabin Eusufzai
- Department of Biostatistics, School of Dental Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Malaysia
| | - Somasundaram Prasadh
- National Dental Center Singapore, 5 Second Hospital Avenue, Singapore 168938, Singapore
| | | | | | | | - Mahendran Sekar
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy and Health Sciences, Royal College of Medicine Perak, Universiti Kuala Lumpur, Ipoh 30450, Malaysia
| | - Siddharthan Selvaraj
- Faculty of Dentistry, AIMST University, Bedong 08100, Malaysia
- Correspondence: (J.Y.A.); (L.S.W.); (S.S.)
| |
Collapse
|
4
|
Young K, Sweeney T, Vos RR, Mehendale F, Daffern H. Evaluation of noise excitation as a method for detection of hypernasality. APPLIED ACOUSTICS. ACOUSTIQUE APPLIQUE. ANGEWANDTE AKUSTIK 2022; 190:108639. [PMID: 35300323 PMCID: PMC8872831 DOI: 10.1016/j.apacoust.2022.108639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 01/07/2022] [Accepted: 01/11/2022] [Indexed: 06/14/2023]
Abstract
Hypernasality is a disorder where excess nasal resonance is perceived during speech, often as a result of abnormal coupling between the oral and nasal tracts known as velopharyngeal insufficiency (VPI). The most common cause of VPI is a cleft palate, which affects around 1 in 1650 babies, around ⅓ of whom have persistent speech problems after surgery. Current equipment-based assessment methods are invasive and require expert knowledge, and perceptual assessment methods are limited by the availability of expert listeners and differing interpretations of assessment scales. Spectral analysis of hypernasality within the academic community has resulted in potentially useful spectral indicators, but these are highly variable, vowel specific, and not commonly used within clinical practice. Previous works by others have developed noise excitation technologies for the measurement of oral tract transfer functions using resonance measurement devices (RMD). These techniques provide an opportunity to investigate the structural system abnormalities which lead to hypernasality, without the need for invasive measurement equipment. Thus, the work presented in this study adapts these techniques for the detection of hypernasality. These adaptations include augmentation of the hardware and development of the software, so as to be suitable for transfer function measurement at the nostrils rather than the mouth (nRMD). The new method was tested with a single participant trained in hypernasal production, producing 'normal' and hypernasal vowels, and the recordings validated through a listening test by an expert listener and calculation of nasalance values using a nasality microphone. These validation stages indicated the reliability of the captured data, and analysis of the nRMD measurements indicated the presence of a systematic difference in the frequency range 2 to 2.5 kHz between normal and hypernasal speech. Further investigation is warranted to determine the generalisability of these findings across speakers, and to investigate the origins of differences manifesting in the transfer functions between conditions. This will provide new insights into the effects of nasal tract coupling on voice acoustics, which could in turn lead to the development of useful new tools to support clinicians in their work with hypernasality.
Collapse
Affiliation(s)
- Kat Young
- AudioLab, Department of Electronic Engineering, University of York, UK
| | | | - Rebecca R. Vos
- Speech and Audio Processing, Department of Electrical and Electronic Engineering, Imperial College London, UK
| | - Felicity Mehendale
- Global Cleft Lip and Palate Research Programme, Global Health Research Centre, Usher Institute, University of Edinburgh, UK
| | - Helena Daffern
- AudioLab, Department of Electronic Engineering, University of York, UK
| |
Collapse
|
5
|
Girish K, Pushpavathi M, Abraham A, Vikram C. Automatic speech processing software – New sensitive tool for the assessment of nasality: A preliminary study. JOURNAL OF CLEFT LIP PALATE AND CRANIOFACIAL ANOMALIES 2022. [DOI: 10.4103/jclpca.jclpca_22_21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
|
6
|
Mathad VC, Scherer N, Chapman K, Liss JM, Berisha V. A Deep Learning Algorithm for Objective Assessment of Hypernasality in Children With Cleft Palate. IEEE Trans Biomed Eng 2021; 68:2986-2996. [PMID: 33566756 PMCID: PMC9023650 DOI: 10.1109/tbme.2021.3058424] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
OBJECTIVES Evaluation of hypernasality requires extensive perceptual training by clinicians and extending this training on a large scale internationally is untenable; this compounds the health disparities that already exist among children with cleft. In this work, we present the objective hypernasality measure (OHM), a speech-based algorithm that automatically measures hypernasality in speech, and validate it relative to a group of trained clinicians. METHODS We trained a deep neural network (DNN) on approximately 100 hours of a publicly-available healthy speech corpus to detect the presence of nasal acoustic cues generated through the production of nasal consonants and nasalized phonemes in speech. Importantly, this model does not require any clinical data for training. The posterior probabilities of the deep learning model were aggregated at the sentence and speaker-levels to compute the OHM. RESULTS The results showed that the OHM was significantly correlated with perceptual hypernasality ratings from the Americleft database (r = 0.797, p < 0.001) and the New Mexico Cleft Palate Center (NMCPC) database (r = 0.713, p < 0.001). In addition, we evaluated the relationship between the OHM and articulation errors; the sensitivity of the OHM in detecting the presence of very mild hypernasality; and established the internal reliability of the metric. Further, the performance of the OHM was compared with a DNN regression algorithm directly trained on the hypernasal speech samples. SIGNIFICANCE The results indicate that the OHM is able to measure the severity of hypernasality on par with Americleft-trained clinicians on thisdataset.
Collapse
|
7
|
Dhillon H, Chaudhari PK, Dhingra K, Kuo RF, Sokhi RK, Alam MK, Ahmad S. Current Applications of Artificial Intelligence in Cleft Care: A Scoping Review. Front Med (Lausanne) 2021; 8:676490. [PMID: 34395471 PMCID: PMC8355556 DOI: 10.3389/fmed.2021.676490] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Accepted: 06/30/2021] [Indexed: 01/30/2023] Open
Abstract
Objective: This scoping review aims to identify the various areas and current status of the application of artificial intelligence (AI) for aiding individuals with cleft lip and/or palate. Introduction: Cleft lip and/or palate contributes significantly toward the global burden on the healthcare system. Artificial intelligence is a technology that can help individuals with cleft lip and/or palate, especially those in areas with limited access to receive adequate care. Inclusion Criteria: Studies that used artificial intelligence to aid the diagnosis, treatment, or its planning in individuals with cleft lip and/or palate were included. Methodology: A search of the Pubmed, Embase, and IEEE Xplore databases was conducted using search terms artificial intelligence and cleft lip and/or palate. Gray literature was searched using Google Scholar. The study was conducted according to the PRISMA- ScR guidelines. Results: The initial search identified 458 results, which were screened based on title and abstracts. After the screening, removal of duplicates, and a full-text reading of selected articles, 26 publications were included. They explored the use of AI in cleft lip and/or palate to aid in decisions regarding diagnosis, treatment, especially speech therapy, and prediction. Conclusion: There is active interest and immense potential for the use of artificial intelligence in cleft lip and/or palate. Most studies currently focus on speech in cleft palate. Multi-center studies that include different populations, with collaboration amongst academicians and researchers, can further develop the technology.
Collapse
Affiliation(s)
- Harnoor Dhillon
- Centre for Dental Education and Research, All India Institute of Medical Sciences, New Delhi, India
| | - Prabhat Kumar Chaudhari
- Centre for Dental Education and Research, All India Institute of Medical Sciences, New Delhi, India
| | - Kunaal Dhingra
- Centre for Dental Education and Research, All India Institute of Medical Sciences, New Delhi, India
| | - Rong-Fu Kuo
- Medical Device Innovation Centre, National Cheng Kung University, Tainan, Taiwan
| | - Ramandeep Kaur Sokhi
- Centre for Dental Education and Research, All India Institute of Medical Sciences, New Delhi, India
| | | | - Shandar Ahmad
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India
| |
Collapse
|
8
|
Ozkan HB, Kulak Kayikci ME, Gunaydin RO, Ozgur FF. Comparing the Temporal Aspects of Velopharyngeal Closure in Children with and without Cleft Palate. Folia Phoniatr Logop 2021; 74:153-166. [PMID: 34274924 DOI: 10.1159/000517296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 05/18/2021] [Indexed: 11/19/2022] Open
Abstract
INTRODUCTION Children with cleft palate exhibit differences in the 4 temporal components of nasalization (nasal onset and offset intervals, nasal consonant duration, and total speech duration), with various patterns having been noted based on different languages. Thus, the current study aimed to examine the temporal aspects of velopharyngeal closure in children with and without cleft palate; this is the first study to do so in the Turkish language. METHODS This study evaluated and compared the 4 temporal characteristics of velopharyngeal closure in children (aged 6-10 years) with (n = 28) and without (n = 28) cleft palate using nonword consonant and vowel speech samples, including the bilabial nasal-to-stop combination /mp/ and the velar nasal-to-stop combination /ηk/. Acoustic data were recorded using a nasometer, after which acoustic waveforms were examined to determine the 4 temporal components of nasalization. Flexible nasoendoscopy was then used to evaluate velopharyngeal closure patterns. RESULTS With regard to the 4 closure patterns, significant differences in the nasal offset interval (F4-25 = 10.213, p = 0.04; p < 0.05) and the nasal consonant duration ratio (F4-25 = 12.987, p = 0.02; p < 0.05) were observed for only /ampa/. The coronal closure pattern showed the longest closure duration (0.74 s). Children with cleft palate showed prolonged temporal parameters in all 4 characteristics, reflecting oral-nasal resonance imbalances. In particular, the low vowel sound /a/ was significantly more prolonged than the high vowel sounds /i/ and /u/. CONCLUSIONS The examined temporal parameters offer more accurate characterizations of velopharygeal closure, thereby allowing more accurate clinical assessments and more appropriate treatment procedures. Children with cleft palate showed longer nasalization durations compared to those without the same. Thus, the degree of hypernasality in children with cleft palate may affect the temporal aspects of nasalization.
Collapse
Affiliation(s)
- Hilal Burcu Ozkan
- Department of Audiology, Faculty of Health Sciences, Hacettepe University, Ankara, Turkey
| | - Mavis Emel Kulak Kayikci
- Department of Speech and Language Therapy, Faculty of Health Sciences, Hacettepe University, Ankara, Turkey
| | - Riza Onder Gunaydin
- Department of Otorhinolaryngology, Faculty of Medicine, Hacettepe University, Ankara, Turkey
| | - Fatma Figen Ozgur
- Department of Plastic Reconstructive and Aesthetic Surgery, Faculty of Medicine, Hacettepe University, Ankara, Turkey
| |
Collapse
|
9
|
A Role for Artificial Intelligence in the Classification of Craniofacial Anomalies. J Craniofac Surg 2021; 32:967-969. [PMID: 33405463 DOI: 10.1097/scs.0000000000007369] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
ABSTRACT Development of an objective algorithm to diagnose and assess craniofacial conditions has the potential to facilitate early diagnosis, especially for care providers with limited craniofacial expertise. Deep learning, a branch of artificial intelligence, can automatically analyze and categorize disease without human assistance. Convolutional neural networks (CNN) have excelled in utilizing medical images to automatically classify disease. In this study, the authors developed CNN models to detect and classify non-syndromic craniosynostosis (CS) using 2D images. The authors created an annotated data set of labeled CS (normal, metopic, sagittal, and unicoronal) conditions using standard clinical photography from the image repository at our center. The authors extended this dataset set by adding photographic images of children with craniofacial conditions from the internet. A total of 1076 images were used in this study. The authors developed a CNN model using a pre-trained ResNet-50 model to classify the data as metopic, sagittal, and unicoronal. The testing accuracy for the CS ResNet50 model achieved an overall testing accuracy of 90.6%. The sensitivity and precision were: 100% and 100% for metopic, 93.3% and 100% for sagittal, and 66.7% and 100% for unicoronal, respectively. The CNN model performed with promising accuracy. These results support the idea that deep learning has a role in diagnosis of craniofacial conditions. Using standard 2D clinical photography, such systems can provide automated screening and detection of these conditions. In the future, ML may be applied to prediction and assessment of surgical outcomes, or as an open-source remote diagnostic resource.
Collapse
|
10
|
Li P, Jiang S. Analysis of the characteristics of English part of speech based on unsupervised machine learning and image recognition model. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2020. [DOI: 10.3233/jifs-179960] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
If there are more external interference factors in the process of intelligent recognition in English, the recognition accuracy will be greatly reduced. It is of great academic value and application significance to deeply study feature recognition of English part-of-speech and realize automatic image processing of English recognition. Based on unsupervised machine learning and image recognition technology, this study combines the actual factors of English recognition to set the corresponding influencing factors and proposes a reliable method to identify multi-body rotating characters. This method utilizes the principle of the periodic characteristics of the trajectory rotation on the feature space. Moreover, this study conducts a comparative analysis of recognition accuracy by comparative experiments. In addition, this paper analyzes the recognition principles of 4 fonts in detail. The research results show that the proposed method has certain effects and can provide theoretical reference for subsequent related research.
Collapse
Affiliation(s)
- Pengpeng Li
- Cangzhou Normal University, Cangzhou, Hebei, China
| | - Shuai Jiang
- Cangzhou Normal University, Cangzhou, Hebei, China
| |
Collapse
|
11
|
Saxon M, Tripathi A, Jiao Y, Liss J, Berisha V. Robust Estimation of Hypernasality in Dysarthria with Acoustic Model Likelihood Features. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 2020; 28:2511-2522. [PMID: 33748328 PMCID: PMC7978228 DOI: 10.1109/taslp.2020.3015035] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Hypernasality is a common characteristic symptom across many motor-speech disorders. For voiced sounds, hypernasality introduces an additional resonance in the lower frequencies and, for unvoiced sounds, there is reduced articulatory precision due to air escaping through the nasal cavity. However, the acoustic manifestation of these symptoms is highly variable, making hypernasality estimation very challenging, both for human specialists and automated systems. Previous work in this area relies on either engineered features based on statistical signal processing or machine learning models trained on clinical ratings. Engineered features often fail to capture the complex acoustic patterns associated with hypernasality, whereas metrics based on machine learning are prone to overfitting to the small disease-specific speech datasets on which they are trained. Here we propose a new set of acoustic features that capture these complementary dimensions. The features are based on two acoustic models trained on a large corpus of healthy speech. The first acoustic model aims to measure nasal resonance from voiced sounds, whereas the second acoustic model aims to measure articulatory imprecision from unvoiced sounds. To demonstrate that the features derived from these acoustic models are specific to hypernasal speech, we evaluate them across different dysarthria corpora. Our results show that the features generalize even when training on hypernasal speech from one disease and evaluating on hypernasal speech from another disease (e.g., training on Parkinson's disease, evaluation on Huntington's disease), and when training on neurologically disordered speech but evaluating on cleft palate speech.
Collapse
Affiliation(s)
- Michael Saxon
- Arizona State Univ., Sch. of Elect., Comput., & Energy Eng., Tempe, Arizona, USA
| | - Ayush Tripathi
- Arizona State Univ., Sch. of Elect., Comput., & Energy Eng., Tempe, Arizona, USA
| | - Yishan Jiao
- Arizona State Univ., Sch. of Elect., Comput., & Energy Eng., Tempe, Arizona, USA
| | - Julie Liss
- Arizona State Univ., Sch. of Elect., Comput., & Energy Eng., Tempe, Arizona, USA
| | - Visar Berisha
- Arizona State Univ., Sch. of Elect., Comput., & Energy Eng., Tempe, Arizona, USA
| |
Collapse
|
12
|
Zhang J, Yang S, Wang X, Tang M, Yin H, He L. Automatic hypernasality grade assessment in cleft palate speech based on the spectral envelope method. ACTA ACUST UNITED AC 2020; 65:73-86. [PMID: 31525154 DOI: 10.1515/bmt-2018-0181] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2018] [Accepted: 05/07/2019] [Indexed: 02/05/2023]
Abstract
Due to velopharyngeal incompetence, airflow overflows from the oral cavity to the nasal cavity, which results in hypernasality. Hypernasality greatly reduces speech intelligibility and affects the daily communication of patients with cleft palate. Accurate assessment of hypernasality grades can provide assisted diagnosis for speech-language pathologists (SLPs) in clinical settings. Utilizing a support vector machine (SVM), this paper classifies speech recordings into four grades (normal, mild, moderate and severe hypernasality) based on vocal tract characteristics. Linear prediction (LP) analysis is widely used to model the vocal tract. Glottal source information may be included in the LP-based spectrum. The stabilized weighted linear prediction (SWLP) method, which imposes the temporal weights on the closed-phase interval of the glottal cycle, is a more robust approach for modeling the vocal tract. The extended weighted linear prediction (XLP) method weights each lagged speech signal separately, which achieves a finer time scale on the spectral envelope than the SWLP method. Tested speech recordings were collected from 60 subjects with cleft palate and 20 control subjects, and included a total of 4640 Mandarin syllables. The experimental results showed that the spectral envelope of normal speech decreases faster than that of hypernasal speech in the high-frequency part. The experimental results also indicate that the SWLP- and XLP-based methods have smaller correlation coefficients between normal and hypernasal speech than the LP method. Thus, the SWLP and XLP methods have better ability to distinguish hypernasal from normal speech than the LP method. The classification accuracies of the four hypernasality grades using the SWLP and XLP methods range from 83.86% to 97.47%. The selection of the model order and the size of the weight function are also discussed in this paper.
Collapse
Affiliation(s)
- Jing Zhang
- College of Electrical Engineering, Sichuan University, Chengdu 610065, China
| | - Sen Yang
- College of Electrical Engineering, Sichuan University, Chengdu 610065, China
| | - Xiyue Wang
- College of Electrical Engineering, Sichuan University, Chengdu 610065, China
| | - Ming Tang
- College of Electrical Engineering, Sichuan University, Chengdu 610065, China
| | - Heng Yin
- West China Hospital of Stomatology, Sichuan University, Chengdu 610041, China
| | - Ling He
- College of Electrical Engineering, Sichuan University, Chengdu 610065, China
| |
Collapse
|
13
|
Dubey AK, Prasanna SRM, Dandapat S. Detection and assessment of hypernasality in repaired cleft palate speech using vocal tract and residual features. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:4211. [PMID: 31893680 DOI: 10.1121/1.5134433] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
The presence of hypernasality in repaired cleft palate (CP) speech is a consequence of velopharyngeal insufficiency. The coupling of the nasal tract with the oral tract adds nasal formant and antiformant pairs in the hypernasal speech spectrum. This addition deviates the spectral and linear prediction (LP) residual characteristics of hypernasal speech compared to normal speech. In this work, the vocal tract constriction feature, peak to side-lobe ratio feature, and spectral moment features augmented by low-order cepstral coefficients are used to capture the spectral and residual deviations for hypernasality detection. The first feature captures the lower-frequencies prominence in speech due to the presence of nasal formants, the second feature captures the undesirable signal components in the residual signal due to the nasal antiformants, and the third feature captures the information about formants and antiformants in the spectrum along with the spectral envelope. The combination of three features gives normal versus hypernasal speech detection accuracies of 87.76%, 91.13%, and 93.70% for /a/, /i/, and /u/ vowels, respectively, and hypernasality severity detection accuracies of 80.13% and 81.25% for /i/ and /u/ vowels, respectively. The speech data are collected from 30 control normal and 30 repaired CP children between the ages of 7 and 12.
Collapse
Affiliation(s)
- Akhilesh Kumar Dubey
- Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam 781039, India
| | - S R Mahadeva Prasanna
- Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam 781039, India
| | - S Dandapat
- Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam 781039, India
| |
Collapse
|
14
|
Wang X, Yang S, Tang M, Yin H, Huang H, He L. HypernasalityNet: Deep recurrent neural network for automatic hypernasality detection. Int J Med Inform 2019; 129:1-12. [PMID: 31445242 DOI: 10.1016/j.ijmedinf.2019.05.023] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2018] [Revised: 04/03/2019] [Accepted: 05/22/2019] [Indexed: 10/26/2022]
Abstract
BACKGROUND Cleft palate patients have inability to produce adequate velopharyngeal closure, which results in hypernasal speech. In clinic, hypernasal speech is assessed through subject assessment by speech language pathologists. Automatic hypernasal speech detection can provide aided diagnoses for speech language pathologists and clinicians. OBJECTIVES This study aims to develop Long Short-Term Memory (LSTM) based Deep Recurrent Neural Network (DRNN) system to detect hypernasal speech from cleft palate patients, thus to provide aided diagnoses for clinical operation and speech therapy. Meanwhile, the feature mining and classification abilities of LSTM-DRNN system are explored. METHODS The utilized speech recordings are 14,544 vowels in Mandarin. Speech data is collected from 144 children (72 children with hypernasality and 72 controls) with the age of 5-12 years old. This work proposes a LSTM based DRNN system to achieve automatic hypernasal speech detection, since LSTM-DRNN can learn short-time dependences of hypernasal speech. The vocal tract based features are fed into LSTM-DRNN to achieve deep mining of features. To verify the feature mining ability of LSTM-DRNN, features projected by LSTM-DRNN are fed into shallow classifiers instead of the following two fully connected layers and a softmax layer. And the features without the projecting process of LSTM-DRNN are directly fed into shallow classifiers as a comparison. Hypernasality-sensitive vowels (/a/, /i/, and /u/) are analyzed for the first time. RESULTS This LSTM-DRNN based hypernasal speech detection method reaches higher detection accuracy than that using shallow classifiers, since LSTM-DRNN mines features through time axis and network depth simultaneously. The proposed LSTM-DRNN based hypernasality detection system reaches the highest accuracy of 93.35%. According to the analysis of hypernasality-sensitive vowels, the experimental result concludes that vowels /i/ and /u/ are the most sensitive vowels to hypernasal speech. CONCLUSIONS The results show that LSTM-DRNN has robust feature mining ability and classification ability. This is the first work that applies the LSTM-DRNN technique to automatically detect hypernasality in cleft palate speech. The experimental results demonstrate the potential of deep learning on pathologist speech detection.
Collapse
Affiliation(s)
- Xiyue Wang
- College of Electrical Engineering and Information Technology, Sichuan University, 610065, China.
| | - Sen Yang
- College of Electrical Engineering and Information Technology, Sichuan University, 610065, China.
| | - Ming Tang
- College of Electrical Engineering and Information Technology, Sichuan University, 610065, China.
| | - Heng Yin
- Hospital of Stomatology, Sichuan University, 610065, China.
| | - Hua Huang
- College of Electrical Engineering and Information Technology, Sichuan University, 610065, China.
| | - Ling He
- College of Electrical Engineering and Information Technology, Sichuan University, 610065, China.
| |
Collapse
|
15
|
Saxon M, Liss J, Berisha V. OBJECTIVE MEASURES OF PLOSIVE NASALIZATION IN HYPERNASAL SPEECH. PROCEEDINGS OF THE ... IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING. ICASSP (CONFERENCE) 2019; 2019:6520-6524. [PMID: 31929763 DOI: 10.1109/icassp.2019.8682339] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Hypernasal speech is a common symptom across several neurological disorders; however it has a variable acoustic signature, making it difficult to quantify acoustically or perceptually. In this paper, we propose the nasal cognate distinctiveness features as an objective proxy for hypernasal speech. Our method is motivated by the observation that incomplete velopharyngeal closure changes the acoustics of the resultant speech such that alveolar stops /t/ and /d/ map to the alveolar nasal /n/ and bilabial stops /b/ and /p/ map to bilabial nasal /m/. We propose a new family of features based on likelihood ratios between the plosives and their respective nasal cognates. These features are based on an acoustic model that is trained only on healthy speech, and evaluated on a set of 75 speakers diagnosed with different dysarthria subtypes and exhibiting varying levels of hypernasality. Our results show that the family of features compares favorably with the clinical perception of speech-language pathologists subjectively evaluating hypernasality.
Collapse
Affiliation(s)
- Michael Saxon
- School of Electrical, Computer, and Energy Engineering, Arizona State University
| | - Julie Liss
- Department of Speech and Hearing Sciences, Arizona State University
| | - Visar Berisha
- School of Electrical, Computer, and Energy Engineering, Arizona State University
- Department of Speech and Hearing Sciences, Arizona State University
| |
Collapse
|
16
|
Salman HE, Yazicioglu Y. Flow-induced vibration analysis of constricted artery models with surrounding soft tissue. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:1913. [PMID: 29092565 DOI: 10.1121/1.5005622] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Arterial stenosis is a vascular pathology which leads to serious cardiovascular diseases. Blood flow through a constriction generates sound and vibration due to fluctuating turbulent pressures. Generated vibro-acoustic waves propagate through surrounding soft tissues and reach the skin surface and may provide valuable insight for noninvasive diagnostic purposes. Motivated by the aforementioned phenomena, vibration of constricted arteries is investigated employing computational models. The flow-induced pressure field in an artery is modeled as broadband harmonic pressure loading based on previous studies in the literature and applied on the inner artery wall. Harmonic analysis is performed for determining radial velocity responses on the outer surface of the models. Results indicate that stenosis severities higher than 70% lead to significant increase in response amplitudes, especially at high frequencies between 250 and 600 Hz. The findings agree well with experimental and theoretical results in the literature considering bending mode frequencies, amplitude scales, and mainly excited frequency ranges. It is seen that artery vibration is sensitive to the phase behavior of pressure loading but its effect becomes less significant with the presence of surrounding tissue. As the surrounding tissue thickness increases, radial velocity response amplitudes decrease but the effect of changes in tissue elastic modulus is more pronounced.
Collapse
Affiliation(s)
- Huseyin Enes Salman
- Department of Mechanical Engineering, Middle East Technical University, Dumlupinar Street Number 1, 06800, Ankara, Turkey
| | - Yigit Yazicioglu
- Department of Mechanical Engineering, Middle East Technical University, Dumlupinar Street Number 1, 06800, Ankara, Turkey
| |
Collapse
|