Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Rah DK, Ko YL, Lee C, Kim DW. A noninvasive estimation of hypernasality using a linear predictive model. Ann Biomed Eng 2002;29:587-94. [PMID: 11501623 DOI: 10.1114/1.1380422] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

For:	Rah DK, Ko YL, Lee C, Kim DW. A noninvasive estimation of hypernasality using a linear predictive model. Ann Biomed Eng 2002;29:587-94. [PMID: 11501623 DOI: 10.1114/1.1380422] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Number

Cited by Other Article(s)

Lou Q, Wang X, Chen Y, Wang G, Jiang L, Liu Q. Subjective and Objective Evaluation of Speech in Adult Patients With Repaired Cleft Palate. J Craniofac Surg 2023;34:e551-e556. [PMID: 36949035 DOI: 10.1097/scs.0000000000009301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 12/28/2022] [Indexed: 03/24/2023] Open

Abstract

OBJECTIVE

To explore the speech outcomes of adult patients with repaired cleft palate through subjective perception evaluation and objective acoustic analysis, and to compare the differences in pronunciation characteristics between speakers with complete velopharyngeal closure (VPC) and velopharyngeal insufficiency (VPI) patients.

PARTICIPANTS AND INTERVENTION

Subjective evaluation indicators included speech intelligibility, nasality and consonant missing rate, for objective acoustic analysis, we used speech sample normalization and objective acoustic parameters included normalized vowel formants, voice onset time and the analysis of 3-dimensional spectrogram and spectrum, were carried out on speech samples produced by 3 groups of speakers: (a) speakers with velopharyngeal competence after palatorrhaphy (n=38); (b) speakers with velopharyngeal incompetence after palatorrhaphy (n=70), (c) adult patients with cleft palate (n=65) and (d) typical speakers (n=30).

RESULTS

There was a highly negative correlation between VPC grade and speech intelligibility (ρ=-0.933), and a highly positive correlation between VPC and nasality (ρ=0.813). In subjective evaluation, the speech level of VPI patients was significantly lower than that of VPC patients and normal adults. Although the nasality and consonant loss rate of VPC patients were significantly higher than that of normal adults, the speech intelligibility of VPC patients was not significantly different from that of normal adults. In acoustic analysis, patients with VPI still performed poorly compared with patients with VPC.

CONCLUSIONS

The speech function of adult cleft palate patients is affected by abnormal palatal structure and bad pronunciation habits. In subjective evaluation, there was no significant difference in speech level between VPC patients and normal adults, whereas there was significant difference between VPI patients and normal adults. The acoustic parameters were different between the 2 groups after cleft palate repair. The condition of palatopharyngeal closure after cleft palate can affect the patient's speech.

Collapse

Young K, Sweeney T, Vos RR, Mehendale F, Daffern H. Evaluation of noise excitation as a method for detection of hypernasality. APPLIED ACOUSTICS. ACOUSTIQUE APPLIQUE. ANGEWANDTE AKUSTIK 2022;190:108639. [PMID: 35300323 PMCID: PMC8872831 DOI: 10.1016/j.apacoust.2022.108639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 01/07/2022] [Accepted: 01/11/2022] [Indexed: 06/14/2023]

Abstract

Hypernasality is a disorder where excess nasal resonance is perceived during speech, often as a result of abnormal coupling between the oral and nasal tracts known as velopharyngeal insufficiency (VPI). The most common cause of VPI is a cleft palate, which affects around 1 in 1650 babies, around ⅓ of whom have persistent speech problems after surgery. Current equipment-based assessment methods are invasive and require expert knowledge, and perceptual assessment methods are limited by the availability of expert listeners and differing interpretations of assessment scales. Spectral analysis of hypernasality within the academic community has resulted in potentially useful spectral indicators, but these are highly variable, vowel specific, and not commonly used within clinical practice. Previous works by others have developed noise excitation technologies for the measurement of oral tract transfer functions using resonance measurement devices (RMD). These techniques provide an opportunity to investigate the structural system abnormalities which lead to hypernasality, without the need for invasive measurement equipment. Thus, the work presented in this study adapts these techniques for the detection of hypernasality. These adaptations include augmentation of the hardware and development of the software, so as to be suitable for transfer function measurement at the nostrils rather than the mouth (nRMD). The new method was tested with a single participant trained in hypernasal production, producing 'normal' and hypernasal vowels, and the recordings validated through a listening test by an expert listener and calculation of nasalance values using a nasality microphone. These validation stages indicated the reliability of the captured data, and analysis of the nRMD measurements indicated the presence of a systematic difference in the frequency range 2 to 2.5 kHz between normal and hypernasal speech. Further investigation is warranted to determine the generalisability of these findings across speakers, and to investigate the origins of differences manifesting in the transfer functions between conditions. This will provide new insights into the effects of nasal tract coupling on voice acoustics, which could in turn lead to the development of useful new tools to support clinicians in their work with hypernasality.

Collapse

Fu J, Yang S, He F, He L, Li Y, Zhang J, Xiong X. Sch-net: a deep learning architecture for automatic detection of schizophrenia. Biomed Eng Online 2021;20:75. [PMID: 34344372 PMCID: PMC8336375 DOI: 10.1186/s12938-021-00915-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Accepted: 07/26/2021] [Indexed: 02/08/2023] Open

Abstract

BACKGROUND

Schizophrenia is a chronic and severe mental disease, which largely influences the daily life and work of patients. Clinically, schizophrenia with negative symptoms is usually misdiagnosed. The diagnosis is also dependent on the experience of clinicians. It is urgent to develop an objective and effective method to diagnose schizophrenia with negative symptoms. Recent studies had shown that impaired speech could be considered as an indicator to diagnose schizophrenia. The literature about schizophrenic speech detection was mainly based on feature engineering, in which effective feature extraction is difficult because of the variability of speech signals.

METHODS

This work designs a novel Sch-net neural network based on a convolutional neural network, which is the first work for end-to-end schizophrenic speech detection using deep learning techniques. The Sch-net adds two components, skip connections and convolutional block attention module (CBAM), to the convolutional backbone architecture. The skip connections enrich the information used for the classification by emerging low- and high-level features. The CBAM highlights the effective features by giving learnable weights. The proposed Sch-net combines the advantages of the two components, which can avoid the procedure of manual feature extraction and selection.

RESULTS

We validate our Sch-net through ablation experiments on a schizophrenic speech data set that contains 28 patients with schizophrenia and 28 healthy controls. The comparisons with the models based on feature engineering and deep neural networks are also conducted. The experimental results show that the Sch-net has a great performance on the schizophrenic speech detection task, which can achieve 97.68% accuracy on the schizophrenic speech data set. To further verify the generalization of our model, the Sch-net is tested on open access LANNA children speech database for specific language impairment detection. The results show that our model achieves 99.52% accuracy in classifying patients with SLI and healthy controls. Our code will be available at https://github.com/Scu-sen/Sch-net .

CONCLUSIONS

Extensive experiments show that the proposed Sch-net can provide aided information for the diagnosis of schizophrenia and specific language impairment.

Collapse

Fu J, He F, Yin H, He L. Automatic detection of pharyngeal fricatives in cleft palate speech using acoustic features based on the vocal tract area spectrum. COMPUT SPEECH LANG 2021. [DOI: 10.1016/j.csl.2021.101203] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Speech Outcomes Comparison Between Adult Velopharyngeal Insufficiency and Patients With Unrepaired Cleft Palate. J Craniofac Surg 2021;32:655-659. [PMID: 33705003 DOI: 10.1097/scs.0000000000006994] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

Abstract

OBJECTIVE

This study compared the speech outcomes of adult velopharyngeal insufficiency patients and adult cleft palate (ACP) patients, and explored whether there was any difference in the phonological level of these 2 types of patients.

METHODS

Perceptual evaluation was used to assess speech intelligibility, hypernasality and compensatory articulation in 89 adult patients with velopharyngeal insufficiency and 35 adult patients with unrepaired cleft palate. Each group was divided into complete cleft palate and incomplete cleft palate (including submucous cleft palate). The phonological differences were compared between the 2 groups of patients and 2 types of cleft palate.

RESULTS

The mean speech intelligibility was 43.04% in velopharyngeal insufficiency group and 32.87% in ACP group. There was a significant difference in speech intelligibility between the 2 groups by T test, t = 2.916 (P < 0.01), speech intelligibility between 2 types of cleft palate was no significant difference. Also, there was a significant difference between the 2 groups in the constitution of hypernasality degree by Chi-Square test, x2 = 31.650 (P < 0.01), compensatory articulation were present in 74.3% ACP patients (26/35) and 47.2% velopharyngeal insufficiency patients (42/89), x2 = 7.446 (P < 0.01), there was a significant difference in incidence of compensatory articulation between the 2 groups.

CONCLUSIONS

Adult patients with unpaired cleft palate present an even worse speech intelligibility and hypernasality degree than velopharyngeal insufficiency patients after cleft palate repair, regardless of the cleft type. Additionally, patients in ACP group have a higher incidence of compensatory articulation than that in incomplete cleft palate group. In sequenced treatments of cleft lip and palate, evaluation and treatment of speech disorders cannot be ignored.

Collapse

Saxon M, Tripathi A, Jiao Y, Liss J, Berisha V. Robust Estimation of Hypernasality in Dysarthria with Acoustic Model Likelihood Features. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 2020;28:2511-2522. [PMID: 33748328 PMCID: PMC7978228 DOI: 10.1109/taslp.2020.3015035] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Zhang J, Yang S, Wang X, Tang M, Yin H, He L. Automatic hypernasality grade assessment in cleft palate speech based on the spectral envelope method. ACTA ACUST UNITED AC 2020;65:73-86. [PMID: 31525154 DOI: 10.1515/bmt-2018-0181] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2018] [Accepted: 05/07/2019] [Indexed: 02/05/2023]

Abstract

Due to velopharyngeal incompetence, airflow overflows from the oral cavity to the nasal cavity, which results in hypernasality. Hypernasality greatly reduces speech intelligibility and affects the daily communication of patients with cleft palate. Accurate assessment of hypernasality grades can provide assisted diagnosis for speech-language pathologists (SLPs) in clinical settings. Utilizing a support vector machine (SVM), this paper classifies speech recordings into four grades (normal, mild, moderate and severe hypernasality) based on vocal tract characteristics. Linear prediction (LP) analysis is widely used to model the vocal tract. Glottal source information may be included in the LP-based spectrum. The stabilized weighted linear prediction (SWLP) method, which imposes the temporal weights on the closed-phase interval of the glottal cycle, is a more robust approach for modeling the vocal tract. The extended weighted linear prediction (XLP) method weights each lagged speech signal separately, which achieves a finer time scale on the spectral envelope than the SWLP method. Tested speech recordings were collected from 60 subjects with cleft palate and 20 control subjects, and included a total of 4640 Mandarin syllables. The experimental results showed that the spectral envelope of normal speech decreases faster than that of hypernasal speech in the high-frequency part. The experimental results also indicate that the SWLP- and XLP-based methods have smaller correlation coefficients between normal and hypernasal speech than the LP method. Thus, the SWLP and XLP methods have better ability to distinguish hypernasal from normal speech than the LP method. The classification accuracies of the four hypernasality grades using the SWLP and XLP methods range from 83.86% to 97.47%. The selection of the model order and the size of the weight function are also discussed in this paper.

Collapse

Dubey AK, Prasanna SRM, Dandapat S. Detection and assessment of hypernasality in repaired cleft palate speech using vocal tract and residual features. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019;146:4211. [PMID: 31893680 DOI: 10.1121/1.5134433] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]

Wang X, Yang S, Tang M, Yin H, Huang H, He L. HypernasalityNet: Deep recurrent neural network for automatic hypernasality detection. Int J Med Inform 2019;129:1-12. [PMID: 31445242 DOI: 10.1016/j.ijmedinf.2019.05.023] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2018] [Revised: 04/03/2019] [Accepted: 05/22/2019] [Indexed: 10/26/2022]

Abstract

BACKGROUND

Cleft palate patients have inability to produce adequate velopharyngeal closure, which results in hypernasal speech. In clinic, hypernasal speech is assessed through subject assessment by speech language pathologists. Automatic hypernasal speech detection can provide aided diagnoses for speech language pathologists and clinicians.

OBJECTIVES

This study aims to develop Long Short-Term Memory (LSTM) based Deep Recurrent Neural Network (DRNN) system to detect hypernasal speech from cleft palate patients, thus to provide aided diagnoses for clinical operation and speech therapy. Meanwhile, the feature mining and classification abilities of LSTM-DRNN system are explored.

METHODS

The utilized speech recordings are 14,544 vowels in Mandarin. Speech data is collected from 144 children (72 children with hypernasality and 72 controls) with the age of 5-12 years old. This work proposes a LSTM based DRNN system to achieve automatic hypernasal speech detection, since LSTM-DRNN can learn short-time dependences of hypernasal speech. The vocal tract based features are fed into LSTM-DRNN to achieve deep mining of features. To verify the feature mining ability of LSTM-DRNN, features projected by LSTM-DRNN are fed into shallow classifiers instead of the following two fully connected layers and a softmax layer. And the features without the projecting process of LSTM-DRNN are directly fed into shallow classifiers as a comparison. Hypernasality-sensitive vowels (/a/, /i/, and /u/) are analyzed for the first time.

RESULTS

This LSTM-DRNN based hypernasal speech detection method reaches higher detection accuracy than that using shallow classifiers, since LSTM-DRNN mines features through time axis and network depth simultaneously. The proposed LSTM-DRNN based hypernasality detection system reaches the highest accuracy of 93.35%. According to the analysis of hypernasality-sensitive vowels, the experimental result concludes that vowels /i/ and /u/ are the most sensitive vowels to hypernasal speech.

CONCLUSIONS

The results show that LSTM-DRNN has robust feature mining ability and classification ability. This is the first work that applies the LSTM-DRNN technique to automatically detect hypernasality in cleft palate speech. The experimental results demonstrate the potential of deep learning on pathologist speech detection.

Collapse

He L, Zhang J, Liu Q, Zhang J, Yin H, Lech M. Automatic detection of glottal stop in cleft palate speech. Biomed Signal Process Control 2018. [DOI: 10.1016/j.bspc.2017.07.027] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

He L, Liu Y, Yin H, Zhang J, Zhang J, Zhang J. Automatic initial and final segmentation in cleft palate speech of Mandarin speakers. PLoS One 2017;12:e0184267. [PMID: 28926572 PMCID: PMC5604964 DOI: 10.1371/journal.pone.0184267] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2016] [Accepted: 08/21/2017] [Indexed: 11/18/2022] Open

Abstract

The speech unit segmentation is an important pre-processing step in the analysis of cleft palate speech. In Mandarin, one syllable is composed of two parts: initial and final. In cleft palate speech, the resonance disorders occur at the finals and the voiced initials, while the articulation disorders occur at the unvoiced initials. Thus, the initials and finals are the minimum speech units, which could reflect the characteristics of cleft palate speech disorders. In this work, an automatic initial/final segmentation method is proposed. It is an important preprocessing step in cleft palate speech signal processing. The tested cleft palate speech utterances are collected from the Cleft Palate Speech Treatment Center in the Hospital of Stomatology, Sichuan University, which has the largest cleft palate patients in China. The cleft palate speech data includes 824 speech segments, and the control samples contain 228 speech segments. The syllables are extracted from the speech utterances firstly. The proposed syllable extraction method avoids the training stage, and achieves a good performance for both voiced and unvoiced speech. Then, the syllables are classified into with "quasi-unvoiced" or with "quasi-voiced" initials. Respective initial/final segmentation methods are proposed to these two types of syllables. Moreover, a two-step segmentation method is proposed. The rough locations of syllable and initial/final boundaries are refined in the second segmentation step, in order to improve the robustness of segmentation accuracy. The experiments show that the initial/final segmentation accuracies for syllables with quasi-unvoiced initials are higher than quasi-voiced initials. For the cleft palate speech, the mean time error is 4.4ms for syllables with quasi-unvoiced initials, and 25.7ms for syllables with quasi-voiced initials, and the correct segmentation accuracy P30 for all the syllables is 91.69%. For the control samples, P30 for all the syllables is 91.24%.

Collapse

Bettens K, Wuyts FL, Jonckheere L, Platbrood S, Van Lierde K. Influence of gender and age on the Nasality Severity Index 2.0 in Dutch-speaking Flemish children and adults. LOGOP PHONIATR VOCO 2016;42:133-140. [PMID: 27841710 DOI: 10.1080/14015439.2016.1245781] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Bettens K, Wuyts FL, D'haeseleer E, Luyten A, Meerschman I, Van Crayelynghe C, Van Lierde KM. Short-term and long-term test-retest reliability of the Nasality Severity Index 2.0. JOURNAL OF COMMUNICATION DISORDERS 2016;62:1-11. [PMID: 27175827 DOI: 10.1016/j.jcomdis.2016.05.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2015] [Revised: 12/18/2015] [Accepted: 05/01/2016] [Indexed: 06/05/2023]

Abstract

PURPOSE

The Nasality Severity Index 2.0 (NSI 2.0) forms a new, multiparametric approach in the assessment of hypernasality. To enable clinical implementation of this index, the short- and long-term test-retest reliability of this index was explored.

METHODS

In 40 normal-speaking adults (mean age 32y, SD 11, 18-56y) and 29 normal-speaking children (mean age 8y, SD 2, 4-12y), the acoustic parameters included in the NSI 2.0 (i.e. nasalance of the vowel /u/ and an oral text, and the voice low tone to high tone ratio (VLHR) of the vowel /i/) were obtained twice at the same test moment and during a second assessment two weeks later. After determination of the NSI 2.0, a comprehensive set of statistical measures was applied to determine its reliability.

RESULTS

Long-term variability of the NSI 2.0 and its parameters was slightly higher compared to the short-term variability, both in adults and in children. Overall, a difference of 2.82 for adults and 2.68 for children between the results of two consecutive measurements can be interpreted as a genuine change. With an ICC of 0.84 in adults and 0.77 in children, the NSI 2.0 additionally shows an excellent relative consistency. No statistically significant difference was withheld in the reliability of test-retest measurements between adults and children.

CONCLUSION

Reliable test-retest measurements of the NSI 2.0 can be performed. Consequently, the NSI 2.0 can be applied in clinical practice, in which successive NSI 2.0 scores can be reliably compared and interpreted.

LEARNING OUTCOMES

The reader will be able to describe and discuss both the short-term and long-term test-retest reliability of the Nasality Severity Index 2.0, a new multiparametric approach to hypernasality, and its parameters. Based on this information, the NSI 2.0 can be applied in clinical practice, in which successive NSI 2.0 scores, e.g. before and after surgery or speech therapy, can be compared and interpreted.

Collapse

Bettens K, Wuyts FL, Van Lierde KM. Instrumental assessment of velopharyngeal function and resonance: a review. JOURNAL OF COMMUNICATION DISORDERS 2014;52:170-183. [PMID: 24909583 DOI: 10.1016/j.jcomdis.2014.05.004] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2014] [Revised: 04/14/2014] [Accepted: 05/16/2014] [Indexed: 06/03/2023]

Vijayalakshmi P, Reddy MR, O'Shaughnessy D. Acoustic analysis and detection of hypernasality using a group delay function. IEEE Trans Biomed Eng 2007;54:621-9. [PMID: 17405369 DOI: 10.1109/tbme.2006.889191] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]