1
|
Offrede T, Mooshammer C, Fuchs S. Breathing and Speech Adaptation: Do Speakers Adapt Toward a Confederate Talking Under Physical Effort? JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:3914-3930. [PMID: 38241692 DOI: 10.1044/2023_jslhr-23-00113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/21/2024]
Abstract
PURPOSE This study investigated whether speakers adapt their breathing and speech (fundamental frequency [fo]) to a prerecorded confederate who is sitting or moving under different levels of physical effort and who is either speaking or not. Following Paccalin and Jeannerod (2000), we would expect breathing rate to change in the direction of the confederate's, even if the participant is physically inactive. This might in turn affect their speech acoustics. METHOD We recorded the speech and respiration of 22 native German speakers. They produced solo and synchronous read speech in interaction with a confederate who appeared on a prerecorded video. There were three within-subject experimental conditions: the confederate (a) sitting, (b) biking with light effort, or (c) biking with heavier effort. RESULTS During speech, the confederate's inhalation amplitude and fo increased with physical effort, as expected. Her breath cycle duration changed differently, probably because of read speech constraints. Overall, the only adaptation the participants showed was higher fo with increase in the confederate's physical effort during synchronous, but not solo, speech. Additionally, they produced shallower inhalations when observing the confederate biking in silence, as compared to the condition without movement. Crucially, the participants' acoustic and breathing data showed large interindividual variability. CONCLUSIONS Our findings indicate that, in this paradigm, convergence only took place on fo during synchronous speech and that this phonetic adaptation happened independently from any speech breathing adaptation. It also suggests that participants may adapt their quiet breathing while watching a person performing physical exercise but that the mechanism is more complex than that explained previously.
Collapse
Affiliation(s)
| | | | - Susanne Fuchs
- Leibniz-Centre General Linguistics (ZAS), Berlin, Germany
| |
Collapse
|
2
|
Bendtsen LØM, Kolborg N, Pedersen SG, Jørkov APS, Iwarsson J. Injection Laryngoplasty of Unilateral Vocal Fold Paralysis Evaluated With Pause and Speech Measurements. J Voice 2024:S0892-1997(24)00206-6. [PMID: 39003211 DOI: 10.1016/j.jvoice.2024.06.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Revised: 06/25/2024] [Accepted: 06/25/2024] [Indexed: 07/15/2024]
Abstract
OBJECTIVE The purpose of this study was to examine a number of pause-and-speech-measurements in patients with unilateral vocal fold paralysis, before and after injection laryngoplasty. The non-invasive measurements were selected to investigate and explain the treatment effect on connected speech in these patients. STUDY DESIGN Retrospective study with repeated measurements design. METHOD Voice recordings of 24 patients with unilateral vocal fold paralysis from before and after injection laryngoplasty in local anesthesia were analyzed retrospectively with the computer program Praat. Measurements examined were number of pauses, average pause duration, pause ratio (expressing the amount of pausing during a reading-aloud task), number of breath groups, average duration of breath groups, articulation rate, speaking rate, maximum phonation time, and Voice Handicap Index. RESULTS Injection laryngoplasty had a significant improving effect on the number of pauses, pause ratio, number of breath groups, average duration of breath groups, articulation rate, speaking rate, maximum phonation time, and Voice Handicap Index. Maximum phonation time before treatment correlated with several pause and speech measurements. CONCLUSION The results showed that treatment with injection laryngoplasty had a clear effect on several pause and speech measurements and that these measurements correlated with maximum phonation time, but not with Voice Handicap Index.
Collapse
Affiliation(s)
- Liv Øster Müller Bendtsen
- Audiologopedics, Department of Nordic Studies and Linguistics, University of Copenhagen, Copenhagen, Denmark
| | - Nanna Kolborg
- Audiologopedics, Department of Nordic Studies and Linguistics, University of Copenhagen, Copenhagen, Denmark
| | - Solveig Gunvor Pedersen
- Audiologopedics, Department of Nordic Studies and Linguistics, University of Copenhagen, Copenhagen, Denmark; Department of Ear, Nose and Throat Surgery, Zealand University Hospital, Køge, Denmark
| | | | - Jenny Iwarsson
- Audiologopedics, Department of Nordic Studies and Linguistics, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
3
|
Sullivan L, Martin E, Allison KM. Effects of SPEAK OUT! & LOUD Crowd on Functional Speech Measures in Parkinson's Disease. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2024; 33:1930-1951. [PMID: 38838243 DOI: 10.1044/2024_ajslp-23-00321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2024]
Abstract
PURPOSE This study investigated the effects of the SPEAK OUT! & LOUD Crowd therapy program on speaking rate, percent pause time, intelligibility, naturalness, and communicative participation in individuals with Parkinson's disease (PD). METHOD Six adults with PD completed 12 individual SPEAK OUT! sessions across four consecutive weeks followed by group-based LOUD Crowd sessions for five consecutive weeks. Most therapy sessions were conducted via telehealth, with two participants completing the SPEAK OUT! portion in person. Speech samples were recorded at six time points: three baseline time points prior to SPEAK OUT!, two post-SPEAK OUT! time points, and one post-LOUD Crowd time point. Acoustic measures of speaking rate and percent pause time and listener ratings of speech intelligibility and naturalness were obtained for each time point. Participant self-ratings of communicative participation were also collected at pre- and posttreatment time points. RESULTS Results showed significant improvement in communicative participation scores at a group level following completion of the SPEAK OUT! & LOUD Crowd treatment program. Two participants showed a significant decrease in speaking rate and increase in percent pause time following treatment. Changes in intelligibility and naturalness were not statistically significant. CONCLUSIONS These findings provide preliminary support for the effectiveness of the SPEAK OUT! & LOUD Crowd treatment program in improving communicative participation for people with mild-to-moderate hypokinetic dysarthria secondary to PD. This study is also the first to demonstrate positive effects of this treatment program for people receiving the therapy via telehealth.
Collapse
Affiliation(s)
- Lauren Sullivan
- Department of Communication Sciences and Disorders, Northeastern University, Boston, MA
| | - Elizabeth Martin
- Department of Communication Sciences and Disorders, Northeastern University, Boston, MA
| | - Kristen M Allison
- Department of Communication Sciences and Disorders, Northeastern University, Boston, MA
| |
Collapse
|
4
|
Kuhlmann LL, Iwarsson J. Effects of Speaking Rate on Breathing and Voice Behavior. J Voice 2024; 38:346-356. [PMID: 34711460 DOI: 10.1016/j.jvoice.2021.09.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 08/30/2021] [Accepted: 09/02/2021] [Indexed: 11/24/2022]
Abstract
OBJECTIVES The objective of this study was to investigate the effects of speaking rate (habitual and fast) and speech task (reading and spontaneous speech) on seven dependent variables: Breath group size (in syllables), Breath group duration (in seconds), Lung volume at breath group initiation, Lung volume at breath group termination, Lung volume excursion for each breath group (in % vital capacity), Lung volume excursion per syllable (in % vital capacity) and mean speaking Fundamental frequency (fO). METHODS Ten women and seven men were included as subjects. Lung volume and breathing behaviors were measured by respiratory inductance plethysmography and fO was measured from audio recordings by the Praat software. Statistical significance was tested by analysis of variance. RESULTS For both reading and spontaneous speech, the group increased mean breath group size and breath group duration significantly in the fast speaking rate condition. The group significantly decreased lung volume excursion per syllable in fast speech. Females also showed a significant increase of fO in fast speech. The lung volume levels for initiation and termination of breath groups, as well as lung volume excursions in % vital capacity, showed great individual variations and no significant effects of rate. Significant effects of speech task were found for breath group size and lung volume excursion per syllable, where reading induced more syllables produced per breath group and less % VC spend per syllable as compared to spontaneous speech. Interaction effects showed that the increases in breath group size and breath group duration associated with fast rate were significantly larger in reading than in spontaneous speech. CONCLUSION Our data from 17 vocally untrained, healthy subjects showed great individual variations but still significant group effects regarding increased speaking rate, where the subjects seemed to spend less air per syllable and inhaled less often as a consequence of greater breath group sizes in fast speech. Subjects showed greater changes in breath group patterns as a consequence of fast speech in reading than in spontaneous speech, indicating that effects of speaking rate are dependent on the speech task.
Collapse
Affiliation(s)
- Laura Lund Kuhlmann
- Copenhagen Cleft Palate Center, Copenhagen University Hospital, Copenhagen, Denmark.
| | - Jenny Iwarsson
- Audiologopedics, Department of Nordic Studies and Linguistics, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
5
|
Hancock AB, Hao G, Ni A, Liu H, Johnson LW. Gender Attributions by Cisgender and Gender Diverse Listeners Rating Vowels, Reading, and Monologues. J Voice 2023:S0892-1997(23)00288-6. [PMID: 37973434 DOI: 10.1016/j.jvoice.2023.09.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 09/12/2023] [Accepted: 09/12/2023] [Indexed: 11/19/2023]
Abstract
OBJECTIVES To determine if listeners' attributions of speakers' gender vary by linguistic context and/or the listeners' gender identity. METHODS Seventeen self-identified transgender adults assigned male at birth were audio-recorded prolonging /a/, reading sentences, and saying spontaneous monologues. Eighteen adults (10 cisgender and 8 gender-diverse individuals) listened and used a 1-5 scale (1: very masculine, 2: somewhat masculine, 3: androgynous, 4: somewhat feminine, and 5: very feminine) to rate the gender attribution of each speech sample. RESULTS The intra-rater reliability was moderate to excellent (0.62-1.00). Ratings by cisgender and gender-diverse listeners were not significantly different. Ratings were not significantly different between different speech contexts of vowel, reading, and spontaneous monologue speech samples. CONCLUSIONS Transwomen have many variables available to consider and use in their communication. The linguistic context (eg, reading a speech versus spontaneous monologue) or listener's gender does not appear to be highly influential factors in how listeners attribute gender.
Collapse
Affiliation(s)
- Adrienne B Hancock
- Department of Speech, Language, and Hearing Sciences, George Washington University, Washington, DC
| | - Grace Hao
- Department of Communication Sciences and Disorders, College of Health and Sciences, North Carolina Central University, Durham, North Carolina.
| | - Anpin Ni
- Department of Communication Sciences and Disorders, College of Health and Sciences, North Carolina Central University, Durham, North Carolina
| | - Hengxin Liu
- Department of Otolaryngology, Head and Neck Surgery, Beijing Children's Hospital Affiliated to Capital Medical University, National Center for Children's Health, Beijing, China; Beijing Key Laboratory for Pediatric Diseases of Otolaryngology Head and Neck Surgery, Beijing, China
| | - Leslie W Johnson
- Department of Communication Sciences and Disorders, College of Health and Sciences, North Carolina Central University, Durham, North Carolina
| |
Collapse
|
6
|
Abbasi O, Kluger DS, Chalas N, Steingräber N, Meyer L, Gross J. Predictive coordination of breathing during intra-personal speaking and listening. iScience 2023; 26:107281. [PMID: 37520729 PMCID: PMC10372729 DOI: 10.1016/j.isci.2023.107281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 05/04/2023] [Accepted: 06/30/2023] [Indexed: 08/01/2023] Open
Abstract
It has long been known that human breathing is altered during listening and speaking compared to rest: during speaking, inhalation depth is adjusted to the air volume required for the upcoming utterance. During listening, inhalation is temporally aligned to inhalation of the speaker. While evidence for the former is relatively strong, it is virtually absent for the latter. We address both phenomena using recordings of speech envelope and respiration in 30 participants during 14 min of speaking and listening to one's own speech. First, we show that inhalation depth is positively correlated with the total power of the speech envelope in the following utterance. Second, we provide evidence that inhalation during listening to one's own speech is significantly more likely at time points of inhalation during speaking. These findings are compatible with models that postulate alignment of internal forward models of interlocutors with the aim to facilitate communication.
Collapse
Affiliation(s)
- Omid Abbasi
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany
| | - Daniel S. Kluger
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany
- Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany
| | - Nikos Chalas
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany
- Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany
| | - Nadine Steingräber
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany
| | - Lars Meyer
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Joachim Gross
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany
- Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany
| |
Collapse
|
7
|
Gullsvåg M, Rodríguez-Aranda C. Effects of verbal tasks with varying difficulty on real-time respiratory airflow during speech generation in healthy young adults. Front Psychol 2023; 14:1150354. [PMID: 37397319 PMCID: PMC10309038 DOI: 10.3389/fpsyg.2023.1150354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 05/16/2023] [Indexed: 07/04/2023] Open
Abstract
Objective Respiratory function is linked to sensory, affective, and cognitive processes and it is affected by environmental constraints such as cognitive demands. It is suggested that specific cognitive processes, such as working memory or executive functioning, may impact breathing. In turn, various lines of research have suggested a link between peak expiratory airflow (PEF) and cognitive function. However, there is scarce experimental support to the above assertions, especially regarding spoken language. Therefore, the present investigation aims to evaluate whether breathing varies as a function of performing verbal naming tasks with different difficulty levels. Methods Thirty healthy young adults, (age M = 25.37 years), participated in the study. Participants were required to perform aloud five verbal tasks ranged in order of difficulty: Reading single words, reading a text passage, object naming, semantic and phonemic fluency. A pneumotachograph mask was employed to acquire simultaneously the verbal responses, and three airflow parameters: Duration, peak, and volume at both stages of the respiratory cycle (i.e., inspiration/expiration). Data were analyzed with one-way repeated measures MANOVA. Results No significant differences were found between reading single words and object naming. In comparison, distinctive airflow requirements were found for reading a text passage, which were proportionally related to number of pronounced words. Though, the main finding of the study concerns the data on verbal fluency tasks, which not only entailed higher inhaled airflow resources but also a significant PEF. Conclusion Our data demonstrated that the most difficult tasks, namely semantic and phonemic verbal fluencies, relying on semantic search, executive function, and fast lexical retrieval of words were those requiring important amount of inhaled airflow and displaying a high peak expiratory airflow. The present findings demonstrated for the first time a direct association between complex verbal tasks and PEF. Inconclusive data related to object naming and reading single words are discussed in light of the methodological challenges inherent to the assessment of speech breathing and cognition in this line of investigation.
Collapse
|
8
|
Gravelin AC, Archer B, Oddo M, Whitfield JA. Reliability of a Linguistic Segmentation Procedure Specified by Systemic Functional Linguistics to Examine Extemporaneous Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:1280-1290. [PMID: 37014996 DOI: 10.1044/2023_jslhr-22-00554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
PURPOSE Extemporaneous speech tasks provide an ecologically valid sample to examine speech acoustics, but differing methodologies exist in the literature for segmentation. Therefore, the purpose of this study was to examine the utility and reliability of a segmentation approach for extemporaneous speech specified by systemic functional linguistics (SFL) and its potential research and clinical applications. METHOD Ten speakers without communication disorders served as participants in this study, and they responded to self-selected extemporaneous speaking prompts. Two expert analysts and one clinician analyst utilized a segmentation procedure specified by SFL to segment the extemporaneous speech samples into clauses and clause complexes. Intra- and interrater reliability were calculated for each analyst and pair of analysts. Acoustic measures of duration, speech rate, and intercomplex pause durations were calculated for each clause complex. RESULTS Analyses for both intra- and interrater reliability revealed high percent agreement that was significantly greater than chance for expert and clinician analysts and between each pair of analysts (p < .001). Acoustic analyses revealed expected variation in number and duration of spoken syllables of clause complexes between and within speakers. CONCLUSIONS The segmentation approach for extemporaneous speech specified by SFL is a reliable method for trained analysts that is informed by lexico-grammar and allows for acoustic measurement of speech production. It is also a reliable method for clinician analysts for speakers without communication disorders, and future work will investigate its utility for speakers with motor speech disorders. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.22357138.
Collapse
Affiliation(s)
- Anna C Gravelin
- Division of Communication Sciences and Disorders, West Virginia University, Morgantown
| | - Brent Archer
- Department of Communication Sciences and Disorders, Bowling Green State University, OH
| | - Mary Oddo
- Department of Communication Sciences and Disorders, Bowling Green State University, OH
| | - Jason A Whitfield
- Department of Communication Sciences and Disorders, Bowling Green State University, OH
| |
Collapse
|
9
|
Nusseck M, Immerz A, Richter B, Traser L. Vocal Behavior of Teachers Reading with Raised Voice in a Noisy Environment. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph19158929. [PMID: 35897294 PMCID: PMC9331438 DOI: 10.3390/ijerph19158929] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 07/20/2022] [Accepted: 07/21/2022] [Indexed: 01/27/2023]
Abstract
(1) Objective: Teaching is a particularly voice-demanding occupation. Voice training provided during teachers’ education is often insufficient and thus teachers are at risk of developing voice disorders. Vocal demands during teaching are not only characterized by speaking for long durations but also by speaking in noisy environments. This provokes the so-called Lombard effect, which intuitively leads to an increase in voice intensity, pitch and phonation time in laboratory studies. However, this effect has not been thoroughly investigated in realistic teaching scenarios. (2) Methods: This study thus examined how 13 experienced, but vocally untrained, teachers behaved when reading in a noisy compared to quiet background environment. The quiet and noisy conditions were provided by a live audience either listening quietly or making noise by talking to each other. By using a portable voice accumulator, the fundamental frequency, sound pressure level of the voice and the noise as well as the phonation time were recorded in both conditions. (3) Results: The results showed that the teachers mainly responded according to the Lombard effect. In addition, analysis of phonation time revealed that they failed to increase inhalation time and appeared to lose articulation through the shortening of voiceless consonants in the noisy condition. (4) Conclusions: The teachers demonstrated vocally demanding behavior when speaking in the noisy condition, which can lead to vocal fatigue and cause dysphonia. The findings underline the necessity for specific voice training in teachers’ education, and the content of such training is discussed in light of the results.
Collapse
|
10
|
Rechenberg L, Meurer EM, Melos M, Nienov OH, Corleta HVE, Capp E. Voice, Speech, and Clinical Aspects During Pregnancy: A Longitudinal Study. J Voice 2022:S0892-1997(22)00133-3. [PMID: 35662512 DOI: 10.1016/j.jvoice.2022.04.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2022] [Revised: 04/26/2022] [Accepted: 04/27/2022] [Indexed: 10/18/2022]
Abstract
BACKGROUND Pregnancy involves anatomical, physiological, and metabolic changes in a woman's body. However, the effects of these changes on the voice remains unclear, particularly regarding the clinical characteristics. OBJECTIVES We aimed to evaluate changes in vocal and speech acoustic measures and the relationship between them and clinical aspects in women during pregnancy. METHOD A prospective, longitudinal study was carried out with 41 low risk, adult, pregnant women, followed for prenatal care. Demographic and anthropometric data as well as lifestyle habits and health conditions were collected. Voice recordings of sustained vowels, and automatic and spontaneous speech were held over each trimester and analyzed by PRAAT®to evaluate acoustic, aerodynamic, and articulatory measures. RESULTS There were no changes in fundamental frequency, jitter, shimmer, and harmony to noise ratio during pregnancy. Maximum phonation time (MPT), pause rate, and pause duration reduced at the end of pregnancy. MPT was lower in sedentary pregnant women. The fundamental frequency peak rate was higher in eutrophic participants and lower in the third trimester in women with BMI ≥25 kg/m2. Pause rate was higher in pregnant women with BMI ≥25 kg/m2. There was no relationship between sleep quality, reflux, and vocal symptoms and acoustic and aerodynamic measures. CONCLUSIONS Differences were shown in MPT and temporal pause measurements during pregnancy. Acoustic measurements did not change. There was a relationship between acoustic and aerodynamic measures and clinical variables (BMI, physical activity, and body mass gain).
Collapse
Affiliation(s)
- Leila Rechenberg
- Graduate Program of Health Science: Obstetrics and Gynecology, School of Medicine, Universidade Federal do Rio Grande do Sul, Porto Alegre, Rio Grande do Sul, Brazil; Undergraduate Program of Speech and Language Therapy, Universidade Federal do Rio Grande do Sul, Porto Alegre, Rio Grande do Sul, Brazil; Department of Social and Preventive Dentistry, School of Dentistry, Universidade Federal do Rio Grande do Sul, Porto Alegre, Rio Grande do Sul, Brazil.
| | - Eliséa Maria Meurer
- Graduate Program of Health Science: Obstetrics and Gynecology, School of Medicine, Universidade Federal do Rio Grande do Sul, Porto Alegre, Rio Grande do Sul, Brazil
| | - Monica Melos
- Graduate Program of Health Science: Obstetrics and Gynecology, School of Medicine, Universidade Federal do Rio Grande do Sul, Porto Alegre, Rio Grande do Sul, Brazil; Undergraduate Program of Speech and Language Therapy, Universidade Federal do Rio Grande do Sul, Porto Alegre, Rio Grande do Sul, Brazil
| | - Otto Henrique Nienov
- Graduate Program of Health Science: Obstetrics and Gynecology, School of Medicine, Universidade Federal do Rio Grande do Sul, Porto Alegre, Rio Grande do Sul, Brazil
| | - Helena von Eye Corleta
- Graduate Program of Health Science: Obstetrics and Gynecology, School of Medicine, Universidade Federal do Rio Grande do Sul, Porto Alegre, Rio Grande do Sul, Brazil; Department of Obstetrics and Gynecology, Hospital de Clínicas de Porto Alegre, School of Medicine, Universidade Federal do Rio Grande do Sul, Porto Alegre, Rio Grande do Sul, Brazil
| | - Edison Capp
- Graduate Program of Health Science: Obstetrics and Gynecology, School of Medicine, Universidade Federal do Rio Grande do Sul, Porto Alegre, Rio Grande do Sul, Brazil; Department of Obstetrics and Gynecology, Hospital de Clínicas de Porto Alegre, School of Medicine, Universidade Federal do Rio Grande do Sul, Porto Alegre, Rio Grande do Sul, Brazil
| |
Collapse
|
11
|
Zhang H, Wiener S, Holt LL. Adjustment of cue weighting in speech by speakers and listeners: Evidence from amplitude and duration modifications of Mandarin Chinese tone. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:992. [PMID: 35232077 PMCID: PMC8846952 DOI: 10.1121/10.0009378] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 01/07/2022] [Accepted: 01/10/2022] [Indexed: 06/14/2023]
Abstract
Speech contrasts are signaled by multiple acoustic dimensions, but these dimensions are not equally diagnostic. Moreover, the relative diagnosticity, or weight, of acoustic dimensions in speech can shift in different communicative contexts for both speech perception and speech production. However, the literature remains unclear on whether, and if so how, talkers adjust speech to emphasize different acoustic dimensions in the context of changing communicative demands. Here, we examine the interplay of flexible cue weights in speech production and perception across amplitude and duration, secondary non-spectral acoustic dimensions for phonated Mandarin Chinese lexical tone, across natural speech and whispering, which eliminates fundamental frequency contour, the primary acoustic dimension. Phonated and whispered Mandarin productions from native talkers revealed enhancement of both duration and amplitude cues in whispered, compared to phonated speech. When nonspeech amplitude-modulated noises modeled these patterns of enhancement, identification of the noises as Mandarin lexical tone categories was more accurate than identification of noises modeling phonated speech amplitude and duration cues. Thus, speakers exaggerate secondary cues in whispered speech and listeners make use of this information. Yet, enhancement is not symmetric among the four Mandarin lexical tones, indicating possible constraints on the realization of this enhancement.
Collapse
Affiliation(s)
- Hui Zhang
- Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
| | - Seth Wiener
- Department of Modern Languages, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, Pennsylvania 15213, USA
| | - Lori L Holt
- Department of Psychology and Neuroscience Institute, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, Pennsylvania 15213, USA
| |
Collapse
|
12
|
The maturational gradient of infant vocalizations: Developmental stages and functional modules. Infant Behav Dev 2021; 66:101682. [PMID: 34920296 DOI: 10.1016/j.infbeh.2021.101682] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2021] [Revised: 12/06/2021] [Accepted: 12/07/2021] [Indexed: 12/29/2022]
Abstract
Stage models have been influential in characterizing infant vocalizations in the first year of life. These models are basically descriptive and do not explain why certain types of vocal behaviors occur within a particular stage or why successive patterns of vocalization occur. This review paper summarizes and elaborates a theory of Developmental Functional Modules (DFMs) and discusses how maturational gradients in the DFMs explain age typical vocalizations as well as the transitions between successive stages or other static forms. Maturational gradients are based on biological processes that effect the reconfiguration and remodeling of the respiratory, laryngeal, and craniofacial systems during infancy. From a dynamic systems perspective, DFMs are part of a complex system with multiple degrees of freedom that can achieve stable performance with relatively few control variables by relying on principles such as synergies, self-organization, nonlinear performance, and movement variability.
Collapse
|
13
|
Stipancic KL, Kuo YL, Miller A, Ventresca HM, Sternad D, Kimberley TJ, Green JR. The effects of continuous oromotor activity on speech motor learning: speech biomechanics and neurophysiologic correlates. Exp Brain Res 2021; 239:3487-3505. [PMID: 34524491 PMCID: PMC8599312 DOI: 10.1007/s00221-021-06206-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Accepted: 08/25/2021] [Indexed: 11/26/2022]
Abstract
Sustained limb motor activity has been used as a therapeutic tool for improving rehabilitation outcomes and is thought to be mediated by neuroplastic changes associated with activity-induced cortical excitability. Although prior research has reported enhancing effects of continuous chewing and swallowing activity on learning, the potential beneficial effects of sustained oromotor activity on speech improvements is not well-documented. This exploratory study was designed to examine the effects of continuous oromotor activity on subsequent speech learning. Twenty neurologically healthy young adults engaged in periods of continuous chewing and speech after which they completed a novel speech motor learning task. The motor learning task was designed to elicit improvements in accuracy and efficiency of speech performance across repetitions of eight-syllable nonwords. In addition, transcranial magnetic stimulation was used to measure the cortical silent period (cSP) of the lip motor cortex before and after the periods of continuous oromotor behaviors. All repetitions of the nonword task were recorded acoustically and kinematically using a three-dimensional motion capture system. Productions were analyzed for accuracy and duration, as well as lip movement distance and speed. A control condition estimated baseline improvement rates in speech performance. Results revealed improved speech performance following 10 min of chewing. In contrast, speech performance following 10 min of continuous speech was degraded. There was no change in the cSP as a result of either oromotor activity. The clinical implications of these findings are discussed in the context of speech rehabilitation and neuromodulation.
Collapse
Affiliation(s)
- Kaila L Stipancic
- Department of Communicative Disorders and Sciences, University at Buffalo, Buffalo, NY, USA
| | - Yi-Ling Kuo
- Department of Physical Therapy, Upstate Medical University, Syracuse, NY, USA
| | - Amanda Miller
- Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, MA, USA
| | - Hayden M Ventresca
- Department of Rehabilitation Sciences, MGH Institute of Health Professions, Building 79/96, 2nd Floor 13th Street, Boston, MA, 02129, USA
| | - Dagmar Sternad
- Department of Biology, Northeastern University, Boston, MA, USA
| | - Teresa J Kimberley
- Department of Rehabilitation Sciences, MGH Institute of Health Professions, Building 79/96, 2nd Floor 13th Street, Boston, MA, 02129, USA
| | - Jordan R Green
- Department of Rehabilitation Sciences, MGH Institute of Health Professions, Building 79/96, 2nd Floor 13th Street, Boston, MA, 02129, USA.
| |
Collapse
|
14
|
Vocal Acoustics and Aerodynamics During Scripted Reading Compared to Spontaneous Speech. J Voice 2021:S0892-1997(21)00118-1. [PMID: 34175170 DOI: 10.1016/j.jvoice.2021.03.022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 03/30/2021] [Accepted: 03/31/2021] [Indexed: 11/21/2022]
Abstract
BACKGROUND Examination of vocal acoustics and phonatory aerodynamics during connected speech provide a more ecologically valid approach to voice assessment than single phoneme measures. The purpose of the current investigation was to determine if differences exist in vocal acoustics and aerodynamics between reading and spontaneous speech tasks in patients with common voice disorders. METHODS The Emory University Institutional Review Board approved this retrospective study. The voice records of 100 patients (74 females and 26 males) diagnosed with benign voice disorders and referred for voice evaluation at the Emory Voice Center between November 2018 and March 2019 were analyzed. These consisted of reading a scripted passage (the Rainbow Passage) and spontaneous speech (describing how to make a peanut butter and jelly sandwich). Data collected included gender, voice diagnosis, mean fundamental frequency (F0), mean airflow during voicing, and mean inspiratory airflow (MIA). RESULTS Univariate analysis assessed normality of the data. Variables with normal distribution utilized paired t test. Non-normal data were log transformed. Mean F0 was not significant for complete case analysis (P = 0.053) but gender based stratified analysis, for females (mean difference = 4.68 Hz; 95% CI = 0.359, 9.0012; P = 0.03). Gender-related statistical differences were also found in MIA in women (P = 0.0001), and P = 0.0003 for MIA in men. The direction and range of change between scripted reading and the spontaneous speech tasks in all metrics varied widely. No consistent patterns were noted in gender, age and diagnosis across the parameters studied. However, clinically salient findings in the range of MIA were noted in a small subgroup of participants. CONCLUSIONS This study suggests that multiple testing stimuli for phonatory aerodynamic and acoustic outcomes measurement may be appropriate for use depending on the need and vocal challenges of the individual patient. Clinically, both structured reading and spontaneous speech provide valuable insight into the vocal capabilities of the patient.
Collapse
|
15
|
Desjardins M, Verdolini Abbott K, Zhang Z. Computational simulations of respiratory-laryngeal interactions and their effects on lung volume termination during phonation: Considerations for hyperfunctional voice disorders. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:3988. [PMID: 34241462 PMCID: PMC8186948 DOI: 10.1121/10.0005063] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Revised: 04/11/2021] [Accepted: 05/07/2021] [Indexed: 05/05/2023]
Abstract
Glottal resistance plays an important role in airflow conservation, especially in the context of high vocal demands. However, it remains unclear if laryngeal strategies most effective in controlling airflow during phonation are consistent with clinical manifestations of vocal hyperfunction. This study used a previously validated three-dimensional computational model of the vocal folds coupled with a respiratory model to investigate which laryngeal strategies were the best predictors of lung volume termination (LVT) and how these strategies' effects were modulated by respiratory parameters. Results indicated that the initial glottal angle and vertical thickness of the vocal folds were the best predictors of LVT regardless of subglottal pressure, lung volume initiation, and breath group duration. The effect of vertical thickness on LVT increased with the subglottal pressure-highlighting the importance of monitoring loudness during voice therapy to avoid laryngeal compensation-and decreased with increasing vocal fold stiffness. A positive initial glottal angle required an increase in vertical thickness to complete a target utterance, especially when the respiratory system was taxed. Overall, findings support the hypothesis that laryngeal strategies consistent with hyperfunctional voice disorders are effective in increasing LVT, and that conservation of airflow and respiratory effort may represent underlying mechanisms in those disorders.
Collapse
Affiliation(s)
- Maude Desjardins
- Department of Communication Sciences and Disorders, University of Delaware, Tower at STAR 100 Discovery Boulevard, Newark, Delaware 19713-1325, USA
| | - Katherine Verdolini Abbott
- Department of Communication Sciences and Disorders, University of Delaware, Tower at STAR 100 Discovery Boulevard, Newark, Delaware 19713-1325, USA
| | - Zhaoyan Zhang
- Department of Head and Neck Surgery, University of California, Los Angeles, 31-24 Rehabilitation Center, 1000 Veteran Avenue, Los Angeles, California 90095-1794, USA
| |
Collapse
|
16
|
Nallanthighal VS, Mostaani Z, Härmä A, Strik H, Magimai-Doss M. Deep learning architectures for estimating breathing signal and respiratory parameters from speech recordings. Neural Netw 2021; 141:211-224. [PMID: 33915446 DOI: 10.1016/j.neunet.2021.03.029] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2020] [Revised: 01/29/2021] [Accepted: 03/18/2021] [Indexed: 01/16/2023]
Abstract
Respiration is an essential and primary mechanism for speech production. We first inhale and then produce speech while exhaling. When we run out of breath, we stop speaking and inhale. Though this process is involuntary, speech production involves a systematic outflow of air during exhalation characterized by linguistic content and prosodic factors of the utterance. Thus speech and respiration are closely related, and modeling this relationship makes sensing respiratory dynamics directly from the speech plausible, however is not well explored. In this article, we conduct a comprehensive study to explore techniques for sensing breathing signal and breathing parameters from speech using deep learning architectures and address the challenges involved in establishing the practical purpose of this technology. Estimating the breathing pattern from the speech would give us information about the respiratory parameters, thus enabling us to understand the respiratory health using one's speech.
Collapse
Affiliation(s)
- Venkata Srikanth Nallanthighal
- Philips Research, Eindhoven, The Netherlands; Centre for Language Studies (CLS), Radboud University Nijmegen, The Netherlands.
| | - Zohreh Mostaani
- Idiap Research Institute, Martigny, Switzerland; Ecole polytechnique fédérale de Lausanne, Lausanne, Switzerland
| | - Aki Härmä
- Philips Research, Eindhoven, The Netherlands
| | - Helmer Strik
- Centre for Language Studies (CLS), Radboud University Nijmegen, The Netherlands
| | | |
Collapse
|
17
|
Wermke K, Sereschk N, May V, Salinger V, Sanchez MR, Shehata-Dieler W, Wirbelauer J. The Vocalist in the Crib: the Flexibility of Respiratory Behaviour During Crying in Healthy Neonates. J Voice 2021; 35:94-103. [DOI: 10.1016/j.jvoice.2019.07.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 07/06/2019] [Accepted: 07/08/2019] [Indexed: 11/26/2022]
|
18
|
Grimley SJ, Ko CM, Morrell HER, Grace F, Bañuelos MS, Bautista BR, Chavez GN, Dalrymple ER, Green M, Gurning J, Heuerman AC, Huerta M, Marks M, Ov J, Overton-Harris P, Olson LE. The Need for a Neutral Speaking Period in Psychosocial Stress Testing. J PSYCHOPHYSIOL 2019. [DOI: 10.1027/0269-8803/a000228] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Abstract. Tasks such as the Trier Social Stress Test, narrative recall, and some cognitive challenges require participants to speak in order to measure acute physiological responses to induced stress. Typically, the physiological measures during the stressed state are compared to a silent baseline period. This does not differentiate between stress that is induced by emotion and stress due to the physical act of vocalization. We modified a psychosocial stress task for 41 participants to add a period of neutral speaking. We hypothesized that there would be significant differences in physiological measures between the silent baseline and neutral speaking periods, and that these differences would explain a substantial proportion of the stress response traditionally attributed to emotion. Blood pressure, skin conductance level, respiration rate, salivary alpha-amylase, and high frequency heart rate variability showed significant changes during the neutral speaking period compared to a silent baseline, demonstrating the need for this control. Of the magnitude of physiological response which would have typically been attributed to emotion, 36–77% was due to vocalization alone. In stress-inducing tasks that require speaking, care should be taken in study design to account for the physiological impact of speech.
Collapse
Affiliation(s)
- Sarah J. Grimley
- Department of Biology, University of Redlands, CA, USA
- Department of Psychology, University of Redlands, CA, USA
| | - Celine M. Ko
- Department of Psychology, University of Redlands, CA, USA
| | | | - Fran Grace
- Department of Religious Studies, University of Redlands, CA, USA
| | | | | | | | | | - Matthew Green
- Department of Biology, University of Redlands, CA, USA
- Department of Psychology, University of Redlands, CA, USA
| | | | - Anne C. Heuerman
- Department of Biology, University of Redlands, CA, USA
- Department of Religious Studies, University of Redlands, CA, USA
| | - Misael Huerta
- Department of Biology, University of Redlands, CA, USA
| | - Megan Marks
- Department of Biology, University of Redlands, CA, USA
| | - Jenny Ov
- Department of Biology, University of Redlands, CA, USA
| | | | - Lisa E. Olson
- Department of Biology, University of Redlands, CA, USA
| |
Collapse
|
19
|
van Mersbergen M, Vinney LA, Payne AE. Cognitive influences on perceived phonatory exertion using the Borg CR10. LOGOP PHONIATR VOCO 2019; 45:123-133. [PMID: 31190588 DOI: 10.1080/14015439.2019.1617895] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Objectives: The purpose of the present study was to examine the nature of the relationship between perceptions of vocal and mental (cognitive) effort during reading and speaking tasks.Methods: One hundred and four young, healthy adult participants were randomized into one of three groups. Each group performed a writing task meant to elicit low mental effort, high mental effort, or high mental effort followed by a period of relaxation. Participants then engaged in reading and speaking tasks, meant to elicit high (suppression of a prepotent desire to speak louder) or low (no suppression of a prepotent desire to speak louder) mental effort, and completed ratings of mental effort and vocal effort via adapted versions of the Borg CR10.Results: Findings indicate that ratings of perceived mental and vocal effort are related to one another, evidenced by strong correlations, and additional analyses reveal that mental effort might drive this relationship.Conclusions: Perceptions of vocal effort appear to mirror ratings of mental effort during tasks for which vocal activity is relatively stable but cognitive demands fluctuate. The possibility that perceptions of mental effort might influence perceptions of vocal effort should be considered when creating reliable and valid measures of vocal effort as well as when interpreting currently adapted measures of vocal effort in the clinical context.
Collapse
Affiliation(s)
- Miriam van Mersbergen
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, USA
| | - Lisa A Vinney
- Department of Communication Sciences and Disorders, Illinois State University, Normal, IL, USA
| | - Alexis E Payne
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, USA
| |
Collapse
|
20
|
Gilman M, Maira C, Hapner ER. Airflow Patterns of Running Speech in Patients With Voice Disorders. J Voice 2019; 33:277-283. [DOI: 10.1016/j.jvoice.2017.12.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2017] [Accepted: 12/06/2017] [Indexed: 11/15/2022]
|
21
|
Lee J, Huber J, Jenkins J, Fredrick J. Language planning and pauses in story retell: Evidence from aging and Parkinson's disease. JOURNAL OF COMMUNICATION DISORDERS 2019; 79:1-10. [PMID: 30844602 DOI: 10.1016/j.jcomdis.2019.02.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2018] [Revised: 01/04/2019] [Accepted: 02/22/2019] [Indexed: 06/09/2023]
Abstract
We examined if and how pauses during connected speech reflect cognitive processes underlying language formulation in typical aging and Parkinson's disease (PD), beyond respiratory and motor-speech mechanisms. The frequency of silent pauses was measured (a) in relation to different linguistic (independent clausal, subordinate clausal, phrasal, and atypical) boundaries and (b) proficiency measures of language production in young adults, older adults, and individuals with PD. At the group level, aging, but not PD, resulted in increased pausing at atypical linguistic locations. However, in both aging and PD, individuals' reduced production of syntactically complex sentences was associated with more frequent pausing at various typical prosodic (clausal or phrasal) boundaries. Frequency of pauses was not associated with individual performance in grammaticality of sentences and lexical-semantic production. Overall, the present study demonstrated that production of pauses during connected speech reflects cognitive processes underlying language production beyond respiratory-physiological processes of communication. Assessing production of pauses in connected speech may augment, but does not replace, assessment of language production in clinical practice.
Collapse
Affiliation(s)
- Jiyeon Lee
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN, USA.
| | - Jessica Huber
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN, USA
| | - Jessica Jenkins
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN, USA
| | - Jennifer Fredrick
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN, USA
| |
Collapse
|
22
|
Chappaz RDO, Barreto SDS, Ortiz KZ. Pneumo-phono-articulatory coordination assessment in dysarthria cases: a cross-sectional study. SAO PAULO MED J 2018; 136:216-221. [PMID: 29924290 PMCID: PMC9907744 DOI: 10.1590/1516-3180.2017.0320161217] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/13/2017] [Accepted: 12/16/2017] [Indexed: 02/11/2023] Open
Abstract
BACKGROUND Pneumo-phono-articulatory coordination is often impaired in dysarthric patients. Because all speech is produced upon exhalation, adequate respiratory support and coordination are essential for communication. Nevertheless, studies investigating respiratory parameters for speech are scarce. The objectives of the present study were to analyze and compare the numbers of words and syllables (universal measurement) per exhalation among healthy and dysarthric speakers, in different speech tasks. DESIGN AND SETTING A cross-sectional analytical study with a control group was conducted at the Department of Speech, Language and Hearing Sciences at UNIFESP. METHODS The study sample consisted of 62 individuals: 31 dysarthric patients and 31 healthy individuals matched for sex, age and education level. All participants performed number counting and text reading tests in which the numbers of words and syllables per exhalation were recorded. All measurements obtained from the two groups were compared. RESULTS Statistically significant differences between the dysarthric and healthy groups were found in the two tasks (counting of syllables and words per exhalation) (P < 0.001). In contrast, the performance of the dysarthric patients did not vary according to the task: reading and number counting in syllables/exhalation (P = 0.821) or words/exhalation (P = 0.785). CONCLUSIONS The mean numbers of words and syllables per exhalation among dysarthric subjects did not vary according to the speech task used but they clearly showed differences between dysarthric patients and normal healthy subjects. The study also made it possible to obtain preliminary data on the average numbers of words and syllables per expiration produced by healthy individuals during their speech production.
Collapse
Affiliation(s)
- Rebeca de Oliveira Chappaz
- Speech-Language Pathologist, Department of Speech, Language and Hearing Sciences, Escola Paulista de Medicina, Universidade Federal de São Paulo (EPM-UNIFESP), São Paulo (SP), Brazil.
| | - Simone dos Santos Barreto
- MSc, PhD. Speech-Language Pathologist and Adjunct Professor III, Department of Specific Training in Speech, Language and Hearing Sciences, Instituto de Saúde de Nova Friburgo, Universidade Federal Fluminense (ISNF-UFF), Nova Friburgo (RJ), Brazil.
| | - Karin Zazo Ortiz
- MSc, PhD. Speech-Language Pathologist and Associate Professor IV, Department of Speech, Language and Hearing Sciences, Escola Paulista de Medicina, Universidade Federal de São Paulo (EPM-UNIFESP), São Paulo (SP), Brazil.
| |
Collapse
|
23
|
Bari R, Adams RJ, Rahman M, Parsons MB, Buder EH, Kumar S. rConverse: Moment by Moment Conversation Detection Using a Mobile Respiration Sensor. PROCEEDINGS OF THE ACM ON INTERACTIVE, MOBILE, WEARABLE AND UBIQUITOUS TECHNOLOGIES 2018; 2:2. [PMID: 30417165 PMCID: PMC6223316 DOI: 10.1145/3191734] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2017] [Accepted: 01/01/2018] [Indexed: 10/17/2022]
Abstract
Monitoring of in-person conversations has largely been done using acoustic sensors. In this paper, we propose a new method to detect moment-by-moment conversation episodes by analyzing breathing patterns captured by a mobile respiration sensor. Since breathing is affected by physical and cognitive activities, we develop a comprehensive method for cleaning, screening, and analyzing noisy respiration data captured in the field environment at individual breath cycle level. Using training data collected from a speech dynamics lab study with 12 participants, we show that our algorithm can identify each respiration cycle with 96.34% accuracy even in presence of walking. We present a Conditional Random Field, Context-Free Grammar (CRF-CFG) based conversation model, called rConverse, to classify respiration cycles into speech or non-speech, and subsequently infer conversation episodes. Our model achieves 82.7% accuracy for speech/non-speech classification and it identifies conversation episodes with 95.9% accuracy on lab data using a leave-one-subject-out cross-validation. Finally, the system is validated against audio ground-truth in a field study with 32 participants. rConverse identifies conversation episodes with 71.7% accuracy on 254 hours of field data. For comparison, the accuracy from a high-quality audio-recorder on the same data is 71.9%.
Collapse
Affiliation(s)
- Rummana Bari
- University of Memphis, Electrical and Computer Engineering, Memphis, TN, 38152, USA,
| | - Roy J Adams
- University of Massachusetts Amherst, Computer Science, Amherst, MA, USA
| | - Mahbubur Rahman
- University of Memphis, Now works at Samsung Research America, Mountain View, CA, USA
| | | | - Eugene H Buder
- University of Memphis, Communication Science and Disorder, Memphis, TN, USA
| | - Santosh Kumar
- University of Memphis, Computer Science, Memphis, TN, USA
| |
Collapse
|
24
|
Zhang Z. Compensation Strategies in Voice Production With Glottal Insufficiency. J Voice 2017; 33:96-102. [PMID: 29129663 DOI: 10.1016/j.jvoice.2017.10.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2017] [Revised: 10/01/2017] [Accepted: 10/02/2017] [Indexed: 11/17/2022]
Abstract
OBJECTIVES This study evaluates potential compensation strategies under conditions of glottal insufficiency. METHODS Using a numerical respiratory-laryngeal model of voice production, voice production under conditions of glottal insufficiency is investigated across a large range of voice conditions, and compared with normal voice production. RESULTS This study shows that glottal insufficiency leads to increased noise production, reduced fundamental frequency range, and inability to produce very low-intensity voice. Glottal insufficiency also leads to significantly increased respiratory effort of phonation and difficulty in maintaining a normal breath group duration, which restricts high-intensity voice production and falsetto-like voice production. Although compensation strategies exist to alleviate these undesirable voice changes, they often require hyperfunctional laryngeal and respiratory muscle activities and thus are more likely to result in vocal fatigue. CONCLUSIONS The laryngeal and respiratory subsystems need to be considered as a whole to fully understand the effect of glottal insufficiency on voice production. Strategies that compensate for laryngeal weakness at the cost of compromising the normal function of the respiratory subsystem are undesirable and may impose additional constraints on voice production and the effectiveness of available compensation strategies.
Collapse
Affiliation(s)
- Zhaoyan Zhang
- Department of Head and Neck Surgery, University of California, Los Angeles, California.
| |
Collapse
|
25
|
Włodarczak M, Heldner M. Respiratory Constraints in Verbal and Non-verbal Communication. Front Psychol 2017; 8:708. [PMID: 28567023 PMCID: PMC5434352 DOI: 10.3389/fpsyg.2017.00708] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Accepted: 04/21/2017] [Indexed: 11/24/2022] Open
Abstract
In the present paper we address the old question of respiratory planning in speech production. We recast the problem in terms of speakers' communicative goals and propose that speakers try to minimize respiratory effort in line with the H&H theory. We analyze respiratory cycles coinciding with no speech (i.e., silence), short verbal feedback expressions (SFE's) as well as longer vocalizations in terms of parameters of the respiratory cycle and find little evidence for respiratory planning in feedback production. We also investigate timing of speech and SFEs in the exhalation and contrast it with nods. We find that while speech is strongly tied to the exhalation onset, SFEs are distributed much more uniformly throughout the exhalation and are often produced on residual air. Given that nods, which do not have any respiratory constraints, tend to be more frequent toward the end of an exhalation, we propose a mechanism whereby respiratory patterns are determined by the trade-off between speakers' communicative goals and respiratory constraints.
Collapse
Affiliation(s)
| | - Mattias Heldner
- Department of Linguistics, Stockholm UniversityStockholm, Sweden
| |
Collapse
|
26
|
Wiechern B, Liberty KA, Pattemore P, Lin E. Effects of asthma on breathing during reading aloud. SPEECH LANGUAGE AND HEARING 2017. [DOI: 10.1080/2050571x.2017.1322740] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Beth Wiechern
- School of Health Sciences, University of Canterbury, Christchurch, New Zealand
| | - Kathleen A. Liberty
- School of Health Sciences, University of Canterbury, Christchurch, New Zealand
| | - Philip Pattemore
- Department of Paediatrics, University of Otago, Christchurch, New Zealand
| | - Emily Lin
- Department of Communication Disorders, University of Canterbury, Christchurch, New Zealand
| |
Collapse
|
27
|
Vocal Control: Is It Susceptible to the Negative Effects of Self-Regulatory Depletion? J Voice 2016; 30:638.e21-31. [DOI: 10.1016/j.jvoice.2015.07.016] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2015] [Accepted: 07/29/2015] [Indexed: 11/18/2022]
|
28
|
Yunusova Y, Graham NL, Shellikeri S, Phuong K, Kulkarni M, Rochon E, Tang-Wai DF, Chow TW, Black SE, Zinman LH, Green JR. Profiling Speech and Pausing in Amyotrophic Lateral Sclerosis (ALS) and Frontotemporal Dementia (FTD). PLoS One 2016; 11:e0147573. [PMID: 26789001 PMCID: PMC4720472 DOI: 10.1371/journal.pone.0147573] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2015] [Accepted: 01/05/2016] [Indexed: 11/18/2022] Open
Abstract
Objective This study examines reading aloud in patients with amyotrophic lateral sclerosis (ALS) and those with frontotemporal dementia (FTD) in order to determine whether differences in patterns of speaking and pausing exist between patients with primary motor vs. primary cognitive-linguistic deficits, and in contrast to healthy controls. Design 136 participants were included in the study: 33 controls, 85 patients with ALS, and 18 patients with either the behavioural variant of FTD (FTD-BV) or progressive nonfluent aphasia (FTD-PNFA). Participants with ALS were further divided into 4 non-overlapping subgroups—mild, respiratory, bulbar (with oral-motor deficit) and bulbar-respiratory—based on the presence and severity of motor bulbar or respiratory signs. All participants read a passage aloud. Custom-made software was used to perform speech and pause analyses, and this provided measures of speaking and articulatory rates, duration of speech, and number and duration of pauses. These measures were statistically compared in different subgroups of patients. Results The results revealed clear differences between patient groups and healthy controls on the passage reading task. A speech-based motor function measure (i.e., articulatory rate) was able to distinguish patients with bulbar ALS or FTD-PNFA from those with respiratory ALS or FTD-BV. Distinguishing the disordered groups proved challenging based on the pausing measures. Conclusions and Relevance This study demonstrated the use of speech measures in the identification of those with an oral-motor deficit, and showed the usefulness of performing a relatively simple reading test to assess speech versus pause behaviors across the ALS—FTD disease continuum. The findings also suggest that motor speech assessment should be performed as part of the diagnostic workup for patients with FTD.
Collapse
Affiliation(s)
- Yana Yunusova
- Department of Speech-Language Pathology, University of Toronto, Toronto, Ontario, Canada
- Sunnybrook Research Institute, Toronto, Ontario, Canada
- University Health Network—Toronto Rehabilitation Institute, Toronto, Ontario, Canada
- * E-mail:
| | - Naida L. Graham
- Department of Speech-Language Pathology, University of Toronto, Toronto, Ontario, Canada
- University Health Network—Toronto Rehabilitation Institute, Toronto, Ontario, Canada
| | - Sanjana Shellikeri
- Department of Speech-Language Pathology, University of Toronto, Toronto, Ontario, Canada
| | - Kent Phuong
- Department of Speech-Language Pathology, University of Toronto, Toronto, Ontario, Canada
| | | | - Elizabeth Rochon
- Department of Speech-Language Pathology, University of Toronto, Toronto, Ontario, Canada
- University Health Network—Toronto Rehabilitation Institute, Toronto, Ontario, Canada
| | - David F. Tang-Wai
- Department of Medicine (Neurology), University of Toronto, Toronto, Ontario, Canada
- Division of Neurology, Toronto Western Hospital, University Health Network, Toronto, Ontario, Canada
| | - Tiffany W. Chow
- Department of Medicine (Neurology), University of Toronto, Toronto, Ontario, Canada
- Rotman Research Institute, Toronto, Ontario, Canada
| | - Sandra E. Black
- Department of Medicine (Neurology), University of Toronto, Toronto, Ontario, Canada
- L.C. Campbell Cognitive Neurology Research Unit, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada
- Sunnybrook Health Sciences Centre, University of Toronto, Toronto, Ontario, Canada
| | - Lorne H. Zinman
- Department of Medicine (Neurology), University of Toronto, Toronto, Ontario, Canada
- Sunnybrook Health Sciences Centre, University of Toronto, Toronto, Ontario, Canada
| | - Jordan R. Green
- MGH Institute of Health Professions, Boston, Massachusetts, United States of America
| |
Collapse
|
29
|
Zhang Z. Respiratory Laryngeal Coordination in Airflow Conservation and Reduction of Respiratory Effort of Phonation. J Voice 2015; 30:760.e7-760.e13. [PMID: 26596845 DOI: 10.1016/j.jvoice.2015.09.015] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2015] [Accepted: 09/22/2015] [Indexed: 11/29/2022]
Abstract
OBJECTIVE This study evaluates the need of airflow conservation and the effect of glottal resistance on respiratory effort of phonation under different phonation conditions. METHODS A computational model of the pressure-volume-flow relationship of the respiratory system is developed. RESULTS Simulations show that increasing the glottal resistance reduces the glottal airflow and allows phonation to be sustained for a longer breath group duration. For a given breath group duration, the reduced airflow also allows phonation to be sustained within a narrow range of lung volumes, thus lowering the overall respiratory effort. CONCLUSIONS This study shows that for breath group durations and subglottal pressures typical of normal conversational speech, airflow conservation or maintaining "effortless" respiratory support does not provide a stricter requirement on the glottal resistance than that required for initiating phonation. However, the need for airflow conservation and respiratory effort reduction becomes relevant when the target subglottal pressure and breath group duration increase as in prolonged speech or singing or in conditions of weakened pulmonary function. In those conditions, the glottal resistance is expected to increase proportionally with increasing subglottal pressure to conserve airflow consumption and reduce respiratory effort.
Collapse
Affiliation(s)
- Zhaoyan Zhang
- UCLA School of Medicine, 31-24 Rehabilitation Center, 1000 Veteran Avenue, Los Angeles, California 90095-1794..
| |
Collapse
|
30
|
Rochet-Capellan A, Fuchs S. Take a breath and take the turn: how breathing meets turns in spontaneous dialogue. Philos Trans R Soc Lond B Biol Sci 2015; 369:20130399. [PMID: 25385777 DOI: 10.1098/rstb.2013.0399] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Physiological rhythms are sensitive to social interactions and could contribute to defining social rhythms. Nevertheless, our knowledge of the implications of breathing in conversational turn exchanges remains limited. In this paper, we addressed the idea that breathing may contribute to timing and coordination between dialogue partners. The relationships between turns and breathing were analysed in unconstrained face-to-face conversations involving female speakers. No overall relationship between breathing and turn-taking rates was observed, as breathing rate was specific to the subjects' activity in dialogue (listening versus taking the turn versus holding the turn). A general inter-personal coordination of breathing over the whole conversation was not evident. However, specific coordinative patterns were observed in shorter time-windows when participants engaged in taking turns. The type of turn-taking had an effect on the respective coordination in breathing. Most of the smooth and interrupted turns were taken just after an inhalation, with specific profiles of alignment to partner breathing. Unsuccessful attempts to take the turn were initiated late in the exhalation phase and with no clear inter-personal coordination. Finally, breathing profiles at turn-taking were different than those at turn-holding. The results support the idea that breathing is actively involved in turn-taking and turn-holding.
Collapse
Affiliation(s)
- Amélie Rochet-Capellan
- GIPSA-Lab, Département Parole and Cognition, CNRS and Université de Grenoble, UMR: 5216, Grenoble, France
| | - Susanne Fuchs
- Zentrum für Allgemeine Sprachwissenschaft (ZAS), 10117 Berlin, Germany
| |
Collapse
|
31
|
Gartner-Schmidt JL, Hirai R, Dastolfo C, Rosen CA, Yu L, Gillespie AI. Phonatory aerodynamics in connected speech. Laryngoscope 2015. [PMID: 26197727 DOI: 10.1002/lary.25458] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
OBJECTIVES/HYPOTHESIS 1) Present phonatory aerodynamic data for healthy controls (HCs) in connected speech; 2) contrast these findings between HCs and patients with nontreated unilateral vocal fold paralysis (UVFP); 3) present pre- and post-vocal fold augmentation outcomes for patients with UVFP; 4) contrast data from patients with post-operative laryngeal augmentation to HCs. STUDY DESIGN Retrospective, single-blinded. METHODS For phase I, 20 HC participants were recruited. For phase II, 20 patients with UVFP were age- and gender-matched to the 20 HC participants used in phase I. For phase III, 20 patients with UVFP represented a pre- and posttreatment cohort. For phase IV, 20 of the HC participants from phase I and 20 of the postoperative UVFP patients from phase III were used for direct comparison. Aerodynamic measures captured from a sample of the Rainbow Passage included: number of breaths, mean phonatory airflow rate, total duration of passage, inspiratory airflow duration, and expiratory airflow duration. The VHI-10 was also obtained pre- and postoperative laryngeal augmentation. RESULTS All phonatory aerodynamic measures were significantly increased in patients with preoperative UVFP than the HC group. Patients with laryngeal augmentation took significantly less breaths, had less mean phonatory airflow rate during voicing, and had shorter inspiratory airflow duration than the preoperative UVFP group. None of the postoperative measures returned to HC values. Significant improvement in the Voice Handicap Index-10 scores postlaryngeal augmentation was also found. CONCLUSIONS Methodology described in this study improves upon existing aerodynamic voice assessment by capturing characteristics germane to UVFP patient complaints and measuring change before and after laryngeal augmentation in connected speech. LEVEL OF EVIDENCE 4.
Collapse
Affiliation(s)
| | - Ryoji Hirai
- Department of Otolaryngology-Head and Neck Surgery, Nihon University School of Medicine, Tokyo, Japan
| | | | - Clark A Rosen
- University of Pittsburgh Voice Center, Department of Otolaryngology
| | - Lan Yu
- Center for Research on Health Care Data Center, University of Pittsburgh School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, U.S.A
| | | |
Collapse
|
32
|
Accuracy of perceptual and acoustic methods for the detection of inspiratory loci in spontaneous speech. Behav Res Methods 2013; 44:1121-8. [PMID: 22362007 DOI: 10.3758/s13428-012-0194-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The present study investigates the accuracy of perceptually and acoustically determined inspiratory loci in spontaneous speech for the purpose of identifying breath groups. Sixteen participants were asked to talk about simple topics in daily life at a comfortable speaking rate and loudness while connected to a pneumotach and audio microphone. The locations of inspiratory loci were determined on the basis of the aerodynamic signal, which served as a reference for loci identified perceptually and acoustically. Signal detection theory was used to evaluate the accuracy of the methods. The results showed that the greatest accuracy in pause detection was achieved (1) perceptually, on the basis of agreement between at least two of three judges, and (2) acoustically, using a pause duration threshold of 300 ms. In general, the perceptually based method was more accurate than was the acoustically based method. Inconsistencies among perceptually determined, acoustically determined, and aerodynamically determined inspiratory loci for spontaneous speech should be weighed in selecting a method of breath group determination.
Collapse
|
33
|
Feenaughty L, Tjaden K, Benedict RHB, Weinstock-Guttman B. Speech and pause characteristics in multiple sclerosis: a preliminary study of speakers with high and low neuropsychological test performance. CLINICAL LINGUISTICS & PHONETICS 2013; 27:134-51. [PMID: 23294227 PMCID: PMC5554953 DOI: 10.3109/02699206.2012.751624] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
This preliminary study investigated how cognitive-linguistic status in multiple sclerosis (MS) is reflected in two speech tasks (i.e. oral reading, narrative) that differ in cognitive-linguistic demand. Twenty individuals with MS were selected to comprise High and Low performance groups based on clinical tests of executive function and information processing speed and efficiency. Ten healthy controls were included for comparison. Speech samples were audio-recorded and measures of global speech timing were obtained. Results indicated predicted differences in global speech timing (i.e. speech rate and pause characteristics) for speech tasks differing in cognitive-linguistic demand, but the magnitude of these task-related differences was similar for all speaker groups. Findings suggest that assumptions concerning the cognitive-linguistic demands of reading aloud as compared to spontaneous speech may need to be re-considered for individuals with cognitive impairment. Qualitative trends suggest that additional studies investigating the association between cognitive-linguistic and speech motor variables in MS are warranted.
Collapse
Affiliation(s)
- Lynda Feenaughty
- Department of Communicative Disorders and Sciences, University at Buffalo, Buffalo, NY 14214, USA.
| | | | | | | |
Collapse
|
34
|
Che WC, Wang YT, Lu HJ, Green JR. Respiratory changes during reading in Mandarin-speaking adolescents with prelingual hearing impairment. Folia Phoniatr Logop 2011; 63:275-80. [PMID: 21372590 DOI: 10.1159/000324211] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVE Most people with severe to profound hearing impairment (SHI) exhibit speech breathing changes, but little is known about the breath group (BG) structure for this population. The purposes of this study were to investigate, compared to speakers with normal hearing, if Mandarin-speaking adolescents with prelingual SHI take inspirations more often at syntactically inappropriate positions and exhibit a difference in the temporal BG characteristics. PATIENTS AND METHODS Forty participants, 20 speakers with prelingual SHI and 20 normal-hearing controls matched for age, sex and education level were recruited. While wearing a circumferentially vented mask connected to a pneumotachograph, the subjects read three passages. The airflow signal was used to locate inspiratory loci in the speech samples. Temporal parameters of BG structure were derived from the acoustic signal. RESULTS The SHI group, compared to the control group, had significantly (1) more inspiratory loci at inappropriate and minor syntactic boundaries; (2) fewer syllables per BG, slower speaking rate, longer inter-BG pauses, and longer noninspiratory pauses, but comparable inspiratory duration, expiration duration, and BG duration. CONCLUSION The slower speaking rate within BGs and longer inter-BG pauses mainly account for the respiratory changes in Mandarin-speaking adolescents with prelingual SHI.
Collapse
Affiliation(s)
- Wei-Chun Che
- School of Dentistry, National Yang-Ming University, Taipei, Taiwan, ROC
| | | | | | | |
Collapse
|
35
|
Accuracy of perceptually based and acoustically based inspiratory loci in reading. Behav Res Methods 2010; 42:791-7. [PMID: 20805602 DOI: 10.3758/brm.42.3.791] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Investigations of speech often involve the identification of inspiratory loci in continuous recordings of speech. The present study investigates the accuracy of perceptually determined and acoustically determined inspiratory loci. While wearing a circumferentially vented mask connected to a pneumotach, 16 participants read two passages. The perceptually determined and acoustically determined inspiratory loci were compared with the actual loci of inspiration, which were determined aerodynamically. The results showed that (1) agreement across all three judges was the most accurate of the approaches considered here for detecting inspiratory loci based on listening; (2) the most accurate pause duration threshold for detecting inspiratory loci was 250 msec; and (3) the perceptually based breath-group determination was more accurate than the acoustically based determination of pause duration. Inconsistencies among perceptually determined, acoustically determined, and aerodynamically determined inspiratory loci are not negligible and, therefore, need to be considered when researchers design experiments on breath groups in speech.
Collapse
|