51
|
Kaburagi T, Kawai K, Abe S. Analysis of voice source characteristics using a constrained polynomial representation of voice source signals. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 121:745-8. [PMID: 17348497 DOI: 10.1121/1.2359234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
To analyze the characteristics of voice source signals from speech, a model is presented in the form of polynomial function by expanding the definition of the Rosenberg model. In combination with the all-pole assumption of the vocal-tract filter, methods are described for the pitch-synchronous speech analysis and temporal search of the glottal opening and closing instants. Because the source and filter models are both linear, the parameter estimation problem can be conveniently solved. In addition, the temporal search method can refine the locations of the glottal events and improve the accuracy of the parameter estimation. Analyses of non-nasalized voiced speech are conducted using an electroglottographic device from which the initial estimate of the temporal information is given.
Collapse
|
52
|
Pouplier M. Tongue kinematics during utterances elicited with the SLIP technique. LANGUAGE AND SPEECH 2007; 50:311-341. [PMID: 17974322 DOI: 10.1177/00238309070500030201] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
In the past years, there have been an increasing number of instrumental investigations as to the nature of speech production errors, prompted by the concern that decades of transcription-based speech error data may be tainted by perceptual biases. While all of these instrumental studies suggest that errors are not, as previously thought, necessarily a matter of all-or-none, it is unclear what implications these studies have for phonological encoding as a cognitive process. Due to their repetition-based design, the ill-formed errors obtained in these studies may be articulation errors rather than cognitive planning errors. The present study reports for the first time tongue movement data collected during an error elicitation study based on the SLIP technique, which has traditionally been hypothesized to elicit errors at the phonological planning level. Results indicate that tongue kinematics during errors in the present task are comparable to those found in errorful utterances in repetition tasks. The findings are interpreted within a dynamic model of speech production as errors in phasing between the interacting consonant gestures.
Collapse
|
53
|
Tack JW, Rakhorst G, van der Houwen EB, Mahieu HF, Verkerke GJ. In vitro evaluation of a double-membrane–based voice-producing element for laryngectomized patients. Head Neck 2007; 29:665-74. [PMID: 17252591 DOI: 10.1002/hed.20560] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
BACKGROUND A sound generator based on a double-membrane design that fits into a regular tracheoesophageal shunt valve may improve voice quality after total laryngectomy in patients rehabilitated with surgical voice prostheses. METHODS Voice-producing element (VPE) prototypes were manufactured using medical grade biocompatible materials and tested in vitro under physiological conditions. RESULTS Basic sound, containing multiple harmonics, was successfully produced under physiologic air pressure and airflow conditions. The fundamental frequency and sound pressure level (SPL) is controlled by changing the driving pressure, thus enabling sufficient intonation for day-to-day speech. The obtained frequency range (190-350 Hz) is appropriate for producing a female voice. The low noise-to-harmonics ratio (mean 0.15) and also the efficiency of sound production (5.5 x 10(-5) at 80 dB(A) and 0.15 m microphone distance) is comparable to that of normal vocal folds. CONCLUSIONS Functional restoration of the voice after laryngectomy with a double-membrane VPE appears to be a feasible concept for female laryngectomized patients with a hypotonic, or atonic pharyngoesophageal segment.
Collapse
|
54
|
Liu J, Liu S. [Features and clinical application of acoustic parameters in voice]. SHENG WU YI XUE GONG CHENG XUE ZA ZHI = JOURNAL OF BIOMEDICAL ENGINEERING = SHENGWU YIXUE GONGCHENGXUE ZAZHI 2006; 23:919-22. [PMID: 17002139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
In order to study laryngeal phonic function, the methods of acoustic evaluation and phonatory detection has become the focused problem by doctors in otorhinolaryngology and speech pathology. A great number of acoustic parameters have been designed and used. This article intends to discuss objective rated indexes reflecting the functional condition of vocal cords.
Collapse
|
55
|
Katz WF, Bharadwaj SV, Stettler MP. Influences of electromagnetic articulography sensors on speech produced by healthy adults and individuals with aphasia and apraxia. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2006; 49:645-59. [PMID: 16787902 DOI: 10.1044/1092-4388(2006/047)] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
PURPOSE This study examined whether the intraoral transducers used in electromagnetic articulography (EMA) interfere with speech and whether there is an added risk of interference when EMA systems are used to study individuals with aphasia and apraxia. METHOD Ten adult talkers (5 individuals with aphasia/apraxia, 5 controls) produced 12 American English vowels in /hVd/ words, the fricative-vowel (FV) words (/si/, /su/, /ei/, /eu/), and the sentence She had your dark suit in greasy wash water all year, in EMA sensors-on and sensors-off conditions. Segmental durations, vowel formant frequencies, and fricative spectral moments were measured to address possible acoustic effects of sensor placement. A perceptual experiment examined whether FV words produced in the sensors-on condition were less identifiable than those produced in the sensors-off condition. RESULTS EMA sensors caused no consistent acoustic effects across all talkers, although significant within-subject effects were noted for a small subset of the talkers. The perceptual results revealed some instances of sensor-related intelligibility loss for FV words produced by individuals with aphasia and apraxia. CONCLUSIONS The findings support previous suggestions that acoustic screening procedures be used to protect articulatory experiments from those individuals who may show consistent effects of having devices placed on intraoral structures. The findings further suggest that studies of fricatives produced by individuals with aphasia and apraxia may require additional safeguards to ensure that results are not adversely affected by intraoral sensor interference.
Collapse
|
56
|
Saito M, Imagawa H, Sakakibara KI, Tayama N, Nibu KI, Amatsu M. High-speed digital imaging and electroglottography of tracheoesophageal phonation by Amatsu's method. Acta Otolaryngol 2006; 126:521-5. [PMID: 16698703 DOI: 10.1080/00016480500415613] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
BACKGROUND Our previous findings have indicated that the thyropharyngeal muscles form a retropharyngeal prominence during alaryngeal phonation via the TE fistula. This prominence forms a so-called 'neoglottis', which is thought to function as the vibratory source. To better understand the mechanism of TE phonation, we analyzed the vibration of the neoglottis using electroglottography (EGG) and a high-speed digital imaging system. PATIENTS AND METHODS Two volunteers who use TE phonation for their daily speech communication were subjected to this study. The vibrations of the neoglottis were recorded simultaneously as EGG and high-speed imaging with acoustic signals. RESULTS The vibrations of the neoglottis, recorded by means of high-speed digital imaging, were exactly synchronized with the waveforms of the acoustic signals and EGG. CONCLUSIONS These results further confirm the neoglottis as the source of vibration during tracheoesophageal (TE) phonation.
Collapse
|
57
|
McLeod S. Australian adults' production of /n/: an EPG investigation. CLINICAL LINGUISTICS & PHONETICS 2006; 20:99-107. [PMID: 16428225 DOI: 10.1080/02699200400026496] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Images of tongue/palate contact for the nasal phoneme /n/ were created using the electropalatograph (EPG). Seven typical Australian adults with no history of hearing or communication difficulty produced syllables containing /n/ paired with five vowels. The majority of productions were symmetrical had contact with the alveolar ridge, and lateral bracing along the sides of the palate; however, there were notable exceptions. There was a wide range of inter- and intra-participant variation in the visual representations of the maximum point of contact as well as in measures of total palate contact (TPC) and centre of gravity (COG). It is suggested that when describing acceptable production of /n/ a range of tongue/palate contact patterns are provided.
Collapse
|
58
|
Watterson T, Lewis K, Brancamp T. Comparison of Nasalance scores obtained with the Nasometer 6200 and the Nasometer II 6400. Cleft Palate Craniofac J 2006; 42:574-9. [PMID: 16149843 DOI: 10.1597/04-017.1] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
OBJECTIVE This study was designed to compare nasalance scores obtained with the old Nasometer 6200 and the new Nasometer II 6400, and to evaluate test-retest reliability of nasalance scores on each machine. DESIGN Nasalance scores were obtained for 60 subjects reading each of two stimuli. Each subject read each stimulus two times on one machine; the headgear was removed and replaced and each stimulus was read a third time. The same procedure was then repeated with the second machine. Within machines, nasalance scores were compared for repeated stimuli with and without headgear change. The first reading of each stimulus with each machine was used to compare nasalance scores across machines. PARTICIPANTS The subjects were 60 adults with normal speech ranging in age from 19 to 59 years. MAIN OUTCOME MEASURES The main outcome measures were the 12 nasalance scores obtained for each of 60 subjects. RESULTS For both passages, there was a significant difference in nasalance scores between the old Nasometer and the Nasometer II; however, the actual variability that could be attributed to a difference between machines was small. Most of the variability between machines could be explained as within-subject performance variability and variability associated with headgear change. There was no significant difference in repeated scores within machines with or without headgear change. CONCLUSIONS For clinical purposes, care should be exercised when comparing nasalance scores between the old Nasometer and the Nasometer II.
Collapse
|
59
|
Fulop SA, Fitz K. Algorithms for computing the time-corrected instantaneous frequency (reassigned) spectrogram, with applications. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 119:360-71. [PMID: 16454291 DOI: 10.1121/1.2133000] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
A modification of the spectrogram (log magnitude of the short-time Fourier transform) to more accurately show the instantaneous frequencies of signal components was first proposed in 1976 [Kodera et al., Phys. Earth Planet. Inter. 12, 142-150 (1976)], and has been considered or reinvented a few times since but never widely adopted. This paper presents a unified theoretical picture of this time-frequency analysis method, the time-corrected instantaneous frequency spectrogram, together with detailed implementable algorithms comparing three published techniques for its computation. The new representation is evaluated against the conventional spectrogram for its superior ability to track signal components. The lack of a uniform framework for either mathematics or implementation details which has characterized the disparate literature on the schemes has been remedied here. Fruitful application of the method is shown in the realms of speech phonation analysis, whale song pitch tracking, and additive sound modeling.
Collapse
|
60
|
Chen F, Zhang YT. Loudness normalization for cochlear implant using pulse-rate modulation to convey Mandarin tonal information: a model-based study. CONFERENCE PROCEEDINGS : ... ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL CONFERENCE 2006; 2006:1236-1239. [PMID: 17946451 DOI: 10.1109/iembs.2006.259368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Cochlear implant (CI) devices employ electrical pulsatile stimulation of the auditory nerves (AN) to restore partial hearing to a profoundly deafened person. In order to improve the speech perception for CI users speaking tonal language, such as Mandarin, the pulse-rate has been suggested to be modulated according to the Mandarin tonal patterns to convey the Mandarin tonal information. However, recent psychological experiments have found that the pulse-rate modulation will produce accompanying variation of perceived loudness. The purpose of this paper is to introduce an amplitude compensation scheme to normalize the loudness perception when the pulse-rate is modulated to convey the Mandarin tonal information. Based on an integrate-and-fire AN model, a loudness perception model and a pitch perception were implemented. Result of model-based simulation showed that using the proposed amplitude compensation scheme, the estimated loudness was normalized while the Mandarin tonal information could still be efficiently transmitted. It is believed that, when the proposed electrical pulsatile stimulation incorporating both pulse-rate modulation and amplitude compensation is integrated with present CI devices, it would more efficiently enhance the speech identification for cochlear implantee speaking tonal languages, such as Mandarin.
Collapse
|
61
|
Dawes KS, Kelly SW. An instrument for the non-invasive assessment of lip function during speech. Med Eng Phys 2005; 27:523-35. [PMID: 15990069 DOI: 10.1016/j.medengphy.2004.11.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2004] [Revised: 10/29/2004] [Accepted: 11/24/2004] [Indexed: 11/20/2022]
Abstract
This paper describes the development of an instrument using infrared light as a non-invasive means of detecting lip opening, the extent of the opening and also the forward protrusion and backward movement of the lips during speech. The design criteria were to build a simple stand alone means of assessing lip function, which could also link to the group's commercially available Super Nasal-Oral Ratiometry System (SNORS+). SNORS+ allows objective assessment of the function and co-ordination of key articulators, with lip function previously monitored using a video camera. Synchronised tests were carried out using the new Lip Function Monitor and the video camera simultaneously, in order to verify that the signals produced related directly to the activity of the mouth. A small trial was then conducted to show that the system provides reproducible results throughout a range of 'normal' subjects. These subjects were of different gender and race to create a sample group within which there was a variety of lip sizes and face shapes. Technical aspects of the instrument and trial results are presented here. These suggest that the simple visual output and feedback of the instrument will prove useful in the assessment and management of speech disorders.
Collapse
|
62
|
Larkin M. Computerised system translates “subvocal speech”into action. Lancet Neurol 2004; 3:262. [PMID: 15132141 DOI: 10.1016/s1474-4422(04)00754-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
63
|
Smith BE, Patil Y, Guyette TW, Brannan TS, Cohen M. Pressure-Flow Measurements for Selected Oral Sound Segments Produced by Normal Children and Adolescents: A Basis for Clinical Testing. J Craniofac Surg 2004; 15:247-54; discussion 254. [PMID: 15167242 DOI: 10.1097/00001665-200403000-00017] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Despite advances in surgery, a significant number of patients who undergo cleft palate repair have residual velopharyngeal insufficiency. Maxillary advancement may also result in velopharyngeal openings during speech. Instrumental approaches providing objective measures of palatal function assisting in the accurate diagnosis of these patients include pressure-flow measurements of velopharyngeal valving during speech. There is little information to guide clinicians in interpreting pressure-flow data when testing pediatric patients, however. The primary purpose of this study was to develop a method for categorizing pressure-flow data used in the diagnosis of children and adolescents with suspected velopharyngeal insufficiency. This prospective study involved 56 male and female subjects 5 to 18 years of age. Subjects had normal speech and resonance at the time of testing, no history of speech therapy, no upper respiratory infections or allergies at the time of testing, and no orofacial anomalies. Subjects repeated oral syllables and the word "hamper" after an examiner. Mean pressures, airflows, and velopharyngeal orifice areas were obtained for each utterance produced by each subject. A discriminate function analysis was performed to determine whether data could be grouped by age, gender, or utterance type. Results indicated significant differences in data for age groups 5 to 8 years, 9 to 13 years, and 14 to 18 years. There were no significant differences between data for male subjects versus female subjects or for different utterance types. Pressures generally decreased, whereas airflows and orifice areas increased with age. Results for 14 to 18 year olds were like those for adults. Using these data, a categorization scheme for velopharyngeal function was proposed for use in clinical testing.
Collapse
|
64
|
Gomes GF, Vargas JVC, Filho EDM. Utilization of temperature distribution in expiratory speaking flow as a new parameter for speech production analysis. J Med Eng Technol 2004; 28:22-31. [PMID: 14660182 DOI: 10.1080/0309190032000112298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
A new instrument with potential use for speech production analysis is utilized in this study to measure the temperature and velocity of the expiratory speaking flow outside the oral cavity. From a physical point of view, the temperature patterns of individuals with healthy voices are expected to be different from individuals with breathy voices, since their air flow patterns are different: during breathy speech production, the glottis does not close completely, and the leakage of warm air through the glottis increases the extent of the hotter-than-ambient temperature field outside the oral cavity. The instrument is a pipe through which the tested individual breathes out while producing a sustained vowel. A tap water heat exchanger keeps the pipe wall at a temperature level considerably lower than the body temperature. The temperature gradient along the pipe centreline is measured and related to the average air velocity at the oral cavity. The measurements were performed in 30 male and 30 female subjects without vocal complaints. The objective of this initial investigation was to evaluate the possibility of establishing patterns of normality for the temperature distribution outside the oral cavity in expiratory speaking flow. In the experiments, all the temperature measurements increased as the expiratory air flow of the individual increased during speech production, therefore the instrument results agree with the physical behavior predicted by fluid mechanics and heat transfer principles. The collected data allowed for the construction of charts with two distinct normalized temperature distributions outside the oral cavity, for male and female individuals, respectively. These charts have the potential for future utilization in a follow-up study for comparison with similar measurements obtained with individuals with vocal fold pathologies, aiming to eventually produce a reliable new instrument for early detection of vocal problems through a non-invasive procedure.
Collapse
|
65
|
Searl JP. Comparison of transducers and intraoral placement options for measuring lingua-palatal contact pressure during speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2003; 46:1444-1456. [PMID: 14700367 DOI: 10.1044/1092-4388(2003/112)] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Two studies were completed that focused on instrumentation and procedural issues associated with measurement of lingua-palatal contact pressure (LPCP) during speech. In the first experiment, physical features and response characteristics of 2 miniature pressure transducers (Entran EPI-BO and Precision Measurement 60S) were evaluated to identify a transducer suitable for measuring LPCP during speech. The 2 transducers were comparable in terms of physical dimensions and most response characteristics. However, the Entran device was less affected by air temperature fluctuations, making it the more attractive option for speech LPCP measurement. In a second experiment, 3 methods of placing the Entran device in the mouth were compared. The 3 adhesion methods evaluated were (a) taping a transducer to the hard palate, (b) surface mounting on a mold of the palate, and (c) flush mounting on a mold of the palate. Directly taping the transducer to the alveolar ridge was the least acceptable option, as it resulted in changes in other aspects of speech production (consonant duration and centroid frequency of the burst/frication) suggesting that articulation was unduly altered. Direct taping was also rated as least acceptable by the speakers. Surface and flush mounting resulted in fewer changes in speech aerodynamic and acoustic parameters of /t/ and/s/ compared to the tape condition. Listener ratings also indicated less articulatory disturbance in the surface and flush mounting conditions compared to the tape condition. Surface mounting was technically easier than flush mounting and it allows for rapid repositioning of the transducer if needed.
Collapse
|
66
|
Gülmezoğlu MB, Barkana A. Effect of the losses in the vocal tract on determination of the area function. Biomed Mater Eng 2003; 13:159-66. [PMID: 12775906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2023]
Abstract
In this work, the cross-sectional areas of the vocal tract are determined for the lossy and lossless cases by using the pole-zero models obtained from the electrical equivalent circuit model of the vocal tract and the system identification method. The cross-sectional areas are used to compare the lossy and lossless cases. In the lossy case, the internal losses due to wall vibration, heat conduction, air friction and viscosity are considered, that is, the complex poles and zeros obtained from the models are used directly. Whereas, in the lossless case, only the imaginary parts of these poles and zeros are used. The vocal tract shapes obtained for the lossy case are close to the actual ones.
Collapse
|
67
|
Wilhelm FH, Handke EM, Roth WT. Detection of speaking with a new respiratory inductive plethysmography system. BIOMEDICAL SCIENCES INSTRUMENTATION 2003; 39:136-41. [PMID: 12724882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 03/02/2023]
Abstract
The LifeShirt system, a garment with integrated sensors connected to a handheld computer, allows recording of a wide variety of clinically important cardiorespiratory data continuously for extended periods outside the laboratory or clinic. The device includes sensors for assessment of physical activity and posture since both can affect physiological activation and need to be controlled. Speaking is another potential confounding factor in the interpretation of physiological data. Auditory speech recording is problematic because it can pick up sources other than the person's voice (external microphone) or is obtrusive (throat microphone). The abdominal and thoracic calibrated respiratory inductive plethysmography (RIP) sensors integrated in the LifeShirt system might be an adequate alternative for detecting speech. In a laboratory experiment we determined respiratory parameters indicative of speech. Eighteen subjects were instructed to sit quietly, write, and speak continuously, for 4 min each. Nine parameters were derived from the RIP signals and averaged over each minute. In addition, nine variability parameters were computed as their coefficients of breath-by-breath variation. Inspiratory/expiratory time (IE-ratio) best distinguished speaking from writing with 98% correct classification at a cutoff criterion of 0.52. This criterion was equally successful in distinguishing speaking from sitting quietly. Discriminant analyses indicated that linear combinations of IE-ratio and a variety of other parameters did not reliably improve classification accuracy across tasks and replications. These results demonstrate the high efficacy of RIP-derived IE-ratio for speech detection and suggest that auditory recording is not necessary for detection of speech in ambulatory assessment.
Collapse
|
68
|
Buder EH, Strand EA. Quantitative and graphic acoustic analysis of phonatory modulations: the modulogram. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2003; 46:475-490. [PMID: 14700387 DOI: 10.1044/1092-4388(2003/039)] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
A method is presented for analyzing phonatory instabilities that occur as modulations of fundamental frequency (f0) and sound pressure level (SPL) on the order of 0.2 to 20 cycles per second. Such long-term phonatory instabilities, including but not limited to traditional notions of tremor, are distinct from cycle-to-cycle perturbation such as jitter or shimmer. For each of the 2 parameters (f0, in Hz, and SPL, in dB), 3 frequency domains are proposed: (a) flutter (10-20 Hz), (b) tremor (2-10 Hz), and (c) wow (0.2-2.0 Hz), yielding 6 types of instability. Analyses were implemented using fast Fourier transforms (FFTs) with domain-specific analysis parameters. Outputs include a graphic display in the form of a set of low-frequency spectrograms (the "modulogram") and quantitative measures of the frequencies, magnitudes, durations, and sinusoidal form of the instabilities. An index of a given instability is developed by combining its duration and average modulation magnitude into a single quantity. Performance of the algorithms was assessed by analyzing test signals with known degrees of modulation, and a range of applications was reviewed to provide a rationale for use of modulograms in phonatory assessment.
Collapse
|
69
|
Verneuil A, Berry DA, Kreiman J, Gerratt BR, Ye M, Berke GS. Modeling measured glottal volume velocity waveforms. Ann Otol Rhinol Laryngol 2003; 112:120-31. [PMID: 12597284 DOI: 10.1177/000348940311200204] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The source-filter theory of speech production describes a glottal energy source (volume velocity waveform) that is filtered by the vocal tract and radiates from the mouth as phonation. The characteristics of the volume velocity waveform, the source that drives phonation, have been estimated, but never directly measured at the glottis. To accomplish this measurement, constant temperature anemometer probes were used in an in vivo canine constant pressure model of phonation. A 3-probe array was positioned supraglottically, and an endoscopic camera was positioned subglottically. Simultaneous recordings of airflow velocity (using anemometry) and glottal area (using stroboscopy) were made in 3 animals. Glottal airflow velocities and areas were combined to produce direct measurements of glottal volume velocity waveforms. The anterior and middle parts of the glottis contributed significantly to the volume velocity waveform, with less contribution from the posterior part of the glottis. The measured volume velocity waveforms were successfully fitted to a well-known laryngeal airflow model. A noninvasive measured volume velocity waveform holds promise for future clinical use.
Collapse
|
70
|
Abstract
Distinguishing between vocal changes that occur with normal aging and those that are associated with disease is an important goal of research in voice. Several acoustic measures have been used in an attempt to illuminate the integrity of the vocal mechanism, including harmonics-to-noise ratio (HNR), jitter, and fundamental frequency (F0). HNR is a measure that quantifies the amount of additive noise in the voice signal; jitter reflects the periodicity of vocal fold vibration. In this study, measures of HNR, jitter and F0 were used to compare vocal function in three groups of normally speaking women: young adults, middle-aged adults, and elderly adults. Significant differences in HNR emerged between the elderly women and the other two groups. F0 differences were also apparent between the elderly group and the two younger groups; there were no significant differences in jitter between the three groups. HNRwas found to be a more sensitive index of vocal function than jitter. The significant lowering of HNR evident in t he elderly speakers may be attributable in part to medications taken by the majority of these elderly subjects.
Collapse
|
71
|
Zagólski O, Składzień J, Carlson E, Modrzejewski M, Strek P. [Language tests in electroglottography]. OTOLARYNGOLOGIA POLSKA 2002; 56:327-31. [PMID: 12162022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/26/2023]
Abstract
An Electroglottograph (Fourcin Laryngograph Processor) was used to determine, whether there was a significant difference in degree of Irregularity in vocal fold vibration (% Irregularity) or Contact Quotient (Qx) in the vocal fold vibratory cycle (Lx) between different phonatory tasks and between men and women. The most recent software package, 'Speech Studio', was used to analyse the data. The subjects were 24 healthy subjects, 16 females and 8 males, aged from 22 to 71. The tasks consisted of sustained vowels: a, o, u, short words: Ala, Ola, Ula and a short phrase in Polish, spoken at comfortable pitch and volume. The average age of females and males did not differ significantly on t-testing. The results showed there were no statistically significant differences between mean values of Qx between phonatory tasks on ANOVA testing. However, the differences between mean values of % Irregularity were significantly different between tasks. The % Irregularity obtained for the phrase, was significantly different from the values obtained for the sustained vowels and single words. The values for the latter two did not differ significantly. The difference between male and female Qx and % Irregularity values were not significant on t-testing except for % Irregularity for the word 'Ola', which was greater in men. Our findings show that for calculation of Qx either sustained vowel phonation, single words or phrases may be used. % Irregularity of vocal fold vibration is however task dependent.
Collapse
|
72
|
Weinrich M, Mccall D, Boser KI, Virata T. Narrative and procedural discourse production by severely aphasic patients. Neurorehabil Neural Repair 2002; 16:249-74. [PMID: 12234088 DOI: 10.1177/154596802401105199] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Five cbronically aphasic subjects were trained on a computerized iconographic communication system (C-VIC). Their performance in producing single sentences scripts. and narratives was assessed using both spoken English and C-VIC. The requisite vocabulary necessary and the narrative complexity of the target productions were controlled. Subject performance using C-VIC indicates that the ability to construct discourse at the macrostructural level is largely intact. Despite significant improvements in spoken production after C-VIC training, especially at the single sentence level, the subjects' spoken discourse remains severely impaired by their failures at the microlinguistic level. These results point to the limits of currently available approaches to the remediation of aphasia and suggest avenues for future research.
Collapse
|
73
|
Abstract
The measurement of parameters from the output acoustic pressure waveform during speech has been a common activity in speech science laboratories for a number of decades. The widespread availability of personal computers with more than adequate processing capability to carry out speech analysis means that speech analysis is now commonly available and many more users have access to it. Indeed, there are some highly comprehensive speech analysis software packages available for PC computers as freeware. However, the results gained from speech analysis are not always a function only of the speech input itself, as there are some often surprising pitfalls to be aware of, due to the nature of the chosen measurement technique itself. This paper explores commonly applied speech analysis techniques and focuses particularly on some potential pitfalls and their consequences.
Collapse
|
74
|
Abstract
In this contribution a method is presented for the measurement of vocal tract resonances. The technique uses a non-invasive acoustic excitation of the vocal tract and a fast and robust detection. The method is an alternative to the linear predictive coding (LPC) analysis for patients with voice and speech disorders. Sweep signals are emitted and recorded simultaneously from the small end of a tube placed in front of the mouth opening. The use of a pressure sensor and a velocity sensor provides a direct measurement of the vocal tract impedance at the mouth (VTMI). For selected sustained German vowels, and some consonants, a comparison of results from LPC analysis and VTMI measurements is given. The results indicate a good agreement in the frequency range from 500 to 5,000 Hz. The feasibility of the VTMI method for diagnostic and therapeutic applications is subject to current research.
Collapse
|
75
|
Pasanisi E, Bacciu A, Vincenti V, Guida M, Berghenti MT, Barbot A, Panu F, Bacciu S. Comparison of speech perception benefits with SPEAK and ACE coding strategies in pediatric Nucleus CI24M cochlear implant recipients. Int J Pediatr Otorhinolaryngol 2002; 64:159-63. [PMID: 12049828 DOI: 10.1016/s0165-5876(02)00075-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Nine congenitally deaf children who received a Nucleus CI24M cochlear implant and who were fitted with the SPrint speech processor participated in this study. All subjects were initially programmed with the SPEAK coding strategy and then converted to the ACE strategy. Speech perception was evaluated before and after conversion to the new coding strategy using word and Common Phrase speech recognition tests in both the presence and absence of noise. In quiet conditions, the mean percent correct scores for words were 68.8% with SPEAK and 91% with ACE; for phrases the percentage was 66.6% with SPEAK and 85.5% with ACE. In the presence of noise (at +10 dB signal-to-noise ratio), the mean percent correct scores for words were 43.3% with SPEAK compared to 84.4% with ACE; for phrases the percentage was 41.1% with SPEAK and 82.2% with ACE. Statistical analysis revealed significant improvement in open-set speech recognition with ACE compared to SPEAK. Preliminary data suggest that converting children from SPEAK to the ACE strategy improves their performance. Subjects showed significant improvements for open-set word and sentence recognition in quiet as well as in noise when ACE was used in comparison with SPEAK. The greatest improvements were obtained when tests were presented in the presence of noise.
Collapse
|