1
|
Ekström AG, Edlund J. Sketches of chimpanzee (Pan troglodytes) hoo's: vowels by any other name? Primates 2024; 65:81-88. [PMID: 38110671 PMCID: PMC10884057 DOI: 10.1007/s10329-023-01107-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
In human speech, the close back rounded vowel /u/ (the vowel in "boot") is articulated with the tongue arched toward the dorsal boundary of the hard palate, with the pharyngeal cavity open. Acoustic and perceptual properties of chimpanzee (Pan troglodytes) hoo's are similar to those of the human vowel /u/. However, the vocal tract morphology of chimpanzees likely limits their phonetic capabilities, so that it is unlikely, or even impossible, that their articulation is comparable to that of a human. To determine how qualities of the vowel /u/ may be achieved given the chimpanzee vocal tract, we calculated transfer functions of the vocal tract area for tube models of vocal tract configurations in which vocal tract length, length and area of a laryngeal air sac simulacrum, length of lip protrusion, and area of lip opening were systematically varied. The method described is principally acoustic; we make no claim as to the actual shape of the chimpanzee vocal tract during call production. Nonetheless, we demonstrate that it may be possible to achieve the acoustic and perceptual qualities of back vowels without a reconfigured human vocal tract. The results, while tentative, suggest that the production of hoo's by chimpanzees, while achieving comparable vowel-like qualities to the human /u/, may involve articulatory gestures that are beyond the range of the human articulators. The purpose of this study was to (1) stimulate further simulation research on great ape articulation, and (2) show that apparently vowel-like phenomena in nature are not necessarily indicative of evolutionary continuity per se.
Collapse
Affiliation(s)
- Axel G Ekström
- Division of Speech, Music and Hearing, KTH Royal Institute of Technology, Stockholm, Sweden.
| | - Jens Edlund
- Division of Speech, Music and Hearing, KTH Royal Institute of Technology, Stockholm, Sweden
| |
Collapse
|
2
|
Abstract
The tongue is one of the organs most central to human speech. Here, the evolution and species-unique properties of the human tongue is traced, via reference to the apparent articulatory behavior of extant non-human great apes, and fossil findings from early hominids - from a point of view of articulatory phonetics, the science of human speech production. Increased lingual flexibility provided the possibility of mapping of articulatory targets, possibly via exaptation of manual-gestural mapping capacities evident in extant great apes. The emergence of the human-specific tongue, its properties, and morphology were crucial to the evolution of human articulate speech.
Collapse
|
3
|
Strömbergsson S, Götze J, Edlund J, Nilsson Björkenstam K. Simulating Speech Error Patterns Across Languages and Different Datasets. Lang Speech 2022; 65:105-142. [PMID: 33637011 PMCID: PMC8886306 DOI: 10.1177/0023830920987268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Children's speech acquisition is influenced by universal and language-specific forces. Some speech error patterns (or phonological processes) in children's speech are observed in many languages, but the same error pattern may have different effects in different languages. We aimed to explore phonological effects of the same speech error patterns across different languages, target audiences and discourse modes, using a novel method for large-scale corpus investigation. As an additional aim, we investigated the face validity of five different phonological effect measures by relating them to subjective ratings of assumed effects on intelligibility, as provided by practicing speech-language pathologists. Six frequently attested speech error patterns were simulated in authentic corpus data: backing, fronting, stopping, /r/-weakening, cluster reduction and weak syllable deletion-each simulation resulting in a "misarticulated" version of the original corpus. Phonological effects were quantified using five separate metrics of phonological complexity and distance from expected target forms. Using Swedish child-speech data as a reference, phonological effects were compared between this reference and a) child speech in Norwegian and English, and b) data representing different modes of discourse (spoken/written) and target audiences (adults/children) in Swedish. Of the speech error patterns, backing-the one atypical pattern of those included-was found to cause the most detrimental effects, across languages as well as across modes and speaker ages. However, none of the measures reflects intuitive rankings as provided by clinicians regarding effects on intelligibility, thus corroborating earlier reports that phonological competence is not translatable into levels of intelligibility.
Collapse
Affiliation(s)
- Sofia Strömbergsson
- Sofia Strömbergsson, SLP, Division of Speech and Language Pathology, CLINTEC, Karolinska Institutet, F67, Karolinska University Hospital, Huddinge, Stockholm, SE-141 86, Sweden.
| | - Jana Götze
- Department of Clinical Science, Intervention and Technology (CLINTEC), Karolinska Institutet (KI), Sweden
| | - Jens Edlund
- Department of Speech, Music and Hearing, KTH Royal Institute of Technology, Sweden
| | | |
Collapse
|
4
|
Strömbergsson S, Holm K, Edlund J, Lagerberg T, McAllister A. Audience Response System-Based Evaluation of Intelligibility of Children's Connected Speech - Validity, Reliability and Listener Differences. J Commun Disord 2020; 87:106037. [PMID: 32846287 DOI: 10.1016/j.jcomdis.2020.106037] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Revised: 07/18/2020] [Accepted: 07/23/2020] [Indexed: 06/11/2023]
Abstract
PURPOSE We assessed audience response systems (ARS)-based evaluation of intelligibility, with a view to find a valid and reliable intelligibility measure that is accessible to non-trained participants. In addition, we investigated potential listener differences between pediatric speech and language pathologists (SLPs) and untrained adults. METHOD Sixteen one-minute samples of connected speech were compiled, collected from 14 children with a speech sound disorder (SSD) and two children with typical speech. 16 SLPs and 13 untrained adults participated in a series of ARS listening sessions, where they were fitted with headphones and hand controls, and instructed to click a button whenever they did not understand the child speaking. Listeners' button clicks were registered and, for each speech sample, totaled into an (un)intelligibility index. The proportion of syllables perceived correctly - based on orthographic listener transcripts - was used as a reference score of intelligibility. RESULTS The ARS-based intelligibility scores correlated strongly with the intelligibility reference score. Reliability was high across listener groups and weaker for single listeners. No significant difference was found between the evaluations of SLPs and untrained adults. CONCLUSIONS ARS-based evaluation offers a valid and reliable measure of intelligibility of particular value in research as a practical tool for collecting input from listeners without experience or knowledge of SSDs. We stress that the ARS design presupposes a listener panel, and that evaluations obtained from individual listeners are predictably inadequate in terms of reliability.
Collapse
Affiliation(s)
- Sofia Strömbergsson
- Division of Speech and Language Pathology, Department of Clinical Science, Intervention and Technology (CLINTEC), Karolinska Institutet (KI), SE-141 86 Stockholm, Sweden.
| | - Katarina Holm
- Division of Speech and Language Pathology, Department of Clinical Science, Intervention and Technology (CLINTEC), Karolinska Institutet (KI), SE-141 86 Stockholm, Sweden
| | - Jens Edlund
- Speech Music & Hearing/Språkbanken Tal, KTH Royal Institute of Technology, SE-100 44 Stockholm, Sweden
| | - Tove Lagerberg
- Institute of Neuroscience and Physiology, Division of Speech and Language Pathology, University of Gothenburg, The Sahlgrenska Academy, Box 452, SE-405 30 Gothenburg, Sweden
| | - Anita McAllister
- Division of Speech and Language Pathology, Department of Clinical Science, Intervention and Technology (CLINTEC), Karolinska Institutet (KI), SE-141 86 Stockholm, Sweden; Functional Area Speech and Language Pathology, Karolinska University Hospital, SE-141 86 Stockholm, Sweden
| |
Collapse
|
5
|
Abstract
The perception of gaze plays a crucial role in human-human interaction. Gaze has been shown to matter for a number of aspects of communication and dialogue, especially for managing the flow of the dialogue and participant attention, for deictic referencing, and for the communication of attitude. When developing embodied conversational agents (ECAs) and talking heads, modeling and delivering accurate gaze targets is crucial. Traditionally, systems communicating through talking heads have been displayed to the human conversant using 2D displays, such as flat monitors. This approach introduces severe limitations for an accurate communication of gaze since 2D displays are associated with several powerful effects and illusions, most importantly the Mona Lisa gaze effect, where the gaze of the projected head appears to follow the observer regardless of viewing angle. We describe the Mona Lisa gaze effect and its consequences in the interaction loop, and propose a new approach for displaying talking heads using a 3D projection surface (a physical model of a human head) as an alternative to the traditional flat surface projection. We investigate and compare the accuracy of the perception of gaze direction and the Mona Lisa gaze effect in 2D and 3D projection surfaces in a five subject gaze perception experiment. The experiment confirms that a 3D projection surface completely eliminates the Mona Lisa gaze effect and delivers very accurate gaze direction that is independent of the observer's viewing angle. Based on the data collected in this experiment, we rephrase the formulation of the Mona Lisa gaze effect. The data, when reinterpreted, confirms the predictions of the new model for both 2D and 3D projection surfaces. Finally, we discuss the requirements on different spatially interactive systems in terms of gaze direction, and propose new applications and experiments for interaction in a human-ECA and a human-robot settings made possible by this technology.
Collapse
Affiliation(s)
| | - Jens Edlund
- KTH Royal Institute of Technology, Stockholm, Sweden
| | - Jonas Beskow
- KTH Royal Institute of Technology, Stockholm, Sweden
| |
Collapse
|
6
|
Abstract
Evaluation of methods and techniques for conversational and multimodal spoken dialogue systems is complex, as is gathering data for the modeling and tuning of such techniques. This article describes MushyPeek, an experiment framework that allows us to manipulate the audiovisual behavior of interlocutors in a setting similar to face-to-face human-human dialogue. The setup connects two subjects to each other over a Voice over Internet Protocol (VoIP) telephone connection and simultaneously provides each of them with an avatar representing the other. We present a first experiment which inaugurates, exemplifies, and validates the framework. The experiment corroborates earlier findings on the use of gaze and head pose gestures in turn-taking.
Collapse
Affiliation(s)
- Jens Edlund
- KTH Speech, Music & Hearing, Stockholm, Sweden.
| | | |
Collapse
|
7
|
Abstract
This paper investigates prosodic aspects of turn-taking in conversation with a view to improving the efficiency of identifying relevant places at which a machine can legitimately begin to talk to a human interlocutor. It examines the relationship between interaction control, the communicative function of which is to regulate the flow of information between interlocutors, and its phonetic manifestation. Specifically, the listener's perception of such interaction control phenomena is modelled. Algorithms for automatic online extraction of prosodic phenomena liable to be relevant for interaction control, such as silent pauses and intonation patterns, are presented and evaluated in experiments using Swedish map task data. We show that the automatically extracted prosodic features can be used to avoid many of the places where current dialogue systems run the risk of interrupt-ing their users, as well as to identify suitable places to take the turn.
Collapse
Affiliation(s)
- Jens Edlund
- KTH Speech, Music and Hearing, Stockholm, Sweden.
| | | |
Collapse
|
8
|
|
9
|
Abstract
A method for measuring masticatory efficiency is described as well as a mathematical formula expressing the masticatory efficiency as an index. A silicone compound was used as test material. The masticatory efficiency index was calculated as the mean of the best four out of five consecutive measurements of masticatory efficiency calculated from a specially designed formula.
Collapse
|
10
|
Edlund J, Hansson T, Petersson A, Willmar K. Sagittal splitting of the mandibular ramus. Electromyography and radiologic follow-up study of temporomandibular joint function in 44 patients. Scand J Plast Reconstr Surg 1979; 13:437-43. [PMID: 542814 DOI: 10.3109/02844317909013094] [Citation(s) in RCA: 51] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
A follow-up study was performed on 44 patients operated with sagittal splitting of the mandibular ramus for correction of a mandibular protrusion. The study included clinical examination, electromyography and masticatory efficiency test as well as radiography of the temporomandibular joint. The maximum opening capacity and protrusion of the mandible decreased one to three years after the operation. The activity of the temporal muscle decreased in rest position after the operation. Masticatory efficiency was unchanged. The position of the condyle in the fossa was unchanged postoperatively, while a posterior and superior condylar movement occurred during the fixation period. Normalization of the condylar position tended to occur one year after the operation. In 37 of 86 condyles, a double contour was seen on the postesuperior margin of the condyle one year after the operation. Possible mechanism behind the development of the new condylar bone layer is discussed.
Collapse
|
11
|
Badersten A, Attström R, Edlund J, Jönsson G, Kroneng M. [Dental hygienist education in Malmö. Patients' view on treatment by dental hygienists]. Tandlakartidningen 1974; 66:1306-8. [PMID: 4533837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
12
|
Granath LE, Bladh E, Edlund J. Amalgam specimen technic. I. Studies on comparable specimens controlled by complexometric titration of mercury content. J Dent Res 1967; 46:417-23. [PMID: 4960428 DOI: 10.1177/00220345670460021801] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
|