Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Ramanarayanan V, Goldstein L, Narayanan SS. Spatio-temporal articulatory movement primitives during speech production: extraction, interpretation, and validation. J Acoust Soc Am 2013;134:1378-1394. [PMID: 23927134 PMCID: PMC3745549 DOI: 10.1121/1.4812765] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2012] [Revised: 04/12/2013] [Accepted: 06/12/2013] [Indexed: 05/28/2023]

For:	Ramanarayanan V, Goldstein L, Narayanan SS. Spatio-temporal articulatory movement primitives during speech production: extraction, interpretation, and validation. J Acoust Soc Am 2013;134:1378-1394. [PMID: 23927134 PMCID: PMC3745549 DOI: 10.1121/1.4812765] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2012] [Revised: 04/12/2013] [Accepted: 06/12/2013] [Indexed: 05/28/2023]

Number

Cited by Other Article(s)

Wu EQ, Tang Z, Yao Y, Qiu XY, Deng PY, Xiong P, Song A, Zhu LM, Zhou M. Scalable Gamma-Driven Multilayer Network for Brain Workload Detection Through Functional Near-Infrared Spectroscopy. IEEE TRANSACTIONS ON CYBERNETICS 2022;52:12464-12478. [PMID: 34705661 DOI: 10.1109/tcyb.2021.3116964] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Kröger BJ. Computer-Implemented Articulatory Models for Speech Production: A Review. Front Robot AI 2022;9:796739. [PMID: 35494539 PMCID: PMC9040071 DOI: 10.3389/frobt.2022.796739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Accepted: 02/21/2022] [Indexed: 11/24/2022] Open

Woo J, Xing F, Prince JL, Stone M, Gomez AD, Reese TG, Wedeen VJ, El Fakhri G. A deep joint sparse non-negative matrix factorization framework for identifying the common and subject-specific functional units of tongue motion during speech. Med Image Anal 2021;72:102131. [PMID: 34174748 PMCID: PMC8316408 DOI: 10.1016/j.media.2021.102131] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Revised: 05/23/2021] [Accepted: 06/01/2021] [Indexed: 11/22/2022]

Gick B, Mayer C, Chiu C, Widing E, Roewer-Després F, Fels S, Stavness I. Quantal biomechanical effects in speech postures of the lips. J Neurophysiol 2020;124:833-843. [PMID: 32727259 DOI: 10.1152/jn.00676.2019] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open

Abstract

The unique biomechanical and functional constraints on human speech make it a promising area for research investigating modular control of movement. The present article illustrates how a modular control approach to speech can provide insights relevant to understanding both motor control and observed variation across languages. We specifically explore the robust typological finding that languages produce different degrees of labial constriction using distinct muscle groupings and concomitantly distinct lip postures. Research has suggested that these lip postures exploit biomechanical regions of nonlinearity between neural activation and movement, also known as quantal regions, to allow movement goals to be realized despite variable activation signals. We present two sets of computer simulations showing that these labial postures can be generated under the assumption of modular control and that the corresponding modules are biomechanically robust: first to variation in the activation levels of participating muscles, and second to interference from surrounding muscles. These results provide support for the hypothesis that biomechanical robustness is an important factor in selecting the muscle groupings used for speech movements and provide insight into the neurological control of speech movements and how biomechanical and functional constraints govern the emergence of speech motor modules. We anticipate that future experimental work guided by biomechanical simulation results will provide new insights into the neural organization of speech movements.NEW & NOTEWORTHY This article provides additional evidence that speech motor control is organized in a modular fashion and that biomechanics constrain the kinds of motor modules that may emerge. It also suggests that speech can be a fruitful domain for the study of modularity and that a better understanding of speech motor modules will be useful for speech research. Finally, it suggests that biomechanical modeling can serve as a useful complement to experimental work when studying modularity.

Collapse

Shao Y, Hayward V, Visell Y. Compression of dynamic tactile information in the human hand. SCIENCE ADVANCES 2020;6:eaaz1158. [PMID: 32494610 PMCID: PMC7159916 DOI: 10.1126/sciadv.aaz1158] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Accepted: 01/17/2020] [Indexed: 05/16/2023]

Woo J, Xing F, Prince JL, Stone M, Reese TG, Wedeen VJ, El Fakhri G. Identifying the Common and Subject-specific Functional Units of Speech Movements via a Joint Sparse Non-negative Matrix Factorization Framework. PROCEEDINGS OF SPIE--THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING 2020;11313:113131S. [PMID: 32454553 PMCID: PMC7243345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Parrell B, Ramanarayanan V, Nagarajan S, Houde J. The FACTS model of speech motor control: Fusing state estimation and task-based control. PLoS Comput Biol 2019;15:e1007321. [PMID: 31479444 PMCID: PMC6743785 DOI: 10.1371/journal.pcbi.1007321] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Revised: 09/13/2019] [Accepted: 08/02/2019] [Indexed: 11/18/2022] Open

Woo J, Xing F, Prince JL, Stone M, Green JR, Goldsmith T, Reese TG, Wedeen VJ, El Fakhri G. Differentiating post-cancer from healthy tongue muscle coordination patterns during speech using deep learning. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019;145:EL423. [PMID: 31153323 PMCID: PMC6530633 DOI: 10.1121/1.5103191] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 04/18/2019] [Accepted: 04/22/2019] [Indexed: 06/09/2023]

Woo J, Prince JL, Stone M, Xing F, Gomez AD, Green JR, Hartnick CJ, Brady TJ, Reese TG, Wedeen VJ, El Fakhri G. A Sparse Non-Negative Matrix Factorization Framework for Identifying Functional Units of Tongue Behavior From MRI. IEEE TRANSACTIONS ON MEDICAL IMAGING 2019;38:730-740. [PMID: 30235120 PMCID: PMC6422735 DOI: 10.1109/tmi.2018.2870939] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Mackevicius EL, Bahle AH, Williams AH, Gu S, Denisenko NI, Goldman MS, Fee MS. Unsupervised discovery of temporal sequences in high-dimensional datasets, with applications to neuroscience. eLife 2019;8:38471. [PMID: 30719973 PMCID: PMC6363393 DOI: 10.7554/elife.38471] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2018] [Accepted: 01/04/2019] [Indexed: 11/22/2022] Open

Ramanarayanan V, Tilsen S, Proctor M, Töger J, Goldstein L, Nayak KS, Narayanan S. Analysis of speech production real-time MRI. COMPUT SPEECH LANG 2018. [DOI: 10.1016/j.csl.2018.04.002] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Lee E, Xing F, Ahn S, Reese TG, Wang R, Green JR, Atassi N, Wedeen VJ, El Fakhri G, Woo J. Magnetic resonance imaging based anatomical assessment of tongue impairment due to amyotrophic lateral sclerosis: A preliminary study. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018;143:EL248. [PMID: 29716267 PMCID: PMC5895467 DOI: 10.1121/1.5030134] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/25/2017] [Revised: 03/12/2018] [Accepted: 03/14/2018] [Indexed: 06/08/2023]

MUPET-Mouse Ultrasonic Profile ExTraction: A Signal Processing Tool for Rapid and Unsupervised Analysis of Ultrasonic Vocalizations. Neuron 2017;94:465-485.e5. [PMID: 28472651 DOI: 10.1016/j.neuron.2017.04.005] [Citation(s) in RCA: 88] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Revised: 07/04/2016] [Accepted: 04/04/2017] [Indexed: 12/26/2022]

Ramanarayanan V, Van Segbroeck M, Narayanan SS. Directly data-derived articulatory gesture-like representations retain discriminatory information about phone categories. COMPUT SPEECH LANG 2016;36:330-346. [PMID: 26688612 DOI: 10.1016/j.csl.2015.03.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

State of the art in statistical methods for language and speech processing. COMPUT SPEECH LANG 2016. [DOI: 10.1016/j.csl.2015.07.001] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Parmiggiani A, Randazzo M, Maggiali M, Metta G, Elisei F, Bailly G. Design and Validation of a Talking Face for the iCub. INT J HUM ROBOT 2015. [DOI: 10.1142/s0219843615500267] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Gibert G, Olsen KN, Leung Y, Stevens CJ. Transforming an embodied conversational agent into an efficient talking head: from keyframe-based animation to multimodal concatenation synthesis. COMPUTATIONAL COGNITIVE SCIENCE 2015;1:7. [PMID: 27980889 PMCID: PMC5125409 DOI: 10.1186/s40469-015-0007-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/22/2015] [Accepted: 08/30/2015] [Indexed: 12/04/2022]

Abstract

Background

Virtual humans have become part of our everyday life (movies, internet, and computer games). Even though they are becoming more and more realistic, their speech capabilities are, most of the time, limited and not coherent and/or not synchronous with the corresponding acoustic signal.

Methods

We describe a method to convert a virtual human avatar (animated through key frames and interpolation) into a more naturalistic talking head. In fact, speech articulation cannot be accurately replicated using interpolation between key frames and talking heads with good speech capabilities are derived from real speech production data. Motion capture data are commonly used to provide accurate facial motion for visible speech articulators (jaw and lips) synchronous with acoustics. To access tongue trajectories (partially occluded speech articulator), electromagnetic articulography (EMA) is often used. We recorded a large database of phonetically-balanced English sentences with synchronous EMA, motion capture data, and acoustics. An articulatory model was computed on this database to recover missing data and to provide ‘normalized’ animation (i.e., articulatory) parameters. In addition, semi-automatic segmentation was performed on the acoustic stream. A dictionary of multimodal Australian English diphones was created. It is composed of the variation of the articulatory parameters between all the successive stable allophones.

Results

The avatar’s facial key frames were converted into articulatory parameters steering its speech articulators (jaw, lips and tongue). The speech production database was used to drive the Embodied Conversational Agent (ECA) and to enhance its speech capabilities. A Text-To-Auditory Visual Speech synthesizer was created based on the MaryTTS software and on the diphone dictionary derived from the speech production database.

Conclusions

We describe a method to transform an ECA with generic tongue model and animation by key frames into a talking head that displays naturalistic tongue, jaw and lip motions. Thanks to a multimodal speech production database, a Text-To-Auditory Visual Speech synthesizer drives the ECA’s facial movements enhancing its speech capabilities.

Electronic supplementary material

The online version of this article (doi:10.1186/s40469-015-0007-8) contains supplementary material, which is available to authorized users.

Collapse

Determining functional units of tongue motion via graph-regularized sparse non-negative matrix factorization. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION : MICCAI ... INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION 2014. [PMID: 25485373 DOI: 10.1007/978-3-319-10470-6_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register]

Gick B, Stavness I. Modularizing speech. Front Psychol 2013;4:977. [PMID: 24399989 PMCID: PMC3872306 DOI: 10.3389/fpsyg.2013.00977] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2013] [Accepted: 12/09/2013] [Indexed: 11/23/2022] Open