Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Narayanan S, Toutios A, Ramanarayanan V, Lammert A, Kim J, Lee S, Nayak K, Kim YC, Zhu Y, Goldstein L, Byrd D, Bresch E, Ghosh P, Katsamanis A, Proctor M. Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC). J Acoust Soc Am 2014;136:1307. [PMID: 25190403 PMCID: PMC4165284 DOI: 10.1121/1.4890284] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]

For:	Narayanan S, Toutios A, Ramanarayanan V, Lammert A, Kim J, Lee S, Nayak K, Kim YC, Zhu Y, Goldstein L, Byrd D, Bresch E, Ghosh P, Katsamanis A, Proctor M. Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC). J Acoust Soc Am 2014;136:1307. [PMID: 25190403 PMCID: PMC4165284 DOI: 10.1121/1.4890284] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]

Number

Cited by Other Article(s)

Badin P, Sawallis TR, Tabain M, Lamalle L. Bilinguals from Larynx to Lips: Exploring Bilingual Articulatory Strategies with Anatomic MRI Data. LANGUAGE AND SPEECH 2024:238309231224790. [PMID: 38680040 DOI: 10.1177/00238309231224790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/01/2024]

Shahid MS, French AP, Valstar MF, Yakubov GE. Research in methodologies for modelling the oral cavity. Biomed Phys Eng Express 2024;10:032001. [PMID: 38350128 DOI: 10.1088/2057-1976/ad28cc] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 02/13/2024] [Indexed: 02/15/2024]

Abstract

The paper aims to explore the current state of understanding surrounding in silico oral modelling. This involves exploring methodologies, technologies and approaches pertaining to the modelling of the whole oral cavity; both internally and externally visible structures that may be relevant or appropriate to oral actions. Such a model could be referred to as a 'complete model' which includes consideration of a full set of facial features (i.e. not only mouth) as well as synergistic stimuli such as audio and facial thermal data. 3D modelling technologies capable of accurately and efficiently capturing a complete representation of the mouth for an individual have broad applications in the study of oral actions, due to their cost-effectiveness and time efficiency. This review delves into the field of clinical phonetics to classify oral actions pertaining to both speech and non-speech movements, identifying how the various vocal organs play a role in the articulatory and masticatory process. Vitaly, it provides a summation of 12 articulatory recording methods, forming a tool to be used by researchers in identifying which method of recording is appropriate for their work. After addressing the cost and resource-intensive limitations of existing methods, a new system of modelling is proposed that leverages external to internal correlation modelling techniques to create a more efficient models of the oral cavity. The vision is that the outcomes will be applicable to a broad spectrum of oral functions related to physiology, health and wellbeing, including speech, oral processing of foods as well as dental health. The applications may span from speech correction, designing foods for the aging population, whilst in the dental field we would be able to gain information about patient's oral actions that would become part of creating a personalised dental treatment plan.

Collapse

Belyk M, Carignan C, McGettigan C. An open-source toolbox for measuring vocal tract shape from real-time magnetic resonance images. Behav Res Methods 2024;56:2623-2635. [PMID: 37507650 PMCID: PMC10990993 DOI: 10.3758/s13428-023-02171-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/14/2023] [Indexed: 07/30/2023]

Ruthven M, Peplinski AM, Adams DM, King AP, Miquel ME. Real-time speech MRI datasets with corresponding articulator ground-truth segmentations. Sci Data 2023;10:860. [PMID: 38042857 PMCID: PMC10693552 DOI: 10.1038/s41597-023-02766-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 11/20/2023] [Indexed: 12/04/2023] Open

Mofakham AA, Helenbrook BT, Erath BD, Ferro AR, Ahmed T, Brown DM, Ahmadi G. Influence of two-dimensional expiratory airflow variations on respiratory particle propagation during pronunciation of the fricative [f]. JOURNAL OF AEROSOL SCIENCE 2023;173:106179. [PMID: 37069899 PMCID: PMC10088289 DOI: 10.1016/j.jaerosci.2023.106179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 02/19/2023] [Accepted: 03/27/2023] [Indexed: 06/19/2023]

Willett FR, Kunz E, Fan C, Avansino D, Wilson G, Choi EY, Kamdar F, Hochberg LRH, Druckmann S, Shenoy K, Henderson J. A high-performance speech neuroprosthesis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.21.524489. [PMID: 36711591 PMCID: PMC9882398 DOI: 10.1101/2023.01.21.524489] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]

Masapollo M, Nittrouer S. Interarticulator Speech Coordination: Timing Is of the Essence. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023;66:901-915. [PMID: 36827516 DOI: 10.1044/2022_jslhr-22-00594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]

Nair NP, Sharma V, Dixit A, Kaushal D, Soni K, Choudhury B, Goyal A. Future Solutions for Voice Rehabilitation in Laryngectomees: A Review of Technologies Based on Electrophysiological Signals. Indian J Otolaryngol Head Neck Surg 2022;74:5082-5090. [PMID: 36742837 PMCID: PMC9895460 DOI: 10.1007/s12070-021-02765-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Accepted: 07/11/2021] [Indexed: 02/07/2023] Open

Belyk M, McGettigan C. Real-time magnetic resonance imaging reveals distinct vocal tract configurations during spontaneous and volitional laughter. Philos Trans R Soc Lond B Biol Sci 2022;377:20210511. [PMID: 36126659 PMCID: PMC9489295 DOI: 10.1098/rstb.2021.0511] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Accepted: 02/15/2022] [Indexed: 12/22/2022] Open

Kröger BJ. Computer-Implemented Articulatory Models for Speech Production: A Review. Front Robot AI 2022;9:796739. [PMID: 35494539 PMCID: PMC9040071 DOI: 10.3389/frobt.2022.796739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Accepted: 02/21/2022] [Indexed: 11/24/2022] Open

Ahmed T, Wendling HE, Mofakham AA, Ahmadi G, Helenbrook BT, Ferro AR, Brown DM, Erath BD. Variability in expiratory trajectory angles during consonant production by one human subject and from a physical mouth model: Application to respiratory droplet emission. INDOOR AIR 2021;31:1896-1912. [PMID: 34297885 PMCID: PMC8447379 DOI: 10.1111/ina.12908] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 07/03/2021] [Accepted: 07/05/2021] [Indexed: 06/10/2023]

Isaieva K, Laprie Y, Leclère J, Douros IK, Felblinger J, Vuissoz PA. Multimodal dataset of real-time 2D and static 3D MRI of healthy French speakers. Sci Data 2021;8:258. [PMID: 34599194 PMCID: PMC8486854 DOI: 10.1038/s41597-021-01041-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 08/25/2021] [Indexed: 12/28/2022] Open

Temporal Convolution Network Based Joint Optimization of Acoustic-to-Articulatory Inversion. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11199056] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Lim Y, Toutios A, Bliesener Y, Tian Y, Lingala SG, Vaz C, Sorensen T, Oh M, Harper S, Chen W, Lee Y, Töger J, Monteserin ML, Smith C, Godinez B, Goldstein L, Byrd D, Nayak KS, Narayanan SS. A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images. Sci Data 2021;8:187. [PMID: 34285240 PMCID: PMC8292336 DOI: 10.1038/s41597-021-00976-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 06/22/2021] [Indexed: 12/11/2022] Open

Affiliation(s)

Yongwan Lim Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
Asterios Toutios Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
Yannick Bliesener Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
Ye Tian Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
Sajan Goud Lingala Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
Colin Vaz Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
Tanner Sorensen Department of Linguistics, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, California, USA
Miran Oh Department of Linguistics, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, California, USA
Sarah Harper Department of Linguistics, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, California, USA
Weiyi Chen Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
Yoonjeong Lee Department of Linguistics, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, California, USA
Johannes Töger Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
Mairym Lloréns Monteserin Department of Linguistics, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, California, USA
Caitlin Smith Department of Linguistics, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, California, USA
Bianca Godinez Department of Linguistics, California State University Long Beach, Long Beach, California, USA
Louis Goldstein Department of Linguistics, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, California, USA
Dani Byrd Department of Linguistics, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, California, USA
Krishna S Nayak Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
Shrikanth S Narayanan Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA. Department of Linguistics, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, California, USA.

Collapse

Fu J, He F, Yin H, He L. Automatic detection of pharyngeal fricatives in cleft palate speech using acoustic features based on the vocal tract area spectrum. COMPUT SPEECH LANG 2021. [DOI: 10.1016/j.csl.2021.101203] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Lynn E, Narayanan SS, Lammert AC. Dark tone quality and vocal tract shaping in soprano song production: Insights from real-time MRI. JASA EXPRESS LETTERS 2021;1:075202. [PMID: 34291230 PMCID: PMC8273971 DOI: 10.1121/10.0005109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 05/10/2021] [Indexed: 06/13/2023]

A deep neural network based correction scheme for improved air-tissue boundary prediction in real-time magnetic resonance imaging video. COMPUT SPEECH LANG 2021. [DOI: 10.1016/j.csl.2020.101160] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Harper S, Goldstein L, Narayanan S. Variability in individual constriction contributions to third formant values in American English /ɹ/. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020;147:3905. [PMID: 32611162 PMCID: PMC7297543 DOI: 10.1121/10.0001413] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 05/23/2020] [Accepted: 05/28/2020] [Indexed: 06/11/2023]

Vocal tract shaping of emotional speech. COMPUT SPEECH LANG 2020;64. [PMID: 32523241 DOI: 10.1016/j.csl.2020.101100] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Serrurier A, Badin P, Lamalle L, Neuschaefer-Rube C. Characterization of inter-speaker articulatory variability: A two-level multi-speaker modelling approach based on MRI data. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019;145:2149. [PMID: 31046321 DOI: 10.1121/1.5096631] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Accepted: 03/14/2019] [Indexed: 06/09/2023]

Sorensen T, Toutios A, Goldstein L, Narayanan S. Task-dependence of articulator synergies. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019;145:1504. [PMID: 31067947 PMCID: PMC6910022 DOI: 10.1121/1.5093538] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2018] [Revised: 02/15/2019] [Accepted: 02/19/2019] [Indexed: 06/09/2023]

Kim YC. Fast upper airway magnetic resonance imaging for assessment of speech production and sleep apnea. PRECISION AND FUTURE MEDICINE 2018. [DOI: 10.23838/pfm.2018.00100] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open

Lin J, Xia J, Zhao HS, Hou R, Talukder M, Yu L, Guo JY, Li JL. Lycopene Triggers Nrf2-AMPK Cross Talk to Alleviate Atrazine-Induced Nephrotoxicity in Mice. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2018;66:12385-12394. [PMID: 30360616 DOI: 10.1021/acs.jafc.8b04341] [Citation(s) in RCA: 67] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Affiliation(s)

Jia Lin College of Veterinary Medicine , ‡Key Laboratory of the Provincial Education Department of Heilongjiang for Common Animal Disease Prevention and Treatment , and §Heilongjiang Key Laboratory for Laboratory Animals and Comparative Medicine , Northeast Agricultural University , Harbin 150030 , P.R. China
Jun Xia College of Veterinary Medicine , ‡Key Laboratory of the Provincial Education Department of Heilongjiang for Common Animal Disease Prevention and Treatment , and §Heilongjiang Key Laboratory for Laboratory Animals and Comparative Medicine , Northeast Agricultural University , Harbin 150030 , P.R. China
Hua-Shan Zhao College of Veterinary Medicine , ‡Key Laboratory of the Provincial Education Department of Heilongjiang for Common Animal Disease Prevention and Treatment , and §Heilongjiang Key Laboratory for Laboratory Animals and Comparative Medicine , Northeast Agricultural University , Harbin 150030 , P.R. China
Rui Hou College of Veterinary Medicine , ‡Key Laboratory of the Provincial Education Department of Heilongjiang for Common Animal Disease Prevention and Treatment , and §Heilongjiang Key Laboratory for Laboratory Animals and Comparative Medicine , Northeast Agricultural University , Harbin 150030 , P.R. China
Milton Talukder College of Veterinary Medicine , ‡Key Laboratory of the Provincial Education Department of Heilongjiang for Common Animal Disease Prevention and Treatment , and §Heilongjiang Key Laboratory for Laboratory Animals and Comparative Medicine , Northeast Agricultural University , Harbin 150030 , P.R. China Department of Physiology and Pharmacology, Faculty of Animal Science and Veterinary Medicine , Patuakhali Science and Technology University , Barishal 8210 , Bangladesh
Lei Yu College of Veterinary Medicine , ‡Key Laboratory of the Provincial Education Department of Heilongjiang for Common Animal Disease Prevention and Treatment , and §Heilongjiang Key Laboratory for Laboratory Animals and Comparative Medicine , Northeast Agricultural University , Harbin 150030 , P.R. China
Jian-Ying Guo College of Veterinary Medicine , ‡Key Laboratory of the Provincial Education Department of Heilongjiang for Common Animal Disease Prevention and Treatment , and §Heilongjiang Key Laboratory for Laboratory Animals and Comparative Medicine , Northeast Agricultural University , Harbin 150030 , P.R. China
Jin-Long Li College of Veterinary Medicine , ‡Key Laboratory of the Provincial Education Department of Heilongjiang for Common Animal Disease Prevention and Treatment , and §Heilongjiang Key Laboratory for Laboratory Animals and Comparative Medicine , Northeast Agricultural University , Harbin 150030 , P.R. China

Collapse

Ramanarayanan V, Tilsen S, Proctor M, Töger J, Goldstein L, Nayak KS, Narayanan S. Analysis of speech production real-time MRI. COMPUT SPEECH LANG 2018. [DOI: 10.1016/j.csl.2018.04.002] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Oh M, Lee Y. ACT: An Automatic Centroid Tracking tool for analyzing vocal tract actions in real-time magnetic resonance imaging speech production data. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018;144:EL290. [PMID: 30404513 PMCID: PMC6192793 DOI: 10.1121/1.5057367] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2018] [Revised: 08/28/2018] [Accepted: 09/11/2018] [Indexed: 06/08/2023]

Speed-accuracy tradeoffs in human speech production. PLoS One 2018;13:e0202180. [PMID: 30192767 PMCID: PMC6128466 DOI: 10.1371/journal.pone.0202180] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Accepted: 07/07/2018] [Indexed: 11/19/2022] Open

Whalen DH, Chen WR, Tiede MK, Nam H. Variability of articulator positions and formants across nine English vowels. JOURNAL OF PHONETICS 2018;68:1-14. [PMID: 30034052 PMCID: PMC6053058 DOI: 10.1016/j.wocn.2018.01.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]

Guo Q, Sun S, Ren X, Dong F, Gao BZ, Feng W. Frequency-tuned active contour model. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2017.11.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Pattem AK, Illa A, Afshan A, Ghosh PK. Optimal sensor placement in electromagnetic articulography recording for speech production study. COMPUT SPEECH LANG 2018. [DOI: 10.1016/j.csl.2017.07.008] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Töger J, Sorensen T, Somandepalli K, Toutios A, Lingala SG, Narayanan S, Nayak K. Test-retest repeatability of human speech biomarkers from static and real-time dynamic magnetic resonance imaging. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017;141:3323. [PMID: 28599561 PMCID: PMC5436977 DOI: 10.1121/1.4983081] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]

Poddar S, Jacob M. Dynamic MRI Using SmooThness Regularization on Manifolds (SToRM). IEEE TRANSACTIONS ON MEDICAL IMAGING 2016;35:1106-15. [PMID: 26685228 PMCID: PMC5334465 DOI: 10.1109/tmi.2015.2509245] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]

Toutios A, Narayanan SS. Advances in real-time magnetic resonance imaging of the vocal tract for speech science and technology research. APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING 2016;5:e6. [PMID: 27833745 PMCID: PMC5100697 DOI: 10.1017/atsip.2016.5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Speech Production in Speech Technologies: Introduction to the CSL Special Issue. COMPUT SPEECH LANG 2016. [DOI: 10.1016/j.csl.2015.11.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Lingala SG, Zhu Y, Kim YC, Toutios A, Narayanan S, Nayak KS. A fast and flexible MRI system for the study of dynamic vocal tract shaping. Magn Reson Med 2016;77:112-125. [PMID: 26778178 DOI: 10.1002/mrm.26090] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2015] [Revised: 11/06/2015] [Accepted: 11/24/2015] [Indexed: 11/07/2022]

Li M, Kim J, Lammert A, Ghosh PK, Ramanarayanan V, Narayanan S. Speaker verification based on the fusion of speech acoustics and inverted articulatory signals. COMPUT SPEECH LANG 2015;36:196-211. [PMID: 28496292 DOI: 10.1016/j.csl.2015.05.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Gibert G, Olsen KN, Leung Y, Stevens CJ. Transforming an embodied conversational agent into an efficient talking head: from keyframe-based animation to multimodal concatenation synthesis. COMPUTATIONAL COGNITIVE SCIENCE 2015;1:7. [PMID: 27980889 PMCID: PMC5125409 DOI: 10.1186/s40469-015-0007-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/22/2015] [Accepted: 08/30/2015] [Indexed: 12/04/2022]

Abstract

Background

Virtual humans have become part of our everyday life (movies, internet, and computer games). Even though they are becoming more and more realistic, their speech capabilities are, most of the time, limited and not coherent and/or not synchronous with the corresponding acoustic signal.

Methods

We describe a method to convert a virtual human avatar (animated through key frames and interpolation) into a more naturalistic talking head. In fact, speech articulation cannot be accurately replicated using interpolation between key frames and talking heads with good speech capabilities are derived from real speech production data. Motion capture data are commonly used to provide accurate facial motion for visible speech articulators (jaw and lips) synchronous with acoustics. To access tongue trajectories (partially occluded speech articulator), electromagnetic articulography (EMA) is often used. We recorded a large database of phonetically-balanced English sentences with synchronous EMA, motion capture data, and acoustics. An articulatory model was computed on this database to recover missing data and to provide ‘normalized’ animation (i.e., articulatory) parameters. In addition, semi-automatic segmentation was performed on the acoustic stream. A dictionary of multimodal Australian English diphones was created. It is composed of the variation of the articulatory parameters between all the successive stable allophones.

Results

The avatar’s facial key frames were converted into articulatory parameters steering its speech articulators (jaw, lips and tongue). The speech production database was used to drive the Embodied Conversational Agent (ECA) and to enhance its speech capabilities. A Text-To-Auditory Visual Speech synthesizer was created based on the MaryTTS software and on the diphone dictionary derived from the speech production database.

Conclusions

We describe a method to transform an ECA with generic tongue model and animation by key frames into a talking head that displays naturalistic tongue, jaw and lip motions. Thanks to a multimodal speech production database, a Text-To-Auditory Visual Speech synthesizer drives the ECA’s facial movements enhancing its speech capabilities.

Electronic supplementary material

The online version of this article (doi:10.1186/s40469-015-0007-8) contains supplementary material, which is available to authorized users.

Collapse