1
|
Perry JL, Snodgrass TD, Gilbert IR, Sutton BP, Baylis AL, Weidler EM, Tse RW, Ishman SL, Sitzman TJ. Establishing a Clinical Protocol for Velopharyngeal MRI and Interpreting Imaging Findings. Cleft Palate Craniofac J 2024; 61:748-758. [PMID: 36448363 PMCID: PMC10243551 DOI: 10.1177/10556656221141188] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Traditional imaging modalities used to assess velopharyngeal insufficiency (VPI) do not allow for direct visualization of underlying velopharyngeal (VP) structures and musculature which could impact surgical planning. This limitation can be overcome via structural magnetic resonance imaging (MRI), the only current imaging tool that provides direct visualization of salient VP structures. MRI has been used extensively in research; however, it has had limited clinical use. Factors that restrict clinical use of VP MRI include limited access to optimized VP MRI protocols and uncertainty regarding how to interpret VP MRI findings. The purpose of this paper is to outline a framework for establishing a novel VP MRI scan protocol and to detail the process of interpreting scans of the velopharynx at rest and during speech tasks. Additionally, this paper includes common scan parameters needed to allow for visualization of velopharynx and techniques for the elicitation of speech during scans.
Collapse
Affiliation(s)
- Jamie L Perry
- Department of Communication Sciences and Disorders, East Carolina University, Greenville, NC, USA
| | - Taylor D Snodgrass
- Department of Communication Sciences and Disorders, East Carolina University, Greenville, NC, USA
| | - Imani R Gilbert
- Department of Communication Sciences and Disorders, East Carolina University, Greenville, NC, USA
| | - Bradley P Sutton
- Bioengineering Department, University of Illinois at Urbana Champaign, Urbana, IL, USA
| | - Adriane L Baylis
- Department of Plastic and Reconstructive Surgery, Nationwide Children's Hospital and The Ohio State University College of Medicine, Columbus, OH, USA
| | - Erica M Weidler
- Division of Plastic Surgery, Phoenix Children's Hospital, Phoenix, AZ, USA
| | - Raymond W Tse
- Division of Craniofacial and Plastic Surgery, Department of Surgery, Seattle Children's Hospital, University of Washington, Seattle, WA, USA
| | - Stacey L Ishman
- Division of HealthVine, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Thomas J Sitzman
- Division of Plastic Surgery, Phoenix Children's Hospital, Phoenix, AZ, USA
| |
Collapse
|
2
|
Rusho RZ, Ahmed AH, Kruger S, Alam W, Meyer D, Howard D, Story B, Jacob M, Lingala SG. Prospectively accelerated dynamic speech magnetic resonance imaging at 3 T using a self-navigated spiral-based manifold regularized scheme. NMR IN BIOMEDICINE 2024:e5135. [PMID: 38440911 DOI: 10.1002/nbm.5135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 02/05/2024] [Accepted: 02/06/2024] [Indexed: 03/06/2024]
Abstract
This work develops and evaluates a self-navigated variable density spiral (VDS)-based manifold regularization scheme to prospectively improve dynamic speech magnetic resonance imaging (MRI) at 3 T. Short readout duration spirals (1.3-ms long) were used to minimize sensitivity to off-resonance. A custom 16-channel speech coil was used for improved parallel imaging of vocal tract structures. The manifold model leveraged similarities between frames sharing similar vocal tract postures without explicit motion binning. The self-navigating capability of VDS was leveraged to learn the Laplacian structure of the manifold. Reconstruction was posed as a sensitivity-encoding-based nonlocal soft-weighted temporal regularization scheme. Our approach was compared with view-sharing, low-rank, temporal finite difference, extra dimension-based sparsity reconstruction constraints. Undersampling experiments were conducted on five volunteers performing repetitive and arbitrary speaking tasks at different speaking rates. Quantitative evaluation in terms of mean square error over moving edges was performed in a retrospective undersampling experiment on one volunteer. For prospective undersampling, blinded image quality evaluation in the categories of alias artifacts, spatial blurring, and temporal blurring was performed by three experts in voice research. Region of interest analysis at articulator boundaries was performed in both experiments to assess articulatory motion. Improved performance with manifold reconstruction constraints was observed over existing constraints. With prospective undersampling, a spatial resolution of 2.4 × 2.4 mm2 /pixel and a temporal resolution of 17.4 ms/frame for single-slice imaging, and 52.2 ms/frame for concurrent three-slice imaging, were achieved. We demonstrated implicit motion binning by analyzing the mechanics of the Laplacian matrix. Manifold regularization demonstrated superior image quality scores in reducing spatial and temporal blurring compared with all other reconstruction constraints. While it exhibited faint (nonsignificant) alias artifacts that were similar to temporal finite difference, it provided statistically significant improvements compared with the other constraints. In conclusion, the self-navigated manifold regularized scheme enabled robust high spatiotemporal resolution dynamic speech MRI at 3 T.
Collapse
Affiliation(s)
- Rushdi Zahid Rusho
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa, USA
| | - Abdul Haseeb Ahmed
- Department of Electrical and Computer Engineering, University of Iowa, Iowa City, Iowa, USA
| | - Stanley Kruger
- Department of Radiology, University of Iowa, Iowa City, Iowa, USA
| | - Wahidul Alam
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa, USA
| | - David Meyer
- Janette Ogg Voice Research Center, Shenandoah University, Winchester, Virginia, USA
| | - David Howard
- Department of Electronic Engineering, Royal Holloway, University of London, London, UK
| | - Brad Story
- Department of Speech, Language, and Hearing Sciences, University of Arizona, Tucson, Arizona, USA
| | - Mathews Jacob
- Department of Electrical and Computer Engineering, University of Iowa, Iowa City, Iowa, USA
| | - Sajan Goud Lingala
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa, USA
- Department of Radiology, University of Iowa, Iowa City, Iowa, USA
| |
Collapse
|
3
|
Belyk M, Carignan C, McGettigan C. An open-source toolbox for measuring vocal tract shape from real-time magnetic resonance images. Behav Res Methods 2024; 56:2623-2635. [PMID: 37507650 PMCID: PMC10990993 DOI: 10.3758/s13428-023-02171-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/14/2023] [Indexed: 07/30/2023]
Abstract
Real-time magnetic resonance imaging (rtMRI) is a technique that provides high-contrast videographic data of human anatomy in motion. Applied to the vocal tract, it is a powerful method for capturing the dynamics of speech and other vocal behaviours by imaging structures internal to the mouth and throat. These images provide a means of studying the physiological basis for speech, singing, expressions of emotion, and swallowing that are otherwise not accessible for external observation. However, taking quantitative measurements from these images is notoriously difficult. We introduce a signal processing pipeline that produces outlines of the vocal tract from the lips to the larynx as a quantification of the dynamic morphology of the vocal tract. Our approach performs simple tissue classification, but constrained to a researcher-specified region of interest. This combination facilitates feature extraction while retaining the domain-specific expertise of a human analyst. We demonstrate that this pipeline generalises well across datasets covering behaviours such as speech, vocal size exaggeration, laughter, and whistling, as well as producing reliable outcomes across analysts, particularly among users with domain-specific expertise. With this article, we make this pipeline available for immediate use by the research community, and further suggest that it may contribute to the continued development of fully automated methods based on deep learning algorithms.
Collapse
Affiliation(s)
- Michel Belyk
- Department of Psychology, Edge Hill University, Ormskirk, UK.
| | - Christopher Carignan
- Department of Speech Hearing and Phonetic Sciences, University College London, London, UK
| | - Carolyn McGettigan
- Department of Speech Hearing and Phonetic Sciences, University College London, London, UK
| |
Collapse
|
4
|
Jin R, Li Y, Shosted RK, Xing F, Gilbert I, Perry JL, Woo J, Liang ZP, Sutton BP. Optimization of 3D dynamic speech MRI: Poisson-disc undersampling and locally higher-rank reconstruction through partial separability model with regional optimized temporal basis. Magn Reson Med 2024; 91:61-74. [PMID: 37677043 PMCID: PMC10847962 DOI: 10.1002/mrm.29812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 07/09/2023] [Accepted: 07/12/2023] [Indexed: 09/09/2023]
Abstract
PURPOSE To improve the spatiotemporal qualities of images and dynamics of speech MRI through an improved data sampling and image reconstruction approach. METHODS For data acquisition, we used a Poisson-disc random under sampling scheme that reduced the undersampling coherence. For image reconstruction, we proposed a novel locally higher-rank partial separability model. This reconstruction model represented the oral and static regions using separate low-rank subspaces, therefore, preserving their distinct temporal signal characteristics. Regional optimized temporal basis was determined from the regional-optimized virtual coil approach. Overall, we achieved a better spatiotemporal image reconstruction quality with the potential of reducing total acquisition time by 50%. RESULTS The proposed method was demonstrated through several 2-mm isotropic, 64 mm total thickness, dynamic acquisitions with 40 frames per second and compared to the previous approach using a global subspace model along with other k-space sampling patterns. Individual timeframe images and temporal profiles of speech samples were shown to illustrate the ability of the Poisson-disc under sampling pattern in reducing total acquisition time. Temporal information of sagittal and coronal directions was also shown to illustrate the effectiveness of the locally higher-rank operator and regional optimized temporal basis. To compare the reconstruction qualities of different regions, voxel-wise temporal SNR analysis were performed. CONCLUSION Poisson-disc sampling combined with a locally higher-rank model and a regional-optimized temporal basis can drastically improve the spatiotemporal image quality and provide a 50% reduction in overall acquisition time.
Collapse
Affiliation(s)
- Riwei Jin
- Department of Bioengineering, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
- Beckman Institute for Advanced Science and Technology, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Yudu Li
- Beckman Institute for Advanced Science and Technology, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
- National Center for Supercomputing Applications, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Ryan K Shosted
- Department of Linguistics, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, Massachusetts, USA
| | - Imani Gilbert
- Department of Communication Sciences and Disorders, East Carolina University, Greenville, North Carolina, USA
| | - Jamie L Perry
- Department of Communication Sciences and Disorders, East Carolina University, Greenville, North Carolina, USA
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, Massachusetts, USA
| | - Zhi-Pei Liang
- Beckman Institute for Advanced Science and Technology, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
- National Center for Supercomputing Applications, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
- Department of Electrical and Computer Engineering, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Bradley P Sutton
- Department of Bioengineering, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
- Beckman Institute for Advanced Science and Technology, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
- National Center for Supercomputing Applications, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
- Carle Illinois College of Medicine, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| |
Collapse
|
5
|
Kuwabara MS, Sitzman TJ, Szymanski KA, Perry JL, Miller JH, Cornejo P. The Pediatric Neuroradiologist's Practical Guide to Capture and Evaluate Pre- and Postoperative Velopharyngeal Insufficiency. AJNR Am J Neuroradiol 2023; 45:9-15. [PMID: 38164545 PMCID: PMC10756579 DOI: 10.3174/ajnr.a8055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 10/03/2023] [Indexed: 01/03/2024]
Abstract
Up to 30% of children with cleft palate will develop a severe speech disorder known as velopharyngeal insufficiency. Management of velopharyngeal insufficiency typically involves structural and functional assessment of the velum and pharynx by endoscopy and/or videofluoroscopy. These methods cannot provide direct evaluation of underlying velopharyngeal musculature. MR imaging offers an ideal imaging method, providing noninvasive, high-contrast, high-resolution imaging of soft-tissue anatomy. Furthermore, focused-speech MR imaging techniques can evaluate the function of the velum and pharynx during sustained speech production, providing critical physiologic information that supplements anatomic findings. The use of MR imaging for velopharyngeal evaluation is relatively novel, with limited literature describing its use in clinical radiology. Here we provide a practical approach to perform and interpret velopharyngeal MR imaging examinations. This article discusses the velopharyngeal MR imaging protocol, methods for interpreting velopharyngeal anatomy, and examples illustrating its clinical applications. This knowledge will provide radiologists with a new, noninvasive tool to offer to referring specialists.
Collapse
Affiliation(s)
- Michael S Kuwabara
- From the Radiology Department (M.S.K., J.H.M., P.C.), Phoenix Children's Hospital, Phoenix, Arizona
| | - Thomas J Sitzman
- Plastic Surgery Division (T.J.S.), Phoenix Children's Hospital, Phoenix, Arizona
| | - Kathryn A Szymanski
- Creighton University School of Medicine (K.A.S.), Phoenix Regional Campus, Phoenix, Arizona
| | - Jamie L Perry
- Department of Communication Sciences and Disorders (J.L.P.), East Carolina University, Greenville, North Carolina
| | - Jeffrey H Miller
- From the Radiology Department (M.S.K., J.H.M., P.C.), Phoenix Children's Hospital, Phoenix, Arizona
| | - Patricia Cornejo
- From the Radiology Department (M.S.K., J.H.M., P.C.), Phoenix Children's Hospital, Phoenix, Arizona
| |
Collapse
|
6
|
Pitkanen VV, Geneid A, Saarikko AM, Hakli S, Alaluusua SA. Diagnosing and Managing Velopharyngeal Insufficiency in Patients With Cleft Palate After Primary Palatoplasty. J Craniofac Surg 2023:00001665-990000000-01192. [PMID: 37955448 DOI: 10.1097/scs.0000000000009822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 09/06/2023] [Indexed: 11/14/2023] Open
Abstract
Velopharyngeal insufficiency (VPI) after palatoplasty is caused by improper anatomy preventing velopharyngeal closure and manifests as a hypernasal resonance, audible nasal emissions, weak pressure consonants, compensatory articulation, reduced speech loudness, and nostril or facial grimacing. A multidisciplinary team using multimodal instruments (speech analysis, nasoendoscopy, videofluoroscopy, nasometry, and magnetic resonance imaging) to evaluate velopharyngeal function should manage these patients. Careful monitoring of velopharyngeal function by a speech pathologist remains paramount for early identification of VPI and the perceptual assessment should follow a standardized protocol. The greatest methodology problem in CLP studies has been the use of highly variable speech samples making comparison of published results impossible. It is hoped that ongoing international collaborative efforts to standardize procedures for collection and analysis of perceptual data will help this issue. Speech therapy is the mainstay treatment for velopharyngeal mislearning and compensatory articulation, but it cannot improve hypernasality, nasal emissions, or weak pressure consonants, and surgery is the definitive treatment for VPI. Although many surgical methods are available, there is no conclusive data to guide procedure choice. The goal of this review article is to present a review of established diagnostic and management techniques of VPI.
Collapse
Affiliation(s)
- Veera V Pitkanen
- Cleft and Craniofacial Center, Department of Plastic Surgery, Helsinki University Hospital and University of Helsinki
| | - Ahmed Geneid
- Department of Otolaryngology and Phoniatrics-Head and Neck Surgery, Helsinki University Hospital and University of Helsinki, Helsinki
| | - Anne M Saarikko
- Cleft and Craniofacial Center, Department of Plastic Surgery, Helsinki University Hospital and University of Helsinki
| | - Sanna Hakli
- Department of Otolaryngology and Phoniatrics, Oulu University Hospital and PEDEGO Research Unit and Medical Research Center Oulu, University of Oulu, Oulu, Finland
| | - Suvi A Alaluusua
- Cleft and Craniofacial Center, Department of Plastic Surgery, Helsinki University Hospital and University of Helsinki
| |
Collapse
|
7
|
Isaieva K, Odille F, Laprie Y, Drouot G, Felblinger J, Vuissoz PA. Super-Resolved Dynamic 3D Reconstruction of the Vocal Tract during Natural Speech. J Imaging 2023; 9:233. [PMID: 37888339 PMCID: PMC10607793 DOI: 10.3390/jimaging9100233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 10/13/2023] [Accepted: 10/17/2023] [Indexed: 10/28/2023] Open
Abstract
MRI is the gold standard modality for speech imaging. However, it remains relatively slow, which complicates imaging of fast movements. Thus, an MRI of the vocal tract is often performed in 2D. While 3D MRI provides more information, the quality of such images is often insufficient. The goal of this study was to test the applicability of super-resolution algorithms for dynamic vocal tract MRI. In total, 25 sagittal slices of 8 mm with an in-plane resolution of 1.6 × 1.6 mm2 were acquired consecutively using a highly-undersampled radial 2D FLASH sequence. The volunteers were reading a text in French with two different protocols. The slices were aligned using the simultaneously recorded sound. The super-resolution strategy was used to reconstruct 1.6 × 1.6 × 1.6 mm3 isotropic volumes. The resulting images were less sharp than the native 2D images but demonstrated a higher signal-to-noise ratio. It was also shown that the super-resolution allows for eliminating inconsistencies leading to regular transitions between the slices. Additionally, it was demonstrated that using visual stimuli and shorter text fragments improves the inter-slice consistency and the super-resolved image sharpness. Therefore, with a correct speech task choice, the proposed method allows for the reconstruction of high-quality dynamic 3D volumes of the vocal tract during natural speech.
Collapse
Affiliation(s)
- Karyna Isaieva
- IADI, Université de Lorraine, U1254 INSERM, F-54000 Nancy, France; (F.O.); (P.-A.V.)
| | - Freddy Odille
- IADI, Université de Lorraine, U1254 INSERM, F-54000 Nancy, France; (F.O.); (P.-A.V.)
- CIC-IT 1433, CHRU de Nancy, INSERM, Université de Lorraine, F-54000 Nancy, France
| | - Yves Laprie
- LORIA, Université de Lorraine, CNRS, INRIA, F-54000 Nancy, France
| | - Guillaume Drouot
- CIC-IT 1433, CHRU de Nancy, INSERM, Université de Lorraine, F-54000 Nancy, France
| | - Jacques Felblinger
- IADI, Université de Lorraine, U1254 INSERM, F-54000 Nancy, France; (F.O.); (P.-A.V.)
- CIC-IT 1433, CHRU de Nancy, INSERM, Université de Lorraine, F-54000 Nancy, France
| | - Pierre-André Vuissoz
- IADI, Université de Lorraine, U1254 INSERM, F-54000 Nancy, France; (F.O.); (P.-A.V.)
| |
Collapse
|
8
|
Perry JL, Gilbert IR, Xing F, Jin R, Kuehn DP, Shosted RK, Woo J, Liang ZP, Sutton BP. Preliminary Development of an MRI Atlas for Application to Cleft Care: Findings and Future Recommendations. Cleft Palate Craniofac J 2023:10556656231183385. [PMID: 37335134 DOI: 10.1177/10556656231183385] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/21/2023] Open
Abstract
OBJECTIVE To introduce a highly innovative imaging method to study the complex velopharyngeal (VP) system and introduce the potential future clinical applications of a VP atlas in cleft care. DESIGN Four healthy adults participated in a 20-min dynamic magnetic resonance imaging scan that included a high-resolution T2-weighted turbo-spin-echo 3D structural scan and five custom dynamic speech imaging scans. Subjects repeated a variety of phrases when in the scanner as real-time audio was captured. SETTING Multisite institution and clinical setting. PARTICIPANTS Four adult subjects with normal anatomy were recruited for this study. MAIN OUTCOME Establishment of 4-D atlas constructed from dynamic VP MRI data. RESULTS Three-dimensional dynamic magnetic resonance imaging was successfully used to obtain high quality dynamic speech scans in an adult population. Scans were able to be re-sliced in various imaging planes. Subject-specific MR data were then reconstructed and time-aligned to create a velopharyngeal atlas representing the averaged physiological movements across the four subjects. CONCLUSIONS The current preliminary study examined the feasibility of developing a VP atlas for potential clinical applications in cleft care. Our results indicate excellent potential for the development and use of a VP atlas for assessing VP physiology during speech.
Collapse
Affiliation(s)
- Jamie L Perry
- Department of Communication Sciences and Disorders, East Carolina University, Greenville, NC, USA
| | - Imani R Gilbert
- Department of Communication Sciences and Disorders, East Carolina University, Greenville, NC, USA
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, MA, USA
| | - Riwei Jin
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, IL, USA
- Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - David P Kuehn
- Department of Speech and Hearing Science, University of Illinois Urbana-Champaign, Urbana, IL, USA
| | - Ryan K Shosted
- Department of Linguistics, University of Illinois at Urbana-Champaign, Urbana, IL USA
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, MA, USA
| | - Zhi-Pei Liang
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, IL, USA
- Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Bradley P Sutton
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, IL, USA
- Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| |
Collapse
|
9
|
Erattakulangara S, Kelat K, Meyer D, Priya S, Lingala SG. Automatic Multiple Articulator Segmentation in Dynamic Speech MRI Using a Protocol Adaptive Stacked Transfer Learning U-NET Model. Bioengineering (Basel) 2023; 10:bioengineering10050623. [PMID: 37237693 DOI: 10.3390/bioengineering10050623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 05/11/2023] [Accepted: 05/19/2023] [Indexed: 05/28/2023] Open
Abstract
Dynamic magnetic resonance imaging has emerged as a powerful modality for investigating upper-airway function during speech production. Analyzing the changes in the vocal tract airspace, including the position of soft-tissue articulators (e.g., the tongue and velum), enhances our understanding of speech production. The advent of various fast speech MRI protocols based on sparse sampling and constrained reconstruction has led to the creation of dynamic speech MRI datasets on the order of 80-100 image frames/second. In this paper, we propose a stacked transfer learning U-NET model to segment the deforming vocal tract in 2D mid-sagittal slices of dynamic speech MRI. Our approach leverages (a) low- and mid-level features and (b) high-level features. The low- and mid-level features are derived from models pre-trained on labeled open-source brain tumor MR and lung CT datasets, and an in-house airway labeled dataset. The high-level features are derived from labeled protocol-specific MR images. The applicability of our approach to segmenting dynamic datasets is demonstrated in data acquired from three fast speech MRI protocols: Protocol 1: 3 T-based radial acquisition scheme coupled with a non-linear temporal regularizer, where speakers were producing French speech tokens; Protocol 2: 1.5 T-based uniform density spiral acquisition scheme coupled with a temporal finite difference (FD) sparsity regularization, where speakers were producing fluent speech tokens in English, and Protocol 3: 3 T-based variable density spiral acquisition scheme coupled with manifold regularization, where speakers were producing various speech tokens from the International Phonetic Alphabetic (IPA). Segments from our approach were compared to those from an expert human user (a vocologist), and the conventional U-NET model without transfer learning. Segmentations from a second expert human user (a radiologist) were used as ground truth. Evaluations were performed using the quantitative DICE similarity metric, the Hausdorff distance metric, and segmentation count metric. This approach was successfully adapted to different speech MRI protocols with only a handful of protocol-specific images (e.g., of the order of 20 images), and provided accurate segmentations similar to those of an expert human.
Collapse
Affiliation(s)
- Subin Erattakulangara
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA 52242, USA
| | - Karthika Kelat
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA 52242, USA
| | - David Meyer
- Janette Ogg Voice Research Center, Shenandoah University, Winchester, VA 22601, USA
| | - Sarv Priya
- Department of Radiology, University of Iowa, Iowa City, IA 52242, USA
| | - Sajan Goud Lingala
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA 52242, USA
- Department of Radiology, University of Iowa, Iowa City, IA 52242, USA
| |
Collapse
|
10
|
Feng L. 4D Golden-Angle Radial MRI at Subsecond Temporal Resolution. NMR IN BIOMEDICINE 2023; 36:e4844. [PMID: 36259951 PMCID: PMC9845193 DOI: 10.1002/nbm.4844] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/29/2022] [Accepted: 10/13/2022] [Indexed: 05/14/2023]
Abstract
Intraframe motion blurring, as a major challenge in free-breathing dynamic MRI, can be reduced if high temporal resolution can be achieved. To address this challenge, this work proposes a highly accelerated 4D (3D + time) dynamic MRI framework with subsecond temporal resolution that does not require explicit motion compensation. The method combines standard stack-of-stars golden-angle radial sampling and tailored GRASP-Pro (Golden-angle RAdial Sparse Parallel imaging with imProved performance) reconstruction. Specifically, 4D dynamic MRI acquisition is performed continuously without motion gating or sorting. The k-space centers in stack-of-stars radial data are organized to guide estimation of a temporal basis, with which GRASP-Pro reconstruction is employed to enforce joint low-rank subspace and sparsity constraints. This new basis estimation strategy is the new feature proposed for subspace-based reconstruction in this work to achieve high temporal resolution (e.g., subsecond/3D volume). It does not require sequence modification to acquire additional navigation data, it is compatible with commercially available stack-of-stars sequences, and it does not need an intermediate reconstruction step. The proposed 4D dynamic MRI approach was tested in abdominal motion phantom, free-breathing abdominal MRI, and dynamic contrast-enhanced MRI (DCE-MRI). Our results have shown that GRASP-Pro reconstruction with the new basis estimation strategy enables highly-accelerated 4D dynamic imaging at subsecond temporal resolution (with five spokes or less for each dynamic frame per image slice) for both free-breathing non-DCE-MRI and DCE-MRI. In the abdominal phantom, better image quality with lower root mean square error and higher structural similarity index was achieved using GRASP-Pro compared with standard GRASP. With the ability to acquire each 3D image in less than 1 s, intraframe respiratory blurring can be intrinsically reduced for body applications with our approach, which eliminates the need for explicit motion detection and motion compensation.
Collapse
Affiliation(s)
- Li Feng
- Biomedical Engineering and Imaging Institute and Department of Radiology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| |
Collapse
|
11
|
Jin R, Shosted RK, Xing F, Gilbert IR, Perry JL, Woo J, Liang ZP, Sutton BP. Enhancing linguistic research through 2-mm isotropic 3D dynamic speech MRI optimized by sparse temporal sampling and low-rank reconstruction. Magn Reson Med 2023; 89:652-664. [PMID: 36289572 PMCID: PMC9712260 DOI: 10.1002/mrm.29486] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 08/17/2022] [Accepted: 09/16/2022] [Indexed: 12/13/2022]
Abstract
PURPOSE To enable a more comprehensive view of articulations during speech through near-isotropic 3D dynamic MRI with high spatiotemporal resolution and large vocal-tract coverage. METHODS Using partial separability model-based low-rank reconstruction coupled with a sparse acquisition of both spatial and temporal models, we are able to achieve near-isotropic resolution 3D imaging with a high frame rate. The total acquisition time of the speech acquisition is shortened by introducing a sparse temporal sampling that interleaves one temporal navigator with four randomized phase and slice-encoded imaging samples. Memory and computation time are improved through compressing coils based on the region of interest for low-rank constrained reconstruction with an edge-preserving spatial penalty. RESULTS The proposed method has been evaluated through experiments on several speech samples, including a standard reading passage. A near-isotropic 1.875 × 1.875 × 2 mm3 spatial resolution, 64-mm through-plane coverage, and a 35.6-fps temporal resolution are achieved. Investigations and analysis on specific speech samples support novel insights into nonsymmetric tongue movement, velum raising, and coarticulation events with adequate visualization of rapid articulatory movements. CONCLUSION Three-dimensional dynamic images of the vocal tract structures during speech with high spatiotemporal resolution and axial coverage is capable of enhancing linguistic research, enabling visualization of soft tissue motions that are not possible with other modalities.
Collapse
Affiliation(s)
- Riwei Jin
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA,Beckman Institute for Advanced Science and Technology, University of Illinois Urbana Champaign, Urbana, IL
| | - Ryan K. Shosted
- Department of Linguistics, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Imani R. Gilbert
- Department of Communication Sciences and Disorders, East Carolina University, Greenville, North Carolina 27858, USA
| | - Jamie L. Perry
- Department of Communication Sciences and Disorders, East Carolina University, Greenville, North Carolina 27858, USA
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Zhi-Pei Liang
- Beckman Institute for Advanced Science and Technology, University of Illinois Urbana Champaign, Urbana, IL,Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Bradley P. Sutton
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA,Beckman Institute for Advanced Science and Technology, University of Illinois Urbana Champaign, Urbana, IL
| |
Collapse
|
12
|
Kröger BJ. Computer-Implemented Articulatory Models for Speech Production: A Review. Front Robot AI 2022; 9:796739. [PMID: 35494539 PMCID: PMC9040071 DOI: 10.3389/frobt.2022.796739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Accepted: 02/21/2022] [Indexed: 11/24/2022] Open
Abstract
Modeling speech production and speech articulation is still an evolving research topic. Some current core questions are: What is the underlying (neural) organization for controlling speech articulation? How to model speech articulators like lips and tongue and their movements in an efficient but also biologically realistic way? How to develop high-quality articulatory-acoustic models leading to high-quality articulatory speech synthesis? Thus, on the one hand computer-modeling will help us to unfold underlying biological as well as acoustic-articulatory concepts of speech production and on the other hand further modeling efforts will help us to reach the goal of high-quality articulatory-acoustic speech synthesis based on more detailed knowledge on vocal tract acoustics and speech articulation. Currently, articulatory models are not able to reach the quality level of corpus-based speech synthesis. Moreover, biomechanical and neuromuscular based approaches are complex and still not usable for sentence-level speech synthesis. This paper lists many computer-implemented articulatory models and provides criteria for dividing articulatory models in different categories. A recent major research question, i.e., how to control articulatory models in a neurobiologically adequate manner is discussed in detail. It can be concluded that there is a strong need to further developing articulatory-acoustic models in order to test quantitative neurobiologically based control concepts for speech articulation as well as to uncover the remaining details in human articulatory and acoustic signal generation. Furthermore, these efforts may help us to approach the goal of establishing high-quality articulatory-acoustic as well as neurobiologically grounded speech synthesis.
Collapse
|
13
|
Voskuilen L, Schoormans J, Gurney-Champion OJ, Balm AJM, Strijkers GJ, Smeele LE, Nederveen AJ. Dynamic MRI of swallowing: real-time volumetric imaging at 12 frames per second at 3 T. MAGNETIC RESONANCE MATERIALS IN PHYSICS BIOLOGY AND MEDICINE 2021; 35:411-419. [PMID: 34779971 PMCID: PMC9188511 DOI: 10.1007/s10334-021-00973-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Revised: 10/10/2021] [Accepted: 10/18/2021] [Indexed: 11/25/2022]
Abstract
Objective Dysphagia or difficulty in swallowing is a potentially hazardous clinical problem that needs regular monitoring. Real-time 2D MRI of swallowing is a promising radiation-free alternative to the current clinical standard: videofluoroscopy. However, aspiration may be missed if it occurs outside this single imaged slice. We therefore aimed to image swallowing in 3D real time at 12 frames per second (fps). Materials and methods At 3 T, three 3D real-time MRI acquisition approaches were compared to the 2D acquisition: an aligned stack-of-stars (SOS), and a rotated SOS with a golden-angle increment and with a tiny golden-angle increment. The optimal 3D acquisition was determined by computer simulations and phantom scans. Subsequently, five healthy volunteers were scanned and swallowing parameters were measured. Results Although the rotated SOS approaches resulted in better image quality in simulations, in practice, the aligned SOS performed best due to the limited number of slices. The four swallowing phases could be distinguished in 3D real-time MRI, even though the spatial blurring was stronger than in 2D. The swallowing parameters were similar between 2 and 3D. Conclusion At a spatial resolution of 2-by-2-by-6 mm with seven slices, swallowing can be imaged in 3D real time at a frame rate of 12 fps. Supplementary Information The online version contains supplementary material available at 10.1007/s10334-021-00973-6.
Collapse
Affiliation(s)
- Luuk Voskuilen
- Department of Head and Neck Oncology and Surgery, Netherlands Cancer Institute, Antoni van Leeuwenhoek, Plesmanlaan 121, 1066 CX, Amsterdam, The Netherlands. .,Department of Radiology and Nuclear Medicine, Amsterdam University Medical Centers, University of Amsterdam, Cancer Center Amsterdam, Amsterdam, The Netherlands. .,Academic Centre for Dentistry Amsterdam and Academic Medical Center, University of Amsterdam and VU University Amsterdam, Amsterdam, The Netherlands.
| | - Jasper Schoormans
- Biomedical Engineering and Physics, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, The Netherlands
| | - Oliver J Gurney-Champion
- Department of Radiology and Nuclear Medicine, Amsterdam University Medical Centers, University of Amsterdam, Cancer Center Amsterdam, Amsterdam, The Netherlands
| | - Alfons J M Balm
- Department of Head and Neck Oncology and Surgery, Netherlands Cancer Institute, Antoni van Leeuwenhoek, Plesmanlaan 121, 1066 CX, Amsterdam, The Netherlands.,Department of Oral and Maxillofacial Surgery, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, The Netherlands.,Robotics and Mechatronics, faculty of EEMCS, TechMed Center, University of Twente, Enschede, The Netherlands
| | - Gustav J Strijkers
- Biomedical Engineering and Physics, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, The Netherlands
| | - Ludi E Smeele
- Department of Head and Neck Oncology and Surgery, Netherlands Cancer Institute, Antoni van Leeuwenhoek, Plesmanlaan 121, 1066 CX, Amsterdam, The Netherlands.,Department of Oral and Maxillofacial Surgery, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, The Netherlands
| | - Aart J Nederveen
- Department of Radiology and Nuclear Medicine, Amsterdam University Medical Centers, University of Amsterdam, Cancer Center Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
14
|
Xing F, Jin R, Gilbert IR, Perry JL, Sutton BP, Liu X, El Fakhri G, Shosted RK, Woo J. 4D magnetic resonance imaging atlas construction using temporally aligned audio waveforms in speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:3500. [PMID: 34852570 PMCID: PMC8580575 DOI: 10.1121/10.0007064] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 09/16/2021] [Accepted: 10/15/2021] [Indexed: 06/13/2023]
Abstract
Magnetic resonance (MR) imaging is becoming an established tool in capturing articulatory and physiological motion of the structures and muscles throughout the vocal tract and enabling visual and quantitative assessment of real-time speech activities. Although motion capture speed has been regularly improved by the continual developments in high-speed MR technology, quantitative analysis of multi-subject group data remains challenging due to variations in speaking rate and imaging time among different subjects. In this paper, a workflow of post-processing methods that matches different MR image datasets within a study group is proposed. Each subject's recorded audio waveform during speech is used to extract temporal domain information and generate temporal alignment mappings from their matching pattern. The corresponding image data are resampled by deformable registration and interpolation of the deformation fields, achieving inter-subject temporal alignment between image sequences. A four-dimensional dynamic MR speech atlas is constructed using aligned volumes from four human subjects. Similarity tests between subject and target domains using the squared error, cross correlation, and mutual information measures all show an overall score increase after spatiotemporal alignment. The amount of image variability in atlas construction is reduced, indicating a quality increase in the multi-subject data for groupwise quantitative analysis.
Collapse
Affiliation(s)
- Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Riwei Jin
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Champaign, Illinois 61801, USA
| | - Imani R Gilbert
- Department of Communication Sciences and Disorders, East Carolina University, Greenville, North Carolina 27858, USA
| | - Jamie L Perry
- Department of Communication Sciences and Disorders, East Carolina University, Greenville, North Carolina 27858, USA
| | - Bradley P Sutton
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Champaign, Illinois 61801, USA
| | - Xiaofeng Liu
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Ryan K Shosted
- Department of Linguistics, University of Illinois at Urbana-Champaign, Champaign, Illinois 61801, USA
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, Massachusetts 02114, USA
| |
Collapse
|
15
|
Isaieva K, Laprie Y, Leclère J, Douros IK, Felblinger J, Vuissoz PA. Multimodal dataset of real-time 2D and static 3D MRI of healthy French speakers. Sci Data 2021; 8:258. [PMID: 34599194 PMCID: PMC8486854 DOI: 10.1038/s41597-021-01041-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 08/25/2021] [Indexed: 12/28/2022] Open
Abstract
The study of articulatory gestures has a wide spectrum of applications, notably in speech production and recognition. Sets of phonemes, as well as their articulation, are language-specific; however, existing MRI databases mostly include English speakers. In our present work, we introduce a dataset acquired with MRI from 10 healthy native French speakers. A corpus consisting of synthetic sentences was used to ensure a good coverage of the French phonetic context. A real-time MRI technology with temporal resolution of 20 ms was used to acquire vocal tract images of the participants speaking. The sound was recorded simultaneously with MRI, denoised and temporally aligned with the images. The speech was transcribed to obtain phoneme-wise segmentation of sound. We also acquired static 3D MR images for a wide list of French phonemes. In addition, we include annotations of spontaneous swallowing. Measurement(s) | Vocal tract images • Speech | Technology Type(s) | Magnetic Resonance Imaging • Microphone Device | Sample Characteristic - Organism | Homo sapiens |
Machine-accessible metadata file describing the reported data: 10.6084/m9.figshare.16404453
Collapse
Affiliation(s)
- Karyna Isaieva
- Université de Lorraine, INSERM, IADI, Nancy, F-54000, France.
| | - Yves Laprie
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, F-54000, France
| | - Justine Leclère
- Université de Lorraine, INSERM, IADI, Nancy, F-54000, France.,Oral Medicine Department, University Hospital of Reims, 45 rue Cognacq-Jay, 51092, Reims, Cedex, France
| | - Ioannis K Douros
- Université de Lorraine, INSERM, IADI, Nancy, F-54000, France.,Université de Lorraine, CNRS, Inria, LORIA, Nancy, F-54000, France
| | - Jacques Felblinger
- Université de Lorraine, INSERM, IADI, Nancy, F-54000, France.,CIC-IT, INSERM, CHRU de Nancy, Nancy, F-54000, France
| | | |
Collapse
|
16
|
Zhao Z, Lim Y, Byrd D, Narayanan S, Nayak KS. Improved 3D real-time MRI of speech production. Magn Reson Med 2021; 85:3182-3195. [PMID: 33452722 DOI: 10.1002/mrm.28651] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2020] [Revised: 10/29/2020] [Accepted: 11/26/2020] [Indexed: 01/21/2023]
Abstract
PURPOSE To provide 3D real-time MRI of speech production with improved spatio-temporal sharpness using randomized, variable-density, stack-of-spiral sampling combined with a 3D spatio-temporally constrained reconstruction. METHODS We evaluated five candidate (k, t) sampling strategies using a previously proposed gradient-echo stack-of-spiral sequence and a 3D constrained reconstruction with spatial and temporal penalties. Regularization parameters were chosen by expert readers based on qualitative assessment. We experimentally determined the effect of spiral angle increment and kz temporal order. The strategy yielding highest image quality was chosen as the proposed method. We evaluated the proposed and original 3D real-time MRI methods in 2 healthy subjects performing speech production tasks that invoke rapid movements of articulators seen in multiple planes, using interleaved 2D real-time MRI as the reference. We quantitatively evaluated tongue boundary sharpness in three locations at two speech rates. RESULTS The proposed data-sampling scheme uses a golden-angle spiral increment in the kx -ky plane and variable-density, randomized encoding along kz . It provided a statistically significant improvement in tongue boundary sharpness score (P < .001) in the blade, body, and root of the tongue during normal and 1.5-times speeded speech. Qualitative improvements were substantial during natural speech tasks of alternating high, low tongue postures during vowels. The proposed method was also able to capture complex tongue shapes during fast alveolar consonant segments. Furthermore, the proposed scheme allows flexible retrospective selection of temporal resolution. CONCLUSION We have demonstrated improved 3D real-time MRI of speech production using randomized, variable-density, stack-of-spiral sampling with a 3D spatio-temporally constrained reconstruction.
Collapse
Affiliation(s)
- Ziwei Zhao
- Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, CA, USA
| | - Yongwan Lim
- Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, CA, USA
| | - Dani Byrd
- Department of Linguistics, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, CA, USA
| | - Shrikanth Narayanan
- Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, CA, USA.,Department of Linguistics, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, CA, USA
| | - Krishna S Nayak
- Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
17
|
Huttinga NRF, Bruijnen T, van den Berg CAT, Sbrizzi A. Nonrigid 3D motion estimation at high temporal resolution from prospectively undersampled k-space data using low-rank MR-MOTUS. Magn Reson Med 2020; 85:2309-2326. [PMID: 33169888 PMCID: PMC7839760 DOI: 10.1002/mrm.28562] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 09/30/2020] [Accepted: 09/30/2020] [Indexed: 12/25/2022]
Abstract
Purpose With the recent introduction of the MR‐LINAC, an MR‐scanner combined with a radiotherapy LINAC, MR‐based motion estimation has become of increasing interest to (retrospectively) characterize tumor and organs‐at‐risk motion during radiotherapy. To this extent, we introduce low‐rank MR‐MOTUS, a framework to retrospectively reconstruct time‐resolved nonrigid 3D+t motion fields from a single low‐resolution reference image and prospectively undersampled k‐space data acquired during motion. Theory Low‐rank MR‐MOTUS exploits spatiotemporal correlations in internal body motion with a low‐rank motion model, and inverts a signal model that relates motion fields directly to a reference image and k‐space data. The low‐rank model reduces the degrees‐of‐freedom, memory consumption, and reconstruction times by assuming a factorization of space‐time motion fields in spatial and temporal components. Methods Low‐rank MR‐MOTUS was employed to estimate motion in 2D/3D abdominothoracic scans and 3D head scans. Data were acquired using golden‐ratio radial readouts. Reconstructed 2D and 3D respiratory motion fields were, respectively, validated against time‐resolved and respiratory‐resolved image reconstructions, and the head motion against static image reconstructions from fully sampled data acquired right before and right after the motion. Results Results show that 2D+t respiratory motion can be estimated retrospectively at 40.8 motion fields per second, 3D+t respiratory motion at 7.6 motion fields per second and 3D+t head‐neck motion at 9.3 motion fields per second. The validations show good consistency with image reconstructions. Conclusions The proposed framework can estimate time‐resolved nonrigid 3D motion fields, which allows to characterize drifts and intra and inter‐cycle patterns in breathing motion during radiotherapy, and could form the basis for real‐time MR‐guided radiotherapy.
Collapse
Affiliation(s)
- Niek R F Huttinga
- Department of Radiotherapy, Division of Imaging & Oncology, University Medical Center Utrecht, Utrecht, The Netherlands.,Computational Imaging Group for MR Diagnostics & Therapy, Center for Image Sciences, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Tom Bruijnen
- Department of Radiotherapy, Division of Imaging & Oncology, University Medical Center Utrecht, Utrecht, The Netherlands.,Computational Imaging Group for MR Diagnostics & Therapy, Center for Image Sciences, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Cornelis A T van den Berg
- Department of Radiotherapy, Division of Imaging & Oncology, University Medical Center Utrecht, Utrecht, The Netherlands.,Computational Imaging Group for MR Diagnostics & Therapy, Center for Image Sciences, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Alessandro Sbrizzi
- Department of Radiotherapy, Division of Imaging & Oncology, University Medical Center Utrecht, Utrecht, The Netherlands.,Computational Imaging Group for MR Diagnostics & Therapy, Center for Image Sciences, University Medical Center Utrecht, Utrecht, The Netherlands
| |
Collapse
|
18
|
Clifford B, Gu Y, Liu Y, Kim K, Huang S, Li Y, Lam F, Liang ZP, Yu X. High-Resolution Dynamic 31P-MR Spectroscopic Imaging for Mapping Mitochondrial Function. IEEE Trans Biomed Eng 2020; 67:2745-2753. [PMID: 32011244 PMCID: PMC7384926 DOI: 10.1109/tbme.2020.2969892] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
OBJECTIVE To enable non-invasive dynamic metabolic mapping in rodent model studies of mitochondrial function using 31P-MR spectroscopic imaging (MRSI). METHODS We developed a novel method for high-resolution dynamic 31P-MRSI. The method synergistically integrates physics-based models of spectral structures, biochemical modeling of molecular dynamics, and subspace learning to capture spatiospectral variations. Fast data acquisition was achieved using rapid spiral trajectories and sparse sampling of (k, t, T)-space; image reconstruction was accomplished using a low-rank tensor-based framework. RESULTS The proposed method provided high-resolution dynamic metabolic mapping in rat hindlimb at spatial and temporal resolutions of 4[Formula: see text]2 mm3 and 1.28 s, respectively. This allowed for in vivo mapping of the time-constant of phosphocreatine resynthesis, a well established index of mitochondrial oxidative capacity. Multiple rounds of in vivo experiments were performed to demonstrate reproducibility, and in vitro experiments were used to validate the accuracy of the estimated metabolite maps. CONCLUSIONS A new model-based method is proposed to achieve high-resolution dynamic 31P-MRSI. The proposed method's ability to delineate metabolic heterogeneity was demonstrated in rat hindlimb. SIGNIFICANCE Abnormal mitochondrial metabolism is a key cellular dysfunction in many prevalent diseases such as diabetes and heart disease; however, current understanding of mitochondrial function is mostly gained from studies on isolated mitochondria under nonphysiological conditions. The proposed method has the potential to open new avenues of research by allowing in vivo and longitudinal studies of mitochondrial dysfunction in disease development and progression.
Collapse
Affiliation(s)
- Bryan Clifford
- Department of Electrical and Computer Engineering and the Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign
| | - Yuning Gu
- Department of Biomedical Engineering and the Case Center for Imaging Research, Case Western Reserve University
| | - Yuchi Liu
- Department of Biomedical Engineering and the Case Center for Imaging Research, Case Western Reserve University
| | - Kihwan Kim
- Department of Biomedical Engineering and the Case Center for Imaging Research, Case Western Reserve University
| | - Sherry Huang
- Department of Biomedical Engineering and the Case Center for Imaging Research, Case Western Reserve University
| | - Yudu Li
- Department of Electrical and Computer Engineering and the Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign
| | - Fan Lam
- Department of Bioengineering and the Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign
| | - Zhi-Pei Liang
- Department of Electrical and Computer Engineering and the Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
| | - Xin Yu
- Departments of Biomedical Engineering, Radiology, and Physiology and Biophysics, as well as the Case Center for Imaging Research, Case Western Reserve University, Cleveland, OH 44106-7207 USA
| |
Collapse
|
19
|
Martin J, Ruthven M, Boubertakh R, Miquel ME. Realistic Dynamic Numerical Phantom for MRI of the Upper Vocal Tract. J Imaging 2020; 6:86. [PMID: 34460743 PMCID: PMC8320850 DOI: 10.3390/jimaging6090086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Revised: 08/08/2020] [Accepted: 08/24/2020] [Indexed: 11/16/2022] Open
Abstract
Dynamic and real-time MRI (rtMRI) of human speech is an active field of research, with interest from both the linguistics and clinical communities. At present, different research groups are investigating a range of rtMRI acquisition and reconstruction approaches to visualise the speech organs. Similar to other moving organs, it is difficult to create a physical phantom of the speech organs to optimise these approaches; therefore, the optimisation requires extensive scanner access and imaging of volunteers. As previously demonstrated in cardiac imaging, realistic numerical phantoms can be useful tools for optimising rtMRI approaches and reduce reliance on scanner access and imaging volunteers. However, currently, no such speech rtMRI phantom exists. In this work, a numerical phantom for optimising speech rtMRI approaches was developed and tested on different reconstruction schemes. The novel phantom comprised a dynamic image series and corresponding k-space data of a single mid-sagittal slice with a temporal resolution of 30 frames per second (fps). The phantom was developed based on images of a volunteer acquired at a frame rate of 10 fps. The creation of the numerical phantom involved the following steps: image acquisition, image enhancement, segmentation, mask optimisation, through-time and spatial interpolation and finally the derived k-space phantom. The phantom was used to: (1) test different k-space sampling schemes (Cartesian, radial and spiral); (2) create lower frame rate acquisitions by simulating segmented k-space acquisitions; (3) simulate parallel imaging reconstructions (SENSE and GRAPPA). This demonstrated how such a numerical phantom could be used to optimise images and test multiple sampling strategies without extensive scanner access.
Collapse
Affiliation(s)
- Joe Martin
- MR Physics, Guy’s and St Thomas’ NHS Foundation Trust, St Thomas’s Hospital, London SE1 7EH, UK;
| | - Matthieu Ruthven
- Clinical Physics, Barts Health NHS Trust, St Bartholomew’s Hospital, London EC1A 7BE, UK;
| | - Redha Boubertakh
- Singapore Bioimaging Consortium (SBIC), Singapore 138667, Singapore;
| | - Marc E. Miquel
- Clinical Physics, Barts Health NHS Trust, St Bartholomew’s Hospital, London EC1A 7BE, UK;
- Centre for Advanced Cardiovascular Imaging, NIHR Barts Biomedical Research Centre (BRC), William Harvey Research Institute, Queen Mary University of London, London EC1M 6BQ, UK
| |
Collapse
|
20
|
Chen W, Lee NG, Byrd D, Narayanan S, Nayak KS. Improved real-time tagged MRI using REALTAG. Magn Reson Med 2020; 84:838-846. [PMID: 31872918 PMCID: PMC7180094 DOI: 10.1002/mrm.28144] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 11/13/2019] [Accepted: 12/02/2019] [Indexed: 12/29/2022]
Abstract
OBJECTIVES To evaluate a novel method for real-time tagged MRI with increased tag persistence using phase sensitive tagging (REALTAG), demonstrated for speech imaging. METHODS Tagging is applied as a brief interruption to a continuous real-time spiral acquisition. REALTAG is implemented using a total tagging flip angle of 180° and a novel frame-by-frame phase sensitive reconstruction to remove smooth background phase while preserving the sign of the tag lines. Tag contrast-to-noise ratio of REALTAG and conventional tagging (total flip angle of 90°) is simulated and evaluated in vivo. The ability to extend tag persistence is tested during the production of vowel-to-vowel transitions by American English speakers. RESULTS REALTAG resulted in a doubling of contrast-to-noise ratio at each time point and increased tag persistence by more than 1.9-fold. The tag persistence was 1150 ms with contrast-to-noise ratio >6 at 1.5T, providing 2 mm in-plane resolution, 179 frames/s, with 72.6 ms temporal window width, and phase sensitive reconstruction. The new imaging window is able to capture internal tongue deformation over word-to-word transitions in natural speech production. CONCLUSION Tag persistence is substantially increased in intermittently tagged real-time MRI by using the improved REALTAG method. This makes it possible to capture longer motion patterns in the tongue, such as cross-word vowel-to-vowel transitions, and provides a powerful new window to study tongue biomechanics.
Collapse
Affiliation(s)
- Weiyi Chen
- Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
| | - Nam Gyun Lee
- Department of Biomedical Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
| | - Dani Byrd
- Department of Linguistics, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, California, USA
| | - Shrikanth Narayanan
- Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
- Department of Linguistics, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, California, USA
| | - Krishna S. Nayak
- Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
- Department of Biomedical Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
| |
Collapse
|
21
|
Pua Schleif E, Pelland CM, Ellis C, Fang X, Leierer SJ, Sutton BP, Kuehn DP, Blemker SS, Perry JL. Identifying Predictors of Levator Veli Palatini Muscle Contraction During Speech Using Dynamic Magnetic Resonance Imaging. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:1726-1735. [PMID: 32539646 PMCID: PMC7839028 DOI: 10.1044/2020_jslhr-20-00013] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/12/2020] [Revised: 03/18/2020] [Accepted: 03/23/2020] [Indexed: 06/11/2023]
Abstract
Purpose The purpose of this study was to identify predictors of levator veli palatini (LVP) muscle shortening and maximum contraction velocity in adults with normal anatomy. Method Twenty-two Caucasian English-speaking adults with normal speech and resonance were recruited. Participants included 11 men and 11 women (M = 22.8 years, SD = 4.1) with normal anatomy. Static magnetic resonance images were obtained using a three-dimensional static imaging protocol. Midsagittal and oblique coronal planes were established for visualization of the velum and LVP muscle at rest. Dynamic magnetic resonance images were obtained in the oblique coronal plane during production of "ansa." Amira 6.0.1 Visualization and Volume Modeling Software and MATLAB were used to analyze images and calculate LVP shortening and maximum contraction velocity. Results Significant predictors (p < .05) of maximum LVP shortening during velopharyngeal closure included mean extravelar length, LVP origin-to-origin distance, velar thickness, pharyngeal depth, and velopharyngeal ratio. Significant predictors (p < .05) of maximum contraction velocity during velopharyngeal closure included mean extravelar length, intravelar length, LVP origin-to-origin distance, and velar thickness. Conclusions This study identified six velopharyngeal variables that predict LVP muscle function during real-time speech. These predictors should be considered among children and individuals with repaired cleft palate in future studies.
Collapse
|
22
|
Ong F, Zhu X, Cheng JY, Johnson KM, Larson PEZ, Vasanawala SS, Lustig M. Extreme MRI: Large-scale volumetric dynamic imaging from continuous non-gated acquisitions. Magn Reson Med 2020; 84:1763-1780. [PMID: 32270547 DOI: 10.1002/mrm.28235] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Revised: 02/05/2020] [Accepted: 02/06/2020] [Indexed: 12/30/2022]
Abstract
PURPOSE To develop a framework to reconstruct large-scale volumetric dynamic MRI from rapid continuous and non-gated acquisitions, with applications to pulmonary and dynamic contrast-enhanced (DCE) imaging. THEORY AND METHODS The problem considered here requires recovering 100 gigabytes of dynamic volumetric image data from a few gigabytes of k-space data, acquired continuously over several minutes. This reconstruction is vastly under-determined, heavily stressing computing resources as well as memory management and storage. To overcome these challenges, we leverage intrinsic three-dimensional (3D) trajectories, such as 3D radial and 3D cones, with ordering that incoherently cover time and k-space over the entire acquisition. We then propose two innovations: (a) A compressed representation using multiscale low-rank matrix factorization that constrains the reconstruction problem, and reduces its memory footprint. (b) Stochastic optimization to reduce computation, improve memory locality, and minimize communications between threads and processors. We demonstrate the feasibility of the proposed method on DCE imaging acquired with a golden-angle ordered 3D cones trajectory and pulmonary imaging acquired with a bit-reversed ordered 3D radial trajectory. We compare it with "soft-gated" dynamic reconstruction for DCE and respiratory-resolved reconstruction for pulmonary imaging. RESULTS The proposed technique shows transient dynamics that are not seen in gating-based methods. When applied to datasets with irregular, or non-repetitive motions, the proposed method displays sharper image features. CONCLUSIONS We demonstrated a method that can reconstruct massive 3D dynamic image series in the extreme undersampling and extreme computation setting.
Collapse
Affiliation(s)
- Frank Ong
- Electrical Engineering, Stanford University, Stanford, CA, USA.,Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA
| | - Xucheng Zhu
- UC Berkeley-UCSF Graduate Program in Bioengineering, University of California, San Francisco, CA, USA.,Department of Radiology and Biomedical Imaging, University of California, San Francisco, CA, USA
| | - Joseph Y Cheng
- Department of Radiology, Stanford University, Stanford, CA, USA
| | - Kevin M Johnson
- Medical Physics, University of Wisconsin, Madison, WI, USA.,Department of Radiology, University of Wisconsin, Madison, WI, USA
| | - Peder E Z Larson
- Radiology and Biomedical Imaging, University of California, San Francisco, CA, USA
| | | | - Michael Lustig
- Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA
| |
Collapse
|
23
|
Erin O, Gilbert HB, Tabak AF, Sitti M. Elevation and Azimuth Rotational Actuation of an Untethered Millirobot by MRI Gradient Coils. IEEE T ROBOT 2019. [DOI: 10.1109/tro.2019.2934712] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
24
|
Abstract
Purpose
Speech production is a complex 3-dimensional (3D) process, and yet most of what is known about it is derived from 2D midsagittal data. The relatively recent development of safe 3D imaging technologies (including magnetic resonance imaging and ultrasound) provide new opportunities to revisit and reformulate what is already known and to push the boundaries of current knowledge still further. A particularly useful imaging modality for this purpose is 3D/4D ultrasound, which until very recently was not well suited for studies in speech research. This technical report presents an overview of what 3D/4D ultrasound can contribute to speech research, with a focus on 2 demonstrations.
Conclusion
The 1st demonstration illustrates how 3D/4D ultrasound makes it possible to image certain vocal tract anatomical structures and planes that conventional 2D ultrasound is not capable of imaging. The 2nd demonstration illustrates how 3D/4D ultrasound can be combined with static 3D magnetic resonance imaging to provide new insight into the temporal pervasiveness and spatial extensiveness of lateral contact between the tongue and palate–teeth during speech production.
Collapse
Affiliation(s)
- Steven M. Lulich
- Department of Speech & Hearing Sciences, Indiana University, Bloomington
| | - William G. Pearson
- Department of Cellular Biology and Anatomy, Medical College of Georgia, Augusta
| |
Collapse
|
25
|
Chen W, Byrd D, Narayanan S, Nayak KS. Intermittently tagged real-time MRI reveals internal tongue motion during speech production. Magn Reson Med 2019; 82:600-613. [PMID: 30919494 PMCID: PMC6510652 DOI: 10.1002/mrm.27745] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Revised: 02/11/2019] [Accepted: 02/28/2019] [Indexed: 12/17/2022]
Abstract
PURPOSE To demonstrate a tagging method compatible with RT-MRI for the study of speech production. METHODS Tagging is applied as a brief interruption to a continuous real-time spiral acquisition. Tagging can be initiated manually by the operator, cued to the speech stimulus, or be automatically applied with a fixed frequency. We use a standard 2D 1-3-3-1 binomial SPAtial Modulation of Magnetization (SPAMM) sequence with 1 cm spacing in both in-plane directions. Tag persistence in tongue muscle is simulated and validated in vivo. The ability to capture internal tongue deformations is tested during speech production of American English diphthongs in native speakers. RESULTS We achieved an imaging window of 650-800 ms at 1.5T, with imaging signal to noise ratio ≥ 17 and tag contrast to noise ratio ≥ 5 in human tongue, providing 36 frames/s temporal resolution and 2 mm in-plane spatial resolution with real-time interactive acquisition and view-sharing reconstruction. The proposed method was able to capture tongue motion patterns and their relative timing with adequate spatiotemporal resolution during the production of American English diphthongs and consonants. CONCLUSION Intermittent tagging during real-time MRI of speech production is able to reveal the internal deformations of the tongue. This capability will allow new investigations of valuable spatiotemporal information on the biomechanics of the lingual subsystems during speech without reliance on binning speech utterance repetition.
Collapse
Affiliation(s)
- Weiyi Chen
- Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
| | - Dani Byrd
- Department of Linguistics, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, California, USA
| | - Shrikanth Narayanan
- Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
- Department of Linguistics, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, California, USA
| | - Krishna S. Nayak
- Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
| |
Collapse
|
26
|
Ruthven M, Freitas AC, Boubertakh R, Miquel ME. Application of radial GRAPPA techniques to single- and multislice dynamic speech MRI using a 16-channel neurovascular coil. Magn Reson Med 2019; 82:948-958. [PMID: 31016802 DOI: 10.1002/mrm.27779] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2019] [Revised: 03/07/2019] [Accepted: 03/29/2019] [Indexed: 12/11/2022]
Abstract
PURPOSE To investigate: (1) the feasibility of using through-time radial GeneRalized Autocalibrating Partially Parallel Acquisitions (rGRAPPA) and hybrid radial GRAPPA (h-rGRAPPA) in single- and multislice dynamic speech MRI; (2) whether single-slice dynamic speech MRI at a rate of 15 frames per second (fps) or higher and with adequate image quality can be achieved using these radial GRAPPA techniques. METHODS Seven healthy adult volunteers were imaged at 3T using a 16-channel neurovascular coil and 2 spoiled gradient echo sequences (radial trajectory, field of view = 192 × 192 mm2 , acquired pixel size = 2.4 × 2.4 mm2 ). One sequence imaged a single slice at 16.8 fps, the other imaged 2 interleaved slices at 7.8 fps per slice. Image sets were reconstructed using rGRAPPA and h-rGRAPPA, and their image quality was compared using the root mean square error, structural similarity index, and visual assessments. RESULTS Image quality deteriorated when fewer than 170 calibration frames were used in the rGRAPPA reconstruction. rGRAPPA image sets demonstrated: (1) in 97% of cases, a similar image quality to h-rGRAPPA image sets reconstructed using a k-space segment size of 4, (2) in 98% of cases, a better image quality than h-rGRAPPA image sets reconstructed using a k-space segment size of 32. CONCLUSION This study confirmed: (1) the feasibility of using rGRAPPA and h-rGRAPPA in single- and multislice dynamic speech MRI, (2) that single-slice speech imaging at a frame rate higher than 15 fps and with adequate image quality can be achieved using these radial GRAPPA techniques.
Collapse
Affiliation(s)
- Matthieu Ruthven
- Clinical Physics, Barts Health NHS Trust, St Bartholomew's Hospital, London, United Kingdom
| | - Andreia C Freitas
- William Harvey Research Institute, Queen Mary University of London, Charterhouse Square, London, United Kingdom.,ISR-Lisboa/LARSyS and Department of Bioengineering, Instituto Superior Técnico - Universidade de Lisboa, Lisbon, Portugal
| | - Redha Boubertakh
- Clinical Physics, Barts Health NHS Trust, St Bartholomew's Hospital, London, United Kingdom.,William Harvey Research Institute, Queen Mary University of London, Charterhouse Square, London, United Kingdom
| | - Marc E Miquel
- Clinical Physics, Barts Health NHS Trust, St Bartholomew's Hospital, London, United Kingdom.,William Harvey Research Institute, Queen Mary University of London, Charterhouse Square, London, United Kingdom
| |
Collapse
|
27
|
Dabbaghchian S, Arnela M, Engwall O, Guasch O. Reconstruction of vocal tract geometries from biomechanical simulations. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2019; 35:e3159. [PMID: 30242981 PMCID: PMC6587943 DOI: 10.1002/cnm.3159] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/17/2018] [Revised: 09/10/2018] [Accepted: 09/17/2018] [Indexed: 06/08/2023]
Abstract
Medical imaging techniques are usually utilized to acquire the vocal tract geometry in 3D, which may then be used, eg, for acoustic/fluid simulation. As an alternative, such a geometry may also be acquired from a biomechanical simulation, which allows to alter the anatomy and/or articulation to study a variety of configurations. In a biomechanical model, each physical structure is described by its geometry and its properties (such as mass, stiffness, and muscles). In such a model, the vocal tract itself does not have an explicit representation, since it is a cavity rather than a physical structure. Instead, its geometry is defined implicitly by all the structures surrounding the cavity, and such an implicit representation may not be suitable for visualization or for acoustic/fluid simulation. In this work, we propose a method to reconstruct the vocal tract geometry at each time step during the biomechanical simulation. Complexity of the problem, which arises from model alignment artifacts, is addressed by the proposed method. In addition to the main cavity, other small cavities, including the piriform fossa, the sublingual cavity, and the interdental space, can be reconstructed. These cavities may appear or disappear by the position of the larynx, the mandible, and the tongue. To illustrate our method, various static and temporal geometries of the vocal tract are reconstructed and visualized. As a proof of concept, the reconstructed geometries of three cardinal vowels are further used in an acoustic simulation, and the corresponding transfer functions are derived.
Collapse
Affiliation(s)
- Saeed Dabbaghchian
- Department of Speech, Music, and HearingKTH Royal Institute of TechnologyStockholmSweden
| | - Marc Arnela
- GTM Grup de recerca en Tecnologies Mèdia, La SalleUniversitat Ramon LlullBarcelonaSpain
| | - Olov Engwall
- Department of Speech, Music, and HearingKTH Royal Institute of TechnologyStockholmSweden
| | - Oriol Guasch
- GTM Grup de recerca en Tecnologies Mèdia, La SalleUniversitat Ramon LlullBarcelonaSpain
| |
Collapse
|
28
|
Kim YC. Fast upper airway magnetic resonance imaging for assessment of speech production and sleep apnea. PRECISION AND FUTURE MEDICINE 2018. [DOI: 10.23838/pfm.2018.00100] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
|
29
|
Lim Y, Zhu Y, Lingala SG, Byrd D, Narayanan S, Nayak KS. 3D dynamic MRI of the vocal tract during natural speech. Magn Reson Med 2018; 81:1511-1520. [PMID: 30390319 DOI: 10.1002/mrm.27570] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Revised: 09/25/2018] [Accepted: 09/26/2018] [Indexed: 12/19/2022]
Abstract
PURPOSE To develop and evaluate a technique for 3D dynamic MRI of the full vocal tract at high temporal resolution during natural speech. METHODS We demonstrate 2.4 × 2.4 × 5.8 mm3 spatial resolution, 61-ms temporal resolution, and a 200 × 200 × 70 mm3 FOV. The proposed method uses 3D gradient-echo imaging with a custom upper-airway coil, a minimum-phase slab excitation, stack-of-spirals readout, pseudo golden-angle view order in kx -ky , linear Cartesian order along kz , and spatiotemporal finite difference constrained reconstruction, with 13-fold acceleration. This technique is evaluated using in vivo vocal tract airway data from 2 healthy subjects acquired at 1.5T scanner, 1 with synchronized audio, with 2 tasks during production of natural speech, and via comparison with interleaved multislice 2D dynamic MRI. RESULTS This technique captured known dynamics of vocal tract articulators during natural speech tasks including tongue gestures during the production of consonants "s" and "l" and of consonant-vowel syllables, and was additionally consistent with 2D dynamic MRI. Coordination of lingual (tongue) movements for consonants is demonstrated via volume-of-interest analysis. Vocal tract area function dynamics revealed critical lingual constriction events along the length of the vocal tract for consonants and vowels. CONCLUSION We demonstrate feasibility of 3D dynamic MRI of the full vocal tract, with spatiotemporal resolution adequate to visualize lingual movements for consonants and vocal tact shaping during natural productions of consonant-vowel syllables, without requiring multiple repetitions.
Collapse
Affiliation(s)
- Yongwan Lim
- Ming Hsieh Department of Electrical Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California
| | - Yinghua Zhu
- Ming Hsieh Department of Electrical Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California
| | - Sajan Goud Lingala
- Department of Biomedical Engineering, College of Engineering, University of Iowa, Iowa City, Iowa
| | - Dani Byrd
- Department of Linguistics, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, California
| | - Shrikanth Narayanan
- Ming Hsieh Department of Electrical Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California
| | - Krishna Shrinivas Nayak
- Ming Hsieh Department of Electrical Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California
| |
Collapse
|
30
|
Ramanarayanan V, Tilsen S, Proctor M, Töger J, Goldstein L, Nayak KS, Narayanan S. Analysis of speech production real-time MRI. COMPUT SPEECH LANG 2018. [DOI: 10.1016/j.csl.2018.04.002] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
31
|
Perry JL, Mason K, Sutton BP, Kuehn DP. Can Dynamic MRI Be Used to Accurately Identify Velopharyngeal Closure Patterns? Cleft Palate Craniofac J 2017; 55:499-507. [PMID: 29554453 DOI: 10.1177/1055665617735998] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Dynamic magnetic resonance imaging (MRI) has been proposed as a non-invasive, child-friendly, reproducible, and repeatable imaging method providing a 3-dimensional view of the velopharyngeal structures and function during speech. However, the value of dynamic MRI as compared to imaging methods such as nasopharyngoscopy is not well understood. The aim of this study was to compare the ability of nasopharyngoscopy and dynamic MRI to accurately identify velopharyngeal closure patterns among adults without cleft palate. METHODS Participants included 34 healthy adults with normal anatomy between 19 and 33 years of age (mean = 23 years; SD = 4.1 years). Participants underwent dynamic MRI and nasopharyngoscopy studies and comparisons were performed to determine the intra- and inter-rater reliability for accurately determining closure pattern. The MRI acquisition was a dynamic acquisition of a 2D plane. RESULTS Strong inter- (κ = .824; P < .001) and intra-rater (Rater 1: κ = 0.879, P < .001, 94% agreement between ratings; Rater 2 with 100% agreement) agreement was observed for the identification of closure pattern using nasopharyngoscopy. Inter-rater agreement for ratings using MRI demonstrated moderate agreement (κ = .489; P < .004). Examining point agreement revealed only 27 of the 33 ratings of MRI showed agreement (80%). CONCLUSION This demonstrates that inter-rater reliability for determining closure patterns from nasopharyngoscopy is good; however, ratings using MRI was less reliable at determining closure patterns. It is likely that future improvements in dynamic imaging with MRI to enable 3D visualizations are needed for improved diagnostic accuracy for assessing velopharyngeal closure patterns.
Collapse
Affiliation(s)
- Jamie L Perry
- 1 Department of Communication Sciences and Disorders, East Carolina University, Greenville, NC, USA
| | - Kazlin Mason
- 1 Department of Communication Sciences and Disorders, East Carolina University, Greenville, NC, USA
| | - Bradley P Sutton
- 2 Department of Bioengineering, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Champaign, IL, USA
| | - David P Kuehn
- 3 Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign, IL, USA
| |
Collapse
|
32
|
Perry JL, Kollara L, Kuehn DP, Sutton BP, Fang X. Examining age, sex, and race characteristics of velopharyngeal structures in 4- to 9-year old children using magnetic resonance imaging. Cleft Palate Craniofac J 2017; 55:21-34. [PMID: 33948051 DOI: 10.1177/1055665617718549] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Objective The purpose of this study was to quantify the growth of the various craniofacial and velopharyngeal structures and examine sex and race effects. Methods Eight-five healthy children (53 White and 32 Black) with normal velopharyngeal anatomy between 4 and 9 years of age who met the inclusion criteria and successfully completed the MRI scans were included in the study. Results Developmental normative mean values for selected craniometric and velopharyngeal variables by race and sex are reported. Cranial variables (face height, nasion to sella, sella to basion, palate height, palate width) and velopharyngeal variables (levator muscle length, angle of origin, sagittal angle, velar length, velar thickness, velar knee to posterior pharyngeal wall, and posterior nasal spine to levator muscle) demonstrated a trend toward a decrease in angle measures and increase in linear measures as age increased (with the exception of PNS to levator muscle). Only hard palate width and levator muscle length showed a significant sex effect. However, two cranial and six velopharyngeal variables showed a significant race effect. The interactions between sex, race, and age were not statistically significant across all variables, with the exception of posterior nasal spine to posterior pharyngeal wall. Conclusion Findings established a large age and race-specific normative reference for craniometiric and velopharyngeal variables. Data reveal minimal sexual dimorphism variables used in the present study; however, significant racial effects were observed.
Collapse
Affiliation(s)
- Jamie L Perry
- Department of Communication Sciences and Disorders, East Carolina University, Greenville, NC, USA
| | - Lakshmi Kollara
- Department of Communication Sciences and Disorders, East Carolina University, Greenville, NC, USA
| | - David P Kuehn
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign, IL, USA
| | - Bradley P Sutton
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Xiangming Fang
- Department of Biostatistics, East Carolina University, Greenville, NC, USA
| |
Collapse
|
33
|
Woo J, Xing F, Stone M, Green J, Reese TG, Brady TJ, Wedeen VJ, Prince JL, El Fakhri G. Speech Map: A Statistical Multimodal Atlas of 4D Tongue Motion During Speech from Tagged and Cine MR Images. COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING-IMAGING AND VISUALIZATION 2017; 7:361-373. [PMID: 31328049 DOI: 10.1080/21681163.2017.1382393] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Quantitative measurement of functional and anatomical traits of 4D tongue motion in the course of speech or other lingual behaviors remains a major challenge in scientific research and clinical applications. Here, we introduce a statistical multimodal atlas of 4D tongue motion using healthy subjects, which enables a combined quantitative characterization of tongue motion in a reference anatomical configuration. This atlas framework, termed Speech Map, combines cine- and tagged-MRI in order to provide both the anatomic reference and motion information during speech. Our approach involves a series of steps including (1) construction of a common reference anatomical configuration from cine-MRI, (2) motion estimation from tagged-MRI, (3) transformation of the motion estimations to the reference anatomical configuration, and (4) computation of motion quantities such as Lagrangian strain. Using this framework, the anatomic configuration of the tongue appears motionless, while the motion fields and associated strain measurements change over the time course of speech. In addition, to form a succinct representation of the high-dimensional and complex motion fields, principal component analysis is carried out to characterize the central tendencies and variations of motion fields of our speech tasks. Our proposed method provides a platform to quantitatively and objectively explain the differences and variability of tongue motion by illuminating internal motion and strain that have so far been intractable. The findings are used to understand how tongue function for speech is limited by abnormal internal motion and strain in glossectomy patients.
Collapse
Affiliation(s)
- Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Maureen Stone
- Department of Neural and Pain Sciences, University of Maryland Dental School, Baltimore, MD 21201, USA
| | - Jordan Green
- Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, MA 02129, USA
| | - Timothy G Reese
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02129, USA
| | - Thomas J Brady
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Van J Wedeen
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02129, USA
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| |
Collapse
|
34
|
Sinko K, Gruber M, Jagsch R, Roesner I, Baumann A, Wutzl A, Denk-Linnert DM. Assessment of nasalance and nasality in patients with a repaired cleft palate. Eur Arch Otorhinolaryngol 2017; 274:2845-2854. [PMID: 28299425 PMCID: PMC5486565 DOI: 10.1007/s00405-017-4506-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2016] [Accepted: 02/10/2017] [Indexed: 01/27/2023]
Abstract
In patients with a repaired cleft palate, nasality is typically diagnosed by speech language pathologists. In addition, there are various instruments to objectively diagnose nasalance. To explore the potential of nasalance measurements after cleft palate repair by NasalView®, we correlated perceptual nasality and instrumentally measured nasalance of eight speech items and determined the relationship between sensitivity and specificity of the nasalance measures by receiver-operating characteristics (ROC) analyses and AUC (area under the curve) computation for each single test item and specific item groups. We recruited patients with a primarily repaired cleft palate receiving speech therapy during follow-up. During a single day visit, perceptive and instrumental assessments were obtained in 36 patients and analyzed. The individual perceptual nasality was assigned to one of four categories; the corresponding instrumental nasalance measures for the eight specific speech items were expressed on a metric scale (1-100). With reference to the perceptual diagnoses, we observed 3 nasal and one oral test item with high sensitivity. However, the specificity of the nasality indicating measures was rather low. The four best speech items with the highest sensitivity provided scores ranging from 96.43 to 100%, while the averaged sensitivity of all eight items was below 90%. We conclude that perceptive evaluation of nasality remains state of the art. For clinical follow-up, instrumental nasalance assessment can objectively document subtle changes by analysis of four speech items only. Further studies are warranted to determine the applicability of instrumental nasalance measures in the clinical routine, using discriminative items only.
Collapse
Affiliation(s)
- Klaus Sinko
- Department of Cranio-, Maxillofacial and Oral Surgery, Medical University, Waehringer Guertel 18-20, 1090, Vienna, Austria.
| | - Maike Gruber
- Department of Cranio-, Maxillofacial and Oral Surgery, Medical University, Waehringer Guertel 18-20, 1090, Vienna, Austria
| | - Reinhold Jagsch
- Faculty of Psychology, Institute of Clinical Psychology, University of Vienna, Vienna, Austria
| | - Imme Roesner
- Division of Phonatrics-Logopedics, Department of Otorhinolaryngology, Medical University, Vienna, Austria
| | - Arnulf Baumann
- Department of Cranio-, Maxillofacial and Oral Surgery, Medical University, Waehringer Guertel 18-20, 1090, Vienna, Austria
| | - Arno Wutzl
- Department of Cranio-, Maxillofacial and Oral Surgery, Medical University, Waehringer Guertel 18-20, 1090, Vienna, Austria
| | - Doris-Maria Denk-Linnert
- Division of Phonatrics-Logopedics, Department of Otorhinolaryngology, Medical University, Vienna, Austria
| |
Collapse
|
35
|
Töger J, Sorensen T, Somandepalli K, Toutios A, Lingala SG, Narayanan S, Nayak K. Test-retest repeatability of human speech biomarkers from static and real-time dynamic magnetic resonance imaging. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:3323. [PMID: 28599561 PMCID: PMC5436977 DOI: 10.1121/1.4983081] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Static anatomical and real-time dynamic magnetic resonance imaging (RT-MRI) of the upper airway is a valuable method for studying speech production in research and clinical settings. The test-retest repeatability of quantitative imaging biomarkers is an important parameter, since it limits the effect sizes and intragroup differences that can be studied. Therefore, this study aims to present a framework for determining the test-retest repeatability of quantitative speech biomarkers from static MRI and RT-MRI, and apply the framework to healthy volunteers. Subjects (n = 8, 4 females, 4 males) are imaged in two scans on the same day, including static images and dynamic RT-MRI of speech tasks. The inter-study agreement is quantified using intraclass correlation coefficient (ICC) and mean within-subject standard deviation (σe). Inter-study agreement is strong to very strong for static measures (ICC: min/median/max 0.71/0.89/0.98, σe: 0.90/2.20/6.72 mm), poor to strong for dynamic RT-MRI measures of articulator motion range (ICC: 0.26/0.75/0.90, σe: 1.6/2.5/3.6 mm), and poor to very strong for velocities (ICC: 0.21/0.56/0.93, σe: 2.2/4.4/16.7 cm/s). In conclusion, this study characterizes repeatability of static and dynamic MRI-derived speech biomarkers using state-of-the-art imaging. The introduced framework can be used to guide future development of speech biomarkers. Test-retest MRI data are provided free for research use.
Collapse
Affiliation(s)
- Johannes Töger
- Ming Hsieh Department of Electrical Engineering, University of Southern California, 3740 McClintock Avenue, EEB 400, Los Angeles, California 90089-2560, USA
| | - Tanner Sorensen
- Ming Hsieh Department of Electrical Engineering, University of Southern California, 3740 McClintock Avenue, EEB 400, Los Angeles, California 90089-2560, USA
| | - Krishna Somandepalli
- Ming Hsieh Department of Electrical Engineering, University of Southern California, 3740 McClintock Avenue, EEB 400, Los Angeles, California 90089-2560, USA
| | - Asterios Toutios
- Ming Hsieh Department of Electrical Engineering, University of Southern California, 3740 McClintock Avenue, EEB 400, Los Angeles, California 90089-2560, USA
| | - Sajan Goud Lingala
- Ming Hsieh Department of Electrical Engineering, University of Southern California, 3740 McClintock Avenue, EEB 400, Los Angeles, California 90089-2560, USA
| | - Shrikanth Narayanan
- Ming Hsieh Department of Electrical Engineering, University of Southern California, 3740 McClintock Avenue, EEB 400, Los Angeles, California 90089-2560, USA
| | - Krishna Nayak
- Ming Hsieh Department of Electrical Engineering, University of Southern California, 3740 McClintock Avenue, EEB 400, Los Angeles, California 90089-2560, USA
| |
Collapse
|
36
|
Lingala SG, Zhu Y, Lim Y, Toutios A, Ji Y, Lo WC, Seiberlich N, Narayanan S, Nayak KS. Feasibility of through-time spiral generalized autocalibrating partial parallel acquisition for low latency accelerated real-time MRI of speech. Magn Reson Med 2017; 78:2275-2282. [PMID: 28185301 DOI: 10.1002/mrm.26611] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Revised: 12/08/2016] [Accepted: 12/27/2016] [Indexed: 12/11/2022]
Abstract
PURPOSE To evaluate the feasibility of through-time spiral generalized autocalibrating partial parallel acquisition (GRAPPA) for low-latency accelerated real-time MRI of speech. METHODS Through-time spiral GRAPPA (spiral GRAPPA), a fast linear reconstruction method, is applied to spiral (k-t) data acquired from an eight-channel custom upper-airway coil. Fully sampled data were retrospectively down-sampled to evaluate spiral GRAPPA at undersampling factors R = 2 to 6. Pseudo-golden-angle spiral acquisitions were used for prospective studies. Three subjects were imaged while performing a range of speech tasks that involved rapid articulator movements, including fluent speech and beat-boxing. Spiral GRAPPA was compared with view sharing, and a parallel imaging and compressed sensing (PI-CS) method. RESULTS Spiral GRAPPA captured spatiotemporal dynamics of vocal tract articulators at undersampling factors ≤4. Spiral GRAPPA at 18 ms/frame and 2.4 mm2 /pixel outperformed view sharing in depicting rapidly moving articulators. Spiral GRAPPA and PI-CS provided equivalent temporal fidelity. Reconstruction latency per frame was 14 ms for view sharing and 116 ms for spiral GRAPPA, using a single processor. Spiral GRAPPA kept up with the MRI data rate of 18ms/frame with eight processors. PI-CS required 17 minutes to reconstruct 5 seconds of dynamic data. CONCLUSION Spiral GRAPPA enabled 4-fold accelerated real-time MRI of speech with a low reconstruction latency. This approach is applicable to wide range of speech RT-MRI experiments that benefit from real-time feedback while visualizing rapid articulator movement. Magn Reson Med 78:2275-2282, 2017. © 2017 International Society for Magnetic Resonance in Medicine.
Collapse
Affiliation(s)
- Sajan Goud Lingala
- Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, California, USA
| | - Yinghua Zhu
- Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, California, USA
| | - Yongwan Lim
- Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, California, USA
| | - Asterios Toutios
- Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, California, USA
| | - Yunhua Ji
- Department of Biomedical Engineering, University of Southern California, Los Angeles, California, USA
| | - Wei-Ching Lo
- Biomedical Engineering, Case Western Reserve University, Cleveland, Ohio, USA
| | - Nicole Seiberlich
- Biomedical Engineering, Case Western Reserve University, Cleveland, Ohio, USA
| | - Shrikanth Narayanan
- Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, California, USA
| | - Krishna S Nayak
- Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, California, USA
| |
Collapse
|
37
|
Burdumy M, Traser L, Burk F, Richter B, Echternach M, Korvink JG, Hennig J, Zaitsev M. One-second MRI of a three-dimensional vocal tract to measure dynamic articulator modifications. J Magn Reson Imaging 2016; 46:94-101. [PMID: 27943448 DOI: 10.1002/jmri.25561] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2016] [Accepted: 11/08/2016] [Indexed: 11/10/2022] Open
Abstract
PURPOSE To enable three-dimensional (3D) vocal tract imaging of dynamic singing or speech tasks at voxel sizes of 1.6 × 1.6 × 1.3 mm3 at 1.3 s per image. MATERIALS AND METHODS A Stack-of-Stars method was implemented and enhanced to allow for fast and efficient k-space sampling of the box-shaped vocal tract using a 3 Tesla MRI system. Images were reconstructed using an off-line image reconstruction using compressed sensing theory, leading to the abovementioned spatial and temporal resolutions. To validate spatial resolution, a phantom with holes of defined sizes was measured. The applicability of the imaging method was validated in an eight-subject study of amateur singers that were required to sustain phonation at a constant pitch, past their comfortable expiratory level. A segmentation of the vocal tract over all phonation time steps was done for one subject. Anatomical distances (larynx position and pharynx width) were calculated and compared for all subjects. RESULTS Analysis of the phantom study revealed that the imaging method could provide at least 1.6 mm isotropic resolution. Visual inspection of the segmented vocal tract during phonation showed modifications of the lips, tongue, and larynx position in all three dimensions. The mean larynx position per subject amounted to 52-85 mm, deviating up to 5% over phonation time. Parameter pharynx width was 32-181 mm2 on average per subject, deviating up to 16% over phonation time. Visual inspection of the parameter course revealed no common compensation strategy for long sustained phonation. CONCLUSION The results of both phantom and in vivo measurements show the applicability of the fast 3D imaging method for voice research and indicate that modifications in all three dimensions can be observed and quantified. LEVEL OF EVIDENCE 2 Technical Efficacy: Stage 1 J. MAGN. RESON. IMAGING 2017;46:94-101.
Collapse
Affiliation(s)
- Michael Burdumy
- University Medical Center Freiburg, Department of Radiology, Medical Physics, Freiburg, Germany.,University Medical Center Freiburg, Institute of Musicians' Medicine, Freiburg, Germany
| | - Louisa Traser
- University Medical Center Freiburg, Institute of Musicians' Medicine, Freiburg, Germany.,Department of Oto-Rhino-Laryngology, Head and Neck Surgery, University Medical Center, Freiburg, Germany
| | - Fabian Burk
- University Medical Center Freiburg, Institute of Musicians' Medicine, Freiburg, Germany
| | - Bernhard Richter
- University Medical Center Freiburg, Institute of Musicians' Medicine, Freiburg, Germany
| | - Matthias Echternach
- University Medical Center Freiburg, Institute of Musicians' Medicine, Freiburg, Germany
| | - Jan G Korvink
- Institute of Microstructure Technology, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Jürgen Hennig
- University Medical Center Freiburg, Department of Radiology, Medical Physics, Freiburg, Germany
| | - Maxim Zaitsev
- University Medical Center Freiburg, Department of Radiology, Medical Physics, Freiburg, Germany
| |
Collapse
|
38
|
Perry JL, Kuehn DP, Sutton BP, Fang X. Velopharyngeal Structural and Functional Assessment of Speech in Young Children Using Dynamic Magnetic Resonance Imaging. Cleft Palate Craniofac J 2016; 54:408-422. [PMID: 27031268 DOI: 10.1597/15-120] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
OBJECTIVE The purpose of this study was to demonstrate a novel method for examining the velopharyngeal mechanism using static and dynamic magnetic resonance imaging (MRI) at the sentence-level production in young children with normal anatomy. This study examined whether velopharyngeal events occurring in the midsagittal plane are correlated to muscle events occurring along the plane of velopharyngeal closure. Adenoid involvement in velopharyngeal function was also explored. METHODS A high-resolution, T2-weighted turbo-spin-echo three-dimensional anatomical scan was used to acquire static velopharyngeal data and a fast-gradient echo fast low angle shot multishot spiral technique (15.8 frames per second) was used to acquire dynamic data on 11 children between 4 and 9 years old. RESULTS Changes in velar knee height from rest to the bilabial /p/ production was strongly correlated with changes in the velar configuration (r = 0.680, P = .021) and levator muscle contraction (r = 0.703, P = .016). Velar configuration was highly correlated to levator muscle changes (r = 0.685, P = .020). Mean alpha angle during bilabial /p/ production was 176°, which demonstrated that subjects achieve velopharyngeal closure at or just below the palatal plane. Subjects with a larger adenoid pad used significantly less (r = -0.660, P = .027) levator muscle contraction compared with individuals with smaller adenoids. CONCLUSIONS This study demonstrates a potentially useful technique in dynamic MRI that does not rely on cyclic repetitions or sustained phonation. This study lends support to the clinical potential of dynamic MRI methods for cleft palate management.
Collapse
|