Alderete J. Simon Fraser University Speech Error Database (SFUSED) Cantonese: Methods, design, and usage.
Front Psychol 2024;
15:1270433. [PMID:
38333059 PMCID:
PMC10851145 DOI:
10.3389/fpsyg.2024.1270433]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 01/08/2024] [Indexed: 02/10/2024] Open
Abstract
The Simon Fraser University Speech Error Database (SFUSED) is a multi-purpose database of speech errors based in audio recordings. The motivation for SFUSED Cantonese, a component of this database, is to create a linguistically rich data set for exploring language production processes in Cantonese, an under-studied language. We describe in detail the methods used to collect, analyze, and explore the database, including details of team workflows, time budgets, data quality, and explicit linguistic and processing assumptions. In addition to showing how to use the database, this account supports future research with a template for investigating additional under-studied languages, and it gives fresh perspective on the benefits and drawbacks of collecting speech error data from spontaneous speech. All of the data and supporting materials are available as open access data sets.
Collapse