1
|
Understanding why infant-directed speech supports learning: A dynamic attention perspective. DEVELOPMENTAL REVIEW 2022. [DOI: 10.1016/j.dr.2022.101047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
2
|
Bi H, Zare S, Kania U, Yan R. A systematic review of studies on connected speech processing: Trends, key findings, and implications. Front Psychol 2022; 13:1056827. [DOI: 10.3389/fpsyg.2022.1056827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 11/07/2022] [Indexed: 11/30/2022] Open
Abstract
Connected speech processing (CSP) is of great significance to individuals’ language and cognitive development. It is particularly crucial not only for clinical detection and treatment of developmental disorders, but also for the Foreign/second language teaching instructions. However, given the importance of this field, there is a clear lack of systematic reviews that summarize the key findings of previous studies. To this end, through searching in the scientific databases PsycInfo, Scopus, PubMed, ERIC, Taylor and Francis, and Web of Science, the present study identified 128 core CSP articles with high reference values according to PRISMA guidance and the following results were obtained through quantitative analysis and qualitative comparative synthesis: (1) The number of studies on CSP published per year showed an upward trend; however, most focused on English language, whereas the studies on other languages were comparatively rare; (2) CSP was found to be affected by multiple factors, among which speech speed, semantics, word frequency, and phonological awareness were most frequently investigated; (3) the deficit in CSP capacity was widely recognized as a significant predictor and indicator of developmental disorders; (4) more studies were carried out on connected speech production than on perception; and (5) almost no longitudinal studies have ever been conducted among either native or non-native speakers. Therefore, future research is needed to explore the developmental trajectory of CSP skills of typically developing language learners and speakers with cognitive disorders over different periods of time. It is also necessary to deepen the understanding of the processing mechanism beyond their performance and the role played by phonological awareness and lexical representations in CSP.
Collapse
|
3
|
Muñetón-Ayala M, De Vega M, Ochoa-Gómez JF, Beltrán D. The Brain Dynamics of Syllable Duration and Semantic Predictability in Spanish. Brain Sci 2022; 12:brainsci12040458. [PMID: 35447989 PMCID: PMC9030985 DOI: 10.3390/brainsci12040458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 03/02/2022] [Accepted: 03/04/2022] [Indexed: 12/03/2022] Open
Abstract
This study examines the neural dynamics underlying the prosodic (duration) and the semantic dimensions in Spanish sentence perception. Specifically, we investigated whether adult listeners are aware of changes in the duration of a pretonic syllable of words that were either semantically predictable or unpredictable from the preceding sentential context. Participants listened to the sentences with instructions to make prosodic or semantic judgments, while their EEG was recorded. For both accuracy and RTs, the results revealed an interaction between duration and semantics. ERP analysis exposed an interactive effect between task, duration and semantic, showing that both processes share neural resources. There was an enhanced negativity on semantic process (N400) and an extended positivity associated with anomalous duration. Source estimation for the N400 component revealed activations in the frontal gyrus for the semantic contrast and in the parietal postcentral gyrus for duration contrast in the metric task, while activation in the sub-lobar insula was observed for the semantic task. The source of the late positive components was located on posterior cingulate. Hence, the ERP data support the idea that semantic and prosodic levels are processed by similar neural networks, and the two linguistic dimensions influence each other during the decision-making stage in the metric and semantic judgment tasks.
Collapse
Affiliation(s)
- Mercedes Muñetón-Ayala
- Programa de Filología Hispánica, Facultad de Comunicaciones y Filología, Universidad de Antioquia, Calle 70 N° 52-21, Medellín 050010, Colombia
- Correspondence:
| | - Manuel De Vega
- Instituto Universitario de Neurociencia, Universidad de la Laguna, 38200 Tenerife, Spain; (M.D.V.); (D.B.)
| | - John Fredy Ochoa-Gómez
- Programa de Bioingeniería, Facultad de Ingeniería, Universidad de Antioquia, Medellín 050010, Colombia;
- Laboratorio de Neurofisiología, GRUNECO-GNA, Universidad de Antioquia, Medellín 050010, Colombia
| | - David Beltrán
- Instituto Universitario de Neurociencia, Universidad de la Laguna, 38200 Tenerife, Spain; (M.D.V.); (D.B.)
- Departamento de Psicología Básica, Universidad Nacional de Educación a Distancia, 28040 Madrid, Spain
| |
Collapse
|
4
|
Honbolygó F, Kóbor A, Hermann P, Kettinger ÁO, Vidnyánszky Z, Kovács G, Csépe V. Expectations about word stress modulate neural activity in speech-sensitive cortical areas. Neuropsychologia 2020; 143:107467. [PMID: 32305299 DOI: 10.1016/j.neuropsychologia.2020.107467] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Revised: 03/06/2020] [Accepted: 04/12/2020] [Indexed: 10/24/2022]
Abstract
A recent dual-stream model of language processing proposed that the postero-dorsal stream performs predictive sequential processing of linguistic information via hierarchically organized internal models. However, it remains unexplored whether the prosodic segmentation of linguistic information involves predictive processes. Here, we addressed this question by investigating the processing of word stress, a major component of speech segmentation, using probabilistic repetition suppression (RS) modulation as a marker of predictive processing. In an event-related acoustic fMRI RS paradigm, we presented pairs of pseudowords having the same (Rep) or different (Alt) stress patterns, in blocks with varying Rep and Alt trial probabilities. We found that the BOLD signal was significantly lower for Rep than for Alt trials, indicating RS in the posterior and middle superior temporal gyrus (STG) bilaterally, and in the anterior STG in the left hemisphere. Importantly, the magnitude of RS was modulated by repetition probability in the posterior and middle STG. These results reveal the predictive processing of word stress in the STG areas and raise the possibility that words stress processing is related to the dorsal "where" auditory stream.
Collapse
Affiliation(s)
- Ferenc Honbolygó
- Brain Imaging Centre, Research Centre for Natural Sciences, Budapest, Hungary; Institute of Psychology, Eötvös Loránd University, Budapest, Hungary.
| | - Andrea Kóbor
- Brain Imaging Centre, Research Centre for Natural Sciences, Budapest, Hungary
| | - Petra Hermann
- Brain Imaging Centre, Research Centre for Natural Sciences, Budapest, Hungary
| | - Ádám Ottó Kettinger
- Brain Imaging Centre, Research Centre for Natural Sciences, Budapest, Hungary; Department of Nuclear Techniques, Budapest University of Technology and Economics, Budapest, Hungary
| | - Zoltán Vidnyánszky
- Brain Imaging Centre, Research Centre for Natural Sciences, Budapest, Hungary
| | - Gyula Kovács
- Brain Imaging Centre, Research Centre for Natural Sciences, Budapest, Hungary; Department of Biological Psychology and Cognitive Neuroscience, Institute of Psychology, Friedrich Schiller University Jena, Jena, Germany
| | - Valéria Csépe
- Brain Imaging Centre, Research Centre for Natural Sciences, Budapest, Hungary; Faculty of Modern Philology and Social Sciences, University of Pannonia, Veszprém, Hungary
| |
Collapse
|
5
|
Kimball AE, Yiu LK, Watson DG. Word Recall is Affected by Surrounding Metrical Context. LANGUAGE, COGNITION AND NEUROSCIENCE 2019; 35:383-392. [PMID: 33015217 PMCID: PMC7531771 DOI: 10.1080/23273798.2019.1665190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Accepted: 08/25/2019] [Indexed: 06/11/2023]
Abstract
It has been claimed that English has a metrical structure, or rhythm, in which stressed and unstressed syllables alternate. In previous research regular, alternating patterns have been shown to facilitate online language comprehension. Expanding these findings to downstream processing would lead to the prediction that metrical regularity enhances memory. Research from the memory literature, however, indicates that regular patterns are less salient and therefore less well remembered, and also that strings of similar sounds are harder to remember. This work suggests that, like lists of words with similar sounds, lists of words with similar metrical patterns are less accurately remembered than comparable metrically irregular patterns. The present study tests these conflicting predictions by examining the effects of metrical regularity in a recall task. We find that words are better recalled when they do not match their metrical context, suggesting that a regular metrical structure may not be beneficial in all contexts.
Collapse
Affiliation(s)
| | - Loretta K Yiu
- Department of Human Centered Design and Engineering, University of Washington
| | - Duane G Watson
- Department of Psychology and Human Development, Vanderbilt University
| |
Collapse
|
6
|
Calandruccio L, Wasiuk PA, Buss E, Leibold LJ, Kong J, Holmes A, Oleson J. The effect of target/masker fundamental frequency contour similarity on masked-speech recognition. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:1065. [PMID: 31472562 PMCID: PMC6690832 DOI: 10.1121/1.5121314] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Revised: 07/19/2019] [Accepted: 07/23/2019] [Indexed: 05/20/2023]
Abstract
Greater informational masking is observed when the target and masker speech are more perceptually similar. Fundamental frequency (f0) contour, or the dynamic movement of f0, is thought to provide cues for segregating target speech presented in a speech masker. Most of the data demonstrating this effect have been collected using digitally modified stimuli. Less work has been done exploring the role of f0 contour for speech-in-speech recognition when all of the stimuli have been produced naturally. The goal of this project was to explore the importance of target and masker f0 contour similarity by manipulating the speaking style of talkers producing the target and masker speech streams. Sentence recognition thresholds were evaluated for target and masker speech that was produced with either flat, normal, or exaggerated speaking styles; performance was also measured in speech spectrum shaped noise and for conditions in which the stimuli were processed through an ideal-binary mask. Results confirmed that similarities in f0 contour depth elevated speech-in-speech recognition thresholds; however, when the target and masker had similar contour depths, targets with normal f0 contours were more resistant to masking than targets with flat or exaggerated contours. Differences in energetic masking across stimuli cannot account for these results.
Collapse
Affiliation(s)
- Lauren Calandruccio
- Department of Psychological Sciences, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Peter A Wasiuk
- Department of Psychological Sciences, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Emily Buss
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | - Lori J Leibold
- Boys Town National Research Hospital, Omaha, Nebraska 68131, USA
| | - Jessica Kong
- Department of Psychological Sciences, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Ann Holmes
- Department of Psychological Sciences, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Jacob Oleson
- Department of Biostatistics, University of Iowa, Iowa City, Iowa 52246, USA
| |
Collapse
|
7
|
Räsänen O, Kakouros S, Soderstrom M. Is infant-directed speech interesting because it is surprising? - Linking properties of IDS to statistical learning and attention at the prosodic level. Cognition 2018; 178:193-206. [PMID: 29885600 DOI: 10.1016/j.cognition.2018.05.015] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 05/15/2018] [Accepted: 05/21/2018] [Indexed: 11/24/2022]
Abstract
The exaggerated intonation and special rhythmic properties of infant-directed speech (IDS) have been hypothesized to attract infants' attention to the speech stream. However, there has been little work actually connecting the properties of IDS to models of attentional processing or perceptual learning. A number of such attention models suggest that surprising or novel perceptual inputs attract attention, where novelty can be operationalized as the statistical (un)predictability of the stimulus in the given context. Since prosodic patterns such as F0 contours are accessible to young infants who are also known to be adept statistical learners, the present paper investigates a hypothesis that F0 contours in IDS are less predictable than those in adult-directed speech (ADS), given previous exposure to both speaking styles, thereby potentially tapping into basic attentional mechanisms of the listeners in a similar manner that relative probabilities of other linguistic patterns are known to modulate attentional processing in infants and adults. Computational modeling analyses with naturalistic IDS and ADS speech from matched speakers and contexts show that IDS intonation has lower overall temporal predictability even when the F0 contours of both speaking styles are normalized to have equal means and variances. A closer analysis reveals that there is a tendency of IDS intonation to be less predictable at the end of short utterances, whereas ADS exhibits more stable average predictability patterns across the full extent of the utterances. The difference between IDS and ADS persists even when the proportion of IDS and ADS exposure is varied substantially, simulating different relative amounts of IDS heard in different family and cultural environments. Exposure to IDS is also found to be more efficient for predicting ADS intonation contours in new utterances than exposure to the equal amount of ADS speech. This indicates that the more variable prosodic contours of IDS also generalize to ADS, and may therefore enhance prosodic learning in infancy. Overall, the study suggests that one reason behind infant preference for IDS could be its higher information value at the prosodic level, as measured by the amount of surprisal in the F0 contours. This provides the first formal link between the properties of IDS and the models of attentional processing and statistical learning in the brain. However, this finding does not rule out the possibility that other differences between the IDS and ADS also play a role.
Collapse
Affiliation(s)
- Okko Räsänen
- Dept. Signal Processing and Acoustics, Aalto University, P.O. Box 12200, 00076 AALTO, Finland.
| | - Sofoklis Kakouros
- Dept. Signal Processing and Acoustics, Aalto University, P.O. Box 12200, 00076 AALTO, Finland.
| | - Melanie Soderstrom
- Department of Psychology, University of Manitoba, P404 Duff Roblin Building, Winnipeg, MB R3T 2N2, Canada.
| |
Collapse
|
8
|
Kakouros S, Salminen N, Räsänen O. Making predictable unpredictable with style - Behavioral and electrophysiological evidence for the critical role of prosodic expectations in the perception of prominence in speech. Neuropsychologia 2018; 109:181-199. [PMID: 29247667 DOI: 10.1016/j.neuropsychologia.2017.12.011] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Revised: 12/04/2017] [Accepted: 12/05/2017] [Indexed: 11/26/2022]
Abstract
Perceptual prominence of linguistic units such as words has been earlier connected to the concepts of predictability and attentional orientation. One hypothesis is that low-probability prosodic or lexical content is perceived as prominent due to the surprisal and high information value associated with the stimulus. However, the existing behavioral studies have used stimulus manipulations that follow or violate typical linguistic patterns present in the listeners' native language, i.e., assuming that the listeners have already established a model for acceptable prosodic patterns in the language. In the present study, we investigated whether prosodic expectations and the resulting subjective impression of prominence is affected by brief statistical adaptation to suprasegmental acoustic features in speech, also in the case where the prosodic patterns do not necessarily follow language-typical marking for prominence. We first exposed listeners to five minutes of speech with uneven distributions of falling and rising fundamental frequency (F0) trajectories on sentence-final words, and then tested their judgments of prominence on a set of new utterances. The results show that the probability of the F0 trajectory affects the perception of prominence, a less frequent F0 trajectory making a word more prominent independently of the absolute direction of F0 change. In the second part of the study, we conducted EEG-measurements on a set of new subjects listening to similar utterances with predominantly rising or falling F0 on sentence-final words. Analysis of the resulting event-related potentials (ERP) reveals a significant difference in N200 and N400 ERP-component amplitudes between standard and deviant prosody, again independently of the F0 direction and the underlying lexical content. Since N400 has earlier been associated with semantic processing of stimuli, this suggests that listeners implicitly track probabilities at the suprasegmental level and that predictability of a prosodic pattern during a word has an impact to the semantic processing of the word. Overall, the study suggests that prosodic markers for prominence are at least partially driven by the statistical structure of recently perceived speech, and therefore prominence perception could be based on statistical learning mechanisms similar to those observed in early word learning, but in this case operating at the level of suprasegmental acoustic features.
Collapse
Affiliation(s)
- Sofoklis Kakouros
- Department of Signal Processing and Acoustics, Aalto University, P.O. Box 12200, FI-00076, Finland.
| | - Nelli Salminen
- Department of Signal Processing and Acoustics, Aalto University, P.O. Box 12200, FI-00076, Finland; Aalto Behavioral Laboratory, Aalto Neuroimaging, Aalto University, FI-00076, Finland.
| | - Okko Räsänen
- Department of Signal Processing and Acoustics, Aalto University, P.O. Box 12200, FI-00076, Finland.
| |
Collapse
|
9
|
Falk S, Kello CT. Hierarchical organization in the temporal structure of infant-direct speech and song. Cognition 2017; 163:80-86. [PMID: 28292666 DOI: 10.1016/j.cognition.2017.02.017] [Citation(s) in RCA: 67] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2016] [Revised: 02/01/2017] [Accepted: 02/28/2017] [Indexed: 11/26/2022]
Abstract
Caregivers alter the temporal structure of their utterances when talking and singing to infants compared with adult communication. The present study tested whether temporal variability in infant-directed registers serves to emphasize the hierarchical temporal structure of speech. Fifteen German-speaking mothers sang a play song and told a story to their 6-months-old infants, or to an adult. Recordings were analyzed using a recently developed method that determines the degree of nested clustering of temporal events in speech. Events were defined as peaks in the amplitude envelope, and clusters of various sizes related to periods of acoustic speech energy at varying timescales. Infant-directed speech and song clearly showed greater event clustering compared with adult-directed registers, at multiple timescales of hundreds of milliseconds to tens of seconds. We discuss the relation of this newly discovered acoustic property to temporal variability in linguistic units and its potential implications for parent-infant communication and infants learning the hierarchical structures of speech and language.
Collapse
Affiliation(s)
- Simone Falk
- Ludwig-Maximilians-University, Munich, Germany; Laboratoire Parole et Langage, UMR 7309, CNRS / Aix-Marseille University, Aix-en-Provence, France; Laboratoire Phonétique et Phonologie, UMR 7018, CNRS / Université Sorbonne Nouvelle Paris-3, Paris, France.
| | | |
Collapse
|