1
|
Chow HM, Ma YK, Tseng CH. Social and communicative not a prerequisite: Preverbal infants learn an abstract rule only from congruent audiovisual dynamic pitch-height patterns. J Exp Child Psychol 2024; 248:106046. [PMID: 39241321 DOI: 10.1016/j.jecp.2024.106046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 07/23/2024] [Accepted: 07/29/2024] [Indexed: 09/09/2024]
Abstract
Learning in the everyday environment often requires the flexible integration of relevant multisensory information. Previous research has demonstrated preverbal infants' capacity to extract an abstract rule from audiovisual temporal sequences matched in temporal synchrony. Interestingly, this capacity was recently reported to be modulated by crossmodal correspondence beyond spatiotemporal matching (e.g., consistent facial emotional expressions or articulatory mouth movements matched with sound). To investigate whether such modulatory influence applies to non-social and non-communicative stimuli, we conducted a critical test using audiovisual stimuli free of social information: visually upward (and downward) moving objects paired with a congruent tone of ascending or incongruent (descending) pitch. East Asian infants (8-10 months old) from a metropolitan area in Asia demonstrated successful abstract rule learning in the congruent audiovisual condition and demonstrated weaker learning in the incongruent condition. This implies that preverbal infants use crossmodal dynamic pitch-height correspondence to integrate multisensory information before rule extraction. This result confirms that preverbal infants are ready to use non-social non-communicative information in serving cognitive functions such as rule extraction in a multisensory context.
Collapse
Affiliation(s)
- Hiu Mei Chow
- Department of Psychology, St. Thomas University, Fredericton, New Brunswick E3B 5G3, Canada
| | - Yuen Ki Ma
- Department of Psychology, The University of Hong Kong, Pokfulam, Hong Kong
| | - Chia-Huei Tseng
- Research Institute of Electrical Communication, Tohoku University, Sendai, Miyagi 980-0812, Japan.
| |
Collapse
|
2
|
Lee MS, Lee GE, Lee SH, Lee JH. Emotional responses of Korean and Chinese women to Hangul phonemes to the gender of an artificial intelligence voice. Front Psychol 2024; 15:1357975. [PMID: 39135868 PMCID: PMC11317464 DOI: 10.3389/fpsyg.2024.1357975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 07/04/2024] [Indexed: 08/15/2024] Open
Abstract
Introduction This study aimed to explore the arousal and valence that people experience in response to Hangul phonemes based on the gender of an AI speaker through comparison with Korean and Chinese cultures. Methods To achieve this, 42 Hangul phonemes were used, in a combination of three Korean vowels and 14 Korean consonants, to explore cultural differences in arousal, valence, and the six foundational emotions based on the gender of an AI speaker. A total 136 Korean and Chinese women were recruited and randomly assigned to one of two conditions based on voice gender (man or woman). Results and discussion This study revealed significant differences in arousal levels between Korean and Chinese women when exposed to male voices. Specifically, Chinese women exhibited clear differences in emotional perceptions of male and female voices in response to voiced consonants. These results confirm that arousal and valence may differ with articulation types and vowels due to cultural differences and that voice gender can affect perceived emotions. This principle can be used as evidence for sound symbolism and has practical implications for voice gender and branding in AI applications.
Collapse
Affiliation(s)
- Min-Sun Lee
- Department of Psychology, Chung-Ang University, Seoul, Republic of Korea
| | - Gi-Eun Lee
- Institute of Cultural Diversity Content, Chung-Ang University, Seoul, Republic of Korea
| | - San Ho Lee
- Department of European Language and Cultures, Chung-Ang University, Seoul, Republic of Korea
| | - Jang-Han Lee
- Department of Psychology, Chung-Ang University, Seoul, Republic of Korea
| |
Collapse
|
3
|
Passi A, Arun SP. The Bouba-Kiki effect is predicted by sound properties but not speech properties. Atten Percept Psychophys 2024; 86:976-990. [PMID: 36525201 PMCID: PMC7615921 DOI: 10.3758/s13414-022-02619-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/05/2022] [Indexed: 12/23/2022]
Abstract
Humans robustly associate spiky shapes to words like "Kiki" and round shapes to words like "Bouba." According to a popular explanation, this is because the mouth assumes an angular shape while speaking "Kiki" and a rounded shape for "Bouba." Alternatively, this effect could reflect more general associations between shape and sound that are not specific to mouth shape or articulatory properties of speech. These possibilities can be distinguished using unpronounceable sounds: The mouth-shape hypothesis predicts no Bouba-Kiki effect for these sounds, whereas the generic shape-sound hypothesis predicts a systematic effect. Here, we show that the Bouba-Kiki effect is present for a variety of unpronounceable sounds ranging from reversed words and real object sounds (n = 45 participants) and even pure tones (n = 28). The effect was strongly correlated with the mean frequency of a sound across both spoken and reversed words. The effect was not systematically predicted by subjective ratings of pronounceability or with mouth aspect ratios measured from video. Thus, the Bouba-Kiki effect is explained using simple shape-sound associations rather than using speech properties.
Collapse
Affiliation(s)
- Ananya Passi
- Undergraduate Programme, Indian Institute of Science, Bengaluru, India
| | - S P Arun
- Centre for Neuroscience, Indian Institute of Science, Bengaluru, India.
| |
Collapse
|
4
|
Sidhu DM, Athanasopoulou A, Archer SL, Czarnecki N, Curtin S, Pexman PM. The maluma/takete effect is late: No longitudinal evidence for shape sound symbolism in the first year. PLoS One 2023; 18:e0287831. [PMID: 37943758 PMCID: PMC10635456 DOI: 10.1371/journal.pone.0287831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 06/14/2023] [Indexed: 11/12/2023] Open
Abstract
The maluma/takete effect refers to an association between certain language sounds (e.g., /m/ and /o/) and round shapes, and other language sounds (e.g., /t/ and /i/) and spiky shapes. This is an example of sound symbolism and stands in opposition to arbitrariness of language. It is still unknown when sensitivity to sound symbolism emerges. In the present series of studies, we first confirmed that the classic maluma/takete effect would be observed in adults using our novel 3-D object stimuli (Experiments 1a and 1b). We then conducted the first longitudinal test of the maluma/takete effect, testing infants at 4-, 8- and 12-months of age (Experiment 2). Sensitivity to sound symbolism was measured with a looking time preference task, in which infants were shown images of a round and a spiky 3-D object while hearing either a round- or spiky-sounding nonword. We did not detect a significant difference in looking time based on nonword type. We also collected a series of individual difference measures including measures of vocabulary, movement ability and babbling. Analyses of these measures revealed that 12-month olds who babbled more showed a greater sensitivity to sound symbolism. Finally, in Experiment 3, we had parents take home round or spiky 3-D printed objects, to present to 7- to 8-month-old infants paired with either congruent or incongruent nonwords. This language experience had no effect on subsequent measures of sound symbolism sensitivity. Taken together these studies demonstrate that sound symbolism is elusive in the first year, and shed light on the mechanisms that may contribute to its eventual emergence.
Collapse
Affiliation(s)
- David M. Sidhu
- Department of Psychology, Carleton University, Ottawa, Canada
| | - Angeliki Athanasopoulou
- School of Languages, Linguistics, Literatures, and Cultures, University of Calgary, Calgary, Canada
| | | | | | - Suzanne Curtin
- Department of Child and Youth Studies, Brock University, St. Catharines, Canada
| | - Penny M. Pexman
- Department of Psychology, University of Calgary, Calgary, Canada
| |
Collapse
|
5
|
Shang N, Styles SJ. Implicit Association Test (IAT) Studies Investigating Pitch-Shape Audiovisual Cross-modal Associations Across Language Groups. Cogn Sci 2023; 47:e13221. [PMID: 36607162 PMCID: PMC10078355 DOI: 10.1111/cogs.13221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Revised: 11/14/2022] [Accepted: 11/15/2022] [Indexed: 01/07/2023]
Abstract
Previous studies have shown that Chinese speakers and non-Chinese speakers exhibit different patterns of cross-modal congruence for the lexical tones of Mandarin Chinese, depending on which features of the pitch they attend to. But is this pattern of language-specific listening a conscious cultural strategy or an automatic processing effect? If automatic, does it also apply when the same pitch contours no longer sound like speech? Implicit Association Tests (IATs) provide an indirect measure of cross-modal association. In a series of IAT studies, conducted with participants with three kinds of language backgrounds (Chinese-dominant bilinguals, Chinese balanced bilinguals, and English speakers with no Chinese experience) we find language-specific congruence effects for Mandarin lexical tones but not for matched sine-wave stimuli. That is, for linguistic stimuli, non-Chinese speakers show advantages for pitch-height congruence (high-pointy, low-curvy); no congruence effects were found for Chinese speakers. For non-linguistic stimuli, all participant groups showed advantages for pitch-height congruence. The present findings suggest that non-lexical tone congruence (high-pointy, low-curvy) is a basic congruence pattern, and the acquisition of a language with lexical tone can alter this perception.
Collapse
Affiliation(s)
- Nan Shang
- School of Foreign Studies, Northwestern Polytechnical University
| | - Suzy J Styles
- Psychology, School of Social Sciences, Nanyang Technological University.,Centre for Research and Development on Learning, Nanyang Technological University
| |
Collapse
|
6
|
Wong LS, Kwon J, Zheng Z, Styles SJ, Sakamoto M, Kitada R. Japanese Sound-Symbolic Words for Representing the Hardness of an Object Are Judged Similarly by Japanese and English Speakers. Front Psychol 2022; 13:830306. [PMID: 35369145 PMCID: PMC8965287 DOI: 10.3389/fpsyg.2022.830306] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 02/14/2022] [Indexed: 11/13/2022] Open
Abstract
Contrary to the assumption of arbitrariness in modern linguistics, sound symbolism, which is the non-arbitrary relationship between sounds and meanings, exists. Sound symbolism, including the “Bouba–Kiki” effect, implies the universality of such relationships; individuals from different cultural and linguistic backgrounds can similarly relate sound-symbolic words to referents, although the extent of these similarities remains to be fully understood. Here, we examined if subjects from different countries could similarly infer the surface texture properties from words that sound-symbolically represent hardness in Japanese. We prepared Japanese sound-symbolic words of which novelty was manipulated by a genetic algorithm (GA). Japanese speakers in Japan and English speakers in both Singapore and the United States rated these words based on surface texture properties (hardness, warmness, and roughness), as well as familiarity. The results show that hardness-related words were rated as harder and rougher than softness-related words, regardless of novelty and countries. Multivariate analyses of the ratings classified the hardness-related words along the hardness-softness dimension at over 80% accuracy, regardless of country. Multiple regression analyses revealed that the number of speech sounds /g/ and /k/ predicted the ratings of the surface texture properties in non-Japanese countries, suggesting a systematic relationship between phonetic features of a word and perceptual quality represented by the word across culturally and linguistically diverse samples.
Collapse
Affiliation(s)
- Li Shan Wong
- Division of Psychology, School of Social Sciences, Nanyang Technological University, Singapore, Singapore
| | - Jinhwan Kwon
- Faculty of Education, Kyoto University of Education, Kyoto, Japan
| | - Zane Zheng
- Department of Psychology, Lasell University, Newton, MA, United States
| | - Suzy J Styles
- Division of Psychology, School of Social Sciences, Nanyang Technological University, Singapore, Singapore
| | - Maki Sakamoto
- Department of Informatics, Graduate School of Informatics and Engineering, The University of Electro-Communications, Chofu, Japan
| | - Ryo Kitada
- Division of Psychology, School of Social Sciences, Nanyang Technological University, Singapore, Singapore.,Graduate School of Intercultural Studies, Kobe University, Kobe, Japan
| |
Collapse
|
7
|
Shen YC, Chen YC, Huang PC. Seeing Sounds: The Role of Vowels and Consonants in Crossmodal Correspondences. Iperception 2022; 13:20416695221084724. [PMID: 35321530 PMCID: PMC8935407 DOI: 10.1177/20416695221084724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2021] [Accepted: 02/11/2022] [Indexed: 11/17/2022] Open
Abstract
Crossmodal correspondences refer to the fact that certain domains of features in different sensory modalities are associated with each other. Here, we investigated the crossmodal correspondences between speech sounds and visual shapes. Specifically, we tested whether the classification dimensions of English vowels (front–central–back) and consonants (voiced–voiceless, sonorant–obstruent, and stop–continuant) correspond to visual shapes along a bipolar rounded–angular dimension. We adapted eighteen meaningless pseudowords from a previous study that corresponded to either the round or the sharp concept. On each trial, the participants heard one of the pseudowords and saw a rounded shape and an angular shape presented side-by-side on the monitor. Participants judged which shape provided a better match to the spoken pseudoword. A logistic regression was conducted in order to elucidate the effectiveness of classification dimensions of phonemes when predicting variations in the sound–shape matchings. The results demonstrated that the sound–shape matchings were predictable using front–central–back dimensions of vowels, and voiced–voiceless and stop–continuant dimensions of consonants. Hence, we verified that sound–shape matching is underpinned by contrasting dimensions in both vowels and consonants, therefore demonstrating crossmodal correspondences at the phonetic level.
Collapse
Affiliation(s)
- Yang-Chen Shen
- Department of Psychology, National Cheng Kung University, Tainan
| | - Yi-Chuan Chen
- Department of Medicine, Mackay Medical College, New Taipei City
| | - Pi-Chun Huang
- Department of Psychology, National Cheng Kung University, Tainan
| |
Collapse
|
8
|
Abstract
The present study aimed to investigate whether or not the so-called "bouba-kiki" effect is mediated by speech-specific representations. Sine-wave versions of naturally produced pseudowords were used as auditory stimuli in an implicit association task (IAT) and an explicit cross-modal matching (CMM) task to examine cross-modal shape-sound correspondences. A group of participants trained to hear the sine-wave stimuli as speech was compared to a group that heard them as non-speech sounds. Sound-shape correspondence effects were observed in both groups and tasks, indicating that speech-specific processing is not fundamental to the "bouba-kiki" phenomenon. Effects were similar across groups in the IAT, while in the CMM task the speech-mode group showed a stronger effect compared with the non-speech group. This indicates that, while both tasks reflect auditory-visual associations, only the CMM task is additionally sensitive to associations involving speech-specific representations.
Collapse
|
9
|
Liew K, Lindborg P, Rodrigues R, Styles SJ. Cross-Modal Perception of Noise-in-Music: Audiences Generate Spiky Shapes in Response to Auditory Roughness in a Novel Electroacoustic Concert Setting. Front Psychol 2018; 9:178. [PMID: 29515494 PMCID: PMC5826189 DOI: 10.3389/fpsyg.2018.00178] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2017] [Accepted: 02/01/2018] [Indexed: 01/23/2023] Open
Abstract
Noise has become integral to electroacoustic music aesthetics. In this paper, we define noise as sound that is high in auditory roughness, and examine its effect on cross-modal mapping between sound and visual shape in participants. In order to preserve the ecological validity of contemporary music aesthetics, we developed Rama, a novel interface, for presenting experimentally controlled blocks of electronically generated sounds that varied systematically in roughness, and actively collected data from audience interaction. These sounds were then embedded as musical drones within the overall sound design of a multimedia performance with live musicians, Audience members listened to these sounds, and collectively voted to create the shape of a visual graphic, presented as part of the audio-visual performance. The results of the concert setting were replicated in a controlled laboratory environment to corroborate the findings. Results show a consistent effect of auditory roughness on shape design, with rougher sounds corresponding to spikier shapes. We discuss the implications, as well as evaluate the audience interface.
Collapse
Affiliation(s)
- Kongmeng Liew
- School of Art, Design and Media, Nanyang Technological University, Singapore, Singapore
- Graduate School of Human and Environmental Studies, Kyoto University, Kyoto, Japan
| | - PerMagnus Lindborg
- School of Art, Design and Media, Nanyang Technological University, Singapore, Singapore
- Soundislands, Singapore, Singapore
| | - Ruth Rodrigues
- Raffles Arts Institute, Raffles Institution, Singapore, Singapore
| | - Suzy J. Styles
- Division of Psychology, School of Social Sciences, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|