1
|
Khanzhyn D, van Heuven WJB, Rataj K. The impact of spatial and verbal working memory load on semantic relatedness judgements. Psychon Bull Rev 2024; 31:781-789. [PMID: 37723334 PMCID: PMC11061018 DOI: 10.3758/s13423-023-02323-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/12/2023] [Indexed: 09/20/2023]
Abstract
Studies using a relatedness judgement task have found differences between prime-target word pairs that vary in the degree of semantic relatedness. However, the influence of working memory load on semantic processing in this task and the role of the type of working memory task have not yet been investigated. The present study therefore investigated for the first time the effect of working memory load (low vs. high) and working memory type (verbal vs. spatial) on semantic relatedness judgements. Semantically strongly related (e.g., hip - KNEE), weakly related (e.g., muscle - KNEE) and unrelated (e.g., office - KNEE) Polish word pairs were presented in an experiment involving a dual working memory and semantic relatedness task. The data revealed that, relative to semantically unrelated word pairs, responses were faster for strongly related pairs but slower for weakly related pairs. Importantly, the verbal working memory task decreased facilitation for strongly related pairs and increased inhibition for weakly related pairs relative to the spatial working memory task. Furthermore, working memory load impacted only weakly related pairs in the verbal but not in the spatial working memory task. These results show that working memory type and load influence semantic relatedness judgements, but the direction and size of the impact depend on the strength of semantic relations.
Collapse
Affiliation(s)
- Dmytro Khanzhyn
- Faculty of English, Adam Mickiewicz University, Poznań, Poland.
| | | | - Karolina Rataj
- Faculty of English, Adam Mickiewicz University, Poznań, Poland
| |
Collapse
|
2
|
Gatti D, Marelli M, Rinaldi L. Predicting Hand Movements With Distributional Semantics: Evidence From Mouse-Tracking. Cogn Sci 2024; 48:e13372. [PMID: 38196167 DOI: 10.1111/cogs.13399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 12/12/2023] [Accepted: 12/19/2023] [Indexed: 01/11/2024]
Abstract
Although mouse-tracking has been taken as a real-time window on different aspects of human decision-making processes, whether purely semantic information affects response conflict at the level of motor output as measured through mouse movements is still unknown. Here, across two experiments, we investigated the effects of semantic knowledge by predicting participants' performance in a standard keyboard task and in a mouse-tracking task through distributional semantics, a usage-based modeling approach to meaning. In Experiment 1, participants were shown word pairs and were required to perform a two-alternative forced choice task selecting either the more abstract or the more concrete word, using standard keyboard presses. In Experiment 2, participants performed the same task, yet this time response selection was achieved by moving the computer mouse. Results showed that the involvement of semantic components in the task at hand is observable using both standard reaction times (Experiment 1) as well as using indexes extracted from mouse trajectories (Experiment 2). In particular, mouse trajectories reflected the response conflict and its temporal evolution, with a larger deviation for increasing word semantic relatedness. These findings support the validity of mouse-tracking as a method to detect deep and implicit decision-making features. Additionally, by demonstrating that a usage-based model of meaning can account for the different degrees of cognitive conflict associated with task achievement, these findings testify the impact of the human semantic memory on decision-making processes.
Collapse
Affiliation(s)
- Daniele Gatti
- Department of Brain and Behavioral Sciences, University of Pavia
| | - Marco Marelli
- Department of Psychology, University of Milano-Bicocca
- NeuroMI, Milan Center for Neuroscience
| | - Luca Rinaldi
- Department of Brain and Behavioral Sciences, University of Pavia
- Cognitive Psychology Unit, IRCCS Mondino Foundation
| |
Collapse
|
3
|
Gastmann F, Poarch GJ. Cross-language activation during word recognition in child second-language learners and the role of executive function. J Exp Child Psychol 2022; 221:105443. [PMID: 35623309 DOI: 10.1016/j.jecp.2022.105443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 03/18/2022] [Accepted: 03/30/2022] [Indexed: 10/18/2022]
Abstract
We investigated lexical retrieval processes in 4- to 6-year-old German-English bilinguals by exploring cross-language activation during second-language (L2) word recognition of cognates and noncognates in semantically related and unrelated contexts in young learners of English. Both button presses (reaction times and accuracies) and eye-tracking data (percentage looks to target) yielded a significant cognate facilitation effect, indicating that the children's performance was boosted by cognate words. Nonetheless, the degree of phonological overlap of cognates did not modulate their performance. Moreover, a semantic interference effect was found in the children's eye movement data. However, in these young L2 learners, cognate status exerted a comparatively stronger impact on L2 word recognition than semantic relatedness. Finally, correlational analyses on the cognate and noncognate performance and the children's executive function yielded a significant positive correlation between noncognate performance and their inhibitory control, suggesting that noncognate processing depended to a greater extent on inhibitory control than cognate processing.
Collapse
Affiliation(s)
- Freya Gastmann
- Department of Language, Literature and Culture, TU Dortmund University, 44227 Dortmund, Germany
| | - Gregory J Poarch
- Department of English Language & Culture, University of Groningen, 9712 EK Groningen, The Netherlands.
| |
Collapse
|
4
|
Pützer A, Wolf OT. Odours as context cues of emotional memories - The role of semantic relatedness. Acta Psychol (Amst) 2021; 219:103377. [PMID: 34293594 DOI: 10.1016/j.actpsy.2021.103377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 07/08/2021] [Accepted: 07/14/2021] [Indexed: 11/26/2022] Open
Abstract
Odours constitute effective context cues, facilitating memory retrieval. Identifying factors which modulate the effectiveness of olfactory context cues can advance the understanding of processes underlying this effect. We hypothesized that the interplay of subjective stress and semantic relatedness between the odour and the learning material would modulate the effectiveness of an olfactory context cue. We further explored the effect of the odorant Hedione, which is a ligand for a putative human pheromone receptor (VN1R1). To this end, 120 participants watched a video of a stressful episode in which visual objects were present, that were either manipulated in the video (central objects) or not (peripheral objects). Participants rated their subjective stress afterwards. After 24 h, recognition and spatial memory of the objects in the video were tested. Ambient during encoding and recall was an odour related to the episode, an unrelated odour, Hedione or no odour. As a result, we observed a narrowing of recognition memory with increased subjective stress elicited by the video - but only if a semantically related odour was ambient. Moreover, higher subjective stress predicted enhanced spatial memory in the no odour condition, but not in presence of a semantically related or unrelated odour. When exposed to Hedione, higher subjective stress predicted impaired recognition and spatial memory of peripheral objects. Our findings stress the importance of considering semantic relatedness between the olfactory context and the encoded episode when applying odours as context cues for emotional or stressful memories.
Collapse
|
5
|
Vega-Mendoza M, Pickering MJ, Nieuwland MS. Concurrent use of animacy and event-knowledge during comprehension: Evidence from event-related potentials. Neuropsychologia 2021; 152:107724. [PMID: 33347913 DOI: 10.1016/j.neuropsychologia.2020.107724] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Revised: 10/02/2020] [Accepted: 12/08/2020] [Indexed: 10/22/2022]
Abstract
In two ERP experiments, we investigated whether readers prioritize animacy over real-world event-knowledge during sentence comprehension. We used the paradigm of Paczynski and Kuperberg (2012), who argued that animacy is prioritized based on the observations that the 'related anomaly effect' (reduced N400s for context-related anomalous words compared to unrelated words) does not occur for animacy violations, and that animacy violations but not relatedness violations elicit P600 effects. Participants read passive sentences with plausible agents (e.g., The prescription for the mental disorder was written by the psychiatrist) or implausible agents that varied in animacy and semantic relatedness (schizophrenic/guard/pill/fence). In Experiment 1 (with a plausibility judgment task), plausible sentences elicited smaller N400s relative to all types of implausible sentences. Crucially, animate words elicited smaller N400s than inanimate words, and related words elicited smaller N400s than unrelated words, but Bayesian analysis revealed substantial evidence against an interaction between animacy and relatedness. Moreover, at the P600 time-window, we observed more positive ERPs for animate than inanimate words and for related than unrelated words at anterior regions. In Experiment 2 (without judgment task), we observed an N400 effect with animacy violations, but no other effects. Taken together, the results of our experiments fail to support a prioritized role of animacy information over real-world event-knowledge, but they support an interactive, constraint-based view on incremental semantic processing.
Collapse
Affiliation(s)
- Mariana Vega-Mendoza
- Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, Edinburgh, Scotland, UK; Department of Business Administration, Technology and Social Sciences. Engineering Psychology, Luleå University of Technology, Luleå, Sweden.
| | - Martin J Pickering
- Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, Edinburgh, Scotland, UK
| | - Mante S Nieuwland
- Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, Edinburgh, Scotland, UK; Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands; Donders Institute for Cognition, Brain and Behaviour, Nijmegen, the Netherlands; Heinrich-Heine-University, Düsseldorf, Germany
| |
Collapse
|
6
|
Zarubin VC, Phillips TK, Robertson E, Bolton Swafford PG, Bunge T, Aguillard D, Martsberger C, Mickley Steinmetz KR. Contributions of Arousal, Attention, Distinctiveness, and Semantic Relatedness to Enhanced Emotional Memory: An Event-Related Potential and Electrocardiogram Study. Affect Sci 2020; 1:172-185. [PMID: 36043208 PMCID: PMC9382952 DOI: 10.1007/s42761-020-00012-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 07/17/2020] [Indexed: 06/15/2023]
Abstract
Enhanced emotional memory (EEM) describes memory benefits for emotional items, traditionally attributed to impacts of arousal at encoding; however, attention, semantic relatedness, and distinctiveness likely also contribute in various ways. The current study manipulated arousal, semantic relatedness, and distinctiveness while recording changes in event-related potentials and heart rate during memory encoding. Trials were classified as remembered or forgotten by immediate recall performance. Negative images were remembered significantly better than neutral, and related neutral images were remembered significantly better than unrelated neutral images. Higher P300 and late positive potential (LPP) amplitudes were associated with memory for negative images as compared with related neutral images, suggesting that negative images received additional attentional processing at encoding, and that this cannot be accounted for only by the inherent relatedness of negative stimuli. No encoding benefits were found for related neutral images though they were better remembered than unrelated neutral images, indicating retrieval dynamics impacted memory. When image types were intermixed, greater heart rate changes occurred, and negative and unrelated neutral images received increased elaborative processing as compared with related neutral images, perhaps due to the prioritization of encoding resources. These results suggest encoding and retrieval processes contribute to EEM, with emotional items benefiting additively.
Collapse
Affiliation(s)
- Vanessa C. Zarubin
- University of California at Davis, Center for Neuroscience, Sacramento, CA USA
| | | | | | | | - Taylor Bunge
- Department of Psychology, Wofford College, Spartanburg, SC USA
| | | | | | | |
Collapse
|
7
|
Jiang MYC, Jong MSY, Tse CS, Chai CS. Examining the Effect of Semantic Relatedness on the Acquisition of English Collocations. J Psycholinguist Res 2020; 49:199-222. [PMID: 31768805 DOI: 10.1007/s10936-019-09680-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This study examines whether semantic relatedness facilitates or impedes the acquisition of English collocations by conducting two experiments respectively on Chinese undergraduates. Each experiment was composed of a reading session, a productive test, and a receptive test. Experiment 1 began with the reading session of 28 paired-up words and their collocations (in sentence context). Those words were counterbalanced between two randomly selected groups by cross-matching on semantic relatedness. Results of the productive test revealed that the participants scored significantly higher on test items that were semantically related than the randomly cross-paired counterparts. However, for the receptive test, the participants performed significantly better on semantically unrelated items. Experiment 2 was similar to Experiment 1 except that the word pairs selected were only semantically related and did not have any shared morphemes. Experiment 2 also revealed consistent results. The results of the two experiments consistently illustrate that semantic relatedness may exert a facilitatory effect on language output but an inhibitory effect on the process of language input.
Collapse
Affiliation(s)
- Michael Yi-Chao Jiang
- Department of Curriculum and Instruction, The Chinese University of Hong Kong, Hong Kong S.A.R., China.
| | - Morris Siu-Yung Jong
- Department of Curriculum and Instruction, The Chinese University of Hong Kong, Hong Kong S.A.R., China
| | - Chi-Shing Tse
- Department of Educational Psychology, The Chinese University of Hong Kong, Hong Kong S.A.R., China
| | - Ching-Sing Chai
- Department of Curriculum and Instruction, The Chinese University of Hong Kong, Hong Kong S.A.R., China
| |
Collapse
|
8
|
Heo GE, Xie Q, Song M, Lee JH. Combining entity co-occurrence with specialized word embeddings to measure entity relation in Alzheimer's disease. BMC Med Inform Decis Mak 2019; 19:240. [PMID: 31801521 PMCID: PMC6894106 DOI: 10.1186/s12911-019-0934-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Background Extracting useful information from biomedical literature plays an important role in the development of modern medicine. In natural language processing, there have been rigorous attempts to find meaningful relationships between entities automatically by co-occurrence-based methods. It has been increasingly important to understand whether relationships exist, and if so how strong, between any two entities extracted from a large number of texts. One of the defining methods is to measure semantic similarity and relatedness between two entities. Methods We propose a hybrid ranking method that combines a co-occurrence approach considering both direct and indirect entity pair relationship with specialized word embeddings for measuring the relatedness of two entities. Results We evaluate the proposed ranking method comparatively with other well-known methods such as co-occurrence, Word2Vec, COALS (Correlated Occurrence Analog to Lexical Semantics), and random indexing by calculating top-ranked entities related to Alzheimer’s disease. In addition, we analyze gene, pathway, and gene–phenotype relationships. Overall, the proposed method tends to find more hidden relationships than the other methods. Conclusion Our proposed method is able to select more useful related entities that not only highly co-occur but also have more indirect relations for the target entity. In pathway analysis, our proposed method shows superior performance at identifying (functional) cross clustering and higher-level pathways. Our proposed method, resulting from phenotype analysis, has an advantage in identifying the common genotype relating to phenotypes from biological literature.
Collapse
Affiliation(s)
- Go Eun Heo
- Department of Library and Information Science, Yonsei University, 50 Yonsei-ro Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Qing Xie
- Department of Library and Information Science, Yonsei University, 50 Yonsei-ro Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Min Song
- Department of Library and Information Science, Yonsei University, 50 Yonsei-ro Seodaemun-gu, Seoul, 03722, Republic of Korea.
| | - Jeong-Hoon Lee
- Department of Creative IT Engineering, POSTECH, 77 Cheongam-ro Nam-gu, Pohang, Gyeongbuk, 37673, Republic of Korea
| |
Collapse
|
9
|
Abstract
BACKGROUND Literature Based Discovery (LBD) produces more potential hypotheses than can be manually reviewed, making automatically ranking these hypotheses critical. In this paper, we introduce the indirect association measures of Linking Term Association (LTA), Minimum Weight Association (MWA), and Shared B to C Set Association (SBC), and compare them to Linking Set Association (LSA), concept embeddings vector cosine, Linking Term Count (LTC), and direct co-occurrence vector cosine. Our proposed indirect association measures extend traditional association measures to quantify indirect rather than direct associations while preserving valuable statistical properties. RESULTS We perform a comparison between several different hypothesis ranking methods for LBD, and compare them against our proposed indirect association measures. We intrinsically evaluate each method's performance using its ability to estimate semantic relatedness on standard evaluation datasets. We extrinsically evaluate each method's ability to rank hypotheses in LBD using a time-slicing dataset based on co-occurrence information, and another time-slicing dataset based on SemRep extracted-relationships. Precision and recall curves are generated by ranking term pairs and applying a threshold at each rank. CONCLUSIONS Results differ depending on the evaluation methods and datasets, but it is unclear if this is a result of biases in the evaluation datasets or if one method is truly better than another. We conclude that LTC and SBC are the best suited methods for hypothesis ranking in LBD, but there is value in having a variety of methods to choose from.
Collapse
Affiliation(s)
- Sam Henry
- Department of Computer Science, Virginia Commonwealth University, 601 W. Main St. Rm 435, Richmond, 23284 USA
| | - Bridget T. McInnes
- Department of Computer Science, Virginia Commonwealth University, 601 W. Main St. Rm 435, Richmond, 23284 USA
| |
Collapse
|
10
|
Ack Baraly KT, Morand A, Fusca L, Davidson PSR, Hot P. Semantic relatedness and distinctive processing may inflate older adults' positive memory bias. Mem Cognit 2019; 47:1431-43. [PMID: 31254177 DOI: 10.3758/s13421-019-00943-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Emotional stimuli are often more semantically interrelated and relatively distinct than neutral stimuli. These factors can enhance memory for emotional stimuli in young adults, but their effects in older adults-and on the age-related positive memory bias-remain unknown. In the present article, we tested whether item relatedness and distinctiveness affect emotional memory in young adults (Exps. 1 and 2) and the positive memory bias in older adults (Exp. 2). In both experiments, participants studied positive, negative, and neutral pictures and performed free recall after 1 min and 45 min. To manipulate relatedness, the neutral pictures were either as highly interrelated as the emotional pictures ("related neutral") or lower in semantic relatedness ("unrelated neutral"). To manipulate distinctiveness, we had participants process the emotional pictures in either a relatively distinct manner (mixed condition), by studying emotional and neutral pictures at the same time, or in a nondistinctive manner (unmixed condition), by studying and recalling each picture category separately. Overall, higher semantic relatedness (i.e., related-neutral vs. unrelated-neutral pictures) increased memory in both age groups. Distinctiveness did not affect memory in young adults, but it did alter the positive memory bias in older adults. Older adults recalled more positive than negative pictures when the pictures were processed in mixed sets, but not when they were processed in unmixed sets. These findings were consistent across both test delays. This suggests that previous reports, which were often based on mixed designs in which item interrelatedness was not controlled, may have overestimated the size and/or robustness of the positivity bias in older adults.
Collapse
|
11
|
Smolka E, Eulitz C. Psycholinguistic measures for German verb pairs: Semantic transparency, semantic relatedness, verb family size, and age of reading acquisition. Behav Res Methods 2018; 50:1540-62. [PMID: 29916042 DOI: 10.3758/s13428-018-1052-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
A central issue in visual and spoken word recognition is the lexical representation of complex words-in particular, whether the lexical representation of complex words depends on semantic transparency: Is a complex verb like understand lexically represented as a whole word or via its base stand, given that its meaning is not transparent from the meanings of its parts? To study this issue, a number of stimulus characteristics are of interest that are not yet available in public databases of German. This article provides semantic association ratings, lexical paraphrases, and vector-based similarity measures for German verbs, measuring (a) the semantic transparency between 1,259 complex verbs and their bases, (b) the semantic relatedness between 1,109 verb pairs with 432 different bases, and (c) the vector-based similarity measures of 846 verb pairs. Additionally, we include the verb regularity of all verbs and two counts of verb family size for 184 base verbs, as well as estimates of age of acquisition and age of reading for 200 verbs. Together with lemma and type frequencies from public lexical databases, all measures can be downloaded along with this article. Statistical analyses indicate that verb family size, morphological complexity, frequency, and verb regularity affect the semantic transparency and relatedness ratings as well as the age of acquisition estimates, indicating that these are relevant variables in psycholinguistic experiments. Although lexical paraphrases, vector-based similarity measures, and semantic association ratings may deliver complementary information, the interrater reliability of the semantic association ratings for each verb pair provides valuable information when selecting stimuli for psycholinguistic experiments.
Collapse
|
12
|
Henry S, McQuilkin A, McInnes BT. Association measures for estimating semantic similarity and relatedness between biomedical concepts. Artif Intell Med 2018; 93:1-10. [PMID: 30197305 DOI: 10.1016/j.artmed.2018.08.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Revised: 03/08/2018] [Accepted: 08/24/2018] [Indexed: 12/26/2022]
Abstract
Association measures quantify the observed likelihood a term pair co-occurs versus their predicted co-occurrence together if by chance. This is based both on the terms' individual occurrence frequencies, and their mutual co-occurrence frequencies. One application of association scores is estimating semantic relatedness, which is critical for many natural language processing applications, such as clustering of biomedical and clinical documents and the development of biomedical terminologies and ontololgies. In this paper we propose a method of generating association scores between biomedical concepts to estimate semantic relatedness. We use co-occurrence statistics between Unified Medical Language System (UMLS) concepts to account for lexical variation at the synonymous level, and introduce a process of concept expansion that exploits hierarchical information from the UMLS to account for lexical variation at the hyponymous level. State of the art results are achieved on several standard evaluation datasets, and an in depth analysis of hyper-parameters is presented.
Collapse
Affiliation(s)
- Sam Henry
- Virginia Commonwealth University, Richmond, VA, United States
| | - Alex McQuilkin
- Virginia Commonwealth University, Richmond, VA, United States
| | | |
Collapse
|
13
|
Zhu Y, Yan E, Wang F. Semantic relatedness and similarity of biomedical terms: examining the effects of recency, size, and section of biomedical publications on the performance of word2vec. BMC Med Inform Decis Mak 2017; 17:95. [PMID: 28673289 PMCID: PMC5496182 DOI: 10.1186/s12911-017-0498-1] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2017] [Accepted: 06/28/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Understanding semantic relatedness and similarity between biomedical terms has a great impact on a variety of applications such as biomedical information retrieval, information extraction, and recommender systems. The objective of this study is to examine word2vec's ability in deriving semantic relatedness and similarity between biomedical terms from large publication data. Specifically, we focus on the effects of recency, size, and section of biomedical publication data on the performance of word2vec. METHODS We download abstracts of 18,777,129 articles from PubMed and 766,326 full-text articles from PubMed Central (PMC). The datasets are preprocessed and grouped into subsets by recency, size, and section. Word2vec models are trained on these subtests. Cosine similarities between biomedical terms obtained from the word2vec models are compared against reference standards. Performance of models trained on different subsets are compared to examine recency, size, and section effects. RESULTS Models trained on recent datasets did not boost the performance. Models trained on larger datasets identified more pairs of biomedical terms than models trained on smaller datasets in relatedness task (from 368 at the 10% level to 494 at the 100% level) and similarity task (from 374 at the 10% level to 491 at the 100% level). The model trained on abstracts produced results that have higher correlations with the reference standards than the one trained on article bodies (i.e., 0.65 vs. 0.62 in the similarity task and 0.66 vs. 0.59 in the relatedness task). However, the latter identified more pairs of biomedical terms than the former (i.e., 344 vs. 498 in the similarity task and 339 vs. 503 in the relatedness task). CONCLUSIONS Increasing the size of dataset does not always enhance the performance. Increasing the size of datasets can result in the identification of more relations of biomedical terms even though it does not guarantee better precision. As summaries of research articles, compared with article bodies, abstracts excel in accuracy but lose in coverage of identifiable relations.
Collapse
Affiliation(s)
- Yongjun Zhu
- Healthcare Policy and Research, Weill Cornell Medicine, Cornell University, New York, NY, USA.
| | - Erjia Yan
- College of Computing and Informatics, Drexel University, Philadelphia, PA, USA
| | - Fei Wang
- Healthcare Policy and Research, Weill Cornell Medicine, Cornell University, New York, NY, USA
| |
Collapse
|
14
|
Guillaume F, Baier S, Bourgeois M, Tinard S. Format change and semantic relatedness effects on the ERP correlates of recognition: old pairs, new pairs, different stories. Exp Brain Res 2016; 235:1007-1019. [PMID: 28032139 DOI: 10.1007/s00221-016-4859-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2016] [Accepted: 12/18/2016] [Indexed: 10/20/2022]
Abstract
In this event-related potential (ERP) study, we investigated the effects of format change and semantic relatedness in a recognition task using pairs composed of a word and a line drawing. The semantic relatedness of the pairs (related: rabbit-carrot; unrelated: duck-artichoke) influenced their associative properties and corresponding distinctiveness, while format change refers to the switching of an item from the verbal form to the line drawing form between study and recognition (e.g., the word "egg" is associated with a drawing of a hen at study, and a line drawing of an egg is associated with the word "hen" at test). Study-test format change thus prevents visual matching while maintaining conceptual matching. While the N300 potential was only modulated by the semantic relatedness of the pair, both factors modulated recognition performance and corresponding ERP old/new effects with larger mid-frontal N400 old/new effect (300-500 ms) and larger parietal old/new effect (500-800 ms) in the same compared to the different-format condition, as well as for related compared to unrelated pairs. Furthermore, the semantic relatedness of correctly recognized old pairs modulated the anterior N400 while it modulated the posterior N400 for correctly rejected pairs. These results suggest that semantic relatedness and familiarity related to the amount of change between study and test present distinct ERP signatures in the N400 window. They suggest also that the distinctiveness and the ease of the retrieval of the pair could be determining for the parietal old/new effect.
Collapse
Affiliation(s)
- Fabrice Guillaume
- Aix Marseille Université, Laboratoire de Psychologie Cognitive (CNRS UMR 7290), Fédération de recherche 3C (Cerveau, Cognition, Comportement), Bâtiment 9 Case D, 3 place Victor Hugo, 13003, Marseille Cedex 3, France.
| | - Sophia Baier
- Université de Nice Sophia Antipolis, LAPCOS (EA 7278), 3 Bd François Mitterrand, 06357, Nice, France
| | - Mélanie Bourgeois
- Aix Marseille Université, Laboratoire de Psychologie Cognitive (CNRS UMR 7290), Fédération de recherche 3C (Cerveau, Cognition, Comportement), Bâtiment 9 Case D, 3 place Victor Hugo, 13003, Marseille Cedex 3, France
| | - Sophie Tinard
- Aix Marseille Université, Laboratoire de Psychologie Cognitive (CNRS UMR 7290), Fédération de recherche 3C (Cerveau, Cognition, Comportement), Bâtiment 9 Case D, 3 place Victor Hugo, 13003, Marseille Cedex 3, France
| |
Collapse
|
15
|
Abstract
BACKGROUND Semantic relatedness is a measure that quantifies the strength of a semantic link between two concepts. Often, it can be efficiently approximated with methods that operate on words, which represent these concepts. Approximating semantic relatedness between texts and concepts represented by these texts is an important part of many text and knowledge processing tasks of crucial importance in the ever growing domain of biomedical informatics. The problem of most state-of-the-art methods for calculating semantic relatedness is their dependence on highly specialized, structured knowledge resources, which makes these methods poorly adaptable for many usage scenarios. On the other hand, the domain knowledge in the Life Sciences has become more and more accessible, but mostly in its unstructured form - as texts in large document collections, which makes its use more challenging for automated processing. In this paper we present tESA, an extension to a well known Explicit Semantic Relatedness (ESA) method. RESULTS In our extension we use two separate sets of vectors, corresponding to different sections of the articles from the underlying corpus of documents, as opposed to the original method, which only uses a single vector space. We present an evaluation of Life Sciences domain-focused applicability of both tESA and domain-adapted Explicit Semantic Analysis. The methods are tested against a set of standard benchmarks established for the evaluation of biomedical semantic relatedness quality. Our experiments show that the propsed method achieves results comparable with or superior to the current state-of-the-art methods. Additionally, a comparative discussion of the results obtained with tESA and ESA is presented, together with a study of the adaptability of the methods to different corpora and their performance with different input parameters. CONCLUSIONS Our findings suggest that combined use of the semantics from different sections (i.e. extending the original ESA methodology with the use of title vectors) of the documents of scientific corpora may be used to enhance the performance of a distributional semantic relatedness measures, which can be observed in the largest reference datasets. We also present the impact of the proposed extension on the size of distributional representations.
Collapse
Affiliation(s)
- Maciej Rybinski
- Departamento LCC, University of Malaga, Campus Teatinos, Malaga, 29010, Spain
| | | |
Collapse
|
16
|
Pakhomov SVS, Eberly L, Knopman D. Characterizing cognitive performance in a large longitudinal study of aging with computerized semantic indices of verbal fluency. Neuropsychologia 2016; 89:42-56. [PMID: 27245645 DOI: 10.1016/j.neuropsychologia.2016.05.031] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2015] [Revised: 05/26/2016] [Accepted: 05/27/2016] [Indexed: 11/22/2022]
Abstract
A computational approach for estimating several indices of performance on the animal category verbal fluency task was validated, and examined in a large longitudinal study of aging. The performance indices included the traditional verbal fluency score, size of semantic clusters, density of repeated words, as well as measures of semantic and lexical diversity. Change over time in these measures was modeled using mixed effects regression in several groups of participants, including those that remained cognitively normal throughout the study (CN) and those that were diagnosed with mild cognitive impairment (MCI) or Alzheimer's disease (AD) dementia at some point subsequent to the baseline visit. The results of the study show that, with the exception of mean cluster size, the indices showed significantly greater declines in the MCI and AD dementia groups as compared to CN participants. Examination of associations between the indices and cognitive domains of memory, attention and visuospatial functioning showed that the traditional verbal fluency scores were associated with declines in all three domains, whereas semantic and lexical diversity measures were associated with declines only in the visuospatial domain. Baseline repetition density was associated with declines in memory and visuospatial domains. Examination of lexical and semantic diversity measures in subgroups with high vs. low attention scores (but normal functioning in other domains) showed that the performance of individuals with low attention was influenced more by word frequency rather than strength of semantic relatedness between words. These findings suggest that various automatically semantic indices may be used to examine various aspects of cognitive performance affected by dementia.
Collapse
|
17
|
Cameron D, Kavuluru R, Rindflesch TC, Sheth AP, Thirunarayan K, Bodenreider O. Context-driven automatic subgraph creation for literature-based discovery. J Biomed Inform 2015; 54:141-57. [PMID: 25661592 PMCID: PMC4888806 DOI: 10.1016/j.jbi.2015.01.014] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Revised: 01/21/2015] [Accepted: 01/25/2015] [Indexed: 01/29/2023]
Abstract
BACKGROUND Literature-based discovery (LBD) is characterized by uncovering hidden associations in non-interacting scientific literature. Prior approaches to LBD include use of: (1) domain expertise and structured background knowledge to manually filter and explore the literature, (2) distributional statistics and graph-theoretic measures to rank interesting connections, and (3) heuristics to help eliminate spurious connections. However, manual approaches to LBD are not scalable and purely distributional approaches may not be sufficient to obtain insights into the meaning of poorly understood associations. While several graph-based approaches have the potential to elucidate associations, their effectiveness has not been fully demonstrated. A considerable degree of a priori knowledge, heuristics, and manual filtering is still required. OBJECTIVES In this paper we implement and evaluate a context-driven, automatic subgraph creation method that captures multifaceted complex associations between biomedical concepts to facilitate LBD. Given a pair of concepts, our method automatically generates a ranked list of subgraphs, which provide informative and potentially unknown associations between such concepts. METHODS To generate subgraphs, the set of all MEDLINE articles that contain either of the two specified concepts (A, C) are first collected. Then binary relationships or assertions, which are automatically extracted from the MEDLINE articles, called semantic predications, are used to create a labeled directed predications graph. In this predications graph, a path is represented as a sequence of semantic predications. The hierarchical agglomerative clustering (HAC) algorithm is then applied to cluster paths that are bounded by the two concepts (A, C). HAC relies on implicit semantics captured through Medical Subject Heading (MeSH) descriptors, and explicit semantics from the MeSH hierarchy, for clustering. Paths that exceed a threshold of semantic relatedness are clustered into subgraphs based on their shared context. Finally, the automatically generated clusters are provided as a ranked list of subgraphs. RESULTS The subgraphs generated using this approach facilitated the rediscovery of 8 out of 9 existing scientific discoveries. In particular, they directly (or indirectly) led to the recovery of several intermediates (or B-concepts) between A- and C-terms, while also providing insights into the meaning of the associations. Such meaning is derived from predicates between the concepts, as well as the provenance of the semantic predications in MEDLINE. Additionally, by generating subgraphs on different thematic dimensions (such as Cellular Activity, Pharmaceutical Treatment and Tissue Function), the approach may enable a broader understanding of the nature of complex associations between concepts. Finally, in a statistical evaluation to determine the interestingness of the subgraphs, it was observed that an arbitrary association is mentioned in only approximately 4 articles in MEDLINE on average. CONCLUSION These results suggest that leveraging the implicit and explicit semantics provided by manually assigned MeSH descriptors is an effective representation for capturing the underlying context of complex associations, along multiple thematic dimensions in LBD situations.
Collapse
Affiliation(s)
- Delroy Cameron
- Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis), Wright State University, Dayton, OH 45435, USA.
| | - Ramakanth Kavuluru
- Division of Biomedical Informatics, University of Kentucky, Lexington, KY 40506, USA
| | | | - Amit P Sheth
- Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis), Wright State University, Dayton, OH 45435, USA
| | - Krishnaprasad Thirunarayan
- Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis), Wright State University, Dayton, OH 45435, USA
| | | |
Collapse
|
18
|
Battal Merlet L, Morel S, Blanchet A, Lockman H, Kostova M. Effect of semantic coherence on episodic memory processes in schizophrenia. Psychiatry Res 2014; 220:752-9. [PMID: 25240943 DOI: 10.1016/j.psychres.2014.08.034] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/05/2013] [Revised: 07/15/2014] [Accepted: 08/18/2014] [Indexed: 10/24/2022]
Abstract
Schizophrenia is associated with severe episodic retrieval impairment. The aim of this study was to investigate the possibility that schizophrenia patients could improve their familiarity and/or recollection processes by manipulating the semantic coherence of to-be-learned stimuli and using deep encoding. Twelve schizophrenia patients and 12 healthy controls of comparable age, gender, and educational level undertook an associative recognition memory task. The stimuli consisted of pairs of words that were either related or unrelated to a given semantic category. The process dissociation procedure was used to calculate the estimates of familiarity and recollection processes. Both groups showed enhanced memory performances for semantically related words. However, in healthy controls, semantic relatedness led to enhanced recollection, while in schizophrenia patients, it induced enhanced familiarity. The familiarity estimates for related words were comparable in both groups, indicating that familiarity could be used as a compensatory mechanism in schizophrenia patients.
Collapse
Affiliation(s)
- Lâle Battal Merlet
- Université Paris 8, Laboratoire de Psychopathologie et Neuropsychologie, EA 2027, 2 rue de la Liberté, 93526 Saint-Denis Cedex, France; University Malaya, Psychological Medicine Department, Kuala Lumpur, Malaysia.
| | - Shasha Morel
- Université Paris 8, Laboratoire de Psychopathologie et Neuropsychologie, EA 2027, 2 rue de la Liberté, 93526 Saint-Denis Cedex, France; Université François-Rabelais, UMR-CNRS 7295, CeRCA, Tours, France
| | - Alain Blanchet
- Université Paris 8, Laboratoire de Psychopathologie et Neuropsychologie, EA 2027, 2 rue de la Liberté, 93526 Saint-Denis Cedex, France
| | - Hazlin Lockman
- University Malaya, Psychological Medicine Department, Kuala Lumpur, Malaysia
| | - Milena Kostova
- Université Paris 8, Laboratoire de Psychopathologie et Neuropsychologie, EA 2027, 2 rue de la Liberté, 93526 Saint-Denis Cedex, France
| |
Collapse
|
19
|
McInnes BT, Pedersen T. Evaluating semantic similarity and relatedness over the semantic grouping of clinical term pairs. J Biomed Inform 2015; 54:329-36. [PMID: 25523466 DOI: 10.1016/j.jbi.2014.11.014] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2014] [Revised: 11/11/2014] [Accepted: 11/13/2014] [Indexed: 11/20/2022]
Abstract
INTRODUCTION This article explores how measures of semantic similarity and relatedness are impacted by the semantic groups to which the concepts they are measuring belong. Our goal is to determine if there are distinctions between homogeneous comparisons (where both concepts belong to the same group) and heterogeneous ones (where the concepts are in different groups). Our hypothesis is that the similarity measures will be significantly affected since they rely on hierarchical is-a relations, whereas relatedness measures should be less impacted since they utilize a wider range of relations. In addition, we also evaluate the effect of combining different measures of similarity and relatedness. Our hypothesis is that these combined measures will more closely correlate with human judgment, since they better reflect the rich variety of information humans use when assessing similarity and relatedness. METHOD We evaluate our method on four reference standards. Three of the reference standards were annotated by human judges for relatedness and one was annotated for similarity. RESULTS We found significant differences in the correlation of semantic similarity and relatedness measures with human judgment, depending on which semantic groups were involved. We also found that combining a definition based relatedness measure with an information content similarity measure resulted in significant improvements in correlation over individual measures. AVAILABILITY The semantic similarity and relatedness package is an open source program available from http://umls-similarity.sourceforge.net/. The reference standards are available at http://www.people.vcu.edu/∼{}btmcinnes/downloads.html.
Collapse
|
20
|
Guo X, Yu Q, Alm CO, Calvelli C, Pelz JB, Shi P, Haake AR. From spoken narratives to domain knowledge: mining linguistic data for medical image understanding. Artif Intell Med 2014; 62:79-90. [PMID: 25174882 DOI: 10.1016/j.artmed.2014.08.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2013] [Revised: 07/29/2014] [Accepted: 08/10/2014] [Indexed: 10/24/2022]
Abstract
OBJECTIVES Extracting useful visual clues from medical images allowing accurate diagnoses requires physicians' domain knowledge acquired through years of systematic study and clinical training. This is especially true in the dermatology domain, a medical specialty that requires physicians to have image inspection experience. Automating or at least aiding such efforts requires understanding physicians' reasoning processes and their use of domain knowledge. Mining physicians' references to medical concepts in narratives during image-based diagnosis of a disease is an interesting research topic that can help reveal experts' reasoning processes. It can also be a useful resource to assist with design of information technologies for image use and for image case-based medical education systems. METHODS AND MATERIALS We collected data for analyzing physicians' diagnostic reasoning processes by conducting an experiment that recorded their spoken descriptions during inspection of dermatology images. In this paper we focus on the benefit of physicians' spoken descriptions and provide a general workflow for mining medical domain knowledge based on linguistic data from these narratives. The challenge of a medical image case can influence the accuracy of the diagnosis as well as how physicians pursue the diagnostic process. Accordingly, we define two lexical metrics for physicians' narratives--lexical consensus score and top N relatedness score--and evaluate their usefulness by assessing the diagnostic challenge levels of corresponding medical images. We also report on clustering medical images based on anchor concepts obtained from physicians' medical term usage. These analyses are based on physicians' spoken narratives that have been preprocessed by incorporating the Unified Medical Language System for detecting medical concepts. RESULTS The image rankings based on lexical consensus score and on top 1 relatedness score are well correlated with those based on challenge levels (Spearman correlation>0.5 and Kendall correlation>0.4). Clustering results are largely improved based on our anchor concept method (accuracy>70% and mutual information>80%). CONCLUSIONS Physicians' spoken narratives are valuable for the purpose of mining the domain knowledge that physicians use in medical image inspections. We also show that the semantic metrics introduced in the paper can be successfully applied to medical image understanding and allow discussion of additional uses of these metrics.
Collapse
Affiliation(s)
- Xuan Guo
- College of Computing & Information Sciences, Rochester Institute of Technology, 20 Lomb Memorial Drive, Rochester, NY 14623, USA.
| | - Qi Yu
- College of Computing & Information Sciences, Rochester Institute of Technology, 20 Lomb Memorial Drive, Rochester, NY 14623, USA
| | - Cecilia Ovesdotter Alm
- College of Liberal Arts, Rochester Institute of Technology, 92 Lomb Memorial Drive, Rochester, NY 14623, USA
| | - Cara Calvelli
- College of Health Sciences & Technology, Rochester Institute of Technology, 90 Lomb Memorial Drive, Rochester, NY 14623, USA
| | - Jeff B Pelz
- Center for Imaging Science, Rochester Institute of Technology, 54 Lomb Memorial Drive, Rochester, NY 14623, USA
| | - Pengcheng Shi
- College of Computing & Information Sciences, Rochester Institute of Technology, 20 Lomb Memorial Drive, Rochester, NY 14623, USA
| | - Anne R Haake
- College of Computing & Information Sciences, Rochester Institute of Technology, 20 Lomb Memorial Drive, Rochester, NY 14623, USA
| |
Collapse
|