1
|
Machine vision situations: Tracing distributed agency. OPEN RESEARCH EUROPE 2024; 3:132. [PMID: 38655131 PMCID: PMC11036037 DOI: 10.12688/openreseurope.16112.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 02/27/2024] [Indexed: 04/26/2024]
Abstract
This article proposes a new method for tracing and examining agency in heterogeneous assemblages, focusing on the role of machine vision technologies in creative works. We introduce the concept of the "machine vision situation" and define it as the moment in which machine vision technologies come into play and make a difference to the course of events. By taking situations as the unit of analysis, we identify moments at which machine vision technologies take part in actions without reducing them to either tools or protagonists, instead allowing for more complex agential entanglements between human and non-human actors. Grounded on an interdisciplinary theoretical framework, this article demonstrates how an analytical unit such as the machine vision situation is a valuable method for tracing how agency is distributed. We illustrate this through three examples by applying the method to creative works - narratives, digital games, and artworks - revealing key aspects of distributed agency and calling attention to the excess, complications, and messy entanglements that might otherwise be overlooked in analyses of agential assemblages. The machine vision situation is shown to be a flexible unit of analysis that can be productively incorporated in both quantitative and qualitative studies and applied to other contexts in which human and non-human agencies interact.
Collapse
|
2
|
A search tool based on language modelling developed for The Index of Middle English Prose. OPEN RESEARCH EUROPE 2024; 3:197. [PMID: 38274893 PMCID: PMC10808851 DOI: 10.12688/openreseurope.16590.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 03/07/2024] [Indexed: 01/27/2024]
Abstract
Non-standardised early vernaculars present a problem for search tools due to the high degree of variation. The challenge lies in the variation found in orthography, syntax, and lexicon between titles, incipits, and explicits in manuscript copies of the same work. Traditional search methods relying on exact string matching or regular expressions fail to address these variations comprehensively. This project presents a web-based search tool specifically designed to handle linguistic and textual variation. The software is made available as a part of the Index of Middle English Prose (IMEP). The search tool addresses the issue of variation by utilizing a database of incipits and explicits, character-based n-gram language models (LMs) built with the Stanford Research Institute Language Modelling (SRILM) toolkit, and a fuzzy search script (IMEP: FSS) written in Python. The tool optimizes for recall, retrieving multiple potential matches for a search string, without attempting to identify the 'correct' one. The search process involves looking up exact matches in the database while simultaneously using the fuzzy search script to evaluate the incipits and explicits against a model of the search string, followed by a match of the search string against models of the incipits and explicits. This two-step process shortens the processing time, which would otherwise be unreasonably long, because while using SRILM to match the search string against each incipit or explicit in the IMEP for precision could be time-consuming, running a first step where all texts are matched against a single LM built from the search string allows for faster processing. A web application, built using Django and Docker, combines the results of the direct database lookup and the fuzzy search script, presenting them as a list with exact matches followed by fuzzy matches ordered by increasing model perplexity. The tool is made available Open Access and can be adapted to other datasets.
Collapse
|
3
|
Who are the "Heroes of CRISPR"? Public science communication on Wikipedia and the challenge of micro-notability. PUBLIC UNDERSTANDING OF SCIENCE (BRISTOL, ENGLAND) 2024:9636625241229923. [PMID: 38419208 DOI: 10.1177/09636625241229923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/02/2024]
Abstract
Wikipedia's influence in shaping public perceptions of science underscores the significance of scientists being recognized on the platform, as it can impact their careers. Although Wikipedia offers guidelines for determining when a scientist qualifies for their own article, it currently lacks guidance regarding whether a scientist should be acknowledged in articles related to the innovation processes to which they have contributed. To explore how Wikipedia addresses this issue of scientific "micro-notability," we introduce a digital method called Name Edit Analysis, enabling us to quantitatively and qualitatively trace mentions of scientists within Wikipedia's articles. We study two CRISPR-related Wikipedia articles and find dynamic negotiations of micro-notability as well as a surprising tension between Wikipedia's principle of safeguarding against self-promotion and the scholarly norm of "due credit." To reconcile this tension, we propose that Wikipedians and scientists collaborate to establish specific micro-notability guidelines that acknowledge scientific contributions while preventing excessive self-promotion.
Collapse
|
4
|
A Computational Approach to Hand Pose Recognition in Early Modern Paintings. J Imaging 2023; 9:120. [PMID: 37367468 DOI: 10.3390/jimaging9060120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 05/25/2023] [Accepted: 06/11/2023] [Indexed: 06/28/2023] Open
Abstract
Hands represent an important aspect of pictorial narration but have rarely been addressed as an object of study in art history and digital humanities. Although hand gestures play a significant role in conveying emotions, narratives, and cultural symbolism in the context of visual art, a comprehensive terminology for the classification of depicted hand poses is still lacking. In this article, we present the process of creating a new annotated dataset of pictorial hand poses. The dataset is based on a collection of European early modern paintings, from which hands are extracted using human pose estimation (HPE) methods. The hand images are then manually annotated based on art historical categorization schemes. From this categorization, we introduce a new classification task and perform a series of experiments using different types of features, including our newly introduced 2D hand keypoint features, as well as existing neural network-based features. This classification task represents a new and complex challenge due to the subtle and contextually dependent differences between depicted hands. The presented computational approach to hand pose recognition in paintings represents an initial attempt to tackle this challenge, which could potentially advance the use of HPE methods on paintings, as well as foster new research on the understanding of hand gestures in art.
Collapse
|
5
|
Characterizing the visualization design space of distant and close reading of poetic rhythm. Front Big Data 2023; 6:1167708. [PMID: 37346813 PMCID: PMC10280022 DOI: 10.3389/fdata.2023.1167708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Accepted: 05/16/2023] [Indexed: 06/23/2023] Open
Abstract
Metrical and rhythmical poetry analysis is founded on the systematic statistical analysis and comparison of sonic devices (e.g., rhythmic patterns) that emerge from a combination of pre-established aesthetic and structural rules and the poet's abilities and creative genius to convey a given message adhering to the said constraints. These rhythmical patterns, which have been traditionally obtained by means of a careful close reading of the poems, in a process known as "scansion," can now be obtained and made visible by automatic means. However, the visualization literature is still scarce on approaches that allow an insightful close and distant reading of the rhythmical patterns in a poetry corpus. In this work, we report our initial efforts in characterizing of the visualization design space of distant and close reading of poetic rhythm. By employing a digital version of a corpus of 11,268 verses originally written by the Spanish poet and playwright Federico García-Lorca (1898-1936), we could craft several prototypical visualizations representative of the inherent complexity of the problem which we expect to employ in future user studies and that we share here with the rest of the community to foster further discussion around this interesting topic.
Collapse
|
6
|
Impact of Old Age on an Occupation's Image Over 210 Years: An Age Premium for Doctors, Lawyers, and Soldiers. J Appl Gerontol 2023; 42:1345-1355. [PMID: 37092180 DOI: 10.1177/07334648231155025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/25/2023] Open
Abstract
As older adults continue to contribute to the labor force, it is critical that perceptions of them reflect these contributions. We explore whether portraying older adults based on their occupational roles instead of their age is linked to more positive sentiments and test the possibility of an age premium. We created the largest historical corpus of American English-a 600-million-word dataset with over 150,000 texts-spanning 210 years (1810-2019). Top descriptors (N = 675,213) of nouns related to age, occupation, and age × occupation over 21 decades were compiled and rated for valence (negative-positive) on a 5-point scale. Occupational role-based framing was associated with more positive portrayals than age-based framing. Positive portrayals of older lawyers increased by 22.6% over 210 years. Older doctors (-1.4%) and older soldiers (-10.7%) experienced a decline in positive portrayals, though sentiments toward older doctors, lawyers, and soldiers remained more positive than those toward older adults.
Collapse
|
7
|
Quantifying the scientific revolution. EVOLUTIONARY HUMAN SCIENCES 2023; 5:e19. [PMID: 37587945 PMCID: PMC10426016 DOI: 10.1017/ehs.2023.6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 02/20/2023] [Accepted: 02/21/2023] [Indexed: 08/18/2023] Open
Abstract
The Scientific Revolution represents a turning point in the history of humanity. Yet it remains ill-understood, partly because of a lack of quantification. Here, we leverage large datasets of individual biographies (N = 22,943) and present the first estimates of scientific production during the late medieval and early modern period (1300-1850). Our data reveal striking differences across countries, with England and the United Provinces being much more creative than other countries, suggesting that economic development has been key in generating the Scientific Revolution. In line with recent results in behavioural sciences, we show that scientific creativity and economic development are associated with other kinds of creative activities in philosophy, literature, music and the arts, as well as with inclusive institutions and ascetic religiosity, suggesting a common underlying mindset associated with long-term orientation and exploration. Finally, we investigate the interplay between economic development and cultural transmission (the so-called 'Republic of Letters') using partially observed Markov models imported from population biology. Surprisingly, the role of horizontal transmission (from one country to another) seems to have been marginal. Beyond the case of science, our results suggest that economic development is an important factor in the evolution of aspects of human culture.
Collapse
|
8
|
Cor Deep and the Sacrobosco Dataset: Detection of Visual Elements in Historical Documents. J Imaging 2022; 8:jimaging8100285. [PMID: 36286379 PMCID: PMC9605005 DOI: 10.3390/jimaging8100285] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 09/24/2022] [Accepted: 10/03/2022] [Indexed: 11/05/2022] Open
Abstract
Recent advances in object detection facilitated by deep learning have led to numerous solutions in a myriad of fields ranging from medical diagnosis to autonomous driving. However, historical research is yet to reap the benefits of such advances. This is generally due to the low number of large, coherent, and annotated datasets of historical documents, as well as the overwhelming focus on Optical Character Recognition to support the analysis of historical documents. In this paper, we highlight the importance of visual elements, in particular illustrations in historical documents, and offer a public multi-class historical visual element dataset based on the Sphaera corpus. Additionally, we train an image extraction model based on YOLO architecture and publish it through a publicly available web-service to detect and extract multi-class images from historical documents in an effort to bridge the gap between traditional and computational approaches in historical studies.
Collapse
|
9
|
A Platform to Develop and Apply Digital Methods for Empirical Bioethics Research: Mixed Methods Design and Development Study. JMIR Form Res 2022; 6:e28558. [PMID: 35511234 PMCID: PMC9121222 DOI: 10.2196/28558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Revised: 03/23/2021] [Accepted: 11/05/2021] [Indexed: 11/13/2022] Open
Abstract
Background The rise of digital methods and computational tools has opened up the possibility of collecting and analyzing data from novel sources, such as discussions on social media. At the same time, these methods and tools introduce a dependence on technology, often resulting in a need for technical skills and expertise. Researchers from various disciplines engage in empirical bioethics research, and software development and similar skills are not usually part of their background. Therefore, researchers often depend on technical experts to develop and apply digital methods, which can create a bottleneck and hinder the broad use of digital methods in empirical bioethics research. Objective This study aimed to develop a research platform that would offer researchers the means to better leverage implemented digital methods, and that would simplify the process of developing new methods. Methods This study used a mixed methods approach to design and develop a research platform prototype. I combined established methods from user-centered design, rapid prototyping, and agile software development to iteratively develop the platform prototype. In collaboration with two other researchers, I tested and extended the platform prototype in situ by carrying out a study using the prototype. Results The resulting research platform prototype provides three digital methods, which are composed of functional components. This modular concept allows researchers to use existing methods for their own experiments and combine implemented components into new methods. Conclusions The platform prototype illustrates the potential of the modular concept and empowers researchers without advanced technical skills to carry out experiments using digital methods and develop new methods. However, more work is needed to bring the prototype to a production-ready state.
Collapse
|
10
|
Diversity of COVID-19 News Media Coverage across 17 Countries: The Influence of Cultural Values, Government Stringency and Pandemic Severity. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:11768. [PMID: 34831524 PMCID: PMC8620484 DOI: 10.3390/ijerph182211768] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Revised: 10/26/2021] [Accepted: 10/27/2021] [Indexed: 11/17/2022]
Abstract
The current media studies of COVID-19 devote asymmetrical attention to social media; in contrast, newspapers have received comparatively less attention. Newspapers are an integral source of current information that are syndicated and amplified by social media to a wide global audience. This is one of the first known studies to operationalize news media diversity and examine its association with cultural values during the pandemic. We tracked the global diversity of COVID-19 coverage in a news media database of 12 billion words, collated from 28 million articles over 7000 news websites, across 8 months. Media diversity was measured weekly by the number of unique descriptors of 10 target terms of the pandemic (e.g., COVID-19, coronavirus) and normalized by the corpus size for the respective countries per week. Government Stringency was taken from the Oxford COVID-19 Government Response Tracker and cultural scores were taken from Hofstede's Cultural Values global database. Results showed that Media Diversity Rate increased 6.7 times over 8 months, from the baseline period (October-December 2019) to during the pandemic (January-May 2020). Mixed effects modelling revealed that higher COVID-19 prevalence rates and governmental stringency predicted this increase. Interestingly, collectivist cultures are linked to more diverse media coverage during COVID-19. It is possible that news outlets in collectivist societies are motivated to present a diverse array of topics given the impact of COVID-19 on every segment of society. Of broader significance, we provided a framework to design targeted public health communications that are culturally nuanced.
Collapse
|
11
|
Societal Narratives on Caregivers in Asia. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:11241. [PMID: 34769759 PMCID: PMC8583461 DOI: 10.3390/ijerph182111241] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/03/2021] [Accepted: 10/07/2021] [Indexed: 11/17/2022]
Abstract
Although there has been an increase in awareness of the struggles experienced by caregivers, discourse on caregiving remains confined mostly to academia, policy circles or the family unit. There have been suggestions that public discourse on informal caregiving dwells overwhelmingly on the outsize toll it takes on the health of caregivers. However, few studies have examined societal narratives on caregivers-a gap our study aims to fill. We leveraged an online media database of 12 billion words collated from over 30 million articles to explore societal narratives on caregivers in six Asian countries. Computational linguistics and statistical analysis were applied to study the content of narratives on caregivers. The prevalence of societal narratives on caregivers was highest in Singapore-five times higher than Sri Lanka, which evidenced the lowest prevalence. Findings reveal that the inadequacies of institutional care as well as the need to train and empower caregivers are pressing issues that need to be prioritized on the policy agenda in Asia. Of broader significance, the diverse capabilities across Asia present opportunities for cross-country learning and capacity-building.
Collapse
|
12
|
A Methodology for Semantic Enrichment of Cultural Heritage Images Using Artificial Intelligence Technologies. J Imaging 2021; 7:121. [PMID: 34460757 PMCID: PMC8404920 DOI: 10.3390/jimaging7080121] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 07/16/2021] [Accepted: 07/18/2021] [Indexed: 11/17/2022] Open
Abstract
Cultural heritage images are among the primary media for communicating and preserving the cultural values of a society. The images represent concrete and abstract content and symbolise the social, economic, political, and cultural values of the society. However, an enormous amount of such values embedded in the images is left unexploited partly due to the absence of methodological and technical solutions to capture, represent, and exploit the latent information. With the emergence of new technologies and availability of cultural heritage images in digital formats, the methodology followed to semantically enrich and utilise such resources become a vital factor in supporting users need. This paper presents a methodology proposed to unearth the cultural information communicated via cultural digital images by applying Artificial Intelligence (AI) technologies (such as Computer Vision (CV) and semantic web technologies). To this end, the paper presents a methodology that enables efficient analysis and enrichment of a large collection of cultural images covering all the major phases and tasks. The proposed method is applied and tested using a case study on cultural image collections from the Europeana platform. The paper further presents the analysis of the case study, the challenges, the lessons learned, and promising future research areas on the topic.
Collapse
|
13
|
Embodiment in 18th Century Depictions of Human-Machine Co-Creativity. Front Robot AI 2021; 8:662036. [PMID: 34262945 PMCID: PMC8273262 DOI: 10.3389/frobt.2021.662036] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Accepted: 06/08/2021] [Indexed: 11/16/2022] Open
Abstract
Artificial intelligence has a rich history in literature; fiction has shaped how we view artificial agents and their capacities in the real world. This paper looks at embodied examples of human-machine co-creation from the literature of the Long 18th Century (1,650-1,850), examining how older depictions of creative machines could inform and inspire modern day research. The works are analyzed from the perspective of design fiction with special focus on the embodiment of the systems and the creativity exhibited by them. We find that the chosen examples highlight the importance of recognizing the environment as a major factor in human-machine co-creative processes and that some of the works seem to precede current examples of artificial systems reaching into our everyday lives. The examples present embodied interaction in a positive, creativity-oriented way, but also highlight ethical risks of human-machine co-creativity. Modern day perceptions of artificial systems and creativity can be limited to some extent by the technologies available; fictitious examples from centuries past allow us to examine such limitations using a Design Fiction approach. We conclude by deriving four guidelines for future research from our fictional examples: 1) explore unlikely embodiments; 2) think of situations, not systems; 3) be aware of the disjunction between action and appearance; and 4) consider the system as a situated moral agent.
Collapse
|
14
|
Biodigital Philosophy, Technological Convergence, and Postdigital Knowledge Ecologies. POSTDIGITAL SCIENCE AND EDUCATION 2021. [PMCID: PMC7797699 DOI: 10.1007/s42438-020-00211-7] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/13/2023]
Abstract
New technological ability is leading postdigital science, where biology as digital information, and digital information as biology, are now dialectically interconnected. In this article we firstly explore a philosophy of biodigitalism as a new paradigm closely linked to bioinformationalism. Both involve the mutual interaction and integration of information and biology, which leads us into discussion of biodigital convergence. As a unified ecosystem, this allows us to resolve problems that isolated disciplinary capabilities cannot, creating new knowledge ecologies within a constellation of technoscience. To illustrate our arrival at this historical flash point via several major epistemological shifts in the post-war period, we venture a tentative typology. The convergence between biology and information reconfigures all levels of theory and practice, and even critical reason itself now requires a biodigital interpretation oriented towards ecosystems and coordinated Earth systems. In this understanding, neither the digital humanities, the biohumanities, nor the posthumanities sit outside of biodigitalism. Instead, posthumanism is but one form of biodigitalism that mediates the biohumanities and the digital humanities, no longer preoccupied with the tradition of the subject, but with the constellation of forces shaping the future of human ontologies. This heralds a new biopolitics which brings the philosophy of race, class, gender, and intelligence, into a compelling dialog with genomics and information.
Collapse
|
15
|
Addressing the Covid-19 Burden on Medical Education and Training: The Role of Telemedicine and Tele-Education During and Beyond the Pandemic. Front Public Health 2020; 8:589669. [PMID: 33330333 PMCID: PMC7728659 DOI: 10.3389/fpubh.2020.589669] [Citation(s) in RCA: 86] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Accepted: 11/06/2020] [Indexed: 12/22/2022] Open
Abstract
Medical students are the future of sustainable health systems that are severely under pressure during COVID-19. The disruption in medical education and training has adversely impacted traditional medical education and medical students and is likely to have long-term implications beyond COVID-19. In this article, we present a comprehensive analysis of the existing structural and systemic challenges applicable to medical students and teaching/training programs and the impact of COVID-19 on medical students and education. Use of technologies such as telemedicine or remote education platforms can minimize increased mental health risks to this population. An overview of challenges during and beyond the COVID-19 pandemic are also discussed, and targeted recommendations to address acute and systemic issues in medical education and training are presented. During the transition from conventional in-person or classroom teaching to tele-delivery of educational programs, medical students have to navigate various social, economic and cultural factors which interfere with their personal and academic lives. This is especially relevant for those from vulnerable, underprivileged or minority backgrounds. Students from vulnerable backgrounds are influenced by environmental factors such as unemployment of themselves and family members, lack of or inequity in provision and access to educational technologies and remote delivery-platforms, and increased levels of mental health stressors due to prolonged isolation and self-quarantine measures. Technologies for remote education and training delivery as well as sustenance and increased delivery of general well-being and mental health services to medical students, especially to those at high-risk, are pivotal to our response to COVID-19 and beyond.
Collapse
|
16
|
Sentiment Analysis of Children and Youth Literature: Is There a Pollyanna Effect? Front Psychol 2020; 11:574746. [PMID: 33071913 PMCID: PMC7541694 DOI: 10.3389/fpsyg.2020.574746] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2020] [Accepted: 08/17/2020] [Indexed: 11/13/2022] Open
Abstract
If the words of natural human language possess a universal positivity bias, as assumed by Boucher and Osgood’s (1969) famous Pollyanna hypothesis and computationally confirmed for large text corpora in several languages (Dodds et al., 2015), then children and youth literature (CYL) should also show a Pollyanna effect. Here we tested this prediction applying an unsupervised vector space model-based sentiment analysis tool called SentiArt (Jacobs, 2019) to two CYL corpora, one in English (372 books) and one in German (500 books). Pitching our analysis at the sentence level, and assessing semantic as well as lexico-grammatical information, both corpora show the Pollyanna effect and thus add further evidence to the universality hypothesis. The results of our multivariate sentiment analyses provide interesting testable predictions for future scientific studies of literature.
Collapse
|
17
|
How We Do Things With Words: Analyzing Text as Social and Cultural Data. Front Artif Intell 2020; 3:62. [PMID: 33733179 PMCID: PMC7861331 DOI: 10.3389/frai.2020.00062] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 07/15/2020] [Indexed: 11/13/2022] Open
Abstract
In this article we describe our experiences with computational text analysis involving rich social and cultural concepts. We hope to achieve three primary goals. First, we aim to shed light on thorny issues not always at the forefront of discussions about computational text analysis methods. Second, we hope to provide a set of key questions that can guide work in this area. Our guidance is based on our own experiences and is therefore inherently imperfect. Still, given our diversity of disciplinary backgrounds and research practices, we hope to capture a range of ideas and identify commonalities that resonate for many. This leads to our final goal: to help promote interdisciplinary collaborations. Interdisciplinary insights and partnerships are essential for realizing the full potential of any computational text analysis involving social and cultural concepts, and the more we bridge these divides, the more fruitful we believe our work will be.
Collapse
|
18
|
Sentiment Analysis for Words and Fiction Characters From the Perspective of Computational (Neuro-)Poetics. Front Robot AI 2019; 6:53. [PMID: 33501068 PMCID: PMC7805775 DOI: 10.3389/frobt.2019.00053] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2018] [Accepted: 06/24/2019] [Indexed: 11/13/2022] Open
Abstract
Two computational studies provide different sentiment analyses for text segments (e.g., "fearful" passages) and figures (e.g., "Voldemort") from the Harry Potter books (Rowling, 1997, 1998, 1999, 2000, 2003, 2005, 2007) based on a novel simple tool called SentiArt. The tool uses vector space models together with theory-guided, empirically validated label lists to compute the valence of each word in a text by locating its position in a 2d emotion potential space spanned by the words of the vector space model. After testing the tool's accuracy with empirical data from a neurocognitive poetics study, it was applied to compute emotional figure and personality profiles (inspired by the so-called "big five" personality theory) for main characters from the book series. The results of comparative analyses using different machine-learning classifiers (e.g., AdaBoost, Neural Net) show that SentiArt performs very well in predicting the emotion potential of text passages. It also produces plausible predictions regarding the emotional and personality profile of fiction characters which are correctly identified on the basis of eight character features, and it achieves a good cross-validation accuracy in classifying 100 figures into "good" vs. "bad" ones. The results are discussed with regard to potential applications of SentiArt in digital literary, applied reading and neurocognitive poetics studies such as the quantification of the hybrid hero potential of figures.
Collapse
|
19
|
[Not Available]. BERICHTE ZUR WISSENSCHAFTSGESCHICHTE 2018; 41:333-336. [PMID: 32495431 DOI: 10.1002/bewi.201801941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
|
20
|
Quantifying the Beauty of Words: A Neurocognitive Poetics Perspective. Front Hum Neurosci 2017; 11:622. [PMID: 29311877 PMCID: PMC5742167 DOI: 10.3389/fnhum.2017.00622] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2017] [Accepted: 12/07/2017] [Indexed: 11/16/2022] Open
Abstract
In this paper I would like to pave the ground for future studies in Computational Stylistics and (Neuro-)Cognitive Poetics by describing procedures for predicting the subjective beauty of words. A set of eight tentative word features is computed via Quantitative Narrative Analysis (QNA) and a novel metric for quantifying word beauty, the aesthetic potential is proposed. Application of machine learning algorithms fed with this QNA data shows that a classifier of the decision tree family excellently learns to split words into beautiful vs. ugly ones. The results shed light on surface and semantic features theoretically relevant for affective-aesthetic processes in literary reading and generate quantitative predictions for neuroaesthetic studies of verbal materials.
Collapse
|
21
|
Abstract
Previous studies have shown that it is possible to detect macroscopic patterns of cultural change over periods of centuries by analyzing large textual time series, specifically digitized books. This method promises to empower scholars with a quantitative and data-driven tool to study culture and society, but its power has been limited by the use of data from books and simple analytics based essentially on word counts. This study addresses these problems by assembling a vast corpus of regional newspapers from the United Kingdom, incorporating very fine-grained geographical and temporal information that is not available for books. The corpus spans 150 years and is formed by millions of articles, representing 14% of all British regional outlets of the period. Simple content analysis of this corpus allowed us to detect specific events, like wars, epidemics, coronations, or conclaves, with high accuracy, whereas the use of more refined techniques from artificial intelligence enabled us to move beyond counting words by detecting references to named entities. These techniques allowed us to observe both a systematic underrepresentation and a steady increase of women in the news during the 20th century and the change of geographic focus for various concepts. We also estimate the dates when electricity overtook steam and trains overtook horses as a means of transportation, both around the year 1900, along with observing other cultural transitions. We believe that these data-driven approaches can complement the traditional method of close reading in detecting trends of continuity and change in historical corpora.
Collapse
|
22
|
Less Citation, Less Dissemination: The Case of French Psychoanalysis. Front Psychol 2016; 7:1729. [PMID: 27857704 PMCID: PMC5093306 DOI: 10.3389/fpsyg.2016.01729] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2016] [Accepted: 10/19/2016] [Indexed: 11/23/2022] Open
|
23
|
Abstract
Researchers effectively trust the work of others anytime they use software tools or custom software. In this article I explore this notion of trusting others, using Digital Humanities as a focus, and drawing on my own experience. Software is inherently flawed and limited, so when its use in scholarship demands better practices and terminology, to review research software and describe development processes. It is also important to make research software engineers and their work more visible, both for the purposes of review and credit.
Collapse
|
24
|
Links that speak: the global language network and its association with global fame. Proc Natl Acad Sci U S A 2014; 111:E5616-22. [PMID: 25512502 DOI: 10.1073/pnas.1410931111] [Citation(s) in RCA: 76] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Languages vary enormously in global importance because of historical, demographic, political, and technological forces. However, beyond simple measures of population and economic power, there has been no rigorous quantitative way to define the global influence of languages. Here we use the structure of the networks connecting multilingual speakers and translated texts, as expressed in book translations, multiple language editions of Wikipedia, and Twitter, to provide a concept of language importance that goes beyond simple economic or demographic measures. We find that the structure of these three global language networks (GLNs) is centered on English as a global hub and around a handful of intermediate hub languages, which include Spanish, German, French, Russian, Portuguese, and Chinese. We validate the measure of a language's centrality in the three GLNs by showing that it exhibits a strong correlation with two independent measures of the number of famous people born in the countries associated with that language. These results suggest that the position of a language in the GLN contributes to the visibility of its speakers and the global popularity of the cultural content they produce.
Collapse
|