1
|
Díaz Berenguer A, Da Y, Bossa MN, Oveneke MC, Sahli H. Causality-driven multivariate stock movement forecasting. PLoS One 2024; 19:e0302197. [PMID: 38662755 PMCID: PMC11045085 DOI: 10.1371/journal.pone.0302197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 03/30/2024] [Indexed: 04/28/2024] Open
Abstract
Our study aims to investigate the interdependence between international stock markets and sentiments from financial news in stock forecasting. We adopt the Temporal Fusion Transformers (TFT) to incorporate intra and inter-market correlations and the interaction between the information flow, i.e. causality, of financial news sentiment and the dynamics of the stock market. The current study distinguishes itself from existing research by adopting Dynamic Transfer Entropy (DTE) to establish an accurate information flow propagation between stock and sentiments. DTE has the advantage of providing time series that mine information flow propagation paths between certain parts of the time series, highlighting marginal events such as spikes or sudden jumps, which are crucial in financial time series. The proposed methodological approach involves the following elements: a FinBERT-based textual analysis of financial news articles to extract sentiment time series, the use of the Transfer Entropy and corresponding heat maps to analyze the net information flows, the calculation of the DTE time series, which are considered as co-occurring covariates of stock Price, and TFT-based stock forecasting. The Dow Jones Industrial Average index of 13 countries, along with daily financial news data obtained through the New York Times API, are used to demonstrate the validity and superiority of the proposed DTE-based causality method along with TFT for accurate stock Price and Return forecasting compared to state-of-the-art time series forecasting methods.
Collapse
Affiliation(s)
- Abel Díaz Berenguer
- Department of Electronics and Informatics (ETRO), Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | - Yifei Da
- Department of Electronics and Informatics (ETRO), Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | - Matías Nicolás Bossa
- Department of Electronics and Informatics (ETRO), Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | | | - Hichem Sahli
- Department of Electronics and Informatics (ETRO), Vrije Universiteit Brussel (VUB), Brussels, Belgium
- Interuniversity Microelectronics Centre (IMEC), Heverlee, Belgium
| |
Collapse
|
2
|
Singh C, Askari A, Caruana R, Gao J. Augmenting interpretable models with large language models during training. Nat Commun 2023; 14:7913. [PMID: 38036543 PMCID: PMC10689442 DOI: 10.1038/s41467-023-43713-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 11/17/2023] [Indexed: 12/02/2023] Open
Abstract
Recent large language models (LLMs), such as ChatGPT, have demonstrated remarkable prediction performance for a growing array of tasks. However, their proliferation into high-stakes domains and compute-limited settings has created a burgeoning need for interpretability and efficiency. We address this need by proposing Aug-imodels, a framework for leveraging the knowledge learned by LLMs to build extremely efficient and interpretable prediction models. Aug-imodels use LLMs during fitting but not during inference, allowing complete transparency and often a speed/memory improvement of greater than 1000x for inference compared to LLMs. We explore two instantiations of Aug-imodels in natural-language processing: Aug-Linear, which augments a linear model with decoupled embeddings from an LLM and Aug-Tree, which augments a decision tree with LLM feature expansions. Across a variety of text-classification datasets, both outperform their non-augmented, interpretable counterparts. Aug-Linear can even outperform much larger models, e.g. a 6-billion parameter GPT-J model, despite having 10,000x fewer parameters and being fully transparent. We further explore Aug-imodels in a natural-language fMRI study, where they generate interesting interpretations from scientific data.
Collapse
Affiliation(s)
| | - Armin Askari
- University of California, Berkeley, Berkeley, CA, USA
| | | | | |
Collapse
|
3
|
Adhikari S, Thapa S, Naseem U, Lu HY, Bharathy G, Prasad M. Explainable hybrid word representations for sentiment analysis of financial news. Neural Netw 2023; 164:115-123. [PMID: 37148607 DOI: 10.1016/j.neunet.2023.04.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Revised: 01/30/2023] [Accepted: 04/10/2023] [Indexed: 05/08/2023]
Abstract
Due to the increasing interest of people in the stock and financial market, the sentiment analysis of news and texts related to the sector is of utmost importance. This helps the potential investors in deciding what company to invest in and what are their long-term benefits. However, it is challenging to analyze the sentiments of texts related to the financial domain, given the enormous amount of information available. The existing approaches are unable to capture complex attributes of language such as word usage, including semantics and syntax throughout the context, and polysemy in the context. Further, these approaches failed to interpret the models' predictability, which is obscure to humans. Models' interpretability to justify the predictions has remained largely unexplored and has become important to engender users' trust in the predictions by providing insight into the model prediction. Accordingly, in this paper, we present an explainable hybrid word representation that first augments the data to address the class imbalance issue and then integrates three embeddings to involve polysemy in context, semantics, and syntax in a context. We then fed our proposed word representation to a convolutional neural network (CNN) with attention to capture the sentiment. The experimental results show that our model outperforms several baselines of both classic classifiers and combinations of various word embedding models in the sentiment analysis of financial news. The experimental results also show that the proposed model outperforms several baselines of word embeddings and contextual embeddings when they are separately fed to a neural network model. Further, we show the explainability of the proposed method by presenting the visualization results to explain the reason for a prediction in the sentiment analysis of financial news.
Collapse
Affiliation(s)
- Surabhi Adhikari
- Department of Computer Science and Engineering, Delhi Technological University, New Delhi, India
| | - Surendrabikram Thapa
- Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA
| | - Usman Naseem
- School of Computer Science, University of Sydney, Sydney, Australia
| | - Hai Ya Lu
- School of Computer Science, University of Technology Sydney, Sydney, Australia
| | - Gnana Bharathy
- School of Computer Science, University of Technology Sydney, Sydney, Australia
| | - Mukesh Prasad
- School of Computer Science, University of Technology Sydney, Sydney, Australia.
| |
Collapse
|
4
|
Xiao Q, Ihnaini B. Stock trend prediction using sentiment analysis. PeerJ Comput Sci 2023; 9:e1293. [PMID: 37547393 PMCID: PMC10403218 DOI: 10.7717/peerj-cs.1293] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 02/23/2023] [Indexed: 08/08/2023]
Abstract
These days, the vast amount of data generated on the Internet is a new treasure trove for investors. They can utilize text mining and sentiment analysis techniques to reflect investors' confidence in specific stocks in order to make the most accurate decision. Most previous research just sums up the text sentiment score on each natural day and uses such aggregated score to predict various stock trends. However, the natural day aggregated score may not be useful in predicting different stock trends. Therefore, in this research, we designed two different time divisions: 0:00t∼0:00t+1 and 9:30t∼9:30t+1 to study how tweets and news from the different periods can predict the next-day stock trend. 260,000 tweets and 6,000 news from Service stocks (Amazon, Netflix) and Technology stocks (Apple, Microsoft) were selected to conduct the research. The experimental result shows that opening hours division (9:30t∼9:30t+1) outperformed natural hours division (0:00t∼0:00t+1).
Collapse
Affiliation(s)
- Qianyi Xiao
- Department of Computer Science, Wenzhou Kean University, Wenzhou, Zhejiang, China
| | - Baha Ihnaini
- Department of Computer Science, Wenzhou Kean University, Wenzhou, Zhejiang, China
| |
Collapse
|
5
|
Kaplan H, Weichselbraun A, Braşoveanu AMP. Integrating Economic Theory, Domain Knowledge, and Social Knowledge into Hybrid Sentiment Models for Predicting Crude Oil Markets. Cognit Comput 2023; 15:1-17. [PMID: 37362197 PMCID: PMC10027267 DOI: 10.1007/s12559-023-10129-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 02/19/2023] [Indexed: 06/28/2023]
Abstract
For several decades, sentiment analysis has been considered a key indicator for assessing market mood and predicting future price changes. Accurately predicting commodity markets requires an understanding of fundamental market dynamics such as the interplay between supply and demand, which are not considered in standard affective models. This paper introduces two domain-specific affective models, CrudeBERT and CrudeBERT+, that adapt sentiment analysis to the crude oil market by incorporating economic theory with common knowledge of the mentioned entities and social knowledge extracted from Google Trends. To evaluate the predictive capabilities of these models, comprehensive experiments were conducted using dynamic time warping to identify the model that best approximates WTI crude oil futures price movements. The evaluation included news headlines and crude oil prices between January 2012 and April 2021. The results show that CrudeBERT+ outperformed RavenPack, BERT, FinBERT, and early CrudeBERT models during the 9-year evaluation period and within most of the individual years that were analyzed. The success of the introduced domain-specific affective models demonstrates the potential of integrating economic theory with sentiment analysis and external knowledge sources to improve the predictive power of financial sentiment analysis models. The experiments also confirm that CrudeBERT+ has the potential to provide valuable insights for decision-making in the crude oil market.
Collapse
Affiliation(s)
- Himmet Kaplan
- University of Applied Sciences of the Grisons, Chur, Switzerland
| | - Albert Weichselbraun
- University of Applied Sciences of the Grisons, Chur, Switzerland
- webLyzard technology, Vienna, Austria
| | | |
Collapse
|
6
|
Suzuki M, Sakaji H, Hirano M, Izumi K. Constructing and analyzing domain-specific language model for financial text mining. Inf Process Manag 2023. [DOI: 10.1016/j.ipm.2022.103194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
7
|
A Lexicon Enhanced Collaborative Network for targeted financial sentiment analysis. Inf Process Manag 2023. [DOI: 10.1016/j.ipm.2022.103187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
8
|
Du K, Xing F, Cambria E. Incorporating Multiple Knowledge Sources for Targeted Aspect-based Financial Sentiment Analysis. ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS 2023. [DOI: 10.1145/3580480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Combining symbolic and subsymbolic methods has become a promising strategy as research tasks in AI grow increasingly complicated and require higher levels of understanding. Targeted Aspect-based Financial Sentiment Analysis (TABFSA) is an example of such complicated tasks, as it involves processes like information extraction, information specification, and domain adaptation. However, little is known about the design principles of such hybrid models leveraging external lexical knowledge. To fill this gap, we
define
anterior, parallel, and posterior knowledge integration and
propose
incorporating multiple lexical knowledge sources strategically into the fine-tuning process of pre-trained transformer models for TABFSA. Experiments on the FiQA Task 1 and SemEval 2017 Task 5 datasets show that the knowledge-enabled models systematically improve upon their plain deep learning counterparts, and some outperform
state-of-the-art
results reported in terms of aspect sentiment analysis error. We discover that parallel knowledge integration is the most effective and domain-specific lexical knowledge is more important according to our ablation analysis.
Collapse
Affiliation(s)
- Kelvin Du
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| | - Frank Xing
- Department of Information Systems and Analytics, National University of Singapore, Singapore
| | - Erik Cambria
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| |
Collapse
|
9
|
Liapis CM, Karanikola A, Kotsiantis S. Investigating Deep Stock Market Forecasting with Sentiment Analysis. ENTROPY (BASEL, SWITZERLAND) 2023; 25:219. [PMID: 36832586 PMCID: PMC9955765 DOI: 10.3390/e25020219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 01/14/2023] [Accepted: 01/20/2023] [Indexed: 06/18/2023]
Abstract
When forecasting financial time series, incorporating relevant sentiment analysis data into the feature space is a common assumption to increase the capacities of the model. In addition, deep learning architectures and state-of-the-art schemes are increasingly used due to their efficiency. This work compares state-of-the-art methods in financial time series forecasting incorporating sentiment analysis. Through an extensive experimental process, 67 different feature setups consisting of stock closing prices and sentiment scores were tested on a variety of different datasets and metrics. In total, 30 state-of-the-art algorithmic schemes were used over two case studies: one comparing methods and one comparing input feature setups. The aggregated results indicate, on the one hand, the prevalence of a proposed method and, on the other, a conditional improvement in model efficiency after the incorporation of sentiment setups in certain forecast time frames.
Collapse
|
10
|
Costola M, Hinz O, Nofer M, Pelizzon L. Machine learning sentiment analysis, COVID-19 news and stock market reactions. RESEARCH IN INTERNATIONAL BUSINESS AND FINANCE 2023; 64:101881. [PMID: 36687319 PMCID: PMC9842392 DOI: 10.1016/j.ribaf.2023.101881] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 12/23/2022] [Accepted: 01/06/2023] [Indexed: 06/17/2023]
Abstract
The recent COVID-19 pandemic represents an unprecedented worldwide event to study the influence of related news on the financial markets, especially during the early stage of the pandemic when information on the new threat came rapidly and was complex for investors to process. In this paper, we investigate whether the flow of news on COVID-19 had an impact on forming market expectations. We analyze 203,886 online articles dealing with COVID-19 and published on three news platforms (MarketWatch.com, NYTimes.com, and Reuters.com) in the period from January to June 2020. Using machine learning techniques, we extract the news sentiment through a financial market-adapted BERT model that enables recognizing the context of each word in a given item. Our results show that there is a statistically significant and positive relationship between sentiment scores and S&P 500 market. Furthermore, we provide evidence that sentiment components and news categories on NYTimes.com were differently related to market returns.
Collapse
Affiliation(s)
| | | | | | - Loriana Pelizzon
- Leibniz Institute for Financial Research SAFE, Frankfurt, Germany
- Ca' Foscari University of Venice, Italy
| |
Collapse
|
11
|
A novel selective learning based transformer encoder architecture with enhanced word representation. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03865-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
12
|
Abstract
Using sentiment information in the analysis of financial markets has attracted much attention. Natural language processing methods can be used to extract market sentiment information from texts such as news articles. The objective of this paper is to extract financial market sentiment information from news articles and use the estimated sentiment scores to predict the price direction of the stock market index Standard & Poor’s 500. To achieve the best possible performance in sentiment classification, state-of-the-art bidirectional encoder representations from transformers (BERT) models are used. The pretrained transformer networks are fine-tuned on a labeled financial text dataset and applied to news articles from known providers of financial news content to predict their sentiment scores. The generated sentiment scores for the titles of the given news articles, for the (text) content of said news articles, and for the combined title-content consideration are posited against past time series information of the stock market index. To forecast the price direction of the stock market index, the predicted sentiment scores are used in a simple strategy and as features for a random forest classifier. The results show that sentiment scores based on news content are particularly useful for stock price direction prediction.
Collapse
|
13
|
Cerchiello P, Nicola G, Rönnqvist S, Sarlin P. Assessing Banks' Distress Using News and Regular Financial Data. Front Artif Intell 2022; 5:871863. [PMID: 35719688 PMCID: PMC9200951 DOI: 10.3389/frai.2022.871863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 04/28/2022] [Indexed: 11/13/2022] Open
Abstract
In this paper, we focus our attention on leveraging the information contained in financial news to enhance the performance of a bank distress classifier. The news information should be analyzed and inserted into the predictive model in the most efficient way and this task deals with the issues related to Natural Language interpretation and to the analysis of news media. Among the different models proposed for such purpose, we investigate a deep learning approach. The methodology is based on a distributed representation of textual data obtained from a model (Doc2Vec) that maps the documents and the words contained within a text onto a reduced latent semantic space. Afterwards, a second supervised feed forward fully connected neural network is trained combining news data distributed representations with standard financial figures in input. The goal of the model is to classify the corresponding banks in distressed or tranquil state. The final aim is to comprehend both the improvement of the predictive performance of the classifier and to assess the importance of news data in the classification process. This to understand if news data really bring useful information not contained in standard financial variables.
Collapse
Affiliation(s)
- Paola Cerchiello
- Department of Economics and Management, University of Pavia, Pavia, Italy
- *Correspondence: Paola Cerchiello
| | - Giancarlo Nicola
- Department of Economics and Management, University of Pavia, Pavia, Italy
| | - Samuel Rönnqvist
- Turku Centre for Computer Science, Data Mining Lab, TurkuNLP, University of Turku, Turku, Finland
| | | |
Collapse
|
14
|
Colasanto F, Grilli L, Santoro D, Villani G. BERT’s sentiment score for portfolio optimization: a fine-tuned view in Black and Litterman model. Neural Comput Appl 2022; 34:17507-17521. [PMID: 35669537 PMCID: PMC9150638 DOI: 10.1007/s00521-022-07403-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 05/04/2022] [Indexed: 11/25/2022]
Abstract
In financial markets, sentiment analysis on natural language sentences can improve forecasting. Many investors rely on information extracted from newspapers or their feelings. Therefore, this information is expressed in their language. Sentiment analysis models classify sentences (or entire texts) with their polarity (positive, negative, or neutral) and derive a sentiment score. In this paper, we use this sentiment (polarity) score to improve the forecasting of stocks and use it as a new “view” in the Black and Litterman model. This score is related to various events (both positive and negative) that have affected some stocks. The sentences used to determine the scores are taken from articles published in Financial Times (an international financial newspaper). To improve the forecast using this average sentiment score, we use a Monte Carlo method to generate a series of possible paths for several trading hours after the article was published to discretize (or approximate) the Wiener measure, which is applied to the paths and returning an exact price as results. Finally, we use the price determined in this way to calculate a yield to be used as views in a new type of “dynamic” portfolio optimization, based on hourly prices. We compare the results by applying the views obtained, disregarding the sentiment and leaving the initial portfolio unchanged.
Collapse
Affiliation(s)
| | - Luca Grilli
- Department of Economics, Management and Territory, University of Foggia, Foggia, Italy
| | - Domenico Santoro
- Department of Economics and Finance, University of Bari, Bari, Italy
| | - Giovanni Villani
- Department of Economics and Finance, University of Bari, Bari, Italy
| |
Collapse
|
15
|
Anbaee Farimani S, Vafaei Jahan M, Milani Fard A, Tabbakh SRK. Investigating the informativeness of technical indicators and news sentiment in financial market price prediction. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108742] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
16
|
Sinha A, Kedas S, Kumar R, Malo P. SEntFiN
1.0:
Entity‐aware
sentiment analysis for financial news. J Assoc Inf Sci Technol 2022. [DOI: 10.1002/asi.24634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- Ankur Sinha
- Production and Quantitative Methods IIM Ahmedabad India
| | | | - Rishu Kumar
- Production and Quantitative Methods IIM Ahmedabad India
| | - Pekka Malo
- Department of Information and Service Economy Alto University Espoo Finland
| |
Collapse
|
17
|
|
18
|
A Multi-Method Survey on the Use of Sentiment Analysis in Multivariate Financial Time Series Forecasting. ENTROPY 2021; 23:e23121603. [PMID: 34945909 PMCID: PMC8700726 DOI: 10.3390/e23121603] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 11/25/2021] [Accepted: 11/26/2021] [Indexed: 11/28/2022]
Abstract
In practice, time series forecasting involves the creation of models that generalize data from past values and produce future predictions. Moreover, regarding financial time series forecasting, it can be assumed that the procedure involves phenomena partly shaped by the social environment. Thus, the present work is concerned with the study of the use of sentiment analysis methods in data extracted from social networks and their utilization in multivariate prediction architectures that involve financial data. Through an extensive experimental process, 22 different input setups using such extracted information were tested, over a total of 16 different datasets, under the schemes of 27 different algorithms. The comparisons were structured under two case studies. The first concerns possible improvements in the performance of the forecasts in light of the use of sentiment analysis systems in time series forecasting. The second, having as a framework all the possible versions of the above configuration, concerns the selection of the methods that perform best. The results, as presented by various illustrations, indicate, on the one hand, the conditional improvement of predictability after the use of specific sentiment setups in long-term forecasts and, on the other, a universal predominance of long short-term memory architectures.
Collapse
|
19
|
Fine-Grained Implicit Sentiment in Financial News: Uncovering Hidden Bulls and Bears. ELECTRONICS 2021. [DOI: 10.3390/electronics10202554] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The field of sentiment analysis is currently dominated by the detection of attitudes in lexically explicit texts such as user reviews and social media posts. In objective text genres such as economic news, indirect expressions of sentiment are common. Here, a positive or negative attitude toward an entity must be inferred from connotational or real-world knowledge. To capture all expressions of subjectivity, a need exists for fine-grained resources and approaches for implicit sentiment analysis. We present the SENTiVENT corpus of English business news that contains token-level annotations for target spans, polar spans, and implicit polarity (positive, negative, or neutral investor sentiment, respectively). We both directly annotate polar expressions and induce them from existing schema-based event annotations to obtain event-implied implicit sentiment tuples. This results in a large dataset of 12,400 sentiment–target tuples in 288 fully annotated articles. We validate the created resource with an inter-annotator agreement study and a series of coarse- to fine-grained supervised deep-representation-learning experiments. Agreement scores show that our annotations are of substantial quality. The coarse-grained experiments involve classifying the positive, negative, and neutral polarity of known polar expressions and, in clause-based experiments, the detection of positive, negative, neutral, and no-polarity clauses. The gold coarse-grained experiments obtain decent performance (76% accuracy and 63% macro-F1) and clause-based detection shows decreased performance (65% accuracy and 57% macro-F1) with the confusion of neutral and no-polarity. The coarse-grained results demonstrate the feasibility of implicit polarity classification as operationalized in our dataset. In the fine-grained experiments, we apply the grid tagging scheme unified model for <polar span, target span, polarity> triplet extraction, which obtains state-of-the-art performance on explicit sentiment in user reviews. We observe a drop in performance on our implicit sentiment corpus compared to the explicit benchmark (22% vs. 76% F1). We find that the current models for explicit sentiment are not directly portable to our implicit task: the larger lexical variety within implicit opinion expressions causes lexical data scarcity. We identify common errors and discuss several recommendations for implicit fine-grained sentiment analysis. Data and source code are available.
Collapse
|
20
|
Daudert T. Exploiting textual and relationship information for fine-grained financial sentiment analysis. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107389] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
21
|
Subba B, Kumari S. A heterogeneous stacking ensemble based sentiment analysis framework using multiple word embeddings. Comput Intell 2021. [DOI: 10.1111/coin.12478] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Affiliation(s)
- Basant Subba
- Department of CSE National Institute of Technology Hamirpur Hamirpur India
| | - Simpy Kumari
- Department of CSE National Institute of Technology Hamirpur Hamirpur India
| |
Collapse
|
22
|
Goldberg DM, Zaman N, Brahma A, Aloiso M. Are mortgage loan closing delay risks predictable? A predictive analysis using text mining on discussion threads. J Assoc Inf Sci Technol 2021. [DOI: 10.1002/asi.24559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- David M. Goldberg
- Department of Management Information Systems, Fowler College of Business San Diego State University San Diego California USA
| | - Nohel Zaman
- Department of Information Systems and Business Analytics, College of Business Loyola Marymount University Los Angeles California USA
| | - Arin Brahma
- Department of Information Systems and Business Analytics, College of Business Loyola Marymount University Los Angeles California USA
| | - Mariano Aloiso
- Department of Information Systems and Business Analytics, College of Business Loyola Marymount University Los Angeles California USA
| |
Collapse
|
23
|
Abstract
The analysis of news in the financial context has gained a prominent interest in the last years. This is because of the possible predictive power of such content especially in terms of associated sentiment/mood. In this paper, we focus on a specific aspect of financial news analysis: how the covered topics modify according to space and time dimensions. To this purpose, we employ a modified version of topic model LDA, the so-called Structural Topic Model (STM), that takes into account covariates as well. Our aim is to study the possible evolution of topics extracted from two well known news archive—Reuters and Bloomberg—and to investigate a causal effect in the diffusion of the news by means of a Granger causality test. Our results show that both the temporal dynamics and the spatial differentiation matter in the news contagion.
Collapse
|
24
|
|
25
|
|
26
|
Tsai MF, Wang CJ, Chien PC. Discovering Finance Keywords via Continuous-Space Language Models. ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS 2016. [DOI: 10.1145/2948072] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
The growing amount of public financial data makes it increasingly important to learn how to discover valuable information for financial decision making. This article proposes an approach to discovering financial keywords from a large number of financial reports. In particular, we apply the continuous bag-of-words (CBOW) model, a well-known continuous-space language model, to the textual information in 10-K financial reports to discover new finance keywords. In order to capture word meanings to better locate financial terms, we also present a novel technique to incorporate syntactic information into the CBOW model. Experimental results on four prediction tasks using the discovered keywords demonstrate that our approach is effective for discovering predictability keywords for post-event volatility, stock volatility, abnormal trading volume, and excess return predictions. We also analyze the discovered keywords that attest to the ability of the proposed method to capture both syntactic and contextual information between words. This shows the success of this method when applied to the field of finance.
Collapse
|