Financial Event Extraction Using Wikipedia-Based Weak Supervision

  title={Financial Event Extraction Using Wikipedia-Based Weak Supervision},
  author={Liat Ein-Dor and Ariel Gera and Orith Toledo-Ronen and Alon Halfon and Benjamin Sznajder},
Extraction of financial and economic events from text has previously been done mostly using rule-based methods, with more recent works employing machine learning techniques. This work is in line with this latter approach, leveraging relevant Wikipedia sections to extract weak labels for sentences describing economic events. Whereas previous weakly supervised approaches required a knowledge-base of such events, or corresponding financial figures, our approach requires no such additional data… 

Figures and Tables from this paper

FEED: A Chinese Financial Event Extraction Dataset Constructed by Distant Supervision

A large-scale Chinese financial event extraction dataset FEED is released, consisting of 31,748 documents on five financial event types derived from the Chinese financial portals, which considers the case of event arguments scattered in multiple sentences and one document containing multiple events.

Extracting Fine-Grained Economic Events from Business News

It is shown that single-token triggers do not provide sufficient discriminative information for a fine-grained event detection setup in a closed domain such as economics, since many classes have a large degree of lexico-semantic and contextual overlap.

TDJEE: A Document-Level Joint Model for Financial Event Extraction

A relation-aware Transformer-based Document-level Joint Event Extraction model (TDJEE), which encodes relations between words into the context and leverages modified Transformer to capture document-level information to fill event arguments.

Effective Use of Graph Convolution Network and Contextual Sub-Tree for Commodity News Event Extraction

This paper proposes an effective use of Graph Convolutional Networks with a pruned dependency parse tree, termed contextual sub-tree, for better event ex-traction in commodity news.

Event detection in finance using hierarchical clustering algorithms on news and tweets

A real-time domain-specific clustering-based event-detection approach that integrates textual information coming from traditional newswires and from microblogging platforms that is effective in extracting meaningful information from real-world events and in spotting hot events in the financial sphere.

CoFiF Plus: A French Financial Narrative Summarisation Corpus

CoFiF Plus is presented, the first French financial narrative summarisation dataset providing a comprehensive set of financial text written in French, composed of 1,703 reports from the most capitalised companies in France covering a time frame from 1995 to 2021.

Unsupervised Extraction of Market Moving Events with Neural Attention

The authors' experiments suggest that there is an indication that the weights indeed skew the global set of events towards those categories that are more relevant to explain the price change; this effect reflects the performance of the network on stock prediction.

An Annotated Commodity News Corpus for Event Extraction

This is the first corpus that is annotated with elements which are crucial for event extraction from commodity news, which can then be used for commodity price prediction.

Utilizing coarse-grained data in low-data settings for event extraction

This work investigates the feasibility of integrating coarse-grained data ( document or sentence labels), which is far more feasible to obtain, instead of annotating more documents, and utilizes a multi-task model with two auxiliary tasks, document and sentence binary classification.

FiNER: Financial Numeric Entity Recognition for XBRL Tagging

It is shown that subword fragmentation of numeric expressions harms BERT’s performance, allowing word-level BILSTMs to perform better, and two simple and effective solutions that replace numeric expressions with pseudo-tokens reflecting original token shapes and numeric magnitudes are proposed.



Economic Event Detection in Company-Specific News Text

A dataset and supervised classification approach for economic event detection in English news articles shows satisfactory results for most event types, with the linear kernel SVM outperforming the other experimental set-ups.

Semantics-based information extraction for detecting economic events

The Semantics-Based Pipeline for Economic Event Detection (SPEED), focusing on extracting financial events from news articles and annotating these with meta-data at a speed that enables real-time use, is proposed.

A Survey of event extraction methods from text for decision support systems

An Overview of Event Extraction from Text

This literature survey reviews text mining techniques that are employed for various event extraction purposes and provides general guidelines on how to choose a particular event extraction technique depending on the user, the available content, and the scenario of use.

Semantic Frames to Predict Stock Price Movement

This work introduces a novel tree representation, and uses it to train predictive models with tree kernels using support vector machines, and shows that features derived from semantic frame parsing have significantly better performance across years on the polarity task.

Open domain event extraction from twitter

TwiCal is described-- the first open-domain event-extraction and categorization system for Twitter, and a novel approach for discovering important event categories and classifying extracted events based on latent variable models is presented.

Using Structured Events to Predict Stock Price Movement: An Empirical Investigation

This work proposes to adapt Open IE technology for event-based stock price movement prediction, extracting structured events from large-scale public news without manual efforts, and outperforms bags-of-words-based baselines and previous systems trained on S&P 500 stock historical data.

Ontology-Based Information and Event Extraction for Business Intelligence

BEECON is the first ontology-based system for business documents analysis that is able to detect 41 different types of business events from unstructured sources of information.

A brief introduction to weakly supervised learning

This article reviews some research progress of weakly supervised learning, focusing on three typical types of weak supervision: incomplete supervision, where only a subset of training data is given with labels; inexact supervision, Where the training data are given with only coarse-grained labels; and inaccurate supervision,Where the given labels are not always ground-truth.