Zero-Revelation RegTech: Detecting Risk through Linguistic Analysis of Corporate Emails and News

  title={Zero-Revelation RegTech: Detecting Risk through Linguistic Analysis of Corporate Emails and News},
  author={Sanjiv Ranjan Das and Seoyoung Kim and Bhushan Kothari},
  journal={Accounting Technology \& Information Systems eJournal},
In this paper, we demonstrate how an applied linguistics platform may be used to parse corporate email content and news to assess factors predicting escalating risk or the gradual shifting of other critical characteristics within the firm before they are eventually manifested in observable data and financial outcomes. We find that email content and news articles meaningfully predict increased risk and potential malaise. We also find that other structural characteristics, such as the average… 
NLP Analytics in Finance with DoRe: A French 250M Tokens Corpus of Corporate Annual Reports
The construction of the DoRe corpus is related, which is designed to be as modular as possible in order to allow for maximum reuse in different tasks pertaining to Economics, Finance and Regulation, and on the spectrum of possible uses of this new resource for NLP applications.
A Stochastic Time Series Model for Predicting Financial Trends using NLP
A novel deep learning model called ST-GAN, or Stochastic Time-series Generative Adversarial Network, is proposed that analyzes both financial news texts and financial numerical data to predict stock trends.
Regulatory technology: replacing law with computer code
In the UK both the Bank of England and the Financial Conduct Authority have recently carried out experiments using new digital technology for regulatory purposes. The idea is to replace rules written


Tweets and Trades: The Information Content of Stock Microblogs
Microblogging forums (e.g., Twitter) have become a vibrant online platform for exchanging stock†related information. Using methods from computational linguistics, we analyse roughly 250,000 stockâ€
More than Words: Quantifying Language to Measure Firms' Fundamentals
We examine whether a simple quantitative measure of language can be used to predict individual firms' accounting earnings and stock returns. Our three main findings are: (1) the fraction of negative
Is All That Talk Just Noise? The Information Content of Internet Stock Message Boards
It is found that stock messages posted on Yahooe Finance and Raging Bull help predict market volatility and disagreement among the posted messages is associated with increased trading volume.
When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks
Previous research uses negative word counts to measure the tone of a text. We show that word lists developed for other disciplines misclassify common words in financial text. In a large sample of 10
News or Noise? Internet Postings and Stock Prices
The anecdotal evidence is growing that postings in Internet financial forums affect stock prices, either because the postings contain new information or because they represent successful attempts to
Text and Context: Language Analytics in Finance
This monograph surveys the technology and empirics of text analytics in finance. I present various tools of information extraction and basic text analytics. I survey a range of techniques of
Which News Moves Stock Prices? A Textual Analysis
A basic tenet of financial economics is that asset prices change in response to unexpected fundamental information. Since Roll's (1988) provocative presidential address that showed little relation
Measuring Readability in Financial Disclosures
It is reported that 10-K document file size provides a simple readability proxy that outperforms the Fog Index, does not require document parsing, facilitates replication, and is correlated with alternative readability constructs.
Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web
A methodology for extracting small investor sentiment from stock message boards is developed, which comprises different classifier algorithms coupled together by a voting scheme that is similar to widely used Bayes classifiers.
Financial text mining: Supporting decision making using web 2.0 content
Deep penetration of personal computers, data communication networks, and the Internet has created a massive platform for data collection, dissemination, storage, and retrieval that presents unique challenges in comprehending the meaning and implication of financial documents and investment strategies from a large collection of Web-based textual data.