Form 10-Q Itemization

@article{Zhang2021Form1I,
  title={Form 10-Q Itemization},
  author={Yanci Zhang and Tianming Du and Yujie Sun and Lawrence Donohue and Rui Dai},
  journal={Proceedings of the 30th ACM International Conference on Information \& Knowledge Management},
  year={2021}
}
  • Yanci Zhang, Tianming Du, +2 authors Rui Dai
  • Published 23 April 2021
  • Computer Science, Economics
  • Proceedings of the 30th ACM International Conference on Information & Knowledge Management
The quarterly financial statement, or Form 10-Q, is one of the most frequently required filings for US public companies to disclose financial and other important business information. Due to the massive volume of 10-Q filings and the enormous variations in the reporting format, it has been a long-standing challenge to retrieve item-specific information from 10-Q filings that lack machine-readable hierarchy. This paper presents a solution for itemizing 10-Q files by complementing a rule-based… 

Figures and Tables from this paper

A News-based Machine Learning Model for Adaptive Asset Pricing
TLDR
The paper proposes a new asset pricing model – the News Embedding UMAP Selection (NEUS) model, to explain and predict the stock returns based on the financial news to have a significantly better fitting and prediction power than the Fama-French 5-factor model.
Bidding via Clustering Ads Intentions: an Efficient Search Engine Marketing System for E-commerce
TLDR
The end-to-end structure of the bidding system for search engine marketing for Walmart e-commerce, which successfully handles tens of millions of bids each day is introduced and how it is found as a production-efficient solution is discussed.
U-Net Convolutional Network for Recognition of Vessels and Materials in Chemistry Lab
TLDR
A UNet convolutional network was applied to recognition of vessels and materials in chemistry lab using the recent Vector-LabPics dataset, which contains 2187 images of materials within mostly transparent vessels in a chemistry lab and other general settings, labeled with 13 classes.
AFTer-UNet: Axial Fusion Transformer UNet for Medical Image Segmentation
TLDR
This paper proposes Axial Fusion Transformer UNet (AFTer-UNet), which takes both advantages of convolutional layers’ capability of extracting detailed features and transformers’ strength on long sequence modeling and has fewer parameters and takes less GPU memory to train than the previous transformer-based models.
An Efficient Group-based Search Engine Marketing System for E-Commerce
TLDR
The development and deployment process of the bidding system for search engine marketing on Walmart.com is introduced, and the real-world performances of state-of-the-art deep learning methods are shown and revealed how they find their as the production-optimal solutions.
High-Dimensional Estimation, Basis Assets, and the Adaptive Multi-Factor Model
TLDR
The paper proposes a new algorithm, the Groupwise Interpretable Basis Selection (GIBS) algorithm, to estimate a new Adaptive Multi-Factor (AMF) asset pricing model, implied by the recently developed Generalized Arbitrage Pricing Theory, which relaxes the convention that the number of risk-factors is small.
Clustering Structure of Microstructure Measures
TLDR
This paper builds the clustering model of measures of market microstructure features which are popular in predicting the stock returns in a 10-second time frequency to predict more accurately with a limited number of predictors, which removes the noise and makes the model more interpretable.

References

SHOWING 1-10 OF 98 REFERENCES
DOLORES: Deep Contextualized Knowledge Graph Embeddings
TLDR
This work introduces a new method DOLORES for learning knowledge graph embeddings that effectively captures contextual cues and dependencies among entities and relations and shows that these representations can very easily be incorporated into existing models to significantly advance the state of the art on several knowledge graph prediction tasks.
The Evolution of 10-K Textual Disclosure: Evidence from Latent Dirichlet Allocation
We document marked trends in 10-K disclosure over the period 1996–2013, with increases in length, boilerplate, stickiness, and redundancy and decreases in specificity, readability, and the relative…
A News-based Machine Learning Model for Adaptive Asset Pricing
TLDR
The paper proposes a new asset pricing model – the News Embedding UMAP Selection (NEUS) model, to explain and predict the stock returns based on the financial news to have a significantly better fitting and prediction power than the Fama-French 5-factor model.
SPot: A Tool for Identifying Operating Segments in Financial Tables
TLDR
SPot is an automated tool for detecting operating segments and their related performance indicators from earnings reports that facilitates credit monitoring, enables them to perform competitive benchmarking more effectively, and can be used for trend analysis at company and sector levels.
When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks
Previous research uses negative word counts to measure the tone of a text. We show that word lists developed for other disciplines misclassify common words in financial text. In a large sample of 10…
Bidding via Clustering Ads Intentions: an Efficient Search Engine Marketing System for E-commerce
TLDR
The end-to-end structure of the bidding system for search engine marketing for Walmart e-commerce, which successfully handles tens of millions of bids each day is introduced and how it is found as a production-efficient solution is discussed.
Demonstration of Nimbus: Model-based Pricing for Machine Learning in a Data Marketplace
TLDR
This work demonstrates Nimbus, a data market framework for ML model exchange that prices ML models directly, which it calls model-based pricing (MBP), and demonstrates how much gain of sellers' revenue and buyers' affordability Nimbus can achieve with low runtime cost via both real time and offline results.
Forecasting the Accuracy of Forecasters from Properties of Forecasting Rationales
TLDR
Methods from natural language processing (NLP) and computational text analysis are adapted to identify distinctive reasoning strategies in the rationales of top forecasters, including cognitive styles that gauge tolerance of clashing perspectives and efforts to blend them into coherent conclusions.
Simultaneously Discovering and Quantifying Risk Types from Textual Risk Disclosures
TLDR
This paper develops a variation of the latent Dirichlet allocation topic model and its learning algorithm for simultaneously discovering and quantifying risk types from textual risk disclosures and provides support for all three competing arguments regarding whether and how risk disclosures affect the risk perceptions of investors.
FlowCon: Elastic Flow Configuration for Containerized Deep Learning Applications
TLDR
This work introduces FlowCon, a system which is able to monitor loss functions of ML/DL jobs at runtime, and thus to make decisions on resource configuration elastically, and shows that FlowCon can strongly improve DL job completion time and resource utilization efficiency, compared to existing approaches.
...
1
2
3
4
5
...