DeepCPCFG: Deep Learning and Context Free Grammars for End-to-End Information Extraction

  title={DeepCPCFG: Deep Learning and Context Free Grammars for End-to-End Information Extraction},
  author={F. C. Chua and Nigel P. Duffy},
We address the challenge of extracting structured information from business documents without detailed annotations. We propose Deep Conditional Probabilistic Context Free Grammars (DeepCPCFG) to parse two-dimensional complex documents and use Recursive Neural Networks to create an end-to-end system for finding the most probable parse that represents the structured information to be extracted. This system is trained end-to-end with scanned documents as input and only relational-records as labels… Expand

Figures and Tables from this paper


Kleister: A novel task for Information Extraction involving Long Documents with Complex Layout
A new task is introduced (named Kleister) with two new datasets to encourage progress on deeper and more complex Information Extraction (IE) and Pipeline method is proposed as a text-only baseline with different Named Entity Recognition architectures (Flair, BERT, RoBERTa). Expand
Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Auto-Encoders
DIORA is introduced, a fully-unsupervised method for discovering syntax that simultaneously learns representations for constituents within the induced tree that outperforms previously reported results for unsupervised binary constituency parsing on the benchmark WSJ dataset. Expand
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
The Tree-LSTM is introduced, a generalization of LSTMs to tree-structured network topologies that outperform all existing systems and strong LSTM baselines on two tasks: predicting the semantic relatedness of two sentences and sentiment classification. Expand
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks. Expand
Torch-Struct: Deep Structured Prediction Library
This work introduces Torch-Struct, a library for structured prediction designed to take advantage of and integrate with vectorized, auto-differentiation based frameworks, and includes a broad collection of probabilistic structures accessed through a simple and flexible distribution-based API. Expand
CUTIE: Learning to Understand Documents with Convolutional Universal Text Information Extractor
The proposed model, Convolutional Universal Text Information Extractor (CUTIE), applies convolutional neural networks on gridded texts where texts are embedded as features with semantical connotations and aims to harness the effective information from both semantic meaning and spatial distribution of texts in documents. Expand
Tree-structured decoding with doubly-recurrent neural networks
A novel neural network architecture specifically tailored to treestructured decoding, which maintains separate depth and width recurrent states and combines them to obtain hidden states for every node in the tree, and exhibits desirable invariance properties over sequential architectures. Expand
Learning nongenerative grammatical models for document analysis
This approach models document layout as a grammar and performs a global search for the optimal parse based on a grammatical cost function and applies this technique to two document image analysis tasks: page layout structure extraction and mathematical expression interpretation. Expand
Parsing 2-Dimensional Language
2-Dimensional Context-Free Grammar (2D-CFG) for 2-dimensional input text is introduced and efficient parsing algorithms for 2D-CFG are presented. In 2D-CFG, a grammar rule’s right hand side symbolsExpand
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data. Expand