• Computer Science
  • Published in ArXiv 2019

BERTgrid: Contextualized Embedding for 2D Document Representation and Understanding

@article{Denk2019BERTgridCE,
  title={BERTgrid: Contextualized Embedding for 2D Document Representation and Understanding},
  author={Timo I. Denk and Christian Reisswig},
  journal={ArXiv},
  year={2019},
  volume={abs/1909.04948}
}
For understanding generic documents, information like font sizes, column layout, and generally the positioning of words may carry semantic information that is crucial for solving a downstream document intelligence task. [...] Key Method The contextualized embedding vectors are retrieved from a BERT language model. We use BERTgrid in combination with a fully convolutional network on a semantic instance segmentation task for extracting fields from invoices. We demonstrate its performance on tabulated line item…Expand Abstract

Figures, Tables, and Topics from this paper.

Citations

Publications citing this paper.

References

Publications referenced by this paper.
SHOWING 1-5 OF 5 REFERENCES

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Efficient Estimation of Word Representations in Vector Space

  • Mikolov Tomas, Chen Kai, Corrado Greg, Dean Jeffrey
  • 1st International Conference on Learning Representations,
  • 2013