TableFormer: Table Structure Understanding with Transformers

  title={TableFormer: Table Structure Understanding with Transformers},
  author={Ahmed Samy Nassar and Nikolaos Livathinos and Maksym Lysak and Peter W. J. Staar},
  journal={2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
Tables organize valuable content in a concise and compact representation. This content is extremely valuable for systems such as search engines, Knowledge Graph's, etc, since they enhance their predictive capabilities. Unfortu-nately, tables come in a large variety of shapes and sizes. Furthermore, they can have complex column/row-header configurations, multiline rows, different variety of separation lines, missing entries, etc. As such, the correct iden-tification of the table-structure from… 

Deep learning for table detection and structure recognition: A survey

The goals of this survey are to provide a profound comprehension of the major developments in the field of Table Detection, offer insight into the different methodologies, and provide a systematic taxonomy of the different approaches.

SEMv2: Table Separation Line Detection Based on Conditional Convolution

This work proposes an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge), and designs the ``split'' in a top-down manner that detects the table separation line instance first and then dynamically predicts the table separated line mask for each instance.

FETA: Towards Specializing Foundation Models for Expert Task Applications

This paper proposes a first of its kind FETA benchmark built around the task of teaching FMs to understand technical documentation, via learning to match their graphical illustrations to corresponding language descriptions, and provides multiple baselines and analysis of popular FMs on FETA.

Aligning benchmark datasets for table structure recognition

This work shows that aligning these benchmarks with a single model architecture, the Table Transformer, improves model performance significantly, and shows through ablations over the modification steps that canonicalization of the table annotations has a significantly positive effect on performance.

Toward a Unified Framework for Unsupervised Complex Tabular Reasoning

A framework for unsupervised complex tabular reasoning (UCTR), which generates sufficient and diverse synthetic data with complex logic for tabular Reasoning tasks, assuming no human-annotated data at all, and can substantially boost the supervised performance in low- resourced domains as a data augmentation technique.

Delivering Document Conversion as a Cloud Service with High Throughput and Responsiveness

The requirements, design, and implementation choices of the document conversion service are outlined, and the best-performing method achieves sustained throughput of over one million PDF pages per hour on 3072 CPU cores across 192 nodes.

DocILE 2023 Teaser: Document Information Localization and Extraction

The DocILE 2023 competition will run the first major benchmark for the tasks of Key Information Localization and Extraction and Line Item Recognition from business documents, and comes with the largest publicly available dataset for KILE and LIR.

Relative Layout Matching for Document Data Extraction

This thesis explores the field of business document information extraction, emphasizing one-shot learning systems that improve their performance by utilizing a database of previously processed documents by proposing a novel representation-learning approach.

Business Document Information Extraction: Towards Practical Benchmarks

There is a lack of relevant datasets and benchmarks for Document IE on semi-structured business documents as their content is typically legally protected or sensitive, and potential sources of available documents including synthetic data are discussed.

Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling

An end-to-end sequential modeling framework for table structure recognition called VAST, which contains a novel coordinate sequence decoder triggered by the representation of the non-empty cell from the logical structure decoder and an auxiliary visual-alignment loss to enforce the logical representation to contain more local visual details, which helps produce better cell bounding boxes.



Global Table Extractor (GTE): A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context

This work presents Global Table Extractor, a vision-guided systematic framework for joint table detection and cell structured recognition, which could be built on top of any object detection model, and GTE-Cell, a new hierarchical cell detection network that leverages table styles.

Image-based table recognition: data, model, and evaluation

The largest publicly available table recognition dataset PubTabNet is developed, containing 568k table images with corresponding structured HTML representation, and a new Tree-Edit-Distance-based Similarity (TEDS) metric for table recognition is proposed, which more appropriately captures multi-hop cell misalignment and OCR errors than the pre-established metric.

Parsing Table Structures in the Wild

This paper aims to establish a practical table structure parsing system for real-world scenarios where tabular input images are taken or scanned with severe deformation, bending or occlusions and proposes an approach named Cycle-CenterNet on the top of CenterNet with a novel cycle-pairing module to simultaneously detect and group tabular cells into structured tables.

TableNet: Deep Learning Model for End-to-end Table Detection and Tabular Data Extraction from Scanned Document Images

The proposed TableNet is a novel end-to-end deep learning model that exploits the interdependence between the twin tasks of table detection and table structure recognition to segment out the table and column regions.

Complicated Table Structure Recognition

A novel graph neural network that takes table cells as input, and then recognizes the table structures by predicting relations among cells is proposed, which is highly effective for complicated tables and outperforms state-of-the-art baselines over a benchmark dataset and new constructed dataset.

ReS2TIM: Reconstruct Syntactic Structures from Table Images

This paper presents a novel framework to convert a table image into its syntactic representation through the relationships between its cells, and builds a cell relationship network to predict the neighbors of each cell in four directions.

DeepTabStR: Deep Learning based Table Structure Recognition

A novel method for the analysis of tabular structures in document images using the potential of deformable convolutional networks using the famous Page-Object Detection dataset, and a new image-based table structure recognition dataset, TabStructDB2, comprising of 1081 tables densely labeled with row and column information.

TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition

An end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition that has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.

ICDAR 2021 Competition on Scientific Table Image Recognition to LaTeX

The dataset, tasks, participants’ methods, and results of the ICDAR 2021 Competition on Scientific Table Image Recognition to LaTeX are discussed, the datasets and ground truth specification are described, the performance evaluation metrics used, and the final results are presented.

GFTE: Graph-based Financial Table Extraction

A standard Chinese dataset named FinTab is published, which contains more than 1,600 financial tables of diverse kinds and their corresponding structure representation in JSON and a novel graph-based convolutional neural network model named GFTE is proposed as a baseline for future comparison.