TableFormer: Table Structure Understanding with Transformers
@article{Nassar2022TableFormerTS, title={TableFormer: Table Structure Understanding with Transformers}, author={Ahmed Samy Nassar and Nikolaos Livathinos and Maksym Lysak and Peter W. J. Staar}, journal={2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2022}, pages={4604-4613} }
Tables organize valuable content in a concise and compact representation. This content is extremely valuable for systems such as search engines, Knowledge Graph's, etc, since they enhance their predictive capabilities. Unfortu-nately, tables come in a large variety of shapes and sizes. Furthermore, they can have complex column/row-header configurations, multiline rows, different variety of separation lines, missing entries, etc. As such, the correct iden-tification of the table-structure from…
Figures and Tables from this paper
12 Citations
Deep learning for table detection and structure recognition: A survey
- Computer ScienceArXiv
- 2022
The goals of this survey are to provide a profound comprehension of the major developments in the field of Table Detection, offer insight into the different methodologies, and provide a systematic taxonomy of the different approaches.
SEMv2: Table Separation Line Detection Based on Conditional Convolution
- Computer ScienceArXiv
- 2023
This work proposes an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge), and designs the ``split'' in a top-down manner that detects the table separation line instance first and then dynamically predicts the table separated line mask for each instance.
FETA: Towards Specializing Foundation Models for Expert Task Applications
- Computer ScienceArXiv
- 2022
This paper proposes a first of its kind FETA benchmark built around the task of teaching FMs to understand technical documentation, via learning to match their graphical illustrations to corresponding language descriptions, and provides multiple baselines and analysis of popular FMs on FETA.
Aligning benchmark datasets for table structure recognition
- Computer Science
- 2023
This work shows that aligning these benchmarks with a single model architecture, the Table Transformer, improves model performance significantly, and shows through ablations over the modification steps that canonicalization of the table annotations has a significantly positive effect on performance.
Toward a Unified Framework for Unsupervised Complex Tabular Reasoning
- Computer ScienceArXiv
- 2022
A framework for unsupervised complex tabular reasoning (UCTR), which generates sufficient and diverse synthetic data with complex logic for tabular Reasoning tasks, assuming no human-annotated data at all, and can substantially boost the supervised performance in low- resourced domains as a data augmentation technique.
Delivering Document Conversion as a Cloud Service with High Throughput and Responsiveness
- Computer Science2022 IEEE 15th International Conference on Cloud Computing (CLOUD)
- 2022
The requirements, design, and implementation choices of the document conversion service are outlined, and the best-performing method achieves sustained throughput of over one million PDF pages per hour on 3072 CPU cores across 192 nodes.
DocILE 2023 Teaser: Document Information Localization and Extraction
- Computer ScienceECIR
- 2023
The DocILE 2023 competition will run the first major benchmark for the tasks of Key Information Localization and Extraction and Line Item Recognition from business documents, and comes with the largest publicly available dataset for KILE and LIR.
Relative Layout Matching for Document Data Extraction
- Computer Science
- 2022
This thesis explores the field of business document information extraction, emphasizing one-shot learning systems that improve their performance by utilizing a database of previously processed documents by proposing a novel representation-learning approach.
Business Document Information Extraction: Towards Practical Benchmarks
- Computer ScienceCLEF
- 2022
There is a lack of relevant datasets and benchmarks for Document IE on semi-structured business documents as their content is typically legally protected or sensitive, and potential sources of available documents including synthetic data are discussed.
Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling
- Computer ScienceArXiv
- 2023
An end-to-end sequential modeling framework for table structure recognition called VAST, which contains a novel coordinate sequence decoder triggered by the representation of the non-empty cell from the logical structure decoder and an auxiliary visual-alignment loss to enforce the logical representation to contain more local visual details, which helps produce better cell bounding boxes.
References
SHOWING 1-10 OF 37 REFERENCES
Global Table Extractor (GTE): A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context
- Computer Science2021 IEEE Winter Conference on Applications of Computer Vision (WACV)
- 2021
This work presents Global Table Extractor, a vision-guided systematic framework for joint table detection and cell structured recognition, which could be built on top of any object detection model, and GTE-Cell, a new hierarchical cell detection network that leverages table styles.
Image-based table recognition: data, model, and evaluation
- Computer ScienceECCV
- 2020
The largest publicly available table recognition dataset PubTabNet is developed, containing 568k table images with corresponding structured HTML representation, and a new Tree-Edit-Distance-based Similarity (TEDS) metric for table recognition is proposed, which more appropriately captures multi-hop cell misalignment and OCR errors than the pre-established metric.
Parsing Table Structures in the Wild
- Computer Science2021 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2021
This paper aims to establish a practical table structure parsing system for real-world scenarios where tabular input images are taken or scanned with severe deformation, bending or occlusions and proposes an approach named Cycle-CenterNet on the top of CenterNet with a novel cycle-pairing module to simultaneously detect and group tabular cells into structured tables.
TableNet: Deep Learning Model for End-to-end Table Detection and Tabular Data Extraction from Scanned Document Images
- Computer Science2019 International Conference on Document Analysis and Recognition (ICDAR)
- 2019
The proposed TableNet is a novel end-to-end deep learning model that exploits the interdependence between the twin tasks of table detection and table structure recognition to segment out the table and column regions.
Complicated Table Structure Recognition
- Computer ScienceArXiv
- 2019
A novel graph neural network that takes table cells as input, and then recognizes the table structures by predicting relations among cells is proposed, which is highly effective for complicated tables and outperforms state-of-the-art baselines over a benchmark dataset and new constructed dataset.
ReS2TIM: Reconstruct Syntactic Structures from Table Images
- Computer Science2019 International Conference on Document Analysis and Recognition (ICDAR)
- 2019
This paper presents a novel framework to convert a table image into its syntactic representation through the relationships between its cells, and builds a cell relationship network to predict the neighbors of each cell in four directions.
DeepTabStR: Deep Learning based Table Structure Recognition
- Computer Science2019 International Conference on Document Analysis and Recognition (ICDAR)
- 2019
A novel method for the analysis of tabular structures in document images using the potential of deformable convolutional networks using the famous Page-Object Detection dataset, and a new image-based table structure recognition dataset, TabStructDB2, comprising of 1081 tables densely labeled with row and column information.
TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition
- Computer Science2021 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2021
An end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition that has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.
ICDAR 2021 Competition on Scientific Table Image Recognition to LaTeX
- Computer ScienceICDAR
- 2021
The dataset, tasks, participants’ methods, and results of the ICDAR 2021 Competition on Scientific Table Image Recognition to LaTeX are discussed, the datasets and ground truth specification are described, the performance evaluation metrics used, and the final results are presented.
GFTE: Graph-based Financial Table Extraction
- Computer ScienceICPR Workshops
- 2020
A standard Chinese dataset named FinTab is published, which contains more than 1,600 financial tables of diverse kinds and their corresponding structure representation in JSON and a novel graph-based convolutional neural network model named GFTE is proposed as a baseline for future comparison.