CORNET: A neurosymbolic approach to learning conditional table formatting rules by example

  title={CORNET: A neurosymbolic approach to learning conditional table formatting rules by example},
  author={Mukul Singh and Jos{\'e} Pablo Cambronero S{\'a}nchez and Sumit Gulwani and Vu Le and Carina Negreanu and Mohammad Raza and Gust Verbruggen},
Spreadsheets are widely used for table manipulation and pre- sentation. Stylistic formatting of these tables is an important property for both presentation and analysis. As a result, pop- ular spreadsheet software, such as Excel, supports automatically formatting tables based on data-dependent rules. Un- fortunately, writing these formatting rules can be challenging for users as that requires knowledge of the underlying rule language and data logic. In this paper, we present C ORNET , a neuro… 

Figures and Tables from this paper



Neural Formatting for Spreadsheet Tables

This paper proposes CellGAN, a neural formatting model for learning and recommending formats of spreadsheet tables in a self-supervised fashion, based on a novel conditional generative adversarial network (cGAN) architecture.

Synchromesh: Reliable code generation from pre-trained language models

A framework for substantially improving the reliability of pre-trained models for code generation and observing substantial complementary gains from CSD and TST in prediction accuracy and in effectively preventing run-time errors is proposed.

TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data

TaBERT is a pretrained LM that jointly learns representations for NL sentences and (semi-)structured tables that achieves new best results on the challenging weakly-supervised semantic parsing benchmark WikiTableQuestions, while performing competitively on the text-to-SQL dataset Spider.

Automating string processing in spreadsheets using input-output examples

The design of a string programming/expression language that supports restricted forms of regular expressions, conditionals and loops is described and an algorithm based on several novel concepts for synthesizing a desired program in this language is described from input-output examples.

BUSTLE: Bottom-up program-Synthesis Through Learning-guided Exploration

A new synthesis approach that leverages learning to guide a bottom-up search over programs, and trains a model to prioritize compositions of intermediate values during search conditioned on a given set of input-output examples.

TaPas: Weakly Supervised Table Parsing via Pre-training

TaPas is presented, an approach to question answering over tables without generating logical forms that outperforms or rivals semantic parsing models by improving state-of-the-art accuracy on SQA and performing on par with the state of theart on WikiSQL and WikiTQ, but with a simpler model architecture.

Learning Natural Programs from a Few Examples in Real-Time

A novel, real-time, ML-based program ranking algorithm that enables synthesis of natural, user-intended, personalized programs and makes two key technical contributions: a new technique to embed programs in a vector space making them amenable to ML-formulations, and a novel formulation that interleaves program search with ranking, enabling real- time synthesis of accurate user-intsended programs.

CodeBERT: A Pre-Trained Model for Programming and Natural Languages

This work develops CodeBERT with Transformer-based neural architecture, and trains it with a hybrid objective function that incorporates the pre-training task of replaced token detection, which is to detect plausible alternatives sampled from generators.

A Hybrid Probabilistic Approach for Table Understanding

This paper introduces an end-to-end system for table understanding, the process of capturing the relational structure of data in tables, with a hybrid, neuro-symbolic approach, combining embedded representations learned from thousands of tables with probabilistic constraints that capture regularities in how humans organize tables.

FlashExtract: a framework for data extraction by examples

This work presents a general framework FlashExtract to extract relevant data from semi-structured documents using examples, and describes instantiation of the framework to three different domains: text files, webpages, and spreadsheets.