Learning Decision Lists

  title={Learning Decision Lists},
  author={Ronald L. Rivest},
  journal={Machine Learning},
  • R. Rivest
  • Published 1 November 1987
  • Computer Science, Mathematics
  • Machine Learning
This paper introduces a new representation for Boolean functions, called decision lists, and shows that they are efficiently learnable from examples. More precisely, this result is established for k-;DL – the set of decision lists with conjunctive clauses of size k at each decision. Since k-DL properly includes other well-known techniques for representing Boolean functions such as k-CNF (formulae in conjunctive normal form with at most k literals per clause), k-DNF (formulae in disjunctive… 

Decision list compression by mild random restrictions

It is proved that decision lists of small width can always be approximated by decision list of small size, where sharp bounds are obtained.

Learning Conjunctions of Horn Clauses

An algorithm is presented for learning the class of Boolean formulas that are expressible as conjunctions of Horn clauses, using equivalence queries and membership queries to produce a formula that is logically equivalent to the unknown formula to be learned.

Research Note on Decision Lists

In his article “Learning Decision Lists,” Rivest proves that (k-DNF ∪ k-CNF) is a proper subset of k-DL. The proof is based on the following incorrect claim:... if a function f has a prime implicant

Computing Optimal Decision Sets with SAT

By finding optimal solutions for decision sets, a type of model with unordered rules, it is shown that one can build decision set classifiers that are almost as accurate as the best heuristic methods, but far more concise, and hence more explainable.

Algebraic Characterizations of Small Classes of Boolean Functions

Algebraic characterizations are derived for some "small" classes of boolean functions, all of which have depth-3 AC° circuits, namely k-term DNF, k-DNF,K-decision lists, decision trees of bounded rank, and DNF.

SAT-Based Rigorous Explanations for Decision Lists

This paper shows that computing explanations for DLs is computationally hard, and proposes propositional encodings for computing abductive explanations and contrastive explanations of DLs and investigates the practical efficiency of a MARCO-like approach for enumerating explanations.

Almost Optimal Testers for Concise Representations

  • N. Bshouty
  • Mathematics, Computer Science
    Electron. Colloquium Comput. Complex.
  • 2019
Improved and almost optimal testers are given for several classes of Boolean functions on inputs that have concise representation in the uniform and distribution-free model and can be approximated by functions that have a small number of relevant variables.

Research note on decision lists

A counterexample is shown to the claim that (k-DNF ∪k-CNF) is a proper subset ofk-DL and a stronger theorem is proved from which Rivest's theorem follows as a corollary.

On domain-partitioning induction criteria: worst-case bounds for the worst-case based

What Circuit Classes Can Be Learned with Non-Trivial Savings?

A new perspective on distribution-free PAC learning problems is suggested, inspired by a surge of recent research in complexity theory, in which the goal is to determine whether and how much of a savings over a naive 2^n runtime can be achieved.



Computational limitations on learning from examples

It is shown for various classes of concept representations that these cannot be learned feasibly in a distribution-free sense unless R = NP, and relationships between learning of heuristics and finding approximate solutions to NP-hard optimization problems are given.

Induction of Decision Trees

This paper summarizes an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail, which is described in detail.

A theory of the learnable

This paper regards learning as the phenomenon of knowledge acquisition in the absence of explicit programming, and gives a precise methodology for studying this phenomenon from a computational viewpoint.

Classifying learnable geometric concepts with the Vapnik-Chervonenkis dimension

It is shown that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned.

Generating Production Rules from Decision Trees

This paper describes a technique for transforming such trees to small sets of production rules, a common formalism for expressing knowledge in expert systems, and provides a way of combining different decision trees for the same classification domain.


This paper derives formal relationships between n, c and the probability of ambiguous predictions by examining three modeling languages under binary classification tasks: perceptrons, Boolean formulae, and Boolean networks.

Occam's razor

On the learnability of Boolean formulae

The goals are to prove results and develop general techniques that shed light on the boundary between the classes of expressions that are learnable in polynomial time and those that are apparently not, and to employ the distribution-free model of learning.

Some NP-complete set-covering problems

  • Unpublished manuscript,
  • 1976