The Singleton Fallacy: Why Current Critiques of Language Models Miss the Point

@article{Sahlgren2021TheSF,
  title={The Singleton Fallacy: Why Current Critiques of Language Models Miss the Point},
  author={Magnus Sahlgren and Fredrik Carlsson},
  journal={Frontiers in Artificial Intelligence},
  year={2021},
  volume={4}
}
This paper discusses the current critique against neural network-based Natural Language Understanding solutions known as language models. We argue that much of the current debate revolves around an argumentation error that we refer to as the singleton fallacy: the assumption that a concept (in this case, language, meaning, and understanding) refers to a single and uniform phenomenon, which in the current debate is assumed to be unobtainable by (current) language models. By contrast, we argue… 

The Rediscovery Hypothesis: Language Models Need to Meet Linguistics

It is shown that language models that are significantly compressed but perform well on their pretraining objectives retain good scores when probed for linguistic structures, which supports the rediscovery hypothesis and leads to the second contribution of this paper: an information-theoretic framework that relates language modeling objectives with linguistic information.

The Linguistic Blind Spot of Value-Aligned Agency, Natural and Artificial

. The value-alignment problem for artificial intelligence (AI) asks how we can ensure that the ‘values’—i.e., objective functions—of artificial systems are aligned with the values of humanity. In this

Epistemic Defenses against Scientific and Empirical Adversarial AI Attacks

This transdisciplinary analysis suggests that employing distinct explanation-anchored, trustdisentangled and adversarial strategies is one possible principled complementary epistemic defense against SEA AI attacks – albeit with caveats yielding incentives for future work.

Grounding the Vector Space of an Octopus: Word Meaning from Raw Text

  • Anders Søgaard
  • Education
    Minds and Machines
  • 2023
Most, if not all, philosophers agree that computers cannot learn what words refers to from raw text alone. While many attacked Searle’s Chinese Room thought experiment, no one seemed to question this

Understanding models understanding language

A non-technical discussion of techniques for grounding Transformer models, giving them referential semantics, even in the absence of supervision is presented, and the approach Landgrebe and Smith advocate for is discussed, namely manual specification of formal grammars that associate linguistic expressions with logical form.

Do Large Language Models know what humans know?

This work tests the viability of the language exposure hypothesis by assessing whether models exposed to large quantities of human language develop evidence of Theory of Mind, and suggests that while statistical learning from language exposure may in part explain how humans develop Theory ofMind, other mechanisms are also responsible.

References

SHOWING 1-10 OF 38 REFERENCES

Why Can Computers Understand Natural Language?

The conception of language implied in the technique of word embeddings that supported the recent development of deep neural network models in computational linguistics is drawn.

What BERT Is Not: Lessons from a New Suite of Psycholinguistic Diagnostics for Language Models

A suite of diagnostics drawn from human language experiments are introduced, which allow us to ask targeted questions about information used by language models for generating predictions in context, and the popular BERT model is applied.

The Distributional Hypothesis

There is a correlation between distributional similarity and meaning similarity, which allows us to utilize the former in order to estimate the latter, and one can pose two very basic questions concerning the distributional hypothesis: what kind of distributional properties the authors should look for, and what — if any — the differences are between different kinds of Distributional properties.

Experience Grounds Language

It is posited that the present success of representation learning approaches trained on large text corpora can be deeply enriched from the parallel tradition of research on the contextual and social nature of language.

Distributional Structure

This discussion will discuss how each language can be described in terms of a distributional structure, i.e. in Terms of the occurrence of parts relative to other parts, and how this description is complete without intrusion of other features such as history or meaning.

Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference

There is substantial room for improvement in NLI systems, and the HANS dataset can motivate and measure progress in this area, which contains many examples where the heuristics fail.

Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data

It is argued that a system trained only on form has a priori no way to learn meaning, and a clear understanding of the distinction between form and meaning will help guide the field towards better science around natural language understanding.

Minds, brains, and programs

  • J. Searle
  • Philosophy
    Behavioral and Brain Sciences
  • 1980
Only a machine could think, and only very special kinds of machines, namely brains and machines with internal causal powers equivalent to those of brains, and no program by itself is sufficient for thinking.

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

A benchmark of nine diverse NLU tasks, an auxiliary dataset for probing models for understanding of specific linguistic phenomena, and an online platform for evaluating and comparing models, which favors models that can represent linguistic knowledge in a way that facilitates sample-efficient learning and effective knowledge-transfer across tasks.

Linguistic Knowledge and Transferability of Contextual Representations

It is found that linear models trained on top of frozen contextual representations are competitive with state-of-the-art task-specific models in many cases, but fail on tasks requiring fine-grained linguistic knowledge.