All Fields
Computer Science
Medicine
FAQ
Contact
Sign in
Heaps' law
Known as:
Heaps law
, Herdan's law
In linguistics, Heaps' law (also called Herdan's law) is an empirical law which describes the number of distinct words in a document (or set of…
(More)
Wikipedia
Topic mentions per year
Topic mentions per year
1977-2017
0
5
10
1977
2017
Related topics
Related topics
6 relations
Clustering high-dimensional data
Feature hashing
Language model
Text corpus
(More)
Broader (1)
Computational linguistics
Related mentions per year
Related mentions per year
1956-2018
1960
1980
2000
2020
Heaps' law
Text corpus
Language model
Computational linguistics
Clustering high-dimensional data
Zipf's law
Papers overview
Semantic Scholar uses AI to extract papers important to this topic.
2016
2016
Verifying Heaps' law using Google Books Ngram data
Vladimir V. Bochkarev
,
Eduard Yu. Lerner
,
Anna V. Shevlyakova
ArXiv
2016
This article is devoted to the verification of the empirical Heaps law in European languages using Google Books Ngram corpus data…
(More)
Is this relevant?
2016
2016
Herdan-Heaps Law Corresponds to Lotka's Law: A Dynamic Perspective from Simon's Model
Yifei Zhang
,
Shi Shan
,
Chunning Yan
,
Junping Qiu
,
Li Xiong
Journal of Quantitative Linguistics
2016
Herdan-Heaps law and Lotka’s law are two important laws in linguistics and many other fields, which are often found to coexist in…
(More)
Is this relevant?
2014
2014
Information density, Heaps' Law, and perception of factiness in news
Miriam Boon
LTCSS@ACL
2014
Seeking information online can be an exercise in time wasted wading through repetitive, verbose text with little actual content…
(More)
Is this relevant?
2014
2014
Scaling laws and fluctuations in the statistics of word frequencies
Martin Gerlach
,
Eduardo G. Altmann
ArXiv
2014
In this paper we combine statistical analysis of large text databases and simple stochastic models to explain the appearance of…
(More)
Is this relevant?
Review
2012
Review
2012
Information-Theoretic Models of Natural Language
Łukasz Dębowski
,
Jana Kazimierza
,
Wolfgang Hilberg
2012
The relaxed Hilberg conjecture is a proposition about natural language which states that mutual information between two adjacent…
(More)
Is this relevant?
2008
2008
Discovery of Power-Laws in Chemical Space
Ryan W. Benz
,
Sanjay Joshua Swamidass
,
Pierre Baldi
Journal of Chemical Information and Modeling
2008
Power-law distributions have been observed in a wide variety of areas. To our knowledge however, there has been no systematic…
(More)
Is this relevant?
2007
2007
Untangling Herdan's law and Heaps' law: Mathematical and informetric arguments
Leo Egghe
JASIST
2007
and this formula represents the formulation of Herdan’s law: The logarithm of vocabulary size divided by the logarithm of text…
(More)
Is this relevant?
2004
2004
Effect of Feature Smoothing Methods in Text Classification Tasks
David Vilar
,
Hermann Ney
,
Alfons Juan-Císcar
,
Enrique Vidal
PRIS
2004
The number of features to be considered in a text classification system is given by the size of the vocabulary and this is…
(More)
Is this relevant?
2003
2003
Combination of a hidden tag model and a traditional n-gram model: a case study in czech speech recognition
Pavel Krbec
,
Petr Podveský
,
Jan Hajic
INTERSPEECH
2003
A speech recognition system targeting high inflective languages is described that combines the traditional trigram language model…
(More)
Is this relevant?
2001
2001
Zipf and Heaps Laws' Coefficients Depend on Language
Alexander F. Gelbukh
,
Grigori Sidorov
CICLing
2001
We observed that the coefficients of two important empirical statistical laws of language – Zipf law and Heaps law – are…
(More)
Is this relevant?