Scaling in words on Twitter

@article{Boknyi2019ScalingIW,
  title={Scaling in words on Twitter},
  author={Eszter Bok{\'a}nyi and D{\'a}niel Kondor and G{\'a}bor Vattay},
  journal={Royal Society Open Science},
  year={2019},
  volume={6}
}
Scaling properties of language are a useful tool for understanding generative processes in texts. We investigate the scaling relations in citywise Twitter corpora coming from the metropolitan and micropolitan statistical areas of the United States. We observe a slightly superlinear urban scaling with the city population for the total volume of the tweets and words created in a city. We then find that a certain core vocabulary follows the scaling relationship of that of the bulk text, but most… Expand
2 Citations
Uncovering the behaviour of road accidents in urban areas
TLDR
It is observed that minor and serious accidents are more frequent in urban areas, whereas fatal accidents aremore likely in rural areas, and the number of accidents in an urban area depends on population size superlinearly, with this superlinear behaviour becoming stronger for lower degrees of severity. Expand
Storywrangler: A massive exploratorium for sociolinguistic, cultural, socioeconomic, and political timelines using Twitter
TLDR
The method of tracking dynamic changes in n-grams can be extended to any temporally evolving corpus, and example use cases including social amplification, the sociotechnical dynamics of famous individuals, box office success, and social unrest are presented. Expand

References

SHOWING 1-10 OF 72 REFERENCES
Diffusion of Lexical Change in Social Media
TLDR
Using a latent vector autoregressive model to aggregate across thousands of words, high-level patterns in diffusion of linguistic change over the United States are identified and support for prior arguments that focus on geographical proximity and population size is offered. Expand
Two Regimes in the Frequency of Words and the Origins of Complex Lexicons: Zipf’s Law Revisited*
TLDR
It is made evident that word frequency as a function of the rank follows two different exponents, ˜(-)1 for the first regime and ™(-)2 for the second. Expand
Race, Religion and the City: Twitter Word Frequency Patterns Reveal Dominant Demographic Dimensions in the United States
TLDR
The findings here validate the concept of demography being represented in OSN language use and show that the traits observed are inherently present in the word frequencies without any previous assumptions about the dataset. Expand
Scaling laws and fluctuations in the statistics of word frequencies
TLDR
This paper combines statistical analysis of large text databases and simple stochastic models to explain the appearance of scaling laws in the statistics of word frequencies and reports a new scaling of the fluctuations around this average (fluctuation scaling analysis). Expand
Race, Religion and the City: Twitter Word Frequency Patterns Reveal Dominant Demographic Dimensions in the United States
TLDR
The findings validate the concept of demography being represented in OSN language use and show that the traits observed are inherently present in the word frequencies without any previous assumptions about the dataset. Expand
The Social Dynamics of Language Change in Online Networks
TLDR
A data set of several million Twitter users is used to show that language change can be viewed as a form of social influence and test whether specific types of social network connections are more influential than others, and finds that tie strength plays an important role. Expand
Languages cool as they expand: Allometric scaling and the decreasing need for new words
TLDR
The annual growth fluctuations of word use has a decreasing trend as the corpus size increases, indicating a slowdown in linguistic evolution following language expansion. Expand
Beyond Word Frequency: Bursts, Lulls, and Scaling in the Temporal Distributions of Words
TLDR
Recurrence patterns of words are well described by a stretched exponential distribution of recurrence times, an empirical scaling that cannot be anticipated from Zipf's law and have implications for other overt manifestations of collective human dynamics. Expand
Cursing in English on twitter
TLDR
This paper examines the characteristics of cursing activity on a popular social media platform - Twitter - involving the analysis of about 51 million tweets and about 14 million users to explore a set of questions that have been recognized as crucial for understanding cursing in offline communications. Expand
The scaling of human interactions with city size
TLDR
It is shown that both the total number of contacts and the total communication activity grow superlinearly with city population size, according to well-defined scaling relations and resulting from a multiplicative increase that affects most citizens. Expand
...
1
2
3
4
5
...