The OKCupid dataset: A very large public dataset of dating site users

@inproceedings{Kirkegaard2016TheOD,
  title={The OKCupid dataset: A very large public dataset of dating site users},
  author={Emil Ole William Kirkegaard and Julius Daugbjerg Bjerrek{\ae}r},
  year={2016}
}
A very large dataset (N=68,371, 2,620 variables) from the dating site OKCupid is presented and made publicly available for use by others. As an example of the analyses one can do with the dataset, a cognitive ability test is constructed from 14 suitable items. To validate the dataset and the test, the relationship of cognitive ability to religious beliefs and political interest/participation is examined. Cognitive ability is found to be negatively related to all measures of religious belief… 

Intelligence and Religiosity among Dating Site Users

We sought to assess whether previous findings regarding the relationship between cognitive ability and religiosity could be replicated in a large dataset of online daters (maximum n = 67k). We found

Self-reported criminal and anti-social behavior on a dating site: the importance of cognitive ability

The relationship between criminal and antisocial (CAS) behaviors and cognitive ability (CA) were examined in a large online sample of dating site users (complete sample n = 68,371). 12 question items

The Negative Intelligence–Religiosity Relation: New and Confirming Evidence

The new analysis showed that the correlation between intelligence and religious beliefs in college and noncollege samples ranged from −.20 to −.23, and one possible interpretation for the IRR is that intelligent people are more likely to use analytic style (i.e., approach problems more rationally).

Not Just a Preference: Reducing Biased Decision-making on Dating Websites

As dating websites are becoming an essential part of how people meet intimate and romantic partners, it is vital to design these systems to be resistant to, or at least do not amplify, bias and

Linguistic features in names and social status: an exploratory study of 1,890 Danish first names

It is concluded that it is possible to train fairly accurate social status predictors from subtle linguistic patterns in names, and it's possible that humans might pick up on such cues to inform social perception when limited data is available.

What the Machine Saw: some questions on the ethics of computer vision and machine learning to investigate human remains trafficking

Analysing the data obtained when 'scraping' image or text relevant to cultural property trafficking of any kind involves the use of machine learning and neural network analysis, the ethics of which are themselves complicated.

The overlapping geography of cognitive ability and chronotype.

It was found that male sex, younger age, residence in a more populous locale, higher cognitive ability and more westward position within the same time zone were associated with later chronotype, but the effect of population on chronotype and latitude on cognitive ability was only present in the USA.

Archiving information from geotagged tweets to promote reproducibility and comparability in social media research

This paper presents a practical solution to sharing social media data with the help of a social science data archive and archived and documented tweet IDs and additional information to improve reproducibility of the initial research while also attending to ethical and legal considerations, and taking into account Twitter’s terms of service in particular.

Finding the traces of behavioral and cognitive processes in big data and naturally occurring datasets

It is argued that big data and naturally occurring datasets are most powerfully used to supplement—not supplant—traditional experimental paradigms in order to understand human behavior and cognition, and it is highlighted that emerging ethical issues related to the collection, sharing, and use of these powerful datasets.

References

SHOWING 1-10 OF 27 REFERENCES

The Relation Between Intelligence and Religiosity

A meta-analysis of 63 studies showed a significant negative association between intelligence and religiosity. The association was stronger for college students and the general population than for

(Un)Available upon Request: Field Experiment on Researchers' Willingness to Share Supplementary Materials

Results of a field experiment in which two hundred e-mails were sent to authors of recent articles in economics that had promised to send the interested reader supplementary material, such as alternative econometric specifications, “upon request,” found authors of published articles were much more likely to share than authors of working papers.

Match makers and deal breakers: analyses of assortative mating in newlywed couples.

Assortative mating in a newlywed sample showed strong similarity in age, religiousness, and political orientation, but little similarity in matrix reasoning, self- and spouse-rated personality, emotional experience and expression, and attachment.

Reassessment of Jewish Cognitive Ability: Within Group Analyses Based on Parental Fluency in Hebrew or Yiddish

The most influential study on the differences in intellectual ability between Jews and gentiles may be Backman’s (1972) analysis of group differences in intellectual ability using the Project Talent

Linear and nonlinear associations between general intelligence and personality in Project TALENT.

It is concluded that nonlinear models can provide incremental detail regarding personality and intelligence associations and how research on intellectually gifted samples may provide a unique way of understanding them.

Behavior Problems and Timing of Menarche: A Developmental Longitudinal Biometrical Analysis Using the NLSY-Children Data

In the major part of this study, MT was used to moderate the developmental trajectory of BP, within a genetically-informed design, and results match previous empirical results in important ways, and also extend those results.