Identifying Real or Fake Articles: Towards better Language Modeling

@inproceedings{Badaskar2008IdentifyingRO,
  title={Identifying Real or Fake Articles: Towards better Language Modeling},
  author={Sameer Badaskar and Sachin Agarwal and Shilpa Arora},
  booktitle={IJCNLP},
  year={2008}
}
The problem of identifying good features for improving conventional language models like trigrams is presented as a classification task in this paper. The idea is to use various syntactic and semantic features extracted from a language for classifying between real-world articles and articles generated by sampling a trigram language model. In doing so, a good accuracy obtained on the classification task implies that the extracted features capture those aspects of the language that a trigram… CONTINUE READING