The Physics of Text: Ontological Realism in Information Extraction

@inproceedings{Russell2016ThePO,
  title={The Physics of Text: Ontological Realism in Information Extraction},
  author={Stuart J. Russell and Ole Torp Lassen and Justin Uang and Wei Wang},
  booktitle={AKBC@NAACL-HLT},
  year={2016}
}
We propose an approach to extracting information from text based on the hypothesis that text sometimes describes the world. The hypothesis is embodied in a generative probability model that describes (1) possible worlds and the facts they might contain, (2) how an author chooses facts to express, and (3) how those facts are expressed in text. Given text, information extraction is done by computing a posterior over the worlds that might have generated it. As a by-product, this unsupervised… CONTINUE READING