Towards Wide-Coverage Semantic Interpretation


Wide-coverage and robust NLP techniques always seemed to go hand in hand with shallow analyses. This was certainly true a couple of years ago, but the state-of-the-art in stochastic approaches has advanced considerably and nowadays there are sophisticated parsers available achieving high coverage and producing accurate syntactic analyses. It seems we have finally reached a stage in NLP where we can apply well known techniques of formal and computational semantics to a larger scale, and get a detailed semantic analysis from a wide-coverage parser. A proof of concept of this idea was demonstrated in [BCS04], with a coverage of over 95% on newspaper texts. In this paper we discuss the further developments in this work, generating semantic representations for sentences or small texts, showing how we can calculate background knowledge required for reasoning, and performing inferences using state-of-the-art theorem provers and model builders. The semantic representation language that we will use is a first-order language, arguing that given the current state of automated deduction, any language with more expressive power (such as second or higher-order logic) cannot be used efficiently to perform inference tasks. There are however highly sophisticated inference tools for first-order logic available which we will use in our work. Despite the tradition in formal semantics to use higher-order logics, firstorder logic is able to cover a (perhaps surprisingly) large variety of interesting natural language phenomena. The language we are going to adopt is developed in Discourse Representation Theory (DRT), closed under a translation to first-order logic, and is described in Section 2. The choice for DRT is motivated by its impressive theoretical coverage of linguistic phenomena [KR93, VdS92]. Next, of course, we need a grammar formalism suitable for computational semantics (i.e. one that is able to produce fine-grained syntactic analyses).

