Learn More
This paper describes a number of log-linear parsing models for an automatically extracted lexicalized grammar. The models are " full " parsing models in the sense that probabilities are defined for complete parses, rather than for independent events derived by decomposing the parse tree. Discriminative training is used to estimate the models, which requires(More)
This paper describes the role of supertagging in a wide-coverage CCG parser which uses a log-linear model to select an analysis. The supertagger reduces the derivation space over which model estimation is performed, reducing the space required for discriminative training. It also dramatically increases the speed of the parser. We show that large increases(More)
This paper shows how to construct semantic representations from the derivations produced by a wide-coverage CCG parser. Unlike the dependency structures returned by the parser itself , these can be used directly for semantic interpretation. We demonstrate that well-formed semantic representations can be produced for over 97% of the sentences in unseen WSJ(More)
Named Entity Linking (nel) grounds entity mentions to their corresponding node in a Knowledge Base (kb). Recently, a number of systems have been proposed for linking entity mentions in text to Wikipedia pages. Such systems typically search for candidate entities and then disambiguate them, returning either the best candidate or nil. However, comparison has(More)
A key question facing the parsing community is how to compare parsers which use different grammar formalisms and produce different output. Evaluating a parser on the same resource used to create it can lead to non-comparable accuracy scores and an over-optimistic view of parser performance. In this paper we evaluate a CCG parser on DepBank, and demonstrate(More)
Automated conversion has allowed the development of wide-coverage corpora for a variety of grammar formalisms without the expense of manual annotation. Analysing new languages also tests formalisms, exposing their strengths and weaknesses. We present Chinese CCGbank, a 760,000 word corpus annotated with Combinatory Categorial Grammar (CCG) derivations,(More)