Learn More
Most current dependency parsers presuppose that input words have been morphologically disambiguated using a part-of-speech tagger before parsing begins. We present a transition-based system for joint part-of-speech tagging and labeled dependency parsing with non-projective trees. Experimental evaluation on Chinese, Czech, English and German shows consistent(More)
We describe improvements to our 2008 system that result in a top-performing summa-rization system. The motivating ideas are (1) improve sentence boundary detection to avoid damaging errors in preprocessing; (2) prune sentences that are unlikely to work well in a summary; (3) leverage sentence position to improve update summarization; (4) focus on(More)
Most of the known stochastic sentence generators use syntactically annotated corpora, performing the projection to the surface in one stage. However, in full-fledged text generation, sentence realization usually starts from semantic (predicate-argument) structures. To be able to deal with semantic structures, stochastic generators require semantically(More)
This demonstration presents a high-performance syntactic and semantic dependency parser. The system consists of a pipeline of modules that carry out the to-kenization, lemmatization, part-of-speech tagging, dependency parsing, and semantic role labeling of a sentence. The sys-tem's two main components draw on improved versions of a state-of-the-art(More)
Joint morphological and syntactic analysis has been proposed as a way of improving parsing accuracy for richly inflected languages. Starting from a transition-based model for joint part-of-speech tagging and dependency parsing , we explore different ways of integrating morphological features into the model. We also investigate the use of rule-based(More)
Transition-based dependency parsers are often forced to make attachment decisions at a point when only partial information about the relevant graph configuration is available. In this paper, we describe a model that takes into account complete structures as they become available to rescore the elements of a beam, combining the advantages of transition-based(More)
We present a novel, count-based approach to obtaining inter-lingual word representations based on inverted indexing of Wikipedia. We present experiments applying these representations to 17 datasets in document classification, POS tagging, dependency parsing, and word alignment. Our approach has the advantage that it is simple, computationally efficient and(More)
– Abstract Les corpus sont très utiles pour de nombreuses tâches dans le domaine du traitement automa-tique des langues naturelles. Les corpus annotés syntaxiquement sont devenus une ressource importante en TAL. Ils sont couramment utilisés, par exemple comme banc d'essai pour la génération, l'analyse et la désambigu¨ısation sémantique, et comme source pour(More)