Share This Author
SUMSS: a wide-field radio imaging survey of the southern sky – II. The source catalogue
This paper is the second in a series describing the Sydney University Molonglo Sky Survey (SUMSS) being carried out at 843 MHz with the Molonglo Observatory Synthesis Telescope (MOST). The survey…
Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models
This article describes a number of log-linear parsing models for an automatically extracted lexicalized grammar and develops a new model and efficient parsing algorithm which exploits all derivations, including CCG's nonstandard derivations.
Learning multilingual named entity recognition from Wikipedia
Evaluating Entity Linking with Wikipedia
From distributional to semantic similarity
- J. Curran
- Computer Science
This dissertation describes how to extract contexts from a corpus of over 2 billion words and introduces a new context-weighted approximation algorithm with bounded complexity in context vector size that significantly reduces the system runtime with only a minor performance penalty.
Linguistically Motivated Large-Scale NLP with C&C and Boxer
An NLP system which is based on syntactic and semantic formalisms from theoretical linguistics, and which is used to analyse the entire Gigaword corpus in less than 5 days using only 18 processors, represents a break-through in NLP technology.
Parsing the WSJ Using CCG and Log-Linear Models
A parallel implementation of the L-BFGS optimisation algorithm is described, which runs on a Beowulf cluster allowing the complete Penn Treebank to be used for estimation and a new efficient parsing algorithm for CCG which maximises expected recall of dependencies is developed.
Improvements in Automatic Thesaurus Extraction
An approximation algorithm is proposed, based on canonical attributes and coarse- and fine-grained matching, that reduces the time complexity and execution time of thesaurus extraction with only a marginal performance penalty.
The Importance of Supertagging for Wide-Coverage CCG Parsing
This paper describes the role of supertagging in a wide-coverage CCG parser which uses a log-linear model to select an analysis and shows that large increases in speed can be obtained by tightly integrating the supertagger with the CCG grammar and parser.
Named Entity Recognition in Wikipedia
- Dominic Balasuriya, Nicky Ringland, J. Nothman, T. Murphy, J. Curran
- Computer SciencePWNLP@IJCNLP
- 7 August 2009
This first NER evaluation on a Wikipedia gold standard (WG) corpus finds that an automatic annotation of Wikipedia has high agreement with WG and, when used as training data, outperforms newswire models by up to 7.7%.