David J. Weir

Learn More
We consider the structural descriptions produced by various grammatical formalisms in terms of the complexity of the paths and the relationship between paths in the sets of structural descriptions that each system can generate. In considering the relationships between formalisms, we show that it is useful to abstract away from the details of the formalism,(More)
This work investigates the variation in a word’s distributionally nearest neighbours with respect to the similarity measure used. We identify one type of variation as being the relative frequency of the neighbour words with respect to the frequency of the target word. We then demonstrate a three-way connection between relative frequency of similar words, a(More)
This work is concerned with distinguishing different semantic relations which exist between distributionally similar words. We compare a novel approach based on training a linear Support Vector Machine on pairs of feature vectors with state-of-the-art methods based on distributional similarity. We show that the new supervised approach does better even when(More)
There is currently considerable interest among computational linguists in grammatical formalisms with highly restricted generative power. This paper concerns the relationship between the class of string languages generated by several such formalisms, namely, combinatory categorial grammars, head grammars, linear indexed grammars, and tree adjoining(More)
Automatic classification of sentiment is important for numerous applications such as opinion mining, opinion summarization, contextual advertising, and market analysis. Typically, sentiment classification has been modeled as the problem of training a binary classifier using reviews annotated for positive or negative sentiment. However, sentiment is(More)
We have attempted to quantify the influence of clinical, radiological and prosthetic design factors upon flexion following knee replacement. Our study examined the outcome following 101 knee replacements performed in two prospective randomized trials using similar cruciate retaining implants. Multivariate analyses, after adjusting for age, sex, diagnosis(More)
In this paper we present a scheme to extend a recognition algorithm for Context-Free Grammars (CFG) that can be used to derive polynomial-time recognition algorithms for a set of formalisms that generate a superset of languages generated by CFG. We describe the scheme by developing a Cocke-Kasami-Younger (CKY)-like pure bottom-up recognition algorithm for(More)
Techniques that exploit knowledge of distributional similarity between words have been proposed in many areas of Natural Language Processing. For example, in language modeling, the sparse data problem can be alleviated by estimating the probabilities of unseen co-occurrences of events from the probabilities of seen co-occurrences of similar events. In other(More)
We describe a sentiment classification method that is applicable when we do not have any labeled data for a target domain but have some labeled data for multiple other domains, designated as the source domains. We automatically create a sentiment sensitive thesaurus using both labeled and unlabeled data from multiple source domains to find the association(More)