Robert Lothian

Learn More
Feature selection for unsupervised tasks is particularly challenging, especially when dealing with text data. The increase in online documents and email communication creates a need for tools that can operate without the supervision of the user. In this paper we look at novel feature selection techniques that address this need. A distributional similarity(More)
Latent Semantic Indexing (LSI) has been shown to be effective in recovering from synonymy and polysemy in text retrieval applications. However, since LSI ignores class labels of training documents, LSI generated representations are not as effective in classification tasks. To address this limitation, a process called ‘sprinkling’ is presented. Sprinkling is(More)
This paper looks at feature selection for ordinal text classification. Typical applications are sentiment and opinion classification, where classes have relationships based on an ordinal scale. We show that standard feature selection using Information Gain (IG) fails to identify discriminatory features, particularly when they are distributed over multiple(More)
We present a novel approach to mine word similarity in Textual Case Based Reasoning. We exploit indirect associations of words, in addition to direct ones for estimating their similarity. If word A co-occurs with word B, we say A and B share a first order association between them. If A co-occurs with B in some documents, and B with C in some others, then A(More)
Problem solving with experiences that are recorded in text form requires a mapping from text to structured cases, so that case comparison can provide informed feedback for reasoning. One of the challenges is to acquire an indexing vocabulary to describe cases. We explore the use of machine learning and statistical techniques to automate aspects of this(More)
The Robert Gordon University (RGU) participated in the Opinion Retrieval Task of the Trec 2007 Blog Track. At the core of the system we developed is a set of training documents labeled with respect to opinion. These documents are used to train a classifier in order to classify the documents that are relevant to the given Trec topics. However, a major(More)
OBJECTIVE Patterns of successive saccades and fixations (scan paths) that are made while viewing images are often spatially restricted in schizophrenia, but the relation with cannabis-induced psychosis has not been examined. We used higher-order statistical methods to examine spatiotemporal characteristics of scan paths to determine whether viewing(More)