Tamara Polajnar

Learn More
Extracting general or intermediate level terms is a relevant problem that has not received much attention in literature. Current approaches for term extraction rely on contrastive corpora to identify domainspecific terms, which makes them better suited for specialised terms, that are rarely used outside of the domain. In this work, we propose an alternative(More)
Datasets that are subjectively labeled by a number of experts are becoming more common in tasks such as biological text annotation where class definitions are necessarily somewhat subjective. Standard classification and regression models are not suited to multiple labels and typically a preprocessing step (normally assigning the majority class) is(More)
In recent years, following the rapid development in the Semantic Web and Knowledge Management research, ontologies have become more in demand in Natural Language Processing. An increasing number of systems use ontologies either internally, for modelling the domain of the application, or as data structures that hold the output resulting from the work of the(More)
Distributional semantic models (DSMs) have been effective at representing semantics at the word level, and research has recently moved on to building distributional representations for larger segments of text. In this paper, we introduce novel ways of applying context selection and normalisation to vary model sparsity and the range of values of the DSM(More)
Several compositional distributional semantic methods use tensors to model multi-way interactions between vectors. Unfortunately, the size of the tensors can make their use impractical in large-scale implementations. In this paper, we investigate whether we can match the performance of full tensors with low-rank approximations that use a fraction of the(More)
The non-parametric deterministic Support Vector Machines (SVMs) produce high levels of performances in text classification. This article offers a much needed evaluation of the Gaussian Process (GP) classifier, as a non-parametric probabilistic analogue to SVMs, which has been rarely applied to text classification. We provide an extensive experimental(More)
When undergoing medical treatment in combination with extended stays in hospitals, children have been frequently found to develop an interest in their condition and the course of treatment. A natural means of searching for related information would be to use a web search engine. The medical domain, however, imposes several key challenges on young and(More)
The Internet plays an important role in people’s daily lives. This is not only true for adults, but also holds for children; however, current web search engines are designed with adult users and their cognitive abilities in mind. Consequently, children face considerable barriers when using these information systems. In this work, we demonstrate the use of(More)
This article introduces RELPRON, a large data set of subject and object relative clauses, for the evaluation of methods in compositional distributional semantics. RELPRON targets an intermediate level of grammatical complexity between content-word pairs and full sentences. The task involves matching terms, such as “wisdom,” with representative properties,(More)