• Publications
  • Influence
Learning to Classify Text Using Support Vector Machines: Methods, Theory, and Algorithms by Thorsten Joachims
TLDR
A theory for automatic learning of text categorization models that has been repeatedly shown to be very successful and is based on a rather rough linguistic generalization of a language-dependent task: topic text classification (TC).
Exploiting Syntactic and Shallow Semantic Kernels for Question Answer Classification
TLDR
The experiments suggest that syntactic information helps tasks such as question/answer classification and that shallow semantics gives remarkable contribution when a reliable set of PASs can be extracted, e.g. from answers.
Structured Lexical Similarity via Convolution Kernels on Dependency Trees
TLDR
This paper defines efficient and powerful kernels for measuring the similarity between dependency structures, whose surface forms of the lexical nodes are in part or completely different, and confirms the benefit of semantic smoothing for dependency kernels.
Semantic Kernels for Text Classification Based on Topological Measures of Feature Similarity
TLDR
A new approach to the design of semantic smoothing kernels for text classification that implicitly encode a superconcept expansion in a semantic network using well-known measures of term similarity.
Automatic induction of FrameNet lexical units
TLDR
This paper investigates the applicability of distributional and WordNet-based models on the task of lexical unit induction, i.e. the expansion of FrameNet with new lexical units, and shows good level of accuracy and coverage, especially when combined.
Tree Kernels for Semantic Role Labeling
TLDR
Several kernel functions to model parse tree properties in kernel-based machines, for example, perceptrons or support vector machines are proposed and tree kernels allow for a general and easily portable feature engineering method which is applicable to a large family of natural language processing tasks.
Classification of musical genre: a machine learning approach
TLDR
This work investigates the impact of machine learning algorithms in the development of automatic music classification models aiming to capture genres distinctions by first creating a medium-sized collection of examples for widely recognized genres and then evaluating the performances of different learning algorithms.
Complex Linguistic Features for Text Classification: A Comprehensive Study
TLDR
Phrases, word senses and syntactic relations derived by Natural Language Processing techniques were observed ineffective to increase retrieval accuracy.
KeLP at SemEval-2016 Task 3: Learning Semantic Relations between Questions and Answers
TLDR
This paper describes the KeLP system participating in the SemEval-2016 Community Question Answering (cQA) task, which outperforms all the other systems with respect to all theother challenge metrics.
Parsing engineering and empirical robustness
TLDR
An empirical definition of robustness based on the notion of performance is proposed and a framework for controlling the parser robustness in the design phase is presented.
...
1
2
3
4
5
...