• Publications
  • Influence
Biographies or Blenders: Which Resource Is Best for Cross-Domain Sentiment Analysis?
This study proposes an algorithm for automatic estimation of performance loss in the context of cross-domain sentiment classification, and presents and validate several measures of domain similarity specially designed for the sentiment classification task.
Does the Version of the Penn World Tables Matter? An Analysis of the Relationship between Growth and Volatility
The Penn World Tables (PWT) are an important data source for cross-country comparisons in economics. The PWT have undergone several revisions over time. This paper documents how countries' output
The Tree Ensemble Layer: Differentiability meets Conditional Computation
This work introduces a new layer for neural networks, composed of an ensemble of differentiable decision trees, and provides an open-source TensorFlow implementation with a Keras API.
Conditional Random Fields vs. Hidden Markov Models in a biomedical Named Entity Recognition task
This paper applies two popular sequence labeling approaches: Hidden Markov Models (HMMs) and Conditional Random Fields (CRFs) to solve Named Entity Recognition (NER), and exploits different stategies to construct biomedical Named Entity (NE) recognizers which take into account special properties of each approach.
Biomedical Named Entity Recognition: A Poor Knowledge HMM-Based Approach
This paper presents a Hidden Markov Model (HMM)-based biomedical NER system that takes into account only parts-of-speech as an additional feature, which are used both to tackle the problem of nonuniform distribution among biomedical entity classes and to provide the system with an additional information about entity boundaries.
Agent Prioritization for Autonomous Navigation
This work proposes a system to rank agents around an autonomous vehicle (AV) in real time, and shows the utility of combining learned features, via a convolutional neural network, with engineered features designed to capture domain knowledge.
Accelerating Gradient Boosting Machines
This work proposes an Accelerated Gradient Boosting Machine (AGBM) by incorporating Nesterov’s acceleration techniques into the design of GBM and designs a “corrected pseudo residual” that serves as a new target for fitting a weak learner, in order to perform the z-update.
TF Boosted Trees: A Scalable TensorFlow Based Framework for Gradient Boosting
TF Boosted Trees (TFBT) is a new open-sourced frame-work for the distributed training of gradient boosted trees. It is based on TensorFlow, and its distinguishing features include a novel
Do Neighbours Help? An Exploration of Graph-based Algorithms for Cross-domain Sentiment Classification
This paper analyses two existing methods: an optimisation problem and a ranking algorithm for cross-domain sentiment classification and concludes that graph domain representations offer a competitive solution to the domain adaptation problem.
Compact multi-class boosted trees
Two extensions to the standard tree boosting algorithm, layer-by-layer boosting and vector-valued boosting, are described, which allow individual trees to be used as multiclass classifiers, rather than requiring one tree per class, and drastically reduces the model size required for multiclass problems.