Iulian Serban

Learn More
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively(More)
This paper introduces the Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique resource for research into building dialogue managers based on neural language models that can make use of large amounts of unlabeled data. The dataset has both(More)
We investigate the task of building open domain, conversational dialogue systems based on large dialogue corpora using generative models. Generative models produce system responses that are autonomously generated word-by-word, opening up the possibility for realistic, flexible interactions. In support of this goal, we extend the recently proposed(More)
Sequential data often possesses hierarchical structures with complex dependencies between sub-sequences, such as found between the utterances in a dialogue. To model these dependencies in a generative framework, we propose a neural networkbased generative architecture, with stochastic latent variables that span a variable number of time steps. We apply the(More)
We investigate evaluation metrics for endto-end dialogue systems where supervised labels, such as task completion, are not available. Recent works in end-to-end dialogue systems have adopted metrics from machine translation and text summarization to compare a model’s generated response to a single target response. We show that these metrics correlate very(More)
Over the past decade, large-scale supervised learning corpora have enabled machine learning researchers to make substantial advances. However, to this date, there are no large-scale questionanswer corpora available. In this paper we present the 30M Factoid QuestionAnswer Corpus, an enormous question answer pair corpus produced by applying a novel neural(More)
We introduce the multiresolution recurrent neural network, which extends the sequence-to-sequence framework to model natural language generation as two parallel discrete stochastic processes: a sequence of high-level coarse tokens, and a sequence of natural language tokens. There are many ways to estimate or learn the high-level coarse tokens, but we argue(More)
In this paper, we analyze neural network-based dialogue systems trained in an end-to-end manner using an updated version of the recent Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words1. This dataset is interesting because of its size, long context lengths, and(More)
A central challenge for implementing quantum computing in the solid state is decoupling the qubits from the intrinsic noise of the material. We investigate the implementation of quantum gates for a paradigmatic, non-Markovian model: a single-qubit coupled to a two-level system that is exposed to a heat bath. We systematically search for optimal pulses using(More)