The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems

@inproceedings{Lowe2015TheUD,
  title={The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems},
  author={Ryan Lowe and Nissan Pow and Iulian Serban and Joelle Pineau},
  booktitle={SIGDIAL Conference},
  year={2015}
}
This paper introduces the Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique resource for research into building dialogue managers based on neural language models that can make use of large amounts of unlabeled data. The dataset has both the multi-turn property of conversations in the Dialog State Tracking Challenge datasets, and the unstructured nature of interactions from… CONTINUE READING
Highly Influential
This paper has highly influenced 44 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 256 citations. REVIEW CITATIONS
Related Discussions
This paper has been referenced on Twitter 35 times. VIEW TWEETS

Citations

Publications citing this paper.
Showing 1-10 of 178 extracted citations

LSTM based Conversation Models

View 5 Excerpts
Highly Influenced

256 Citations

05010020152016201720182019
Citations per Year
Semantic Scholar estimates that this publication has 256 citations based on the available data.

See our FAQ for additional information.

References

Publications referenced by this paper.
Showing 1-10 of 33 references

Similar Papers

Loading similar papers…