• Publications
  • Influence
CiteSeer x : A Scholarly Big Dataset
TLDR
We propose an approach to CiteSeer x metadata cleaning that incorporates information from an external data source that is substantially cleaner than the entire set. Expand
  • 46
  • 6
  • PDF
Understanding User Satisfaction with Intelligent Assistants
TLDR
Voice-controlled intelligent personal assistants, such as Cortana, Google Now, Siri and Alexa, are increasingly becoming a part of users' daily lives, especially on mobile devices. Expand
  • 100
  • 5
  • PDF
Using Prerequisites to Extract Concept Maps fromTextbooks
TLDR
We present a framework for constructing a specific type of knowledge graph, a concept map from textbooks, which is widely used in the learning sciences. Expand
  • 48
  • 5
  • PDF
Concept Hierarchy Extraction from Textbooks
TLDR
We propose a method for extracting concept hierarchies from textbooks based on Wikipedia based on the knowledge in Wikipedia. Expand
  • 35
  • 5
  • PDF
Predicting User Satisfaction with Intelligent Assistants
TLDR
We propose an automatic method to predict user satisfaction with intelligent assistants that exploits all the interaction signals, including voice commands and physical touch gestures on the device. Expand
  • 77
  • 4
Detecting Good Abandonment in Mobile Search
TLDR
This paper proposes a solution to this problem using gesture interactions, such as reading times and touch actions, as signals for differentiating between good and bad abandonment. Expand
  • 46
  • 4
  • PDF
Towards building a scholarly big data platform: Challenges, lessons and opportunities
TLDR
We introduce a Big Data platform that provides various services for harvesting scholarly information and enabling efficient scholarly applications including citation recommendation and collaborator discovery. Expand
  • 37
  • 3
  • PDF
Near duplicate detection in an academic digital library
TLDR
This paper describes an investigation into the application of scalable simhash and shingle state of the art duplicate detection algorithms for detecting near duplicate documents in the CiteSeerX digital library. Expand
  • 27
  • 3
  • PDF
Unsupervised Ranking for Plagiarism Source Retrieval Notebook for PAN at CLEF 2013
TLDR
We describe a strategy for source retrieval that makes use of an unsupervised ranking method to rank the results returned by a search engine by their similarity with the query document and that only retrieves documents that are likely to be sources of plagiarism. Expand
  • 19
  • 3
  • PDF
Measuring User Satisfaction on Smart Speaker Intelligent Assistants Using Intent Sensitive Query Embeddings
TLDR
Intelligent assistants are increasingly being used on smart speaker devices, such as Amazon Echo, Google Home, Apple Homepod, and Harmon Kardon Invoke with Cortana. Expand
  • 16
  • 2
  • PDF