Learn More
In crowdsourcing systems, the interests of contributing participants and system stakeholders are often not fully aligned. Participants seek to learn, be entertained, and perform easy tasks, which offer them instant gratification; system stakeholders want users to complete more difficult tasks, which bring higher value to the crowdsourced application. We(More)
The purpose of this paper is to begin a conversation about the importance and role of confidence estimation in knowledge bases (KBs). KBs are never perfectly accurate, yet without confidence reporting their users are likely to treat them as if they were, possibly with serious real-world consequences. We define a notion of confidence based on the probability(More)
We employ universal schema for slot filling and cold start. In universal schema, we allow each surface pattern from raw text, and each type defined in ontology, i.e. TACKBP slots to represent relations. And we use matrix factorization to discover implications among surface patterns and target slots. First, we identify mentions of entities from the whole(More)
Large-scale author coreference, the problem of ascribing research papers to real-world authors in bibliographic databases, is critical for mining the scientific community. However , traditional pairwise approaches, which measure coreference similarity between pairs of author mentions, scale poorly to large databases; and streaming approaches, which lack the(More)
Guiding principles for selecting the best crowdsourcing methodology for a given information gathering task remain insufficient. This paper contributes additional experimental evidence and analysis to this problem. Our work focuses on a subset of crowdsourcing problems we term expert tasks—tasks that require specific domain knowledge. We experiment with(More)
Cross-sectional imaging has long been employed to examine swallowing in both the sagittal and axial planes. However, data regarding temporal swallow measures in the upright and supine positions are sparse, and none have employed the MBS impairment profile (MBSImP). We report temporal swallow measures, physiologic variables, and swallow safety of upright and(More)
Many modern clustering methods scale well to a large number of data items, N , but not to a large number of clusters, K. This paper introduces PERCH, a new non-greedy algorithm for online hierarchical clustering that scales to both massive N and K—a problem setting we term extreme clustering. Our algorithm efficiently routes new data points to the leaves of(More)
  • 1