Faiza Khan Khattak

Learn More
We propose a general scheme for quality-controlled labeling of large-scale data using multiple labels from the crowd and a " few " ground truth labels from an expert of the field. Expert-labeled instances are used to assign weights to the expertise of each crowd labeler and to the difficulty of each instance. Ground truth labels for all instances are then(More)
This paper is about an ongoing project in which we hypothesize that infant colic has causes that can be illuminated by digging into a large corpus of pediatric notes collected at the New York Presbyterian Hospital. Our ultimate goal is to conduct a large-scale study to understand infant colic and potentially other conditions, through Machine Learning on(More)
Crowd-labeling emerged from the need to label large-scale and complex data, a tedious, expensive, and time-consuming task. But the problem of obtaining good quality labels from a crowd and their integration is still unresolved. To address this challenge, we propose a new framework that automatically combines and boosts bulk crowd labels supported by limited(More)
Crowd-labeling emerged from the need to label large-scale and complex data, a tedious, expensive, and time-consuming task. One of the main challenges in the crowd-labeling task is to control for or determine in advance the proportion of low-quality/malicious labelers. If that proportion grows too high, there is often a phase transition leading to a steep,(More)
  • 1