Learn More
Detection of abusive language in user generated online content has become an issue of increasing importance in recent years. Most current commercial methods make use of blacklists and regular expressions, however these measures fall short when contending with more subtle, less ham-fisted examples of hate speech. In this work, we develop a machine learning(More)
Unlabeled samples can be intelligently selected for labeling to minimize classification error. In many real-world applications, a large number of unlabeled samples arrive in a streaming manner, making it impossible to maintain all the data in a candidate pool. In this work, we focus on binary classification problems and study selective labeling in data(More)
CAPTCHAs (completely automated public Turing test to tell computers and humans apart) are in common use today as a method for performing automated human verification online. The most popular type of CAPTCHA is the text recognition variety. However, many of the existing printed text CAPTCHAs have been broken by web-bots and are hence vulnerable to attack. We(More)
Many large Internet websites are accessed by users anonymously, without requiring registration or logging-in. However, to provide personalized service these sites build anonymous, yet persistent, user models based on repeated user visits. Cookies, issued when a web browser first visits a site, are typically employed to anonymously associate a website visit(More)
In this paper we explore the potential of handwriting for use in CAPTCHAs. A synthetic handwriting generation method is presented, where the generated textlines need to be as close as possible to human handwriting without being writer-specific. The primary application of such a synthetic generator is in the design of handwritten CAPTCHAs (Completely(More)
Cancelable biometric systems are gaining in popularity for use in person authentication for applications where the privacy and security of biometric templates are important considerations. A variety of approaches have been proposed in the literature. In this work, we have chosen two (a registration based and a registration free) techniques and performed a(More)
Automated recognition of unconstrained handwriting continues to be a challenging research task. In contrast to the traditional role of handwriting recognition in applications such as postal automation and bank check reading, in this paper, we explore the use of handwriting recognition in designing CAPTCHAs for cyber security. CAPTCHAs (Completely Automatic(More)
Interactive websites use text-based Captchas to prevent unauthorized automated interactions. These Captchas must be easy for humans to decipher while being difficult to crack by automated means. In this work we present a framework for the systematic study of Captchas along these two competing objectives. We begin by abstracting a set of distortions that(More)
Online services which allow users to contribute content and interact remotely over the internet in some manner are common today. Many of these services, like spam control for blogs and email account sign-up, require that they be accessed only by humans and not machines (automated scripts or bots). One method of differentiating between humans and bots is by(More)