Achint Oommen Thomas

Learn More
Detection of abusive language in user generated online content has become an issue of increasing importance in recent years. Most current commercial methods make use of blacklists and regular expressions, however these measures fall short when contending with more subtle, less ham-fisted examples of hate speech. In this work, we develop a machine learning(More)
Unlabeled samples can be intelligently selected for labeling to minimize classification error. In many real-world applications, a large number of unlabeled samples arrive in a streaming manner, making it impossible to maintain all the data in a candidate pool. In this work, we focus on binary classification problems and study selective labeling in data(More)
Many large Internet websites are accessed by users anonymously, without requiring registration or logging-in. However, to provide personalized service these sites build anonymous, yet persistent, user models based on repeated user visits. Cookies, issued when a web browser first visits a site, are typically employed to anonymously associate a website visit(More)
In this paper we explore the potential of handwriting for use in CAPTCHAs. A synthetic handwriting generation method is presented, where the generated textlines need to be as close as possible to human handwriting without being writer-specific. The primary application of such a synthetic generator is in the design of handwritten CAPTCHAs (Completely(More)
Large scale retrieval of handwritten documents has primarily been focused around searching a query text in the OCR’ed transcription of the document images, which provides a limited view of the complete search process. Recent research advances have led to a number of content based retrieval techniques which expand the search scope to document content level(More)
Automated recognition of unconstrained handwriting continues to be a challenging research task. In contrast to the traditional role of handwriting recognition in applications such as postal automation and bank check reading, in this paper, we explore the use of handwriting recognition in designing CAPTCHAs for cyber security. CAPTCHAs (Completely Automatic(More)
Cancelable biometric systems are gaining in popularity for use in person authentication for applications where the privacy and security of biometric templates are important considerations. A variety of approaches have been proposed in the literature. In this work, we have chosen two (a registration based and a registration free) techniques and performed a(More)
Online services which allow users to contribute content and interact remotely over the internet in some manner are common today. Many of these services, like spam control for blogs and email account sign-up, require that they be accessed only by humans and not machines (automated scripts or bots). One method of differentiating between humans and bots is by(More)
Interactive websites use text-based Captchas to prevent unauthorized automated interactions. These Captchas must be easy for humans to decipher while being difficult to crack by automated means. In this work we present a framework for the systematic study of Captchas along these two competing objectives. We begin by abstracting a set of distortions that(More)