Fidelis Assis

Learn More
This paper discusses the design decisions underlying the CRM114 Discriminator software, how it can be configured as a spam filter, and what we may glean from the preliminary TREC 2005 results. Unlike most other filters, CRM114 is not a fixed-purpose antispam filter; rather, it's a general purpose language meant to expedite the creation of text filters. The(More)
Spam filtering is a text categorization task that has attracted significant attention due to the increasingly huge amounts of junk email on the Internet. While current best-practice systems use Naive Bayes filtering and other probabilistic methods, we propose using a statistical , but non-probabilistic classifier based on the Winnow algorithm. The feature(More)
OSBF­Lua is a C module for the Lua language which implements a Bayesian classifier enhanced with Orthogonal Sparse Bigrams ­ OSB ­ for feature extraction and Exponential Differential Document Count ­ EDDC – for feature selection. These two techniques, combined with the new training method introduced for TREC 2006 produce a highly accurate filter, yet very(More)
  • 1