We deal in this paper with anomaly-based host intrusion detection using system call traces produced by a host's kernel. In addition to the sequences, we leverage system call arguments, contextual information and domain level knowledge to produce clusters for each individual system call. These clusters are then used to rewrite process sequences of system calls obtained from kernel logs. The new sequences are then fed to a naïve Bayes supervised classifier (SC2.2) that builds class conditional probabilities from Markov modeling of system call sequences. The results of our proposed two-stage (that is clustering followed by classification) intrusion detection system on the 1999 DARPA dataset from the MIT Lincoln Lab show significant performance improvements in terms of false positive rate, while maintaining a high detection rate when compared with other classifiers. The two-stage classifier fares also better than classification alone with SC2.2 on system calls without arguments and contextual knowledge.