A Simple Local n-gram Ensemble for Authorship Verification

The authorship verification task requires deciding whether a given test document was written by the same author as a training set. For my attempt I tested a simple voting ensemble of local (character) n-gram methods, using a grid search to choose parameters. This results in a method that requires little preconfiguration and can be applied to any language with a concept of characters. The method itself is quite fast, however training is slow with the large number of attempted parameter… CONTINUE READING
