Flexible Pattern Matching with Gap-Length and One-Off Conditions
Most of the current algorithms for mining frequent patterns assume that two object subdescriptions are similar if they are equal, but in many real-world problems some other ways to evaluate the similarity are used. Recently, three algorithms (ObjectMiner, STreeDC-Miner and STreeNDC-Miner) for mining frequent patterns allowing similarity functions different from the equality have been proposed. For searching frequent patterns, ObjectMiner and STreeDC-Miner use a pruning property called Downward Closure property, which should be held by the similarity function. For similarity functions that do not meet this property, the STreeNDC-Miner algorithm was proposed. However, for searching frequent patterns, this algorithm explores all subsets of features, which could be very expensive. In this work, we propose a frequent similar pattern mining algorithm for similarity functions that do not meet the Downward Closure property, which is faster than STreeNDC-Miner and loses fewer frequent similar patterns than ObjectMiner and STreeDC-Miner. Also we show the quality of the set of frequent similar patterns computed by our algorithm with respect to the quality of the set of frequent similar patterns computed by the other algorithms, in a supervised classification context.