Distinguishing sequential patterns are sequential patterns that have much higher frequencies in one target group of sequences (concerning a given phenomenon of interest) than in a contrasting group of sequences. Distinguishing sequential patterns are useful for many machine learning tasks, as well as for the explanation and characterization of the phenomenon underlying the target group of sequences. However, previous studies on mining distinguishing sequential patterns did not consider the hierarchical relationship among elements in sequences. To fill the gap, this paper investigates the mining of distinguishing sequential patterns in the presence of concept hierarchies among sequence elements. The associated patterns will be called concept-hierarchy aware distinguishing sequential patterns (hDSPs). After presenting the challenges on mining hDSPs, we present hDSP-Miner, a method with effective pruning techniques, for mining hDSPs. Our empirical study using real-world protein sequences demonstrates that hDSP-Miner is effective and efficient, and it can discover more novel distinguishing sequential patterns than previous algorithms for mining distinguishing sequential patterns.
- Jufo-taso 1
!!ASJC Scopus subject areas
- Management Information Systems
- Information Systems and Management
- Artificial Intelligence