Unexpected Attributed Subgraphs: a Mining Algorithm
Résumé
Graphs are ubiquitous in real-world data, ranging from the study of social interactions to bioinformatics or the modelling of physical systems. These real-world graphs are typically sparse, possibly large and frequently contain additional information in the form of attributes, making them a complex object to understand. Graph summarization techniques can help facilitate the discovery of hidden patterns in underlying data by providing an interesting subset of the interactions and available attributes, which we broadly call a pattern. However, determining what is considered interesting in this context is not straightforward. We address this challenge by designing an interestingness measure based on the information-theoretic measure of Unexpectedness, linking the concepts of relevance and Kolmogorov complexity. We design a pattern mining algorithm to provide a summary of the initial data in the form of a set of unexpected patterns, that is, patterns for which there is a drop between their expected complexity and the observed complexity. Experimental results on five real-world datasets with state-of-the-art methods demonstrate that our method exhibits a small number of diversified patterns, providing a humanreadable summary of the initial attributed graph; we show that our summaries quantitatively outperforms attribute-only and interaction-only baselines as well as other pattern mining methods, reinforcing the need for methods dealing with attributed graphs. We visualize summaries extracted with our method, in order to qualitatively validate their readability.
Domaines
Intelligence artificielle [cs.AI]Origine | Fichiers produits par l'(les) auteur(s) |
---|