ISSN : 1796-2048
Volume : 4    Issue : 5    Date : October 2009

A Multimodal Data Mining Framework for Revealing Common Sources of Spam Images
Chengcui Zhang, Wei-Bang Chen, Xin Chen, Richa Tiwari, Lin Yang, and Gary Warner
Page(s): 313-320
Full Text:
PDF (1,636 KB)

This paper proposes a multimodal framework that clusters spam images so that ones from the
same spam source/cluster are grouped together. By identifying the common sources of spam
images, we can provide evidence in tracking spam gangs. For this purpose, text recognition and
visual feature extraction are performed. Subsequently, a two-level clustering method is applied
where images with visually similar illustrations are first grouped together. Then the clustering result
from the first level is further refined using the textual clues (if applicable) contained in spam images.
Our experimental results show the effectiveness of the proposed framework.

Index Terms
spam image, clustering, multimodal analysis, botnet, computer forensics