JOURNAL OF SOFTWARE (JSW)

ISSN : 1796-217X

Volume : 4 Issue : 1 Date : February 2009

**Fuzzy Clustering Algorithm based on Factor Analysis and its Application to Mail Filtering**

Jingtao Sun, Qiuyu Zhang, and Zhanting Yuan

Page(s): 58-64

Full Text: PDF (165 KB)

**Abstract**

Aim at the faults of Dynamic Clustering Algorithm based on Fuzzy Equation Matrix, we raise a fuzzy

clustering algorithm based on factor analysis, which it combines the technology of reducing

dimension using factor analyses method. The algorithm will deal with the sample collections

before fuzzy clustering, which enlarge the scale of using dynamic clustering algorithm to resolve

practical problems. All these show that the algorithm has a strong capability of concluding and

abstracting through being applied to E-mail filtering. At the same time, we also make an experiment

in our optional database. The experiment result verifies that the algorithm recall rate is 87.3 % in the

mail filtering, which is higher than the SVM’s 80.1%, Naïve Bayes’s 61.7%, and KNN’s 73.2%

respectively. The experiments show that the new algorithm has better recall rate and error rate.

**Index Terms**

factor analysis, fuzzy clustering, fuzzy equivalence relation, Spam Filtering

ISSN : 1796-217X

Volume : 4 Issue : 1 Date : February 2009

Page(s): 58-64

Full Text: PDF (165 KB)

clustering algorithm based on factor analysis, which it combines the technology of reducing

dimension using factor analyses method. The algorithm will deal with the sample collections

before fuzzy clustering, which enlarge the scale of using dynamic clustering algorithm to resolve

practical problems. All these show that the algorithm has a strong capability of concluding and

abstracting through being applied to E-mail filtering. At the same time, we also make an experiment

in our optional database. The experiment result verifies that the algorithm recall rate is 87.3 % in the

mail filtering, which is higher than the SVM’s 80.1%, Naïve Bayes’s 61.7%, and KNN’s 73.2%

respectively. The experiments show that the new algorithm has better recall rate and error rate.