ISSN : 1796-203X
Volume : 1    Issue : 4    Date : July 2006

Empirical Analysis of Attribute-Aware Recommender System Algorithms Using Synthetic Data
Karen H. L. Tso and Lars Schmidt-Thieme
Page(s): 18-29
Full Text:
PDF (424 KB)

As the amount of online shoppers grows rapidly, the need of recommender systems for
e-commerce sites are demanding, especially when the number of users and products being
offered online continues to increase dramatically. There have been many ongoing researches on
recommender systems and in investigating recommendation algorithms that could optimize the
recommendation quality. However, adequate and public datasets of users and products have
always been demanding to better evaluate recommender system algorithms. Yet, the amount of
public data, especially data containing adequate content information (attributes) is limited. When
evaluating recommendation algorithms, it is important to observe the behavior of the algorithm as
the characteristic of data varies. Synthetic data would allow the application of systematic changes
on the data which cannot be done with real-life data. Although studies on synthetic data for the use
of recommender systems have been investigated, artificial data with attributes information are rarely
looked into. In this paper, we review public and synthetic data that are applied in the field of
recommender systems. A synthetic data generation methodology that considers attributes will also
be discussed. Furthermore, we present empirical evaluations on existing attributeaware
recommendation algorithms and other state-of-theart algorithms using real-life dataset as well as
variable synthetic data to observe their behavior as the characteristic of data varies. In particular, the
informativeness of attributes is being further investigated with both real-life datasets with
augmented attributes sets as well as synthetic datasets with attributes. We have shown that a
reasonably good overview of the behavior of attribute-aware algorithms can be obtained by using
synthetic data compared to results done with real-life datasets.

Index Terms
synthetic data, recommender systems, collaborative filtering, content-based filtering, attribute-aware