Nanchang, China May 22 - 24, 2009

Nanchang, China May 22 - 24, 2009

WISA 2009

WISA 2009

Second International Symposium on

Web Information Systems and Applications

Second International Symposium on

Web Information Systems and Applications

Home > Table of Contents

 

Proceedings of the 2nd International Symposium on Web Information Systems and Applications (WISA 2009)

Nanchang, China, May 22-24, 2009

Editors: Fei Yu, Jiexian Zeng, and Guangxue Yue

AP Catalog Number: AP-PROC-CS-09CN001

ISBN: 978-952-5726-00-8 (Print), 978-952-5726-01-5 (CD-ROM)

Page(s): 68-71

Highly Accurate Distributed Classification of Web Documents

JingKuan Song, Hui Gao, LianLi Gao, Yan Fu

Full text: PDF

Abstract

With the rapid growth of internet, it is a scientific challenge and a massive economic need to discover an efficient and accurate text classifier for handling tons of online documents. This paper presents a distributed model for efficient web document classifications. In the model, the distributed text classifiers are trained serially with the weights on the training instances, which are adaptively set according to their previous performances. Based on the distributed model, Unequal Bagging (UBagging), an improved technique of bagging for text classifier is also proposed. Results from the experiments show that our approach could gain higher classification accuracy over traditional centralized text classifiers, and require less memory and computational time.

Index Terms

Text classification; Bagging; Distributed environment; Decision tree; Neural network

Copyright @ 2009 ACADEMY PUBLISHER All rights reserved