Home > Table of Contents


Proceedings of 2009 International Symposium on Computer Science and Computational Technology (ISCSCT 2009)

Huangshan, China, December 26-28, 2009

Editors: Fei Yu, Guangxue Yue, Jian Shu, Yun Liu

AP Catalog Number: AP-PROC-CS-09CN005

ISBN: 978-952-5726-07-7 (Print), 978-952-5726-08-4 (CD-ROM)

Page(s): 30-34

Efficiently Methods for Embedded Frequent Subtree Mining on Biological Data

Wei Liu, Ling Chen, and Lan Zheng

Full text: PDF


As a technology based on database, statistics and AI, data mining provides biological research a useful information analyzing tool. The key factors which influence the performance of biological data mining approaches are the large-scale of biological data and the high similarities among patterns mined. In this paper, we present an efficient algorithm named IRTM for mining frequent subtrees embedded in biological data. We also advanced a string encoding method for representing the trees, and a scope-list for extending all substrings for frequency test. The IRTM algorithm adopts vertically mining approach, and uses some pruning techniques to further reduce the computational time and space cost. Experimental results show that IRTM algorithm can achieve significantly performance improvement over previous works.

Index Terms

Embedded Frequent Sub Tree; Scope -List; Biological data

Copyright @ 2009 ACADEMY PUBLISHER All rights reserved