JOURNAL OF COMPUTERS (JCP)
ISSN : 1796-203X
Volume : 3    Issue : 10    Date : October 2008

Conversation Extraction in Dynamic Text Message Stream
Le Wang,Yan Jia, and Yingwen Chen
Page(s): 86-93
Full Text:
PDF (552 KB)


Abstract
Text message stream which is produced by Instant Messager and Internet Relay Chat poses
interesting and challenging problems for information technologies. It is beneficial to extract the
conversations in this kind of chatting message stream for information management and knowledge
finding. However, the data in text message stream are usually very short and incomplete, and it
requires efficiency to monitor thousands of continuous chat sessions. Many existing text mining
methods encounter challenges. This paper focuses on the conversation extraction in dynamic text
message stream. We design the dynamic representation for messages to combine the text content
information and linguistic feature in message stream. A memory structure of reversed maximal
similar relationship is developed for renewable assignments when grouping messages into
conversations. We finally propose a double time window algorithm based on above methods to
extract conversations in dynamic text message stream. Experiments on a real dataset shows that
our method outperforms two baseline methods introduced in a recent related paper about 47% and
15% in terms of F measure respectively.

Index Terms
text message, conversation extraction, content similarity, linguistic feature