Home Author Index Search Volume 1 May 2009 ISSN 1797-9617

International Journal of

Recent Trends in Engineering

Home > Vol. 1, No. 1


International Journal of Recent Trends in Engineering (IJRTE)

ISSN 1797-9617

Volume 1, Number 1, May 2009

Issue on Computer Science

Page(s): 498-500

English to Tamil Transliteration using WEKA

Vijaya MS, Ajith VP, Shivapratap G, and Soman KP

Full text: PDF


Machine transliteration has gained prime importance as a supporting tool for Machine translation and cross language information retrieval especially when proper names and technical terms are involved. The performance of machine translation and cross-language information retrieval depends extremely on accurate transliteration of named entities. Hence the transliteration model must aim to preserve the phonetic structure of words as closely as possible. In this paper, the transliteration problem is modeled as classification problem and trained using C4.5 decision tree classifier, in WEKA Environment. The training was implemented with features extracted from a parallel corpus. This technique was demonstrated for English to Tamil Transliteration and achieved exact Tamil transliterations for 84.82% of English names. Possible equivalent transliterations were also generated by the model. It is found that the transliteration accuracy is increased when the top five ranked transliterations were considered.

Index Terms

Transliteration, Sequence labeling, alignment, decision tree, WEKA

Published by Academy Publisher in cooperation with the ACEEE

@ Copyright 2009 ACADEMY PUBLISHER All rights reserved