|
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
| Volume 141 - Issue 1 |
| Published: May 2016 |
| Authors: Jumi Sarmah, Shikhar Kr. Sarma |
10.5120/ijca2016909488
|
Jumi Sarmah, Shikhar Kr. Sarma . Decision Tree based Supervised Word Sense Disambiguation for Assamese. International Journal of Computer Applications. 141, 1 (May 2016), 42-48. DOI=10.5120/ijca2016909488
@article{ 10.5120/ijca2016909488,
author = { Jumi Sarmah,Shikhar Kr. Sarma },
title = { Decision Tree based Supervised Word Sense Disambiguation for Assamese },
journal = { International Journal of Computer Applications },
year = { 2016 },
volume = { 141 },
number = { 1 },
pages = { 42-48 },
doi = { 10.5120/ijca2016909488 },
publisher = { Foundation of Computer Science (FCS), NY, USA }
}
%0 Journal Article
%D 2016
%A Jumi Sarmah
%A Shikhar Kr. Sarma
%T Decision Tree based Supervised Word Sense Disambiguation for Assamese%T
%J International Journal of Computer Applications
%V 141
%N 1
%P 42-48
%R 10.5120/ijca2016909488
%I Foundation of Computer Science (FCS), NY, USA
Word Sense Disambiguation (WSD) aims to disambiguate the words which have multiple sense in a context automatically. Sense denotes the meaning of a word and the words which have various meanings in a context are referred as ambiguous words. WSD is vital in many important Natural Language Processing tasks like MT, IR, TC, SP etc. This research paper attempts to propose a supervised Machine Learning approach- Decision Tree for Word Sense Disambiguation task in Assamese language. A Decision Tree is decision model flow-chart like tree structure where each internal node denotes a test, each branch represents result of a test and each leaf holds a sense label. J48 a Java implementation of C4.5 decision tree algorithm is taken for experimentation in our case. A few polysemous words with different real occurrences in Assamese text with manual sense annotation was collected as the training and test dataset. DT algorithm produces average F-measure of .611 when 10-fold crossvalidation evaluation was performed on 10 Assamese ambiguous words.