FLEAT VI has ended
Back To Schedule
Wednesday, August 12 • 4:00pm - 4:25pm
CANCELLED A Stochastic Learning Approach for English Countability Prediction

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

This session has been cancelled.  Some nouns can be treated as countable and uncountable depending on contexts. However, it is sometimes difficult to distinguish them for novice english learners, especially for learners of ESL. This study applies Machine Learning techniques to make it clear when nouns should be treated as countable and when should not. A stochastic learning model like a well-known Bayesian Network is used for the purpose. It is trained to estimate the probability of simultaneous appearance of countable/uncountable nouns with other words. Native english texts taken from British National Corpus are converted into Latent Semantic Index (LSI) vectors with part-of-speech tags and used to train the model. A LSI is an indexing number of words, which becomes same when the different words are used in same or similar contexts and consequently thought as having same or similar meaning. Thus, LSI vectors make the word space compact and reduce ambiguity. Results will show the high probability in some combinations of specific type of words, in other words, the probability becomes high in some contexts. Which means that it is easy to determine whether nouns should be treated as countable or uncountable in the contexts. Contrarily, the low probability means difficult to determine. The information is useful for both teachers and students. Furthermore, the trained model can be applied to machine translation systems. There are several related works. Baldwin and Bold have proposed a clustering method to estimate countability from corpus data in 2003. Nagata et. al. have proposed a method to determine countability of nouns according to their contexts in 2006. However, they used deterministic models and could not estimate the probability of countability. We think the probability is important to know how the context is difficult to determine countability which can be used to teach and learn english.


Junko Tanaka

Kobe University

Wednesday August 12, 2015 4:00pm - 4:25pm EDT
CGIS South S020 (Belfer Case Study Room) 1730 Cambridge St, Cambridge, MA