Automated scoring of L2 spoken English with random forests

 Journal of Pan-Pacific Association of Applied Linguisticsに以下の論文が掲載されました(こちらからダウンロードすることができます)。

  • Yuichiro Kobayashi & Mariko Abe (2016). Automated scoring of L2 spoken English with random forests. Journal of Pan-Pacific Association of Applied Linguistics, 20(1), 55-73.
  • Abstract
    • The purpose of the present study is to assess second language (L2) spoken English using automated scoring techniques. Automated scoring aims to classify a large set of learners' oral performance data into a small number of discrete oral proficiency levels. In automated scoring, objectively measurable features such as the frequencies of lexical and grammatical items are generally used as "exploratory variables" to predict oral proficiency levels, any of which can be used as a “criterion variable” in this study. We have chosen the NICT JLE Corpus, a corpus of 1,281 Japanese EFL learners’ speech productions coded into nine oral proficiency levels (Izumi, Uchimoto, and Isahara, 2004). The nine oral proficiency levels were used as the criterion variables and linguistic features analyzed in Biber (1988) as explanatory variables. We employed random forests (Breiman, 2001), a powerful method for text classification and feature extraction, to predict oral proficiency. As a result of random forests with the out-of-bag error estimate, 60.11% of the productions were correctly classified. Compared to the baseline accuracy of the simplest possible algorithm of always choosing the most frequent level (37.63%), our random forests model improved prediction by 22.48 points. The Pearson product-moment correlation coefficient with human scoring was 0.85. Predictors that showed a clear discrimination of oral proficiency levels were tokens, types, and the frequency of nouns in the order of strength.