Heat map with hierarchical clustering: Multivariate visualization method for corpus-based language studies

 『国立国語研究所論集』(NINJAL Research Papers) というジャーナルに、以下の論文が掲載されました(オープンアクセスですので、こちらからダウンロードすることができます)。手法の解説だけでなく、Rのスクリプトを載せています。

  • Yuichiro Kobayashi (2016). Investigating metadiscourse markers in Asian Englishes: A corpus-based approach. NINJAL Research Papers, 11, 25-36.
  • Abstract
    • An advantage of corpus-based language studies is that global descriptions of linguistic texts can be obtained by examining a broad range of linguistic features. However, multivariate statistical techniques are required to analyze the multiple linguistic features found in a number of texts. This study compared the strengths and weaknesses of several multivariate statistical techniques, thereby demonstrating the effectiveness of using heat map with hierarchical clustering as a powerful method for visualizing multivariate data. Explanations are also provided for how these techniques can be used in the R programming language as well as indicating how the results obtained can be interpreted.