Introduction

Learning analytics researchers and practitioners are often faced with the problem of making sense of large amounts of data, including textual data. Among different analysis techniques, several approaches for extraction of key topics and themes contained in the document collection are becoming increasingly popular. While there are many different ways to extract important topics and themes from text corpora, Latent Semantic Analysis (LSA), Probabilistic Latent Semantic Analysis (pLSA) and Latent Dirichlet Allocation (LDA) are some of the most popular ones.

The overall goal of the proposed tutorial is to provide learning analytics researchers with an overview and hands-on experience with the different techniques for extraction of latent topics from text corpora. With the focus on practical use of existing techniques for topic modeling, the objective of this tutorial is to enable learning analytics researchers to gain practical skills in developing topic models for their document corpora using the R programming language and its topic modeling libraries. “What are the important topics in the document collection? How many documents discuss each of the discovered topics? What words best describe each of the discovered topics?” are some of the questions which participants should be able to answer after this tutorial.

UPDATE

Hello everyone,

Thank you very much for your participation today, it was very exciting for us! We really enjoyed the discussions and many ideas in which topic modeling can be applied.

The  slides are available here, if you have any questions or ideas please drop us an email at sjoksimo@sfu.ca or v.kovanovic@ed.ac.uk.

Thanks once again and have a great LAK15!!

Srećko and Vitomir