Introduction to Natural Language Processing in Education and applications for formative assessment

25th April 2022 : 12:45 - 14:00

Category: Seminar

Research Group: Quantitative Methods Hub

Speaker: Owen Henkel (Department of Education, University of Oxford)

Location: Online & in-person - Seminar Room E, Department of Education, or via Zoom

Convener: Ariel Lindorff

Audience: Public

All welcome to join in person. Register to join events online via Zoom

Over the past five years, the broader field of Natural Language Processing (NLP) has undergone a renaissance, driven largely by the emergence of pre-trained, word-embedding-based language models such as BERT and GPT-3, resulting in significant improvement in a variety of core NLP challenges such as sentiment analysis, machine translation, and transcription, the latter two of which have reached human-level performance. While educational applications of NLP have been a topic of research for decades, the limitations of previous NLP techniques had meant that most successful applications had been restricted to narrow domains. However, recent advances in NLP mean that challenges that had been considered prohibitively complex such as interactive chatbots, speech recognition, or automatic grading of complex open-ended responses, may now be tractable.

My specific focus of research is on the potential of NLP to assist in the formative assessment of basic literacy in low-and-middle-income countries (LMICs). In many LMICs, it is challenging to conduct high-quality, formative assessments of children’s literacy due to a variety of factors. As a result, large-scale standardized assessments, which typically consist of silently reading passages and then answering multiple-choice questions is become the de-facto method for nations to assess students’ literacy. This is a problem both because the assessment format is poorly suited for assessing basic literacy, and because the assessments are conducted infrequently and on a small sample of students, meaning the results cannot be used at the classroom level to improve instruction. In the past, more effective approaches to formative literacy assessment (e.g., oral reading, story-retell, short-answer questions), were rarely used because they were substantially more difficult and time-consuming to administer and grade.

However, given the recent advances in NLP and the proliferation of publicly available pre-trained language models, it appears feasible to partially automate the administration and scoring of formative literacy assessment. To test this, I am collaborating with a school network in Ghana to conduct a series of literacy assessments with approximately 500 of their primary school students. Students’ responses will be graded by a mix of experts and crowd workers and will be used to train language models to score student responses similar to how would human raters. The results can be used in conjunction with the school network’s pre-existing reading achievement and student demographic data to investigate both the predictive and convergent validity of open-ended questions compared to traditional measures of reading ability, as well the models’ performance relative to human raters.

Anticipated Agenda

Recent advances in Natural Language Processing (20 min)
Implications for and potential applications in Education (20 min)
NLP and formative literacy assessment: current research and initial findings (30 min)
Questions/discussion (20 min)