Doctoral theses of the School of Science are available in the open access repository maintained by Aalto, Aaltodoc.
Public defence in Computer Science, M.Sc. Charles Koutcheme
Public defence from the Aalto University School of Science, Department of Computer Science.
Title of the thesis: Towards Automated Programming Feedback with Open-Weight Language Models (title changed on 19.1.2026)
Thesis defender: Charles Koutcheme
Opponent: Professor Teemu Roos, University of Helsinki, Finland
Custos: Professor Mikko Kivelä, Aalto University School of Science
Effectively supporting the growing number of students learning to program remains a longstanding challenge. While educators' instruction and guidance are irreplaceable, their ability to provide timely personalized feedback cannot scale in large introductory programming classes.
Recently, advances in large language models (LLMs) have opened the possibility of deploying AI learning assistants. However, the best known models are accessible only via APIs, raising serious concerns about cost, institutional control, and long-term dependency. Open-weight models, which are freely downloadable, offer a lesser known alternative pathway.
This dissertation investigates how open-weight language models can be integrated into educational feedback systems. The thesis focuses particularly on small language models, which can be deployed on limited hardware or even directly on students' personal computers, enabling timely, personalized, adjustable support. The work addresses three central challenges: enabling models to correct student programs in educationally meaningful ways, automatically evaluating feedback quality without human annotation, and improving smaller models' capabilities without expensive human-labeled datasets.
The dissertation makes three contributions addressing these challenges and bridging three disciplines: programming education, software engineering research, and language modeling studies. First, it combines language models with classical program repair techniques, showing how models can fix incorrect student code while preserving correct portions, and how deterministic repair rules can be transferred into probabilistic models. Second, it proposes two feedback quality evaluation frameworks: an LLM-as-a-Judge approach leveraging larger models to score feedback, and a repair-as-proxy method using repair performance to estimate feedback proficiency. A central finding is that program repair capability (which can be automatically measured) correlates effectively with feedback quality. Third, it introduces reinforcement learning techniques and preference optimization methods to improve smaller models' pedagogical abilities without human annotations.
Empirical evaluations show that small, open-weight models can approach the feedback quality of larger proprietary systems. These findings are particularly relevant to practitioners interested in building sustainable, accessible AI-powered feedback systems and educational tools.
Keywords: Programming education, automated feedback, language models, program repair, computing education research, machine learning in education
Contact information:
Thesis available for public display 7 days prior to the defence at .
Doctoral theses of the School of Science