Publications

Authors

Martin-Hoz, Laura , Yanes-Luis, Samuel , Huerta Cejudo, Jeronimo , Gutierrez-Reina, Daniel , FRANCO ÁLVAREZ, EVELIA

External publication

No

Means

Electronics

Scope

Article

Nature

Científica

JCR Quartile

✗

SJR Quartile

✗

Publication date

28/09/2025

ISI

001593572900001

DOI

10.3390/electronics14193849

Abstract

Assessing teaching behavior is essential for improving instructional quality, particularly in Physical Education, where classroom interactions are fast-paced and complex. Traditional evaluation methods such as questionnaires, expert observations, and manual discourse analysis are often limited by subjectivity, high labor costs, and poor scalability. These challenges underscore the need for automated, objective tools to support pedagogical assessment. This study explores and compares the use of Transformer-based language models for the automatic classification of teaching behaviors from real classroom transcriptions. A dataset of over 1300 utterances was compiled and annotated according to the teaching styles proposed in the circumplex approach (Autonomy Support, Structure, Control, and Chaos), along with an additional category for messages in which no style could be identified (Unidentified Style). To address class imbalance and enhance linguistic variability, data augmentation techniques were applied. Eight pretrained BERT-based Transformer architectures were evaluated, including several pretraining strategies and architectural structures. BETO achieved the highest performance, with an accuracy of 0.78, a macro-averaged F1-score of 0.72, and a weighted F1-score of 0.77. It showed strength in identifying challenging utterances labeled as Chaos and Autonomy Support. Furthermore, other BERT-based models purely trained with a Spanish text corpus like DistilBERT also present competitive performance, achieving accuracy metrics over 0.73 and and F1-score of 0.68. These results demonstrate the potential of leveraging Transformer-based models for objective and scalable teacher behavior classification. The findings support the feasibility of leveraging pretrained language models to develop scalable, AI-driven systems for classroom behavior classification and pedagogical feedback.

Keywords

natural language processing; transformers; classroom behavior classification

Universidad Loyola members

EVELIA FRANCO ÁLVAREZ

A Comparative Study of BERT-Based Models for Teacher Classification in Physical Education