INTRODUCTION: The study of how classroom layout and activities affect learning outcomes of students with different demographics is difficult because it is hard to gather accurate information on the minute by minute progression of every class in a course. Furthermore, the process of data gathering must produce an abundance of data to work with and hence must be automated. OBJECTIVES: A machine learning model trained on images of a classroom and thus capable of accurately labeling the classroom layout and activity of many thousands of images much faster and cheaper than employing a human. METHODS: Transfer learning can allow for preexisting computer vision models to be retrained on a smaller, more specific dataset in order to still achieve a highly accurate result. RESULTS: In the case of the classroom layout, the final model achieved an accuracy of 97% on a four category classification. And for detecting the classroom activity, after experimentation with several different versions that could work on a very small sample sizes, the best model achieved an accuracy of 86.17%. CONCLUSION: In addition to showing that using computer vision to determine human activities is possible albeit more difficult than layouts of inanimate objects such as classroom desks, the study shows the differences between the use of self-supervised learning techniques and data augmentation techniques in order to overcome the problem of small training data-sets.
Read full abstract