Incorporating Generative Artificial Intelligence (GenAI), especially Large Language Models (LLMs), into educational settings presents valuable opportunities to boost the efficiency of educators and enrich the learning experiences of students. A significant portion of the current use of LLMs by educators has involved using conversational user interfaces (CUIs), such as chat windows, for functions like generating educational materials or offering feedback to learners. The ability to engage in real-time conversations with LLMs, which can enhance educators' domain knowledge across various subjects, has been of high value. However, it also presents challenges to LLMs' widespread, ethical, and effective adoption. Firstly, educators must have a degree of expertise, including tool familiarity, AI literacy and prompting to effectively use CUIs, which can be a barrier to adoption. Secondly, the open-ended design of CUIs makes them exceptionally powerful, which raises ethical concerns, particularly when used for high-stakes decisions like grading. Additionally, there are risks related to privacy and intellectual property, stemming from the potential unauthorised sharing of sensitive information. Finally, CUIs are designed for short, synchronous interactions and often struggle and hallucinate when given complex, multi-step tasks (e.g., providing individual feedback based on a rubric on a large scale). To address these challenges, we explored the benefits of transitioning away from employing LLMs via CUIs to the creation of applications with user-friendly interfaces that leverage LLMs through API calls. We first propose a framework for pedagogically sound and ethically responsible incorporation of GenAI into educational tools, emphasizing a human-centred design. We then illustrate the application of our framework to the design and implementation of a novel tool called Feedback Copilot, which enables instructors to provide students with personalized qualitative feedback on their assignments in classes of any size. An evaluation involving the generation of feedback from two distinct variations of the Feedback Copilot tool, using numerically graded assignments from 338 students, demonstrates the viability and effectiveness of our approach. Our findings have significant implications for GenAI application researchers, educators seeking to leverage accessible GenAI tools, and educational technologists aiming to transcend the limitations of conversational AI interfaces, thereby charting a course for the future of GenAI in education.
Read full abstract