Abstract

This paper details the development of a new evaluation framework for a text based Conversational Agent (CA). A CA is an intelligent system that handle spoken or/and text based conversations between machine and human. Generally, the lack of evaluation frameworks for CAs effects its development. The idea behind any system’s evaluation is to make sure about the system’s functionalities and to continue development on it. A specific CA has been chosen to test the proposed framework on it; namely ArabChat. The ArabChat is a rule based CA and uses pattern matching technique to handle user’s Arabic text based conversations. The proposed and developed evaluation framework in this paper is natural language independent. The proposed framework is based on the exchange of specific information between ArabChat and user called “Information Requirements”. This information are tagged for each rule in the applied domain and should be exist in a user’s utterance (conversation). A real experiment has been done in Applied Science University in Jordan as an information point advisor for their native Arabic students to evaluate the ArabChat and then evaluating the proposed evaluation framework.

Highlights

  • Different terms can be used to define a system has the ability to handle user conversations such as Conversational Agent (CA), dialog system and chatterbot

  • Where the unserious user who just try to trick the ArabChat, saying something funny or his/her utterances has impolite words. This manual checking has been raised that an 8.267% of users‟ conversations were placed in the second category which might reveal the existence of unserious users who negatively affected the evaluation result

  • CA conversations vary among users even for closed applied domains

Read more

Summary

A General Evaluation Framework for Text Based Conversational Agent

Abstract—This paper details the development of a new evaluation framework for a text based Conversational Agent (CA). The lack of evaluation frameworks for CAs effects its development. The ArabChat is a rule based CA and uses pattern matching technique to handle user’s Arabic text based conversations. The proposed and developed evaluation framework in this paper is natural language independent. The proposed framework is based on the exchange of specific information between ArabChat and user called “Information Requirements”. This information are tagged for each rule in the applied domain and should be exist in a user’s utterance (conversation). A real experiment has been done in Applied Science University in Jordan as an information point advisor for their native Arabic students to evaluate the ArabChat and evaluating the proposed evaluation framework

INTRODUCTION
THE SELECTED CASE STUDY ARABCHAT
THE “ARABCHAT EVALUATOR”
The “Information Requirements”
The “ArabChat Evaluator” framework
EVALUATION ASPECTS RESULTS Evaluation result
THE EVALUATION
Evaluation results and discussion
SUMMARY
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call