Інформаційна технологія перевірки відповідей в інтелектуальній автоматизованій системі контролю знань

K T Kuzma

doi:10.31649/1997-9266-2020-151-4-58-66

Abstract

The process of verification the answers to the open type questions (short text answer, essay) in the testing systems was researched. The analysis of automated systems in which this process is implemented has been carried out. The limitations of their application for control the level of knowledge on technical disciplines were determined. There has been justified the relevance of research on the fuzzy string comparison methods in the problems of verification of answers, submitted in an arbitrary text form. The functional structure of the module for verification such answers in intelligent automated system for control the level of knowledge in the form of a structural step-by-step diagram of the process of testing the input answer was proposed. The computational algorithms of each stage of processing are given; their implementation in C# was executed. The first step is to normalize the words of the answer and the etalon (the correct answer to the question stored in the data-base). The result of the first stage is a string arrays: the first is a set of response words, the second is a set of etalon words (words less than four characters in length are not included. The second step involves the cyclic calling the function of finding the length of the longest common subsequence (LCS) of the words of arrays of the response and the etalon. A block diagram of the LCS calculation procedure based on the recursive algorithm proposed by Hirschberg was presented. Comparison of each word of the input answer with all the words of the etalon, including synonyms, makes it possible to find the length of the LCS, even if the order of the words in the answer and the etalon does not match, which is an advantage of the proposed approach. In the third step, the total indicator of similarity of response and etalon is calculated by finding the sum of LCS of individual words. The last step is to formulate the result of validation of the answer based on the value of the similarity indicator (set depending on requirements: high level of coincidence — 50 %, sufficient — 30 %, low — 10 %). The proposed information technology has been tested for the processing of the answers in an arbitrary text form. A comparison of the obtained results with Levenshtein distance and latent-semantic analysis was performed. The proposed IT gives the best result when checking the answers in which synonymous words are used. The number of false results for the 50 variants of responses of different lengths (from 10 to 200 characters) was 4 %. As a result of the test, the recommended length of the answer and the etalon (maximum 200 characters) was set. Such length provides the higher accuracy. The directions of future research are formed: increasing the efficiency of the algorithm by introducing an additional stage of processing, — determining the general degree of similarity of the answer and the etalon based on the Jacquard coefficient; implementation of an automated intelligent knowledge control system based on client-server technology; forming the result of the test to the relative scale of assessment of the level of knowledge.

Full Text