Abstract

The article examines an example of a system in which a large number of short texts are generated. In it, participants create strategic planning documents, within which key performance indicators are determined. The formulations of key performance indicators form a data set consisting of short texts. Within the framework of this system, there is an urgent task of forming and updating a classifier based on this set. A solution to this problem is presented using the fuzzy interactive clustering method. This method allows expert to perform clustering sets of short texts, issuing reverse communication based on the results of each step interactive clustering. Collection procedure reverse does not imply any connection availability of an expert special knowledge about work neural network and is assembled in human-readable form matrices reverse communications. Such an approach has advantages over clustering methods requiring adjustments metaparameters algorithm not related directly with the clustering results. Also important advantage the proposed method is opportunity realize clustering sets data related to various language domains that do not match the domain on which was produced education language models, due to proposed extension method dictionary language models This property allows use the proposed algorithm in a narrow way specialized domains, as well as in domains that do not allow you to obtain a full-fledged corpus of texts for yourself training language models.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.