Abstract

With the constant growth of volumes of available data, their manual processing stops being possible, giving way to various machine learning models. Modern algorithms do a good job of basic tasks, provided that there is a sufficient amount of training data. However, many modern tasks are much more complicated and are highly specialized, which limits the amount of training data available for training, hindering the performance of fully automatic systems. In this paper, an approach to the task of automated fact extraction from the collections of raw text documents adapted for the lack of training data is presented. The integration of rule-based approaches for specific knowledge domains with generalized, domain-independent machine learning models pre-trained on large volumes of data is discussed. The proposed approach based on the active learning methodology, seeks to reduce the expert’s labor costs required for the efficient generation of extractable fact templates without compromising the system’s performance. The paper also demonstrates the application of the proposed method of fact extraction based on the task of the target audience information search from the unstructured raw descriptions of online courses.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.