Abstract

The article goes on to describe a solution to the problem of information support for the synthesis of new technical systems. The method consists in organizing the ontology of the subject area and filling it with data from the Russian-language patent array. An algorithm is presented for constructing a graph of elements of technical object structures from previously extracted primary semantic SAO (Subject–Action–Object) structures. The extracted bundles are preprocessed by searching for homogeneous sentence members and generating additional case forms. This is followed by linking the prepared SAO-objects into a single graph. The mechanism consists in sequentially transforming the subject and object actants into a set of anchor points from a common vocabulary of terms, followed by the memorization of the relation (predicate) for the identified points. The evaluation of data extraction by the system is carried out: the value of the F1 metric for a strict evaluation is 63 % and for a non-strict 79 %, respectively. The non-strict evaluation takes into account the correctness of the extraction of SAO root elements only. The extracted data is then converted into a subject domain ontology. The ontology scheme as a concept includes the structural elements of technical objects and the relationship between them, as well as supporting information on the invention. The initial content of the ontology is based on the processing of 11, 200 patent documents for inventions. The existing scheme already allows retrieving useful information about alternatives of structural components and communications between them. For example, searching for all elements of a structure in a given invention or tracking relationships. The results suggest that the proposed approach is promising. The authors see further research direction in improving the existing data extraction method and extending the ontology.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call