Abstract

To solve the problem of information support for the synthesis of new technical solutions, a method of extracting structured data from an array of Russian-language patents is presented. The key features of the invention, such as the structural elements of the technical object and the relationships between them, are considered as information support. The data source addresses the main claim of the invention in the device patent. The unit of extraction is the semantic structure Subject-Action-Object (SAO), which semantically describes the constructive elements. The extraction method is based on shallow parsing and claim segmentation, taking into account the specifics of writing patent texts. Often the excessive length of the claim sentence and the specificity of the patent language make it difficult to efficiently use off-the-shelf tools for data extracting. All processing steps include: segmentation of the claim sentences; extraction of primary SAO structures; construction of the graph of the construct elements f the invention; integration of the data into the domain ontology. This article deals with the first two stages. Segmentation is carried out according to a number of heuristic rules, and several natural language processing tools are used to reduce analysis errors. The primary SAO elements are extracted considering the valences of the predefined semantic group of verbs, as well as information about the type of processed segment. The result of the work is the organization of the domain ontology, which can be used to find alternative designs for nodes in a technical object. In the second part of the article, an algorithm for constructing a graph of structural elements of a separate technical object, an assessment of the effectiveness of the system, as well as ontology organization and the result are considered.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.