Abstract

The use of linguistic analysis based on the accumulated experience in computer linguistics allows simplifying processing of huge amounts of text information and opens up new opportunities for documents processing automating. The problem of finding suitable tools, adapting them to work with texts in the Russian language, and integrating with each other makes difficult to use them both for research and in industrial systems. We present an open source Java framework (TAWT) that provides convenient tools and data structures for the main stages of text analysis which meets modern requirements for performance, reliability, project assembly tools, etc. Examples of automating some technical documentation preparation tasks demonstrate the use of the framework, TAWT can be useful for developers of research tools or applied software for implementing new functions or improving the quality of text processing, as well as for developers of automated tools to reduce routine tasks working with documentation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.