Abstract

With the rise and development of the Internet and the artificial intelligence boom, natural language processing has been greatly developed in various fields and industries, including power industry. Intelligent document understanding, as a sub field of natural language understanding, uses artificial intelligence technology to enable machines to have natural language understanding ability. It has always been the focus of researchers and industry, and it is also the core problem of intelligent semantic interaction. As a common method to expand the sample set, data enhancement technology is an important text and image processing technology. Its core purpose is to use limited data and produce value equivalent to that from much more data. This technology has been widely used in various fields of deep learning. This paper reviews the development of data enhancement technology, lists and discusses five kinds of text enhancement technology paths and their corresponding representative technologies, including back translation, random word replacement, non core word replacement, text enhancement based on context information, and text enhancement technology based on generative language model. It also analyzes the effectiveness of text enhancement technology in the aspects of regularization, transfer learning, improving model robustness, manifold and so on, to achieve the goal of improving the accuracy and effectiveness of document understanding in power industry.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call