Abstract

This paper is meant for a heuristic approach for the refinements of corpus based on regular expressions and its possible applications in the field of Opinion Mining. Corpus which is the plural form of ‘corpora’ is nothing but the collection of linguistic data. And here the proposed work is based on a corpus of reviews; more specifically product reviews. The reviews are in the html files which are easily available in popular review sites like Cnet.com. The revolution in information and technologies has given a new era in the development of language industries. The versatility in technological development, along with the translations available in different languages has lead to use of this corpus for specific machine learning mechanism as well as various automatic translation applications. But the prime objective of researchers as well as the naive users is to give a fast developing technique of machine learning systems that should be both exact and effective. Most of the time it becomes a very tedious job to create exact dataset for the work due to the crisis of accurate corpus regarding respective research work. And that is why; we have proposed an algorithm for creating a corpus for opinion mining research field.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.