Abstract
Mining textual data is the need of this era. In most text mining applications, side information is associated with the text documents. This side information consists of document origin information, links present in document, user-access behavior which can be retrieved from web logs, and different kinds of non-textual attributes. Such side information may include explanatory data which can be beneficial for text clustering. Conventional text clustering methods are available that perform quite well but do not consider such attributes. As side information consists of meaningful data, it can assist in enriching the quality of clusters by incorporating such side information into clustering process. Nevertheless, not all side information is important. Hence it should be incorporated carefully. Herein, an effective clustering technique is proposed which adds only important side information in the clustering process and determines if incorporating side information improves the cluster quality. The clustering technique is extended to classification problem.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.