Abstract

Background/Objectives: Authorship Attribution is one of the text classification methods. It is useful to find out the author with a given set of text based on author writing style. Methods/Statistical Analysis: Various methods that include Decision Tree, K-Nearest Neighbor, Naive Bayes and Support Vector Machine) have been used to find text patterns that exist within a text based database. The classification of the text patterns is deterministic whereas the authorship attribution to the text is un-deterministic. This paper presents a method that recognize a text pattern by using the authorship features using different phases of processing which include prior processing, extracting the features, feature selection, classifying the features and then finally leading to finding the author. Findings: The task of Authorship Attribution can be imposed to a range of exercises such as Scientific Analysis, Stealing Recognition and Authorship Recognition. Exploration in the part of Authorship Attribution is in view for more than 100 centuries, but the completed consequences were unacceptable. A range of provocations have been referred which include information collections, tokenizing of the content, applying Natural Language Tools, suitability of categorization methods and reorganization of a range of appearance which can discriminate one writer from the other writers. From the prevailing analysis, it can be concluded that the pronounced accruement are individual circumstance scene of situations, since it may not be useful to other consequences of Authorship Attribution associations. From the acquired inputs, it is recognized that the word “unigram” constituent acquired the finest record when assessed with all additional appearances for all classifiers. From among different classifiers, Support Vector Machine realized the best result when evaluated in conjunction with different classifiers such as Decision Tree, K-Nearest Neighbor and Naive Bayes classifiers. Application/Improvements: This authorship attribution method is used to find out authorship of vernacular language which in this case is TELUGU.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.