Abstract

Stylometric Authorship attribution is one of the essential approaches in the text mining. The present research endorses a Stylometric method called Stylometric Authorship Ranking Attribution (SARA) overcomes the usual problems which are processing time and accurate prediction results, without any human opinion that relays on the domain expert. This new method also uses the most effective attributes used in the Stylometric authorship prediction frequent word bag counts, whether it was frequent single, pair or trio words attributes, which are the most successful attributes in Stylometric prediction, having more alibi for author artistic writing style for our authorship recognition and prediction proposed technique. The experiments show that the proposed method produces superior prediction accuracy and even provides a completely correct result at the final stage of our experimental tests regarding the dataset scope.

Highlights

  • Data mining is the evaluation of observational data units to find authorized relationships and the evaluation of statistics in novel methods that are each obvious and beneficial to the statistics owner [1]

  • Text preprocessing tasks inclusive of information selection, classification and characteristic extraction normally convert the documents into intermediate forms, which have to be appropriate for distinct mining purpose

  • The results showed that the Stylometric Authorship Balanced Attribution (SABA) method produces most useful prediction accuracy and even presents a completely right end result during the closing stage of the test [10]

Read more

Summary

Introduction

Data mining is the evaluation of observational data units to find authorized relationships and the evaluation of statistics in novel methods that are each obvious and beneficial to the statistics owner [1]. Information mining requires structured data, whilst textual content mining aims to discover patterns in unstructured statistics [4]. The problem of text mining has gained growing attention in current years because of the big quantities of textual content data, which created a variety of social network, web, and other information-centric applications. There has been an extraordinary need for graph techniques and algorithms which can successfully manner a broad range of textual content purposes [1]. Another foremost issue is a multilingual text refinement dependency that creates problems. The upcoming sections in this research will illustrate the latest methods and approaches of a certain subfield in the text mining area that is concerned about the text corpus in literature and the writing style of its authors before stepping into the proposed method details

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call