Abstract

in data mining are increasing over the time. Current world is of internet and everything is available over internet, which leads to criminal and malicious activity. So the identity of available content is now a need. Available content is always in the form of text data. Authorship analysis is the statistical study of linguistic and computational characteristics of the written documents of individuals. This paper describes review of various methods for authorship analysis and identification for a set of provided text. Surely research in authorship analysis and identification will continue and even increase over decades. In this article, we put our vision of future authorship analysis and identification with high performance and solution for behavioral feature extraction from set of text documents. adapted for the automatic analysis of the text. A Source Code (programming code) Author Profiles (SCAP) represents a new, highly accurate approach to source code authorship identification. Another section, text categorization based on keywords that may appear uniquely, may dual sequences like computer+science, genetic+algorithm etc. In discriminative syntactic tree approach, there is direct mining from a given set of syntactic trees. Later part of this paper is organized in research work done, challenges in feature extraction and classification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call