In this age of Artificial Intelligence, the dispute over ownership of the web contents are increasing. In forensic studies and plagiarism cases, determining the author of a document undoubtedly plays a crucial role. The task of Authorship Attribution (AA) involves learning the style of the writers based on available documents to predict the ownership of unknown documents. Hence, capturing the style of writing is primary besides challenging. In this work, key features are systematically extracted, multiple deep-learning models are trained on various aspects of the text, and two novelties: Distilled and Fused Style Embedding (DFSE) and a multi-kernel ensemble model with kernels weighted based on Z-scores are proposed. The performance of the proposed model is superior to the baseline model and many contemporary models for ingenre and multi-topic datasets.
Read full abstract