Abstract

Hadith is a collection of texts containing sayings of the prophet Muhammad, which, along with accounts of his daily practice, constitute the second major source of legislation for Muslims after the Holy Koran. The Hadith collection comprises thousands of text pieces transferred over the years by many narrators with varying degrees of credibility. Hadith scholars are faced with the challenge of assessing the degree of a specific Hadith’s authenticity to classify the Hadith as Sahih (fully authentic and accepted) or Daif (rejected). Automatic Hadith classification has been addressed in the literature; however, the results vary and are not directly comparable, as no dataset has been made available for benchmarking. In addition, no previous work has utilised deep-learning (DL) approaches for Hadith classification. This work contributes by 1) collecting and publicly releasing a benchmark Hadith dataset of almost 4,000 Hadith texts to facilitate future research, 2) exploring DL model performance on binary Hadith classification tasks, and 3) benchmarking traditional machine learning against DL models. Our best results were recorded with an ARBERT DL model that provided an accuracy score of 91.56%. KEYWORDS Hadith classification; deep learning; Classical Arabic; machine learning; Hadith science; Hadith authenticity

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call