Abstract

In recent years, Natural Language Processing (NLP) has seen a surge in research, particularly in the 
 areas of text summarization and machine translation. Evaluation metrics like ROUGE and BLEU have been 
 widely used to assess the quality of texts using N-gram based approaches. However, these metrics often struggle 
 when applied to data sourced from the internet, such as social media platforms, due to the prevalence of 
 phonological errors. This study focuses on identifying the sources and frequency of phonological errors while 
 addressing the question of whether they should be considered or not. Data from Twitter, a platform known for 
 phonological errors, was collected, and studied, along with existing literature on the subject. The article proposes 
 enhancing existing metrics by integrating edit distance algorithms like Levenshtein or Damerau-Levenshtein. By 
 considering phonological errors in evaluations, this approach aims to improve accuracy and reliability in the NLP 
 and machine translation domains. The ultimate goal of this study is to contribute to more sensitive and reliable 
 evaluation metrics in these fields.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.