Abstract

Complex Word Identification (CWI) is one of the key components of lexical text simplification. This paper proposes a new approach to CWI on websites, based on tracking what web users copy to their clipboards. Users may copy to the clipboard words that they are not familiar with or that make the text difficult to understand, in order to search for more information on the internet. Accordingly, this study examines the hypothesis that word copying on a website is an indicator of word complexity. Copied words on a sample website are compared to uncopied words using three simple word complexity indicators: number of syllables, number of characters, and general word frequency. The results show that copied words are more likely to be evaluated as complex than uncopied words and words that are copied more frequently are more likely to be evaluated as complex than words that are copied less frequently, by all three indicators. Consequently, word copying on a website can be considered a novel CWI indicator. Unlike traditional CWI indicators, which are based on static word features, this new indicator provides a different approach by targeting complex words based on dynamic user behavior. Therefore, simplifying these complex words might be particularly helpful to the readers. Further work should evaluate using this word copying indicator in complete CWI and text simplification implementations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.