Abstract

Since its debut in New York Times, the Wordle, a five-letter word guessing game, quickly gains its worldwide popularity in various languages. Statistics, such as the number of hard-mode players, distribution of scores, have been collected from Twitter. In this work, such statistics are thoroughly analyzed by newly developed mathematical models, which unravel underlying correlations among various attributes of the word, the number of players, the number of attempts, etc. We refer our models as GUESS, which is named after the models developed for the analyzing Wordle. We propose an AutoregrESSive Integrated Moving Average (ARIMA) model, in which the date is taken as the time and the number of reported results is chosen as the time series data. We arrive at a prediction interval of [8502.06, 14372.14] for the number of reports on March 1, 2023. Such models offer reasonable explanations for the daily variation on the number of reported results and reliable prediction interval on the number of reported results in future. We build a Gradient Boosting Tree (GBDT) regression model based on the correlation analysis, which takes the word attributes that correlate with the score percentages as independent variables. Data shuffling is performed to ensure that both the training and test sets contain various types of data. We propose a ClUster Analysis Model K-means++ to classify the difficulty of solution words. Extensive cluster analysis demonstrates that the solution words with higher word frequency or initial letter rank are easier to guess; nonetheless, the solution words with higher word repetition rate are more difficult. Cross validation tests show that our classification is highly accurate. Finally, we conduct sensitivity analysis on our model, which reveals its robustness to parameters. In addition, we summarize our strengths and weaknesses. Our results are summarized in conclusion.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.