Abstract

This paper aims to introduce our publicly available datasets in the area of tourism demand prediction for future experiments and comparisons. Most of the previous works in the area of tourism demand forecasting are based on coarse-grained analysis (level of countries or regions) and there are very few works and consequently datasets available for fine-grained tourism analysis (level of attractions and points of interest). In this article, we present our fine-grained enriched datasets for two types of attractions – (I) indoor attractions (27 Museums and Galleries in U.K.) and (II) outdoor attractions (76 U.S. National Parks) enriched with official number of visits, social media reviews and environmental data for each of them. In addition, the complete analysis of prediction results, methodology and exploited models, features' performance analysis, anomalies, etc, are available in our original paper, “Fine-grained tourism prediction: Impact of social and environmental features”[2].

Highlights

  • Outdoor datasetThe outdoor dataset consists of climate, social media and official data for 76 national parks in the United States

  • This paper aims to introduce our publicly available datasets in the area of tourism demand prediction for future experiments and comparisons

  • We present our fine-grained enriched datasets for two types of attractions e (I) indoor attractions (27 Museums and Galleries in U.K.) and (II) outdoor attractions (76 U.S National Parks) enriched with official number of visits, social media reviews and environmental data for each of them

Read more

Summary

Outdoor dataset

We accessed the U.S National Park Service website (https://irma.nps.gov/Stats/) to download the monthly total number of visitors for each national park in the period of January 1996 to February 2016. We consider this dataset as ground truth for possible tourism analysis. We collected social media data from TripAdvisor - the largest travel website with more than 570 million reviews and 455 million average monthly unique visitors (http://www.tripadvisor.com/). The monthly aggregated number of reviews alongside the average rating scores of reviewers were collected for the period of January 2011 until September 2016.

Indoor dataset
Dataset feature distributions
Dataset feature statistics
Findings
Conflict of Interest
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call