Abstract
Timely estimation of the distribution of socioeconomic attributes and their movement is crucial for academic as well as administrative and marketing purposes. In this study, assuming personal attributes affect human behavior and movement, we predict these attributes from location information. First, we predict the socioeconomic characteristics of individuals by supervised learning methods, i.e., logistic Lasso regression, Gaussian Naive Bayes, random forest, XGBoost, LightGBM, and support vector machine, using survey data we collected of personal attributes and frequency of visits to specific facilities, to test our conjecture. We find that gender, a crucial attribute, is as highly predictable from locations as from other sources such as social networking services, as done by existing studies. Second, we apply the model trained with the survey data to actual GPS log data to check the performance of our approach in a real-world setting. Though our approach does not perform as well as for the survey data, the results suggest that we can infer gender from a GPS log.
Highlights
Recent technological developments in portable devices such as smartphones and car navigation systems enable us to use people’s location information for academic, administrative and marketing purposes [12, 14]
The remainder of this paper proceeds as follows: in Sect. 2, we describe the two datasets used in this study: our survey data and actual GPS log data collected by other researchers
Looking at the accuracy, the classifiers predicted gender with about 80% accuracy on average and XGBoost shows the highest performance with accuracy of 0.8463, F score of 0.8498, and ROC AUC of 0.9202, which is as accurate as the existing studies
Summary
Recent technological developments in portable devices such as smartphones and car navigation systems enable us to use people’s location information for academic, administrative and marketing purposes [12, 14]. Border-control agencies of European countries use this kind of information to control immigrants and refugees; Germany and Denmark amended domestic laws to authorize their agencies to extract data from the cellphones of asylum seekers, and similar bills were proposed in Belgium and Austria. Information on the distribution of personal socioeconomic attributes like gender, age, and education in a specific area is necessary for administrators to make suitable policies for their areas and for companies to determine the location of new stores or products. Except for companies that own such raw data, it is difficult to ascertain the distribution of personal attributes
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.