Abstract

Understanding fine-grained urban function for different regions is essential for both city managers and residents in terms of strategy design, tourism recommendation, business site selection, etc. A huge amount of data from the mobile network in the past several years provides the possibility for fine-grained urban function identification since it provides the opportunity to extract useful information about urban functions. However, challenges remain: (i) there is no prior knowledge about the existence of App usage patterns relating to urban functional regions; (ii) collected data are very noisy and data from different cellular towers have different noise levels. Therefore, it is difficult to extract unique patterns to identify urban functional regions. This article proposes a fine-grained urban functional region identification system, which utilizes mobile App usage data from cellular towers. To address challenge (i), we first extract three key variables for each cellular tower, App number, user number, and traffic. Then, we design a Davies–Bouldin index (DBI)-based filtering method to automatically select the most distinguishable features for multiclassification. To address challenge (ii), we first reduce cellular tower level noise by designing a clustering-based method to select the most representative cellular tower data. The data from these cellular towers share similar patterns for the same urban functional region and different patterns between different urban functional regions. Then, we reduce feature level noise by designing a Fourier transform-based method to reconstruct the features with several key frequency components, which preserves the most important information and removes the unnecessary noise. We evaluate our system and selected features with three representative supervised learning models, all of which achieve more than 95 % classification accuracy.

Highlights

  • Understanding fine-grained urban function for different regions is essential for both city managers and residents in terms of strategy design, tourism recommendation, business site selection, etc

  • Challenges remain: (i) there is no prior knowledge about the existence of App usage patterns relating to urban functional regions; (ii) collected data are very noisy and data from different cellular towers have different noise levels. erefore, it is difficult to extract unique patterns to identify urban functional regions. is article proposes a fine-grained urban functional region identification system, which utilizes mobile App usage data from cellular towers

  • To address challenge (ii), we first reduce cellular tower level noise by designing a clustering-based method to select the most representative cellular tower data. e data from these cellular towers share similar patterns for the same urban functional region and different patterns between different urban functional regions. en, we reduce feature level noise by designing a Fourier transform-based method to reconstruct the features with several key frequency components, which preserves the most important information and removes the unnecessary noise

Read more

Summary

Data Preprocess and Visualization

In order to obtain regional useful information for urban function identification, we first discretize the logs into small time chunks: 1, 2, . . ., N, where N is the number of time chunks. en, within each chunk i, we aggregate three variables of mobile usage logs from the same cellular tower c and derive total number of unique Apps used (ac[i]), total number of unique connected users (uc[i]), and total amount of traffic flow consumed (fc[i]), based on which we extract features for urban function identification. En, within each chunk i, we aggregate three variables of mobile usage logs from the same cellular tower c and derive total number of unique Apps used (ac[i]), total number of unique connected users (uc[i]), and total amount of traffic flow consumed (fc[i]), based on which we extract features for urban function identification. We derive mobile App fingerprint logs, each of which contains a cellular tower ID (c), a time chunk index (i), a total number of unique Apps used (ac[i]), a total number of unique connected users (uc[i]), and the total amount of traffic flow consumed (fc[i]). 7, where corrAij, corrUij, and corrFij denote the correlation coefficients of App number, user number, and traffic between two different days, respectively. E mobile App fingerprint on Monday has high correlations with those on other days for all three variables (App number, user number, traffic). That does not affect the user number and App number too much since a subscriber use multiple Apps and an App is used by multiple subscribers. e similarity between different days shows that the information of the entire week has high redundancy and we can use data on Monday to represent the information of the entire week. is helps to reduce data dimension to oneseventh of original data

Definition of Region and Functional Region
System Architecture
Pattern Identification
Representative Filtering
Raw Feature Extraction
Frequency
Experiment Setup
Offline Training of Machine
System Performance
Influence of Pattern Identification
Influence of Feature Selection
Computational Complexity Analysis
Findings
Related Work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call