Knowing the behavioral patterns of city residents is of great value in formulating and adjusting urban planning strategies, such as urban road planning, urban commercial development, and urban pedestrian flow control. Based on the high penetration rate of cell phones, it is possible to indirectly understand the behavior of city residents based on the call records of users. However, the behavioral patterns of large‐scale users over a long period of time can present characteristics such as large dispersion, difficult to discover patterns, and difficult to explain behavioral patterns. In this paper, we design and implement a human behavior pattern analysis system based on massive mobile communication data based on serial data modeling method and visual analysis technology. For the problem that it is difficult to capture the behavioral patterns of residents in cities in call records, this paper constructs base station trajectories based on users’ cell phone call records and uses users’ long‐time base station trajectories to mine users’ potential behavioral patterns. Since users with similar activity characteristics will exhibit similar base station trajectories, this paper focuses on the similarity between text sequences and base station trajectory sequences and combines the word embedding method in natural language processing to build a Cell2vec model to identify the semantics of base stations in cities. In order to obtain the group behavior patterns of users from the base station trajectories of group users, a user clustering method based on users’ regional mobile preferences is proposed, and the results are projected using the Stochastic Neighbor Embedding (t‐SNE) algorithm to expose the clustering features of large‐scale cell phone users in the low‐dimensional space. To address the problem that user behavior patterns are difficult to interpret, a visual analysis model with group as well as regional semantics is designed for the spatial and temporal characteristics of user behavior. Among them, the clustering model uses the distance between scatter points to map the similarity between users, which helps analysts to explore the behavioral characteristics of group users.
Read full abstract