A Statistical Synthetic Population Calibration for Activity-Based Model with Incomplete Census Data

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Synthetic population generator is the core component of the microsimulation in activity-based travel demand model. Typically, synthetic population is used in the way that their decisions on activity-travel pattern are simulated. Traditionally, household sample survey data is used to synthesize the population. The estimated results can be biased due to such as low-sampling size and inaccurate household sample data. To deal with this issue, a statistical maximum-likelihood method to calibrate synthetic population using the roadside observations (link counts) is proposed. Statistical performances of the proposed method are evaluated on the illustrative network and real network with census and household sample survey data. Multiday link counts are simulated from (true) activity-based model parameters and synthetic population. Tests are carried out assuming different number of observations and observation variations. The results illustrate the efficiency of the model calibration based on link counts and its potential for large and complex applications.

Similar Papers
  • Research Article
  • Cite Count Icon 10
  • 10.1007/s12205-015-0691-7
A simulated annealing algorithm for the creation of synthetic population in activity-based travel demand model
  • Nov 25, 2015
  • KSCE Journal of Civil Engineering
  • Jooyoung Kim + 1 more

A simulated annealing algorithm for the creation of synthetic population in activity-based travel demand model

  • Research Article
  • Cite Count Icon 10
  • 10.1016/j.trb.2015.05.004
Statistical approach for activity-based model calibration based on plate scanning and traffic counts data
  • May 23, 2015
  • Transportation Research Part B: Methodological
  • Treerapot Siripirote + 3 more

Statistical approach for activity-based model calibration based on plate scanning and traffic counts data

  • Dissertation
  • Cite Count Icon 3
  • 10.14264/uql.2020.822
Population synthesis for travel demand modelling in Australian capital cities
  • Jun 8, 2020
  • Poh Ping Lim

Microsimulation analysis in travel demand modelling provides an important basis to investigate travel behaviour in a spatial context. The disaggregated nature of the model design is well suited to represent complex travel behaviour and simulate spatial interactions. An integral part of building a microsimulation travel demand model is to obtain a comprehensive set of spatial microdata for the entire model region in small geographies. Data at this level of detail have been collected in Australian censuses. However, detailed geocoded microdata is generally restricted in access due to privacy reasons. Population synthesis techniques have been developed as viable alternatives to supplement the lack of completeness in spatial microdata for microsimulation analysis. These techniques are used to generate synthetic microdata that are statistically sound enough for microsimulation while preserving the privacy of the actual population. Several synthetic population generators have been developed for use in microsimulation travel demand models in recent decades. In Australia, the use of population synthesis techniques to create synthetic populations remains a challenging and time-consuming task that often hinders the progress of further development in microsimulation travel demand models. The primary focus of this thesis is to establish a reproducible population synthesis routine for creating synthetic microdata that can be readily fed into an activity-based model to simulate travel activity schedules at household and person level in Australian capital cities. The aim is to ease the process of preparing the necessary microdata for microsimulation and incentivise further development of microsimulation modelling in travel demand. The main approach adopted for the synthesis routine is based on the Iterative Proportional Updates (IPU) algorithm, a modified Iterative Proportional Fitting (IPF) procedure. IPU differs from the standard IPF procedure in that distributions at the household and person level are iteratively fitted simultaneously. The IPU procedure is solely based on a mathematical algorithm. Therefore, the accuracy of generated synthetic population relies on the quality and integrity of input data. In this thesis, two new heuristic procedures were formulated for data treatments before and after IPU using Australian census data. The procedure proposed for data treatment before the synthesis routine ensures the consistency of the input data, whereas the procedure proposed for data treatment after the synthesis routine extends under-synthesised estimates to a near complete synthetic population. In this research study, complete sets of synthetic populations at the household and person level were generated for Greater Sydney, Greater Melbourne and Greater Brisbane in the smallest geographical units available. These generated datasets have been extensively validated and benchmarked against actual aggregate census data to evaluate their representativeness in small geographies. The performance statistics for all three cities have consistently displayed excellent fit with high level of confidence in matching the synthesised to actual data. Multiple experiments have been conducted to test the efficacy and robustness of the IPU algorithm. The treated post-synthesised estimates have also been validated and proven to further increase the accuracy of the synthesised estimates. A case analysis is presented to illustrate the application of the generated synthetic population in policy analysis. In a broader context, this thesis contributes in setting up an efficient and replicable population synthesis routine that can be included into a standard methodological toolbox for transport researchers and for mainstream social scientists to produce synthetic populations using Australian census data. The lack of implementation details or transparency in validations of existing population synthesis procedures often imposes the need to redevelop a new synthesis routine whenever a synthetic population is required for microsimulation analysis. This research intends to alleviate the cumbersome and costly process of building synthetic microdata from scratch by presenting a practical pathway to building synthetic populations for microsimulation analysis.

  • Book Chapter
  • Cite Count Icon 11
  • 10.5772/intechopen.93827
Recent Progress in Activity-Based Travel Demand Modeling: Rising Data and Applicability
  • Jul 28, 2021
  • Atousa Tajaddini + 3 more

Over 30 years have passed since activity-based travel demand models (ABMs) emerged to overcome the limitations of the preceding models which have dominated the field for over 50 years. Activity-based models are valuable tools for transportation planning and analysis, detailing the tour and mode-restricted nature of the household and individual travel choices. Nevertheless, no single approach has emerged as a dominant method, and research continues to improve ABM features to make them more accurate, robust, and practical. This paper describes the state of art and practice, including the ongoing ABM research covering both demand and supply considerations. Despite the substantial developments, ABM’s abilities in reflecting behavioral realism are still limited. Possible solutions to address this issue include increasing the inaccuracy of the primary data, improved integrity of ABMs across days of the week, and tackling the uncertainty via integrating demand and supply. Opportunities exist to test, the feasibility of spatial transferability of ABMs to new geographical contexts along with expanding the applicability of ABMs in transportation policy-making.

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/jcsse.2016.7748838
Generating synthetic population at individual and household levels with aggregate data
  • Jul 1, 2016
  • Natthaporn Watthanasutthi + 1 more

Population synthesis is a process to create data records of individual persons and households with associated attributes that closely resemble the real population. It is the basis of microsimulation models for various applications such as urban planning, crime modeling and epidemiology. This work aims to create synthetic Thai population at the provincial scale. Our synthetic population generator is based on the synthetic reconstruction method, which is most suitable where only aggregate census data are available, as in Thailand. With available census tabulations from various government agencies, the generator is configured to combine 16 tabulation data at individual and household levels using conditional probabilities. The order of conditional probabilities is designed according to dependencies between the attributes and the difference in resolutions of the data from multiple sources. The main contribution of this work is the method to generate complex household types. Many family related attributes are used to create family relationships among individuals. Then, families and individuals are assigned into households according to household statistics. The generator is evaluated by creating synthetic population of Phitsanulok, a province with 835,555 individuals and 296,807 households localized in 18 municipality areas of 9 districts. The aggregated tabulations of the synthetic population are compared to the original ones. The results show that the distributions of their aggregated attributes are very close to the source data. Therefore, the synthetic population is a good approximate of the real population.

  • Conference Article
  • Cite Count Icon 1
  • 10.2495/ut080591
Extending activity-based models of travel demand to represent activity-travel behaviour of children: some descriptive results
  • Aug 15, 2008
  • T Arentze + 1 more

This paper on activity and travel behavior of children is from the proceedings of 14th international Conference on Urban Transport and the Environment in the 21st Century, which was held in Malta in 2008. The authors compare descriptive findings of an activity based model of travel behavior of children to data from heads of households. They begin with a brief review of the existing studies on activity-travel schedules of children, then describe the results of an analysis on an existing dataset (the MON survey of approximately 30,000 households in the Netherlands). The authors conclude that the activity-travel behavior of children significantly differs from the behavior of heads of households. Children tend to participate in fewer activities, conduct these activities closer to home and less by car. As children get older and less dependent on parents, their activity-travel patterns become more similar to those of heads of households, although participation in leisure activities remains higher. The authors stress that any model of children's activity-travel patterns must take into consideration their interdependency with adult activity-travel behavior.

  • PDF Download Icon
  • Abstract
  • Cite Count Icon 1
  • 10.23889/ijpds.v8i3.2285
SynthEco - A multi-layered digital ecosystem for analysing complex human behaviour in context
  • Sep 18, 2023
  • International Journal of Population Data Science
  • Antonia Gieschen + 6 more

Introduction & BackgroundHuman behaviour is multi-faceted and complex, with different dimensions interacting and impacting each other and individuals operating in an environmental context. In order to understand this behaviour better, the combination of data from different sources is useful to uncover some of those interactions and complexities. We present a multi-layered digital ecosystem based on a data platform providing statistically representative synthetic population derived from census data at different geo-spatial granularity, which we call SynthEco. This platform is enriched with individual data stemming from cohorts and cross-sectional surveys and geo-scanning of different layers of socio-environmental actors and conditions to create a complex digital ecosystem.
 Objectives & ApproachThe objective of SynthEco is to allow for the analysis of behaviour, as well as health and wellbeing outcomes, through the integration of cohort and cross-sectional data into a geospatially anchored synthetic population embedded into environmental data which is forming the backdrop. We demonstrate the use of this platform on the example of Montreal, Canada. The synthetic population is first generated from census data using iterative proportional fitting, which allows for the creation of a population data set that is artificial yet statistically representative for a given geospatial granularity, such as a city. Each individual household is assigned a geospatial location, which allows for the consideration of their surrounding environment including enterprises or institutions such as schools, hospitals and the local food environment. Through fuzzy matching and statistical extrapolation, different cohort and cross-sectional survey data are then merged to individual records, in order to describe them in more detail. This includes health, as well as financial wellbeing or social environment descriptors.
 Relevance to Digital FootprintsThere are two important points made through the presented work in relation to Digital Footprints data: the first is the technical approach to merging multiple datasets describing different dimensions of interacting human characteristics and behaviour by anchoring them into a synthetic population through fuzzy record matching. The second is the consideration of a spatial dimension when describing human behaviour. This is especially important when describing behaviour within local environments, such as the interaction with local food outlets.
 ResultsRecent work in this context includes an analysis of the food environment in Montreal, Canada. It introduces a way of utilising the synthetic population to predict the healthfulness of their local environment in terms of healthy food outlets, as well as providing a platform for the analysis of food environment surveillance and intervention simulations. For this purpose, the healthfulness of different census tract regions in Montreal is calculated to identify food deserts, food swamps, as well as healthy areas as defined through the Modified Retail Food Environment Index. We test different machine learning approaches to then predict these healthfulness scores using census variables from the synthetic population in their respective census tract, achieving accuracy scores of around 0.53 to 0.60. This demonstrates that census data has some limited predictive power in explaining the healthiness of the local food environment, which could be especially relevant for situations in which no information on the retailers is available to local policy makers. Future work can extend this approach to also include further data describing the population, stemming from the integrated cohorts and survey data, which could improve the prediction accuracy or help in identifying areas of concern.
 Conclusions & ImplicationsThe presented SynthEco platform views individuals as agents nested within modular systems of systems, trying to capture both internal systems and processes as well as environmental ones within which individuals are operating. The platform thus enables the application of computational systems modelling for the analysis of individual human behaviour in contexts. As demonstrated through the example of using SynthEco in the context of healthier food environments, the approach is especially relevant to practitioners and policy makers interested in local intervention strategies and identifying areas for targeted policy related to different dimensions of health and wellbeing.

  • Research Article
  • Cite Count Icon 1
  • 10.1016/j.trpro.2015.03.005
A Reproducibility Analysis of Synthetic Population Generation
  • Jan 1, 2015
  • Transportation Research Procedia
  • Jooyoung Kim + 1 more

A Reproducibility Analysis of Synthetic Population Generation

  • Book Chapter
  • Cite Count Icon 1
  • 10.4018/978-1-4666-4920-0.ch009
Activity-Based Travel Demand Forecasting Using Micro-Simulation
  • Jan 1, 2014
  • Qiong Bao + 4 more

Activity-based models of travel demand employ in most cases a micro-simulation approach, thereby inevitably including a stochastic error that is caused by the statistical distributions of random components. As a result, running a transport micro-simulation model several times with the same input will generate different outputs. In order to take the variation of outputs in each model run into account, a common approach is to run the model multiple times and to use the average value of the results. The question then becomes: What is the minimum number of model runs required to reach a stable result? In this chapter, systematic experiments are carried out by using the FEATHERS, an activity-based micro-simulation modeling framework currently implemented for Flanders (Belgium). Six levels of geographic detail are taken into account, which are building block level, subzone level, zone level, superzone level, province level, and the whole Flanders. Three travel indices (i.e., the average daily number of activities per person, the average daily number of trips per person, and the average daily distance travelled per person), as well as their corresponding segmentations with respect to socio-demographic variables, transport mode alternatives, and activity types are calculated by running the model 100 times. The results show that application of the FEATHERS at a highly aggregated level only requires limited model runs. However, when a more disaggregated level is considered (the degree of the aggregation here not only refers to the size of the geographical scale, but also to the detailed extent of the index), a larger number of model runs is needed to ensure confidence of a certain percentile of zones at this level to be stable. The values listed in this chapter can be consulted as a reference for those who plan to use the FEATHERS framework, while for the other activity-based models the methodology proposed in this chapter can be repeated.

  • Research Article
  • Cite Count Icon 77
  • 10.1007/s11116-017-9840-9
A time-use activity-pattern recognition model for activity-based travel demand modeling
  • Nov 20, 2017
  • Transportation
  • Mohammad Hesam Hafezi + 2 more

This study develops a new comprehensive pattern recognition modeling framework that leverages activity data to derive clusters of homogeneous daily activity patterns, for use in activity-based travel demand modeling. The pattern recognition model is applied to time use data from the large Halifax STAR household travel diary survey. Several machine learning techniques not previously employed in travel behavior analysis are used within the pattern recognition modeling framework. Pattern complexity of activity sequences in the dataset was recognized using the FCM algorithm, and resulted in identification of twelve unique clusters of homogeneous daily activity patterns. We then analysed inter-dependencies in each identified cluster and characterized the cluster memberships through their socio-demographic attributes using the CART classifier. Based on the socio-demographic characteristics of individuals we were able to correctly identify which cluster individuals belonged to, and also predict various information related to their activities, such as start time, duration, travel distance, and travel mode, for use in activity-based travel demand modeling. To execute the pattern recognition model, the 24-h activity patterns are split into 288 three dimensional 5 min intervals. Each interval includes information on activity types, duration, start time, location, and travel mode if applicable. Results from aggregated statistical evaluation and Kolmogorov–Smirnov tests indicate that there is heterogeneous diversity among identified clusters in terms of temporal distribution, and substantial differences in a variety of socio-demographic variables. The homogeneous clusters identified in this study may be used to more accurately predict the scheduling behavior of specific population groups in activity-based modeling, and hence to improve prediction of the times and locations of their travel demands. Finally, the results of this study are expected to be implemented within the activity-based travel demand model, Scheduler for Activities, Locations, and Travel (SALT).

  • Research Article
  • Cite Count Icon 256
  • 10.1080/12265934.2013.835118
Activity-based models of travel demand: promises, progress and prospects
  • Sep 10, 2013
  • International Journal of Urban Sciences
  • Soora Rasouli + 1 more

Because two decades have almost passed since the introduction of activity-based models of travel demand, this seems the right time to evaluate progress made in the development and application of these models. This invited paper seeks to discuss the initial promises of activity-based models as an alternative to four-step and tour-based models, summarize progress made and identify still unsolved issues that require further research.

  • Book Chapter
  • 10.1007/978-981-32-9042-6_20
Activity-Based Travel Demand Models to Evaluate Transport Policies
  • Oct 25, 2019
  • Pranav Padhye + 3 more

Transportation planning plays a critical role in shaping the economic health and quality of life of the general public. A good deal of the demand for transport is concentrated on a few hours of a day, at particular section of urban areas where congestion takes place during specific peak periods. Hence, modelling of this travel demand from the transportation point of view is necessary. There are two basic approaches to this travel demand modelling—traditional four-stage travel demand modelling and activity-based travel demand modelling. According to transport department data of Bangalore city collected in 2012, there are 41.86 lakh of two-wheelers, 11.8 lakh of cars and 5.91 lakh of transport vehicles. The share is 69% of two-wheelers, 22% of LMVs, 5% of HTVs and 4% of other vehicles. From the earlier research, it has been found that the activity-based modelling is more efficient to evaluate the transport policies than traditional four-step modelling particularly for the cities like Bangalore having a large amount of vehicle population. Here, an attempt has been made to develop the activity-based travel demand models for the selected zone of Bangalore city. Bangalore city has been divided into three major areas and further into 47 zones. The data has been collected through individual person survey considering certain parameters which are influential to develop person tours. This collected data is then analysed through SPSS software, and models are developed considering the several parameters such as age, gender, monthly income, distance of travel, daily travel cost and vehicle ownership. Simultaneously, the zonal public transport policies have been studied to understand the norms regarding the transport such as quality of transport, pricing, financing and parking facilities. The results obtained in the form of models are compared with the traditional models and are used to evaluate the public transport policies. Also, the factors influencing the trips of each individual have been studied and the effects of those factors are analysed. The results obtained are found satisfactory in terms of R2 value and other testing parameters. Transport policies are selected, and models are linked to the policies to evaluate them. Study concludes with the effective linking of the models to the policies which will help the authorities to bring it into play.

  • Research Article
  • Cite Count Icon 4
  • 10.3141/1981-06
Analysis of New Starts Project by Using Tour-Based Model of San Francisco, California
  • Jan 1, 2006
  • Transportation Research Record: Journal of the Transportation Research Board
  • Joel Freedman + 2 more

Activity-based models are increasingly attractive as alternatives to traditional trip-based travel demand forecasting models because of growing dissatisfaction with the internal consistency, aggregation bias, and lack of detail of trip-based approaches. New policy analysis requirements demand that forecasting models represent travel choices and the contexts in which these travel choices are made with ever-increasing geographic, temporal, and behavioral detail. Activity-based models can incorporate this detail and can provide decision makers with more precise insights into potential outcomes of transportation and land use investment and development strategies. The model of San Francisco, California, is a tour-based microsimulation model that forecasts daily activity patterns for individual San Francisco residents and has been used in transportation planning practice since 2000. The San Francisco model uses the daily activity pattern approach, first introduced by Bowman and Ben-Akiva, within a disaggregate microsimulation framework. This paper describes an application of the San Francisco model to the proposed new Central Subway project in downtown San Francisco. This is the first application of an activity-based travel demand model in the United States to a major infrastructure project in support of a submission to FTA for project funding through the New Starts program. To enable the submittal of a New Starts request, software was developed to collapse the microsimulation output of the tour and trip mode choice models into a format compatible with the FTA SUMMIT program. SUMMIT was then successfully used to summarize and analyze user benefits accruing to the project and to prepare an acceptable New Starts submittal.

  • Research Article
  • Cite Count Icon 6
  • 10.3141/2412-04
Modeling Context-Sensitive, Dynamic Activity Travel Behavior by Linking Short- and Long-Term Responses to Accumulated Stress
  • Jan 1, 2014
  • Transportation Research Record: Journal of the Transportation Research Board
  • Ifigenia Psarra + 3 more

As existing activity-based models of travel demand simulate activity travel patterns for a typical day, dynamic models simulate behavioral response to endogenous or exogenous change along various time horizons. Prior research predominantly addressed a specific kind of change, which usually affected a specific time horizon. In contrast, the current study aims to develop a dynamic model of activity travel decisions that links short- and long-term adaptation decisions in a hierarchical manner. Specifically, this study focuses on the bottom-up process of influence, in which problems with rescheduling on a daily basis may induce a long-term change. The authors assume that travelers will first explore short-term adjustments of their habitual activity travel patterns so as to cope with change and increasing stress. Only when travelers recognize that such adaptation strategies are ineffective will they consider long-term decisions. The proposed framework integrates three key concepts: aspiration, activation, and expected utility. Moreover, both rational and emotional mechanisms are taken into account. The study demonstrates model properties by using numerical simulation. Individual travelers are represented as agents, each with their cognition of the environment, habits, preferences, and aspirations. The results offer insight into the dynamics of traveler learning–adaptation and into the evolution of long-term decisions.

  • Research Article
  • Cite Count Icon 13
  • 10.1016/j.compenvurbsys.2016.11.003
An unconstrained statistical matching algorithm for combining individual and household level geo-specific census and survey data
  • Nov 28, 2016
  • Computers, Environment and Urban Systems
  • Mohammad-Reza Namazi-Rad + 4 more

An unconstrained statistical matching algorithm for combining individual and household level geo-specific census and survey data

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.