Commuting Flow Generation

Yan Luo

Individual mobility, both within cities and across cities, significantly impacts various societal aspects, such as well-being, the spread of epidemics, and environmental quality. Commuting flow, a crucial and predictable aspect of human movement, is typically monitored by national statistical agencies. However, relying solely on existing statistical data is insufficient, as it may suffer from issues of timeliness and data completeness.

In cases where mobility flow information is unavailable for a specific region of interest, traditional methods resort to mathematical-physical models, like the gravity model and the radiation model, to generate the required data. The gravity model is widely used to predict flow patterns, but it depends on adjustable parameters that vary from region to region and may have known analytical inconsistencies. The radiation model introduces a stochastic process to capture local mobility decisions, enabling the analytical derivation of commuting and mobility fluxes based solely on information about population distribution. It has the advantage of being interpretable by design and addresses many of the problems of the gravity model. However, it also has some limitations: (1) It relies on a restricted set of variables, usually just the population and distance between locations, neglecting crucial information about the geographical landscape such as land use, points of interest (POIs), and the transportation network. (2) It assumes the determinants of flow are based on a homogeneous ideal space, while in reality, human mobility is influenced by multiple factors, including transportation conditions, topography, land use, and their distribution patterns. As a result, the distances traveled in different directions from the origin may vary when the time cost of travel is constant, leading to an inability to accurately capture the structure and greater variability of real flows.

To address these challenges, it becomes necessary to use more detailed input data and more flexible models to generate realistic mobility flows. This can be achieved by extracting a rich set of geographical features from freely available online data and using powerful nonlinear models like deep artificial neural networks to model complex spatial relationships, including the spatial heterogeneity mentioned earlier. Existing research, like deep gravity, has attempted to improve the gravity model using deep learning techniques, but it fails to overcome the fundamental limitations that the radiation model addresses.

To overcome these issues, this study introduces a transformer-based deep learning framework that neuralizes the radiation model. Specifically, we demonstrate that transformers are more suitable than graph neural networks (GNNs), a mainstream deep learning backbone, for existing deep learning-based flow generation solutions. Furthermore, we design a positional encoding module tailored to geographical relationships, utilizing radial basis functions to encode relative and absolute distances between locations and using spatial heterogeneity as a supervisory signal to guide model training. Our experiments, conducted on mobility flows across several states in the US, show that our framework significantly improves performance, especially in densely populated regions, compared to traditional mathematical-physical models and state-of-the-art deep learning models. Additionally, our model exhibits remarkable generalization abilities, generating realistic flows even in geographical areas lacking training data and achieving inter-state generalization. Finally, we demonstrate how the flows generated by our model can be explained in terms of geographic features. We identify significant differences among the three states considered, interpreting the model's predictions using explainable AI techniques.