Forecasting urban flows is strategically important for traffic management, land use, public safety, etc. For city managers, they can pre-discover the traffic congestion that may occur in the city, deploy traffic in advance, and ease traffic congestion. For businessmen, they can find the crowded regions or potential business investment areas to gain greater business benefits. For public, they can improve their own travel plans in advance, stagger the peak of travel, and choose a more convenient way to travel. From the perspective of people’s travel mode, urban flows contain crowd flow, traffic flow and public transit flow, etc.
However, urban flows prediction is not an easy issue. First, there are some main factors affecting urban flows, which can be classified into four groups. Daily flows activity patterns: They are the main patterns of urban flows, including working day commute, going to and back from school and other daily repeated activities. Anomalies of flows activity patterns: Although daily flows activity patterns are main patterns of urban flows, our mostly concern is anomalies of flows activity patterns and certain areas such as an increase of urban traffic anomaly, because this may lead to the phenomena of traffic congestion, social security, etc. For this kind of phenomenon, we can improve traffic deployment to make people travel more convenient and enhance security emergency to make social order more harmonious. Some urban events or activities will also affect the flows in the city. When temporary traffic control is imposed on an area because of the construction of roads, there will be a corresponding decrease in flows in that area. Weather: It also has a certain impact on urban flows, such as heavy rain, smog and other extreme weather conditions. It causes the number of people going out may decrease, while the number of people going out may increase when the weather is sunny. Holidays: In addition, for holidays such as National Day holiday and Spring Festival holiday, there may be cross-regional and surging flows, which is periodic in years. In terms of time, it also has a certain impact on the urban flows on the date near the holiday, making the fluctuation of crowd flows last for a period of time.
Datasets we can obtain are almost spatial-temporal data, such as mobile phone data, taxi trajectories data, metro/bus swiping data, etc., all of which have temporal dependence and spatial correlation. How to deal with spatial-temporal data is also a very valuable and fundamental problem.
In addition, predicting urban flows requires determining the level of forecasting, in terms of space, such as the entire city, regional, and street-wide level; in terms of time, such as the next 15 minutes, the next hour, the next 24 hours of urban flows in each regions. Different prediction levels require different precision and different processing methods. Apart from the objective factors mentioned above, there are also some unmeasured subjective travel intentions of the population, which are very difficult to study and haven’t any breakthrough on it before.
The contribution of this paper lies in three folds. First, urban flows prediction from spatial-temporal data is systematically reviewed. Second, we divide the typical and representative methods into five categories for urban flows prediction and mainly analysis the deep learning-based methods. Third, some public spatial-temporal datasets for urban flows forecasting are shared for facilitating research.
The rest of this paper is organized as follows. In Section 2, we partition the multi-sources spatial-temporal data preparation process into three groups. In Section 3, we choose the spatial-temporal dynamic data as a case study for the urban flows prediction task. In Section 4, we analyze and compare some well-known and state-of-the-art flows prediction methods in detail, classifying them into five categories: statistics-based, traditional machine learning-based, deep learning-based, reinforcement learning-based and transfer learning-based methods. Finally, we give open challenges of urban flows prediction and an outlook on the future of this field.
Multi-sources data from urban must be processed and prepared for further data analysis. In this section, we partition the preparation process into three groups, they are shown as below.
2.1. Spatial-temporal datasets
The datasets used for urban flows prediction are most spatial-temporal data. From the characteristic perspectives of spatial and temporal, we can divide the datasets into three categories, which they are spatial-temporal static data, spatial static temporal dynamic data and spatial-temporal dynamic data. According to data type of the spatial-temporal datasets, there are point data and network data [1]. The spatial-temporal datasets are illustrated in Table 1.
Table 1: Types of spatial-temporal datasets.
2.2. Map decomposition
It can be found that in cities, a large amount of data is spatial-temporal data, such as traffic data including bicycle renting and returning data, taxi track data, metro card swiping data, etc. These data have time and space properties, and are in a constantly changing state. We need to represent and measure these data at the time and space level. In order to better process these spatial-temporal data, we should decompose the city map first. One decomposition method is grid-based decomposition. For example, a DNN-based prediction model was proposed for spatial-temporal data [2], which can capture both temporal and spatial properties. They defined a grid map based on the longitude and latitude, partitioned the Beijing city in to an M*N grip map. For example, there is an entertainment region (i, j) that
Figure 1: Grid-based map segmentation in Beijing and the black region is an entertainment area (Happy Valley Beijing).
lies at the ith row and the jth column in the grip map, as shown in Figure 1. This is a good presentation of dividing region into some grids. Then they used in-flow and out-flow to measure the crowd flows in a region. Another is road network-based map decomposition method which the vehicles’ GPS trajectories are mapped onto the city road network. Unlike the mentioned method before, it can sufficiently take advantage of the road network’s information and apply classical clustering approach for further refining. Howerver, it is not as convenient and simple as grid-based decomposition method.
2.3. Dealing with data problems
When deal with spatial-temporal data, we may face many data problems. For example, data missing, data imbalance and data uncertainty arise alonely or in combination. These conditions will reduce the accuracy and efficiency of analysis result, so we review some methods to overcoming them.
2.3.1. Data missing
Due to sensor failures, communication errors, and other human factors, spatial-temporal data is often missing. Data missing often brings negative impact to the subsequent data analysis, so it is necessary to study the problem of data missing. At present, the main data missing processing method is to fill the missing value. For example, Lee et al. [3] proposed a factorial hidden Markov model to revover missing values. Hoang et al. [4] divided a city into low-level regions based on road network, and grouped adjacent these regions with similar crowd flow patterns using graph clustering. It’s a novel and effective solution, but it’s not sure how to define similar crowd flow patterns accurately. To consider temporal and spatial correlation, Yi et al. [5] proposed a spatial-temporal multi-view-based learning (ST-MVL) methods to collectively fill missing value in a collection of geo-sensory time series data. This method received good performance because of combining empirical statistic models, consisting of Inverse Distance Weighting and Simple Exponential Smoothing, with User-based and Item-based Collaborative Filtering. Hence, we can conclude that data missing in spatial-temporal datasets have an implicit spatiotemporal correlation.
2.3.2. Data imbalance
Spatiotemporal data imbalance is mainly manifested in two aspects: data distribution imbalance and data label imbalance. First of all, for the problem of imbalanced data distribution, Zheng et al. [6] proposed a semi-supervised learning algorithm to deal with the problem of sparse training data caused by the lack of air monitoring stations. Then, for the problem of imbalanced data label, Beckmann et al. [7] studied a KNN-based undersampling methods for data balancing. Wang et al. [8] used a K-labelsets ensemble method based on mutual information and joint entropy to deal with inblanced data. Gong et al. [9] presented a ensemble method using random undersampling and ROSE sampling to solve the imbalance classification problem. So when we face the data imbalance problem, it’s a good choice to determine data distribution or data label imbalance, and then apply these corresponding methods.
2.3.3. Data uncertainty
In the actual deployment of machine learning algorithm, in order to better explain the model and effectively deal with the risk caused by data uncertainty, researchers proposed to adopt uncertainty quantification to alleviate the problem of data uncertainty [10, 11]. Bayesian Deep Learning [12] is a kind of uncertainty quantification technique which can learn the weight distribution of networks. Quantifying predictive uncertainty in neural networks, which as a challenging problem, Lakshminarayanan et al. [13] proposed a model which puts uncertainty into the loss function and is directly optimized through BP algorithm. Rangapuram et al. [14] combined state space models with deep learning for probabilistic time series forecasting, this method keeps properties of state space models such as data interpretability. Recently, more and more uncertainty quantification research works emerge in the traffic forecast field, it could be hoped to be a novel direction for traffic forecast research.
In this section, we choose the spatial-temporal dynamic data (trajectories data) as a case study for the urban flows prediction task because it as one of widely studied spatial-temporal data types in urban flows forecasting and it relates more with our urban crowd flows and traffic flows prediction topic.
In the spatial-temporal data mining field, spatial-temporal (ST) data types can be divided into four categories [15], which are different from the classification we mentioned in Table 1 above: (i) event data, which often occur at point locations and times (e.g., a concert or a car accident), (ii) trajectory data, which refer to the trajectories of moving objects (e.g., human, vehicle, animals), (iii) point reference data, where a continuous spatial-temporal field is being measured at moving ST reference sites (e.g., surface temperature are measured by using weather balloons), and (iv) raster data, whose measurements of an ST field are collected at fixed ST grids (e.g., air quality of Earth’s surface collected by ground-based sensors). We can find that the ST events data and ST point reference data are point data, while trajectories data and ST raster data are network data. They can be merged into the categories of spatial-temporal datasets showed in Table 1. As we know, the derivation of trajectories data can be classified into four main categories, which are human mobility, transportation vehicles mobility, animals mobility and natural phenomena [16]. In this paper, the urban flows prediction task mainly aim at estimated and predict human mobility and transportation vehicles mobility. Before starting data mining tasks for urban flows prediction, we should preprocess these spatial-temporary dynamic trajectory data. The raw location traces are often collected by smartphones with GPS and WiFi or taxis equipped with a GPS sensor. There are some spatial-temporal trajectory data preprocessing methods, consisting of extracting the history of place visits, data filtering and statistics, trajectory compression, trajectory segmentation and map matching [16, 17].
Extracting the history of place visits. From the analysis of the location traces, we can find that there are transitions and stay points in trajectories data. Then using a gird clustering algorithm [17], the stay regions will be generated from the set of stay points based on a given radius. So the history of place that users visited can be extracted. As demonstrated in Figure 2, if we set the minimum time of stay points to 5 min, the bus station (P4) can be found in this trajectory, and the stay region can be generated around the school (P9) about 1km radius. Based on these places with timestamp information, some applications may appear, such as travel recommendation, business location and travel time estimation.
Data filtering and statistics. Because of sensor’s error or other technical issues, there are some incomplete instances or outliers in the trajectories data. To forecast urban flows well, we need to filter out these incomplete instances
Figure 2: Stay points and regions in trajectories.
Figure 3: Time-interval and stay points trajectory segmentation.
and outliers before prediction task. The filter step is significant to avoid biased estimates of prediction performance. In order to find the trajectories distribution, we also need to perform some statistics analysis, such as mean, median, the ratio between location and transitions, and how many places visited by a user during a given recording interval.
Trajectory compression. Let’s imagine this condition. We can collect time-stamped location every second even more accurate time measurements for moving objects. It will cost plenty of communication, computing and storage sources. To efficiently collected and leverage these data, we need to compress the trajectory data. There are two major categories of trajectory compression methods. One is offline compression, such as Douglas-Peucker algorithm [18] which replaces the original trajectory by an approximate line segment until the negligible error is below a specified error. The other is online compression, such as Sliding Window algorithm [19] and Open Window algorithm [20] to transmit trajectory data timely. They are window-based algorithms which fit the trajectory points in a sliding window with a valid line segment and expand the sliding window until exceed specified error bound.
Trajectory segmentation. In order to classify or cluster trajectories to mine more useful knowledge, we need to study trajectory segmentation before mining tasks. There are three common types of trajectory segmentation methods. They are time interval-based, shape of a trajectory-based and semantic meaning-based methods. The first one is that a trajectory is divided into some segments based on a given time-length (lager than the given threshold or the same time interval), as illustrated in Figure 3. Due to the time length between p2 and p3 being larger than a given time interval, so we can divide the trajectory into two segments (p1p2 and p3p7). The second one is that we can partition a trajectory by the turning points with heading direction changing over a threshold [16]. The last one is based on the semantic meaning of points in a trajectory. For example, in the travel speed estimation task, we often remove the stay points from the GPS trajectories because the stay points may be location where taxi is waiting for passengers [21]. For example, from Figure 3, the trajectory points in the dotted box can be removed because of stay points (p4p5p6) and the trajectory can be divided into two segments (p1p3 and p6p7). To find a walk-based segmentation [22, 23], we also need to combine with the human mobility patterns and employ further semantic meaning-based trajectory segmentation research.
Map matching. There are two major categories map matching methods. One is the additional information-base method, and the other is the range of sampling points-based method. The first type of methods can be divided into four groups: geometric [24], topological [25, 26], probabilistic [27, 28] and other advanced methods [29, 30, 31]. The second type of methods can be divided into two categories: local and global methods. The local methods aim to find a local optimal point based on the distance and orientation similarity. The global methods [32, 33] try to match an entire trajectory with a road network.
Urban flows prediction is one of the spatial-temporal prediction tasks in urban computing field. There are some related research work, such as air quality prediction [34, 35, 36], traffic flow prediction [37, 38, 39], and travel demand prediction [40, 41, 42]. With the rapid development of Super first-tier city and the emergence of New first-tier city, urban flows prediction has become more and more important in traffic management and public safety. From the perspective of spatial forecasting measure, urban flows prediction can be divided into three categories, e.g., citywide-level, region-level and road-level [4, 43, 44]. From the perspective of temporal prediction, urban flows prediction can be divide into three categories, e.g., short-term, mid-term and long-term flows prediction [44, 45]. And we can find that the problem of urban flows prediction has both spatial relation and temporal dependencies.
To solve the urban flows prediction problem, in recent years, there are many novel methods have been proposed. The major methods can be classified into five categories: statistics-based methods, traditional machine learning methods, deep learning-based methods, reinforcement learning methods and transfer learning methods.
4.1. Statistics-based methods
In statistics-based methods, ARMA (Autoregressive Moving Average) [46] is a fundamental time series prediction methods, and the variant method is ARIMA (Autoregressive Integrated Moving Average) [47]. An integrated version of ARMA model is also very popular in time-series prediction problems. You can use the Hyndman-Khandakar algorithm for automatic ARIMA modelling in R [48]. The default procedure contains two steps [49] : (i) The number of differences is determined using repeated KPSS tests; (ii) The value of p and q are then chosen by minimising the AICc (AICc is AIC (Akaike Information Criterion) with a correction for small sample sizes) after differencing the data d times. Rather than considering every possible combination of p and q, the algorithm uses a stepwise search to traverse the model space. And the process for forecasting is summarized in Figure 5. As an another extension of the ARMA method, Seasonal Auto-Regressive Integrated Moving Average(SARIMA) method [50] can catch intrinsic correlations in time series data, especially fit for modeling seasonal, stochastic time series that always occur in traffic flow data. Although these classical time-series methods can capture temporal dependencies in time series data, they can’t depict the spatial influence in urban flows prediction problems.
4.2. Traditional machine learning methods
Support vector regression (SVR) model is usually used in traffic flow prediction, for example, SVR with RBF kernel multiplied by a seasonal kernel has been used in traffic flow forecasting with high prediction accuracy and computational efficiency [51]. As one of non-parametric and data-driven methods, an enhanced K-nearest neighbor(KNN) algorithm applied in short-term traffic flow prediction based on identify similar traffic patterns [52]. Zhu et al. studied a linear conditional Gaussian (LCG) Bayesian network(BN) model for short-term traffic flow prediction, which considers spatial-temporal characteristics as well as speed information [53]. To tackle the task of estimating the number of people who moved between cells, Akagi et al. [54] developed a probabilistic model based on collective graphical models, which has considered movements to remote cells. As presented in Figure 4, the proposed method is an unsupervised learning method and only needs input variables, which are spatiotemporal population data. Liu et al. [55] developed a graph processing framework based traffic estimation (GPTE), which can capture traffic correlation from taxi data and enable advanced traffic estimation at city-scale based on graph-parallel processing method. From these previous traditional machine learning-based methods, we can conclude that these models mainly focus on short-term traffic flow prediction and receive high prediction accuracy. However, traffic data explosion due to the increase of traffic sensors and the rapid development of intelligent transport systems in recent years, traditional machine learning-based methods are restricted with mining the deep, implicit spatial-temporal correlations in the big traffic data.
4.3. Deep learning-based methods
Deep learning-based methods are becoming main and popular methods for traffic spatial-temporal tasks. Due to big data and strong computing power, the success of deep learning in many application scenarios motivate plenty of deep learning-based methods in different areas, such as CNNs in computer vision [56, 57] and RNNs in sequence learning tasks [58, 59]. In the next part, we will introduce the deep learning-based methods in detail.
In previous research work, there are many papers on human mobility prediction based on their history location trajectories data [60, 61, 17]. They aim at providing context-aware services and other location-based services for
Figure 4: The task of estimating people flow between cells. Input: population of each grid cell at each time; Output: the number of people who move between cells over time. The map is divided by cells based on latitude and longitude.
Figure 5: General process for forecasting using an ARIMA model.
users, however, it may expose the privacy of users to others, so these data may be unavailable because of the policy of protecting privacy. Compared with the human mobility prediction problem, to forecast urban crowd flows, we can obtain more related datasets based on crowd-scale not individual-scale, e.g., taxi trajectories data, public transportation system data (metro or bus card swiping data), bike-sharing data, road network data, weather data and so on. And it is also of great importance to traffic management and public safety.
From the perspectives of people’s travel mode and urban flows type, we can divide the urban flows into three categories: crowd flow, traffic flow, and public transit flow. Crowd flow usually can be concluded from users’ phone signals and other vehicles GPS trajectories separately or synthetically. Traffic flow can mainly be estimated by using taxi trajectories data. Public transport flow indicates that the movement passengers measured by public transit card swiping data or bike-sharing data. If we want to measure the urban flows accurately, all the flow type need to be prepared and analyzed synthetically. However, in practical scenario, it’s hard to receive all the urban flow type data of a city and there are some complex relationships among these flows. Next, we will review and analysis the three types of urban flows separately.
4.3.1. Crowd flow prediction
In recent years, there are many researchers focus on citywide-level traffic flow prediction [4, 62], to forecast the citywide-level crowd flow. In the next part, before starting touch deep learning models, we will follow the related definitions of the crowd flow prediction problem in advance [2].
Region [2]. There are many definitions of Region in terms of different scales and semantic meanings. In most studies, they often partition a city into an I*J grid map based on longitude and latitude and a grid cell called a region, as shown in Figure 1.
Inflow/Outflow [2]. Let P be a collection of trajectories at the tth time interval. A grid cell (i, j) means the ith row and the jth column, the inflow and outflow of the crowds at the time interval t are defined respectively as
where Tr : g1 g2
is a trajectory in P, and gk is the geospatial coordinate, gk
means the point gk lies within grid (i, j) , and vice versa,
denotes the cardinality of a set. At the tth time interval, inflow and outflow in all I*J regions denote as a tensor Xt
where (Xt
xin
,
(Xtxout
. Prediction Target [2]. Given the historical observation {Xt
, predict Xn. In 2016, a deep neural network-based prediction model called DeepST was proposed, which can capture spatial
and temporal properties to predict citywide crowd flows [2]. The architecture of DeepST contains two parts: time sequence part and external factors part. From the history observation, the time serials contains temporal closeness, period and seasonal trend properties. The external factors have some related information with crowd flows prediction, e.g., dayofweek, weekday/weekend and meteorological condition. Then convolution layers are employed to capture the temporal closeness, period and seasonal trend properties, and the convolution layers output is fused followed by three sequential convolutional layers. At last, this result is fused with the output of the external factors captured by fully-connected layers and the prediction target Xn is obtained. And they built a real-time flow forecasting system (called as UrbanFlow) based on the DeepST model. In 2017 and 2018, a novel deep learning-based model (ST-ResNet) is presented shown in Figure 6 and the city is
partitioned by using a grid-based method for forecasting the crowd flows in each and every region of a city [43, 45].
Note, The model outperforms other classical time-series and deep learning prediction methods. Mostly like the DeepST model, the ST-ResNet adds Residual Units and only one fusion component. The fusion step uses a parameter-
matrix-based fusion method,
Figure 6: The architecture of ST-ResNet. (The original ST-ResNet architecture is available in [43]). The model outperforms other classical time-series and deep learning prediction methods.
Figure 7: Residual Unit. (The original Residual Unit is available in [43]).
where is Hadamard product (i.e., element-wise multiplication), Wc, Wp and Wq are the learnable parameters that adjust the degrees affected by closeness, period and trend, respectively. They concatenate the output of the three components after fusion with the external component. To model citywide dependencies, they employ residual learning in this ST-ResNet model, which has been demonstrated to be very effective for training super deep neural networks of over 1000 layers [43, 63]. The residual unit used in the ST-ResNet is shown in Figure 7. But in short-term crowd flows prediction problem, the residual network structure of ST-ResNet can be removed to get much more better performance, because it’s not necessary to use the residual network structure to capture the distant spatial dependencies far away from the target region. And the ST-ResNet also needs too many data to train the model, so it has not good performance if we can’t get much available data [44]. Some researchers choose the ST-ResNet model as baseline to do further urban crowd flows prediction tasks [43, 44], because it outperforms other deep learning-based methods before.
4.3.2. Traffic flow prediction
Traffic flow prediction plays an important role and is a hot research field in urban flow forecasting [64, 65, 66, 67], because the taxi GPS data can access easily relatively and it can represent the citizens’s traffic behaviors without the limitation of fixed lines. Fox example, Jiang et al. [68] developed a deep learning framework which transforms geospatial data to images using Convolutional Neural Network(CNN) and residual networks for traffic prediction. Wu et al. [39] proposed a novel model, which combines the CNN and RNN to capture the spatial-temporal features as well as learn the importance of past traffic flow using attention mechanism. This method makes full use of the temporal and spatial characteristics of traffic flow to model and improve prediction performance. Since the forecast of traffic flow is affected by complex factors such as temporal relationship, spatial correlation, and other external factors (weather and events), it is more challenging to accurately predict traffic flow. Zhang et al. [69] studied a multitask deep learning framework to simultaneously forecast the node flow and edge flow in the spatial-temporal networks. The model outperforms 11 baselines and shows great prediction performance in traffic flow forecast.
4.3.3. Public transit flow prediction
As an important component of the urban public transportation system, the metro has been rapidly deployed in the city because of its large capacity, high speed and high reliability, and it has attracted a large number of passengers [70, 71, 72, 73]. Therefore, doing a good job of metro passenger flow forecast will not only help the metro management department to manage passenger travel demand and optimize metro dispatch, but also help passengers choose travel time and travel mode. Liu et al. [70] proposed an end-to-end deep learning model, named as DeepPF, to forecast the metro inbound and outbound passenger flow. They combine all the influence factors, such as temporal dependencies, spatial characteristics, metro operation properties and external environment factors to predict short-term metro passenger flow. The experiment shows that the model has good prediction performance and can apply to general conditions. We can find that it is feasible and effective to use the deep learning methods to capture the temporal and spatial characteristics of metro data and predict the metro passenger flow. Ma et al. [71] analyzed the metro data’s spatial-temporal characteristics and then developed a parallel framework which comprises convolutional neural network (CNN) and bi-directional long short-term memory network (BLSTM) to forecast metro passenger flow. The model was evaluated by Beijing metro network data and it outperformed traditional statistics methods. In recent years, bike-sharing is becoming more popular in urban transportation because of providing flexible transport mode and reducing the production of greenhouse gas. Chai et al. [74] proposed a novel multi-graph CNN method to predict bike flow at station-level. This method give us a novel graph neural network perspective to study traffic prediction.
4.4. Reinforcement learning-based methods
The reinforcement learning methods can usually be applied in traffic flow optimization problems. As we know, traffic congestion is a tricky problem in urban that may lead to travel delays, increased fuel consumption and air pollution. Hence, it is necessary to optimize traffic flow, make traffic control in advance, and alleviate traffic congestion. To tackle these challenges, Erwin et al. [75] proposed a method based on reinforcement learning, which to optimize traffic flow and using Q-learning to learn policies dictating the maximum driving speed allowed on a highway. The model takes traffic prediction into account and controls traffic flow proactively. More importantly, it can further help alleviate traffic congestion. Another example is to coordinate passenger inflow control problem on an urban rail transit line in Shanghai. In order to reduce the frequency of metro passengers being stranded and ensure public safety, Jiang et al. [76] presented a reinforcement learning-based method applied to study metro passenger inflow control strategy in peak hours. These methods aim at found better optimization strategy using reinforcement learning algorithms and further optimize current traffic flow, improve the efficiency of intelligent transportation systems. However, there are few articles that combine deep learning and reinforcement learning to predict and optimize traffic flow, it will be a promising research direction for the future.
4.5. Transfer learning-based methods
Using transfer learning methods to predict crowd flows is the novel direction for urban flows forecast in a data-scarce city [77, 78]. A novel network for spatial-temporal prediction with region representation was developed [77], as shown in Figure 8. The model’s objective is to minimize the squared error between predicted yt and real yt.
This method focuses on finding inter-city region pairs that share similar patterns and then transfers knowledge from data-adequate city (source city) to data-scarce city (target city). We can find that this method can use not too much data to predict the crowd flows than other deep learning-based methods and can take full advantage of knowledge from source city. But the model only shows good performance between two cities with similar patterns and it’s not easy to find the matching function of the source regions and target regions.
Figure 8: Deep Spatial-temporal Neural Network with Region Representations. (The original model is available in [77]).
With the development of machine learning algorithms, especially deep learning methods are still very active, and more open urban spatial-temporal datasets, many research works on urban flows forecast studies emerge during recent years. In order to give researchers a clear research progress outline in the field and help them with further research, we summarize the classic and representative works on urban flows forecast research during recent five years. We list these recent works mentioned in above Section 4, as shown in Table 2.
In Table 2, we categorize these works into three categories by tasks they faced, datasets and used methods. The abbreviations of some methods are given in Table 3. And then we will also list some open datasets to facilitate researchers further dig in the field.
From the Table 2, we can find that deep learning is still be frequently used in solving many urban flows forecast tasks, such as crowd flow prediction and traffic flow prediction. Due to more public transit data are becoming available on research under the privacy protection agreement, we can see that there are some public transit flow forecast papers using advanced machine learning algorithms. The works list in the Table 2 also give researchers good examples for dealing with urban spatial-temporal flow forecast problems. Learn from these great previous works, we can conclude that you can use statistics-based methods and traditional machine learning algorithms when addressing short-term traffic flow prediction tasks. If you want to capture urban flows’ temporal dependency, spatial correlation simultaneously and further improve forecast performance, you can try to use deep learning-based methods, even a hybrid deep learning-based method, these methods have shown excellent results in spatial-temporal flow forecast tasks. Recently, reinforcement learning-based methods are often applied in urban traffic flow optimization problem, from few papers, we also see the method combines traffic flow prediction using deep learning and traffic flow optimization using reinforcement learning, it shows a promising direction for urban flows study. At last, if you have very little traffic data at hand, consider using a novel transfer learning approach. Compared to the data-hungry deep learning methods, it only requires a small amount of data to learn a lot because it has an ability to take advantage of the knowledge learned from the source domain.
In this article, we have reviewed how spatial-temporal data can be used to forecast urban flows. In dealing with these datasets and choosing great analysis methods, some challenges still remain challenges of urban spatial-temporal flows prediction are three-fold.
Table 2: A summary of previous works about urban flows prediction.
Dealing with multiple influence factors. When we design our model, we may face following problems: what is the main factor affects urban spatial-temporal flows and what are the minor but necessary factors for our given tasks. For example, in this crowd flows prediction problem, we need consider the influence factors of spatial and temporal scale, and other factors, such as weather, holidays, social events, traffic accidents, traffic jams and so on. The main factors affecting urban flows discussed in the Introduction part of the paper may give you some inspirations.
Finding suitable data fusion methods. In order to improve our model’s performance, we often need more available datasets from various domains (traffic, weather, social) to train the deep learning-based model. However, we may face those problems: how to choose the suitable datasets for our tasks and how to fuse these heterogenous data as model input. Hence, it’s not trivial to investigate cross-domain data fusion methods to speed up the research.
Limitations in data sparsity. In practical scenario, some data are missing and failure due to sensor error, transmission failure and storage loss. These incomplete/inaccurate flow data will reduce the prediction accuracy. We can fill the missing value using some missing value filling methods. And transfer learning may be also a good choice to deal with the data sparsity challenge.
Table 3: Abbreviations used in Table 2 and their corresponding terminologies (in order which they appear in Table 2).
In order to help other researchers further participate and make more valuable works, we collect and organize several related open datasets on this urban spatial-temporal forecast topic for you. Here are links of these datasets:
• NYC taxi data: https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page .
• NYC bike data: https://www.citibikenyc.com/system-data .
• San Francisco taxi data: https://crawdad.org/ crawdad/epfl/mobility/20090224/ .
• Weather and events data: https://www.wunderground.com/ .
• UK traffic flow datasets: http://data.gov.uk/dataset/highways-england-network-journey-time-and-traffic-flow-data .
• Traffic flow data is available from the Illinois Department of Transportation: http://www.travelmidwest.com/ .
• Weather and climate data: https://www.ncdc.noaa.gov/data-access .
• Chicago bike sharing data: https://www.divvybikes.com/system-data .
• NSW POI data: https://sdi.nsw.gov.au/catalog/search/resource/details.page?uuid=%7BC41F6FE5-1C56-4556-9EC6-EC9BD7094BBB%7D .
• Road network data: http://networkrepository.com/road.php .
In this paper, we attempt to provide an overview of the field of urban spatial-temporal flows prediction during 2014-2019, which plays an increasingly significant role in urban computing research and has close relationship with traffic management, land use and public safety. However, we are only able to cover a small fraction of work in this rapid growing area of research. We still hope that this paper can provide you a ladder to do further research on urban spatial-temporal flows prediction. Due to most methods are data-driven methods in the urban flows prediction problem, we need to pay more attention to the data. This paper helps the reader in identifying problems with given spatial-temporal datasets and some good choice of preprocessing or prediction methods to deal with urban flows prediction problems. Although, the field of urban flows prediction has received much achievement, there are also many open challenges need to be dealt with, such as how to fuse multiple source data simultaneously, how to decide which influence factors are key factor for our problem and effectively solve the data sparsity problem. At last, we hope to see more generative and practical urban flows prediction methods in solving urban challenges.
This research was supported by the National Natural Science Foundation of China (No. 61773324).
[1] Y. Zheng, Urban Computing, MIT Press, 2019.
[2] J. Zhang, Y. Zheng, D. Qi, R. Li, X. Yi, Dnn-based prediction model for spatio-temporal data, in: Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM, 2016, p. 92.
[3] D. Lee, D. Kulic, Y. Nakamura, Missing motion data recovery using factorial hidden markov models, in: 2008 IEEE International Conference on Robotics and Automation, IEEE, 2008, pp. 1722–1728.
[4] M. X. Hoang, Y. Zheng, A. K. Singh, Fccf: Forecasting citywide crowd flows based on big data, in: Proceedings of the 24th ACM SIGSPA- TIAL International Conference on Advances in Geographic Information Systems, ACM, 2016, p. 6.
[5] X. Yi, Y. Zheng, J. Zhang, T. Li, St-mvl: Filling missing values in geo-sensory time series data, in: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016, pp. 2704–2710.
[6] Y. Zheng, F. Liu, H.-P. Hsieh, U-air: When urban air quality inference meets big data, in: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2013, pp. 1436–1444.
[7] M. Beckmann, N. F. Ebecken, B. S. P. de Lima, A knn undersampling approach for data balancing, Journal of Intelligent Learning Systems and Applications 7 (04) (2015) 104.
[8] R. Wang, S. Kwong, Y. Jia, Z. Huang, L. Wu, Mutual information based k-labelsets ensemble for multi-label classification, in: 2018 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), IEEE, 2018, pp. 1–7.
[9] J. Gong, H. Kim, Rhsboost: Improving classification performance in imbalance data, Computational Statistics & Data Analysis 111 (2017) 1–13.
[10] A. Khosravi, S. Nahavandi, D. Creighton, A. F. Atiya, Comprehensive review of neural network-based prediction intervals and new advances, IEEE Transactions on Neural Networks 22 (9) (2011) 1341–1356.
[11] E. Begoli, T. Bhattacharya, D. Kusnezov, The need for uncertainty quantification in machine-assisted medical decision making, Nature Machine Intelligence 1 (1) (2019) 20.
[12] H. Wang, D.-Y. Yeung, Towards bayesian deep learning: A framework and some existing methods, IEEE Transactions on Knowledge and Data Engineering 28 (12) (2016) 3395–3408.
[13] B. Lakshminarayanan, A. Pritzel, C. Blundell, Simple and scalable predictive uncertainty estimation using deep ensembles, in: Advances in Neural Information Processing Systems, 2017, pp. 6402–6413.
[14] S. S. Rangapuram, M. W. Seeger, J. Gasthaus, L. Stella, Y. Wang, T. Januschowski, Deep state space models for time series forecasting, in: Advances in Neural Information Processing Systems, 2018, pp. 7796–7805.
[15] G. Atluri, A. Karpatne, V. Kumar, Spatio-temporal data mining: A survey of problems and methods, ACM Computing Surveys 51 (4) (2018) 1–37.
[16] Y. Zheng, Trajectory data mining: An overview, ACM Transactions on Intelligent Systems and Technology (TIST) 6 (3) (2015) 1–41.
[17] T. M. T. Do, O. Dousse, M. Miettinen, D. Gatica-Perez, A probabilistic kernel method for human mobility prediction with smartphones, Pervasive and Mobile Computing 20 (2015) 13–28.
[18] D. H. Douglas, T. K. Peucker, Algorithms for the reduction of the number of points required to represent a digitized line or its caricature, Classics in Cartography: Reflections on Influential Articles from Cartographica (2011) 15–28.
[19] E. Keogh, S. Chu, D. Hart, M. Pazzani, An online algorithm for segmenting time series, in: Proceedings of the IEEE International Conference on Data Mining, IEEE, 2001, pp. 289–296.
[20] N. Meratnia, A. Rolf, Spatiotemporal compression techniques for moving point objects, in: Proceedings of the International Conference on Extending Database Technology, Springer, 2004, pp. 765–782.
[21] J. Yuan, Y. Zheng, X. Xie, G. Sun, T-drive: Enhancing driving directions with taxi drivers’ intelligence, IEEE Transactions on Knowledge and Data Engineering 25 (1) (2012) 220–232.
[22] Y. Zheng, Y. Chen, Q. Li, X. Xie, W.-Y. Ma, Understanding transportation modes based on gps data for web applications, ACM Transactions on the Web 4 (1) (2010) 1–36.
[23] Y. Zheng, L. Liu, L. Wang, X. Xie, Learning transportation mode from raw gps data for geographic applications on the web, in: Proceedings of the 17th International Conference on World Wide Web, ACM, 2008, pp. 247–256.
[24] J. S. Greenfeld, Matching gps observations to locations on a digital map, in: Proceedings of the 81th Annual Meeting of the Transportation Research Board, 2002, pp. 164–173.
[25] W. Chen, M. Yu, Z. Li, Y. Chen, Integrated vehicle navigation system for urban applications, in: Proceedings of International Conference Global Navigation Satellite System, 2003, pp. 15–22.
[26] H. Yin, O. Wolfson, A weight-based map matching method in moving objects databases, in: Proceedings of the International Conference on Scientific & Statistical Database Management, IEEE, 2004, pp. 437–438.
[27] M. A. Quddus, R. B. Noland, W. Y. Ochieng, A high accuracy fuzzy logic based map matching algorithm for road transport, Journal of Intelligent Transportation Systems 10 (3) (2006) 103–115.
[28] O. Pink, B. Hummel, A statistical approach to map matching using road network geometry, topology and vehicular motion constraints, in: Proceedings of the 11th International IEEE Conference on Intelligent Transportation, IEEE, 2008, pp. 862–867.
[29] P. Newson, J. Krumm, Hidden markov map matching through noise and sparseness, in: Proceedings of the 17th ACM SIGSPATIAL Interna- tional Conference on Advances in Geographic Information Systems, ACM, 2009, pp. 336–343.
[30] J. Yuan, Y. Zheng, C. Zhang, X. Xie, G.-Z. Sun, An interactive-voting based map matching algorithm, in: Proceedings of the 2010 Eleventh International Conference on Mobile Data Management, IEEE Computer Society, 2010, pp. 43–52.
[31] Y. Lou, C. Zhang, Y. Zheng, X. Xie, W. Wang, Y. Huang, Map-matching for low-sampling-rate gps trajectories, in: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances In Geographic Information Systems, ACM, 2009, pp. 352–361.
[32] H. Alt, A. Efrat, G. Rote, C. Wenk, Matching planar maps, Journal of Algorithms 49 (2) (2003) 262–283.
[33] S. Brakatsoulas, D. Pfoser, R. Salas, C. Wenk, On map-matching vehicle tracking data, in: Proceedings of the 31st International Conference on Very Large Data Bases, VLDB Endowment, 2005, pp. 853–864.
[34] X. Yi, J. Zhang, Z. Wang, T. Li, Y. Zheng, Deep distributed fusion network for air quality prediction, in: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2018, pp. 965–973.
[35] X. Li, L. Peng, Y. Hu, J. Shao, T. Chi, Deep learning architecture for air quality predictions, Environmental Science and Pollution Research 23 (22) (2016) 22408–22417.
[36] X. Li, L. Peng, X. Yao, S. Cui, Y. Hu, C. You, T. Chi, Long short-term memory neural network for air pollutant concentration predictions: Method development and evaluation, Environmental Pollution 231 (2017) 997–1004.
[37] S. Du, T. Li, X. Gong, Z. Yu, Y. Huang, S.-J. Horng, A hybrid method for traffic flow forecasting using multimodal deep learning, arXiv preprint arXiv:1803.02099.
[38] N. G. Polson, V. O. Sokolov, Deep learning for short-term traffic flow prediction, Transportation Research Part C: Emerging Technologies 79 (2017) 1–17.
[39] Y. Wu, H. Tan, L. Qin, B. Ran, Z. Jiang, A hybrid deep learning based traffic flow prediction method and its understanding, Transportation Research Part C: Emerging Technologies 90 (2018) 166–180.
[40] D. Wang, Y. Yang, S. Ning, Deepstcl: A deep spatio-temporal convlstm for travel demand prediction, in: 2018 International Joint Conference on Neural Networks, IEEE, 2018, pp. 1–8.
[41] J. Ke, H. Zheng, H. Yang, X. M. Chen, Short-term forecasting of passenger demand under on-demand ride services: A spatio-temporal deep learning approach, Transportation Research Part C: Emerging Technologies 85 (2017) 591–608.
[42] H. Yao, F. Wu, J. Ke, X. Tang, Y. Jia, S. Lu, P. Gong, J. Ye, Deep multi-view spatial-temporal network for taxi demand prediction, arXiv preprint arXiv:1802.08714.
[43] J. Zhang, Y. Zheng, D. Qi, R. Li, X. Yi, T. Li, Predicting citywide crowd flows using deep spatio-temporal residual networks, Artificial Intelligence 259 (2018) 147–166.
[44] W. Jin, Y. Lin, Z. Wu, H. Wan, Spatio-temporal recurrent convolutional networks for citywide short-term crowd flows prediction, in: Pro-
ceedings of the 2nd International Conference on Compute and Data Analysis, ACM, 2018, pp. 28–35.
[45] J. Zhang, Y. Zheng, D. Qi, Deep spatio-temporal residual networks for citywide crowd flows prediction, in: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017, pp. 1655–1661.
[46] S. E. Said, D. A. Dickey, Testing for unit roots in autoregressive-moving average models of unknown order, Biometrika 71 (3) (1984) 599–607.
[47] B. Williams, P. Durvasula, D. Brown, Urban freeway traffic flow prediction: application of seasonal autoregressive integrated moving average and exponential smoothing models, Transportation Research Record: Journal of the Transportation Research Board (1644) (1998) 132–141.
[48] R. J. Hyndman, Y. Khandakar, Automatic time series for forecasting: The forecast package for R, Monash University, Department of Econo- metrics and Business Statistics, 2007.
[49] G. A. R. J. Hyndman, Forecasting: Principles and practice, Otexts.com.
[50] N. Zhang, Y. Zhang, H. Lu, Seasonal autoregressive integrated moving average and support vector machine models: prediction of short-term traffic flow on freeways, Transportation Research Record 2215 (1) (2011) 85–92.
[51] M. Lippi, M. Bertini, P. Frasconi, Short-term traffic flow forecasting: An experimental comparison of time-series analysis and supervised learning, IEEE Transactions on Intelligent Transportation Systems 14 (2) (2013) 871–882.
[52] F. G. Habtemichael, M. Cetin, Short-term traffic flow rate forecasting based on identifying similar traffic patterns, Transportation research Part C: emerging technologies 66 (2016) 61–78.
[53] Z. Zhu, B. Peng, C. Xiong, L. Zhang, Short-term traffic flow prediction with linear conditional gaussian bayesian network, Journal of Advanced Transportation 50 (6) (2016) 1111–1123.
[54] Y. Akagi, T. Nishimura, T. Kurashima, H. Toda, A fast and accurate method for estimating people flow from spatiotemporal population data, in: IJCAI, 2018, pp. 3293–3300.
[55] Z. Liu, P. Zhou, Z. Li, M. Li, Think like a graph: Real-time traffic estimation at city-scale, IEEE Transactions on Mobile Computing PP (99) (2018) 1–1.
[56] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE 86 (11) (1998) 2278–2324.
[57] A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, in: Advances in Neural Informa- tion Processing Systems, 2012, pp. 1097–1105.
[58] R. J. Williams, D. Zipser, A learning algorithm for continually running fully recurrent neural networks, Neural Computation 1 (2) (1989) 270–280.
[59] I. Sutskever, O. Vinyals, Q. V. Le, Sequence to sequence learning with neural networks, in: Advances in Neural Information Processing Systems, 2014, pp. 3104–3112.
[60] S. Jiang, G. A. Fiore, Y. Yang, J. Ferreira Jr, E. Frazzoli, M. C. Gonz´alez, A review of urban computing for mobile phone traces: Current methods, challenges and opportunities, in: Proceedings of the 2nd ACM SIGKDD International Workshop on Urban Computing, ACM, 2013, pp. 1–9.
[61] F. Calabrese, L. Ferrari, V. D. Blondel, Urban sensing using mobile phone network data: a survey of research, ACM Computing Surveys 47 (2) (2015) 1–20.
[62] Y. Li, Y. Zheng, H. Zhang, L. Chen, Traffic prediction in a bike-sharing system, in: Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM, 2015, pp. 33:1–33:10.
[63] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
[64] J. A. Deri, F. Franchetti, J. M. Moura, Big data computation of taxi movement in new york city, in: 2016 IEEE International Conference on Big Data (Big Data), IEEE, 2016, pp. 2616–2625.
[65] X. Zhan, Y. Zheng, X. Yi, S. V. Ukkusuri, Citywide traffic volume estimation using trajectory data, IEEE Transactions on Knowledge and Data Engineering 29 (2) (2017) 272–285.
[66] Z. Zhao, W. Chen, X. Wu, P. C. Chen, J. Liu, Lstm network: a deep learning approach for short-term traffic forecast, IET Intelligent Transport Systems 11 (2) (2017) 68–75.
[67] Z. Duan, Y. Yang, K. Zhang, Y. Ni, S. Bajgain, Improved deep hybrid networks for urban traffic flow prediction using trajectory data, IEEE Access 6 (2018) 31820–31827.
[68] W. Jiang, L. Zhang, Geospatial data to images: A deep-learning framework for traffic forecasting, Tsinghua Science and Technology 24 (1) (2019) 52–64.
[69] J. Zhang, Y. Zheng, J. Sun, D. Qi, Flow prediction in spatio-temporal networks based on multitask deep learning, IEEE Transactions on Knowledge and Data Engineering (2019) 1–1.
[70] Y. Liu, Z. Liu, R. Jia, Deeppf: A deep learning based architecture for metro passenger flow prediction, Transportation Research Part C: Emerging Technologies 101 (2019) 18–34.
[71] X. Ma, J. Zhang, B. Du, C. Ding, L. Sun, Parallel architecture of convolutional bi-directional lstm neural networks for network-wide metro ridership prediction, IEEE Transactions on Intelligent Transportation Systems (2018) 1–11.
[72] Y. Ning, Y. Huang, J. Li, Q. Liu, D. Yang, W. Zheng, H. Liu, St-drn: Deep residual networks for spatio-temporal metro stations crowd flows forecast, in: Proceedings of 2018 International Joint Conference on Neural Networks (IJCNN), IEEE, 2018, pp. 1–8.
[73] D. Nam, H. Kim, J. Cho, R. Jayakrishnan, A model based on deep learning for predicting travel mode choice, in: Proceedings of the Transportation Research Board 96th Annual Meeting Transportation Research Board, Washington, DC, USA, 2017, pp. 8–12.
[74] D. Chai, L. Wang, Q. Yang, Bike flow prediction with multi-graph convolutional networks, in: Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM, 2018, pp. 397–400.
[75] E. Walraven, M. T. Spaan, B. Bakker, Traffic flow optimization: A reinforcement learning approach, Engineering Applications of Artificial Intelligence 52 (2016) 203–212.
[76] Z. Jiang, W. Fan, W. Liu, B. Zhu, J. Gu, Reinforcement learning approach for coordinated passenger inflow control of urban rail transit in peak hours, Transportation Research Part C: Emerging Technologies 88 (2018) 1–16.
[77] L. Wang, X. Geng, X. Ma, F. Liu, Q. Yang, Crowd flow prediction by deep spatio-temporal transfer learning, arXiv preprint arXiv:1802.00386.
[78] B. Wang, Z. Yan, J. Lu, G. Zhang, T. Li, Road traffic flow prediction using deep transfer learning, in: Data Science and Knowledge Engineer- ing for Sensing Decision Support: Proceedings of the 13th International FLINS Conference (FLINS 2018), Vol. 11, World Scientific, 2018, pp. 331–338.