Grey Models for Short-Term Queue Length Predictions for Adaptive Traffic Signal Control

2019·Arxiv

Abstract

Abstract

Traffic congestion at a signalized intersection greatly reduces the travel time reliability in urban areas. Adaptive signal control system (ASCS) is the most advanced traffic signal technology that regulates the signal phasing and timings considering the traffic patterns in real-time in order to reduce traffic congestion. Real-time prediction of traffic queue length can be used to adjust the signal phasing and timings for different traffic movements at a signalized intersection with ASCS. The accuracy of the queue length prediction model varies based on the many factors, such as the stochastic nature of the vehicle arrival rates at an intersection, time of the day, weather and driver characteristics. In addition, accurate queue length prediction for multilane, undersaturated and saturated traffic scenarios at signalized intersections is challenging. Thus, the objective of this study is to develop short-term queue length prediction models for signalized intersections that can be leveraged by adaptive traffic signal control systems using four variations of Grey systems: (i) the first order single variable Grey model (GM(1,1)); (ii) GM(1,1) with Fourier error corrections (EGM); (iii) the Grey Verhulst model (GVM), and (iv) GVM with Fourier error corrections (EGVM). The efficacy of the Grey models is that they facilitate fast processing; as these models do not require a large amount of data; as would be needed in artificial intelligence models; and they are able to adapt to stochastic changes, unlike statistical models. We have conducted a case study using queue length data from five intersections with adaptive traffic signal control on a calibrated roadway network in Lexington, South Carolina. Grey models were compared with linear, nonlinear time series models, and long short-term memory (LSTM) neural network. Based on our analyses, we found that EGVM reduces the prediction error over closest competing models (i.e., LSTM and Additive Autoregressive (AAR) time series models) in predicting average and maximum queue lengths by 40% and 42%, respectively, in terms of Root Mean Squared Error (RMSE), and 51% and 50%, respectively, in terms of Mean Absolute Error (MAE).

Keywords: Grey systems, time series, long short-term memory neural network, queue length prediction

1. Introduction

Traffic congestion at a signalized intersection negatively impacts the travel time reliability in urban areas (Qi et al. (2016), Ma et al. (2018)). Adaptive signal control systems (ASCS) are the most advanced technology that regulates the phasing as well as red, yellow and green timings considering the traffic patterns (i.e., the arrival rate of vehicles at a signalized intersection from different approaches) in real-time to reduce traffic congestion (Radin et al. (2018)). Major benefits of ASCS include: (i) real-time distribution of green timings based on the arrival rate of vehicles for all traffic movements; and (ii) reduction of travel times through intersections by ensuring progression through green signal timing window (Radin et al. (2018)).

Real-time prediction of traffic queue length can be used to adjust the green timing for different traffic movements. Existing systems mainly use inductive loop detectors to detect queue lengths. Inductive loop detectors are installed on the roadway pavement (Tiaprasert et al. (2015)). There are several disadvantages of the loop detector-based sensors: (i) low coverage area (only cover a small length of a traffic lane); (ii) detection susceptibility to environmental conditions; and (iii) high cost for deployment and maintenance. Emerging connected vehicle technology can overcome the challenges of existing queue length estimation methods by providing real-time information to the traffic signal control using Vehicle-to-Infrastructure (V2I) wireless communication (Tiaprasert et al. (2015)).

In a connected vehicle environment, the information of arrival rate of vehicles for all movements at a signalized intersection is available via V2I communication. However, these arrival rates are stochastic in nature depending on different factors, such as the time of the day, weather and driving characteristics (Yang et al. (2018)). These factors adversely affect the performance of queue length prediction models and reduce the prediction accuracy significantly. Moreover, accurate queue length predictions for multi-lane scenarios and robustness of the predictions for both under saturated and saturated roadway traffic scenarios at a signalized intersection are challenging (Zhan et al. (2015)).

Recent studies use statistical and data-driven models for predicting queue length at signalized intersections (Tiaprasert et al. (2015), Comert (2016)). Data-driven models, such as recurrent neural network (RNN) based time series models, require a large amount of data for training a queue prediction model for different scenarios (such as single lane and multilane roadways) to achieve a high accuracy. However, it increases the computational resource need for real-time applications. It also increases the need for large amounts of data for extensive training considering different roadway traffic scenarios. The advantage of the RNN models is that after training, it can capture the stochastic roadway traffic pattern. On the other hand, although statistical models do not require a large amount of data for training, they need to re-estimate model parameters based on the traffic patterns, which reduces the applicability of the statistical model for real-world applications (Comert (2016)).

Recently, Grey models (GM) have become popular for traffic data prediction, as these models do not assume any underlying distribution for data generation process; are able to handle autocorrelated observations, and require low computational cost (Bezuglov and Comert (2016)). Furthermore, GM requires low sample size to update its parameters (as low as only four data points) (Liu et al. (2010)). A study by An et al. showed that the accuracy of first order single variable Grey Model (GM(1,1)) is higher than back propagation neural network (NN) and radial basis function NN model to predict monthly average daily traffic volume (An et al. (2012)). Similarly, Gao et al. found that GM(1,1) prediction accuracy of average hourly traffic volumes surpasses the performance of support vector machine (SVM) and artificial neural network networks (Gao et al. (2010)). However, there is no study that uses Grey models for predicting traffic queue length using connected vehicle data for ASCS. In addition, the efficacies of the Grey models are that it does not require a large amount of data, and is able to adapt to stochastic changes of the arrival rate of vehicles at a signalized intersection.

The objective of this study is to develop a robust short-term queue length prediction model for adaptive traffic signal control systems using four variations of Grey Systems: (i) the basic Grey model (GM(1,1)); (ii) GM(1,1) with Fourier error corrections (EGM); (iii) the Grey Verhulst model (GVM), and (iv) GVM with Fourier error corrections (EGVM). Grey models are evaluated using queue length data from five signalized intersections with adaptive traffic signal controls in Lexington, South Carolina. To evaluate the performance of different variation of GMs, we compared GMs to existing linear and nonlinear time series prediction models including long short-term memory (LSTM) model.

The rest of the paper is organized as follows. Section 2 presents related work focusing on queue length estimations and predictions at signalized intersections. Section 3 focuses on the Grey models and covers GM(1,1), the Grey Verhulst model, and two variations of these methods to improve their prediction accuracy. Section 4 presents the compared time series models and detailed numerical experiments to evaluate the prediction performance of the Grey models. Finally, section 5 summarizes the findings and addresses possible future research directions.

2. Related Work

There have been many studies focusing on queue length estimations and predictions at signalized intersections. Different studies have used different types of models and inputs for estimating or predicting queue lengths at intersections. Below we segment the literature into prediction and estimation studies.

2.1. Queue Length Prediction

In one of the most recent studies, Li et al. developed a queue length prediction model for multi-lane signalized intersections. The authors used the Lighthill-Whitham-Richards shockwave theory and Robertson’s platoon dispersion model to predict the arrival of vehicles 5 seconds in advance for each lane and integrated the predictions of different lanes using Kalman filter. The authors achieved an average RMSE of 2.33 vehicles, MAE of 1.82 vehicles, and MAPE of 16.12% for maximum queue length prediction. However, this model does not consider several aspects of real-world traffic flow that affect queue lengths, such as lane changing, heterogeneous traffic and dynamic correction of travel times (Li et al. (2018)). Zeng et al. developed a queue length prediction model using stochastic fluid theory. The authors used the two-fluid theory for considering road traffic and congested traffic for predicting queue lengths. The average relative prediction error of the model is 24.7% for single lane scenario and 38.2% for multilane scenario. This model also struggles with multilane scenario due to the existence of lane changing (Zeng et al. (2017)).

2.2. Queue Length Estimation

All these studies have the limitation of only estimating current queue lengths, but these studies are also relevant to prediction because the estimation models can be leveraged to create prediction models. The estimation models can be divided into three categories, statistical, analytical, and data-driven models.

2.2.1. Statistical models

Comert developed stochastic models and formulated the analytical expressions of estimators, which were used for estimating queue length from probe vehicle data (e.g., location, time, and count). The developed models estimate cycle-to-cycle queue lengths within 5% error. However, this paper does not deal with predicting future queue lengths (Comert (2016)). Hao et al. developed seven Bayesian network models for estimating cycle-by-cycle queue lengths for seven different traffic scenarios. The input to the models is mobile traffic sensor data collected between the upstream and downstream of an intersection. Hao et al. proved that the stochastic approach at low penetration rates is more robust compared to deterministic approaches. However, this model suffers from the lack of availability of actual ground truth data, since the model predicts queue length distribution by cycle, but in the real world only a queue is observed at a certain instant (Hao et al. (2014)). Zhan et al. developed a lane-based real-time queue length estimation method using license plate data. The developed model includes a Gaussian Process based interpolation method and a car following model for reconstructing the equivalent cumulative arrival-departure curve of each lane and estimating queue lengths. The RMSE and MAE of queue length estimation are below 3.2 vehicles and 2.4 vehicles (approximately 12 m and 16 m based on average vehicle length), respectively. However, this model also has some limitations, such as lane changing effects not considered and the model may infer incorrect arrival times (Zhan et al. (2015)).

2.2.2. Analytical models

Hao and Ban developed a queue length estimation method to solve the long queue problem using short vehicle trajectories from mobile sensors (Hao and Ban (2015)). The method is based on vehicle trajectory reconstruction models to estimate the missing acceleration/deceleration process. Their method was able to reduce the mean absolute error for long queue length estimation from 3.79 vehicles to 1.61 vehicles (approximately 18.95 m to 8.05 m based on average vehicle length). However, Hao and Ban do not deal with predicting future queue lengths. Moreover, this model is inapplicable for multi-lane intersections and heavily congested scenarios. It also requires the input data to be high precision and low sampling rate (Hao and Ban (2015)). Wang et al. developed a queue estimation method for signalized intersections using data from both probe vehicles and point detectors. The authors used shockwave theory to model the queue dynamics. The models showed mean absolute percent error (MAPE) of 17.09% and 12.28% for 2 different scenarios. However, this model has some limitations in estimating queue length when there is residual queue at the intersection (Wang et al. (2017)). Tiaprasert et al. proposed a queue length estimation model using connected vehicle technology for adaptive signal control (Tiaprasert et al. (2015)). Tiaprasert et al. applied a discrete wavelet transform (DWT) to queue estimation in order to make it robust against randomness in penetration ratio. The authors showed that the queue length estimation algorithm works in both undersaturated and saturated traffic conditions, which is essential for applying it in adaptive signal control (Tiaprasert et al. (2015)).

2.2.3. Data-driven models

An et al. developed a real-time queue length estimation model including a breakpoint misidentification checking process and two input-output models (upstream-based and local-based), and used event-based data as input. The model was able to improve on the generic breakpoint model as the maximum queue length estimation MAE was found to be 10.88 m compared to 32.2 m. However, as the model needed to be trained with ground truth data for parameter estimation; two limitations of the model related to parameter estimation are: validity of the parameters for different time periods and transferability of the parameters among intersections (An et al. (2018)). Gao et al. proposed a cycle-by-cycle queue length estimation model, which is a weighted combination of two submodels: shockwave sensing and back propagation neural network sensing. The input to the model is connected vehicle data. The authors showed that their model has a higher accuracy than probability distribution models for low penetrations of connected vehicles, with 85% accuracy for low penetration rates and 95% accuracy for high penetration rates. This model also performs well for both undersaturated and saturated conditions, which is crucial for adaptive signal control. However, it suffers from the data requirements for training the back propagation neural network model (Gao et al. (2019)).

From the review of literature, it is evident that queue length prediction has some research gaps, which include accuracy for multilane scenarios and robustness for both under saturated and saturated scenarios (to be effective for adaptive signal control). Through our development and evaluation of Grey model, we will investigate these gaps in the literature.

3. Grey Systems for Queue Length Prediction

The Grey Systems theory was developed by Ju-Long (1982) and since then it has become the preferred method to study and model systems in which the structure or operation mechanism is not completely known (Ju-Long (1982)). Grey System theory applications has been applied mainly in the area of finance (Kayacan et al. (2010)). Its application in transportation is limited; examples include prediction of average speed, travel time, number of accidents, and pavement degradation (An et al. (2018), Gao et al. (2019), Liu et al. (2014), Bezuglov and Comert (2016)). According to the Grey Systems theory, the unknown parameters of the system are represented by discrete or continuous Grey numbers. The theory introduces a number of properties and operations on the Grey numbers, such as the core of the number, its degree of Greyness, and whitenization of the Grey number. The latter operation generally describes the preference of the number towards the range of its possible values (Liu et al. (2014)). In order to model time series, the theory suggests a family of Grey models, where the basic one is the first order Grey model with one variable which will be referred to as GM(1,1). The principles and estimation of GM(1,1) is briefly discussed here; readers are referred to Ju-Long (1989) (Julong (1989)) for additional information.

Suppose that )) denotes a sequence of nonnegative observations of a stochastic process (i.e., average and maximum queue lengths) and an accumulation of sequence of queue lengths, computed as in Eq. (1). If the data contains missing values, sequence of identical observations, or zeros (no queues during parts of green phases), one can introduce very low Gaussian noise

then Eq. (2) defines the original form of the GM(1,1).

Let )) be a mean sequence of calculated by formula Eq. (3) and defined for

Eq. (4) gives the basic form of GM(1,1).

If ˆ

then, as in Liu and Lin (2006), the least squares estimate of the GM(1,1) model is ˆand Eq. (5) is the whitenization equation of the GM(1,1) model.

Suppose that ˆ) represent the time response sequence (the forecast) and the accumulated time response sequence of GM(1,1) at time k, respectively. Then, the accumulated time response sequence can be obtained by solving Eq. (5):

According to the definition in Eq. (1), the restored values of ˆ+1) are calculated as ˆ

Eq. (7) gives the method to produce forecasts for all k in 2, 3, ..., n. However, for longer time series, a rolling GM(1,1) is preferred. The rolling model observes a window of a few sequential data points in the series: 4 is the window size. Then, the model forecasts one or more future data points: ˆ+ 2). The process repeats for the next k value.

3.1. The Grey Verhulst Model (GVM)

The response sequence Eq. (7) implies that the basic GM(1,1) works the best when the time series demonstrate a steady growth or decline and may not perform well when the data has oscillations or saturated sigmoid sequences. For the latter case, the Grey Verhulst model (GVM) is generally used (Liu et al. (2010)). The basic form of the GVM is present by Eq. (8).

Eq. (9) provides the whitenization formula of GVM. It is practically represents assumed structure of data generation process.

Similar to the GM(1,1), the least squares estimate is applied to find ˆ

The forecasts ˆ+ 1) are calculated using Eq. (10).

3.2. Error Corrections to Grey Models

The accuracy of the Grey models can be improved by a few methods. Suppose that is the error sequence of ). If all errors are positive, then a remnant GM(1,1) model can be built (Liu et al. (2010)). When the errors can be positive or negative, expressed using Fourier series (Tan and Chang (1996)) as in Eq. (11).

then . As a result, the predicted value of the time series must be corrected according to Eq. (12):

4. Numerical Experiments

In this section, numerical results are presented regarding the performance of the applied Grey models (GMs). This section also presents linear and nonlinear time series models and their parameter values to compare the prediction results with Grey models.

4.1. Data Description

In order to evaluate the performance of the Grey System models, a case study has been performed. A calibrated microsimulation model has been developed in VISSIM of the US 378 (Sunset Drive) corridor in Lexington, South Carolina. A portion of the corridor has been chosen for analysis that includes five signalized intersections. All the signalized intersections operate under adaptive signal control. Centracs Adaptive traffic signal controller has been used in this study. Centracs Adaptive is an improved version of the original ACS Lite controller developed by Econolite. Traffic data and travel times have been collected for afternoon peak period and the VISSIM model has been calibrated to this data. As we are interested in queue lengths, a congested scenario is required in order to study the patterns of queue buildups and progressions. The first intersection is a T-intersection, while the other four are 4-way intersections. Along with the five intersections, there are 33 driveways on this corridor, which creates disruptions and stop-go conditions. These can contribute to the queue length patterns at intersections. A screenshot of the VISSIM simulation environment is shown in Fig. 1, including the detectors and queue counters placed at intersections.

Fig. 1. US 378 Lexington, South Carolina corridor with intersections (1 to 5) simulated in VISSIM

Table 1. Queue Counter Information

In order to get the queue length data, queue counters are placed at each intersection. Each queue counter corresponds to one lane group. A lane group is a group of lanes that allow traffic to move simultaneously. For example, a through lane and a right-turn lane can be in the same lane group. However, a through lane and a left-turn lane may not be in the same lane group, as the two lanes may not allow traffic to flow simultaneously. There are 31 queue counters in total for five intersections. By running the simulation, we have collected the average and maximum queue length data for each queue counter at each intersection. Please note that some queue counters correspond to multiple lanes. That is why, we have divided the dataset into two segments: average queue length data and maximum queue length data. The intersections are numbered from west to east. Intersection 1 is a T-intersection, so it requires the least number of queue counters (i.e. 4). The queue length data has been collected per second. The information about queue counters is given in Table 1.

Fig. 2. Comparison of average (Avg) and maximum (Max) queue length densities for different Queue Counters (QC1,QC4, and QC12)

Average and maximum queue length data of all 31 counters is collected for 1 hour from 4 different simulation runs. From Table 1, it can be observed that different queue counters yield different types of queue length patterns based on number of lanes, intersection, signal phasing and timings etc. For example, in the case of queue counter number 6 (denoted as QC6), the number of lanes is 1, so the average and maximum queue length is the same. However, for QC4, the number of lanes is 3, so there will be variations between the maximum and average queue lengths. The difference between average and maximum queue length of QC4 (3 lanes) is shown in Fig. 2) and the variation of average queue lengths among 3 different queue counters, QC12 (1 lane), QC1 (2 lanes) and QC3 (3 lanes), is shown in Fig. 2. From Fig. 2, it can be observed that the variation in average queue length is higher than the maximum queue length, which indicates the existence of one more congested lane compared to the other lanes in the lane group. From Fig. 2, it can be observed that the queue buildup for QC12 is more severe at certain times, which takes time to dissipate. On the other hand, QC1 and QC4 have a more distributed queue accumulation and dissipation due to the higher number of lanes.

Autocorrelation presence within the time series data assists in prediction if the models can capture them. Although several other covariates would influence (hidden or unobserved) the response variable of interest, we can simply use historical data to be able to predict future values. These conditions constitute the main motivation behind Grey system models. The autocorrelations can be shown simply using autocorrelation functions (ACF) and partial ACF or formal statistical tests. As an example, for QC4 average queue lengths, Fig. 3 shows the presence of negative autocorrelation in the data (Fig. 3). The partial ACF plot also reveals that ACF values become insignifiant after 2 significant lags which suggests that the autoregressive (AR) component in the time series to be fit is low (e.g.,AR(1) to AR(3)). A formal Durbin-Watson test also results in a p-value of 0.056, which barely rejects the null hypothesis of no autocorrelation. Although other parts of data used may fail to reject, the queue length data from our experiments show autocorrelations.

Fig. 3. Example of Autocorrelation functions (ACF) of queue length time series data on Queue Counter 4 (QC4)

4.2. Linear and Non-linear Methods for Comparison

Based on the above discussions, time series models can be good forecasting model candidates. For a fair comparison, we considered the following linear and nonlinear time series models for comparison with the GM models which are adopted from (Di Narzo et al. (2015)), and long short term memory (LSTM) model. All these models are fit to 1st 67% of all queue counter datasets and tested on 33% of the data that contains 124 1-hour series.

Eq.(13) presents linear model (autoregressive AR(3) denoted as LINEAR).

where denotes average or maximum queue length observations at time is intercept, are weights of previous observations, and is white noise. Eq. (14) shows logistic smooth transition autoregressive model (LSTAR).

where denotes average or maximum queue length at time

is logistics transition function. is delay of the transition variable, and th is the threshold value. Eq.(15) gives neural network nonlinear autoregressive model (NNETS).

where m denotes embedding dimension, D is number of hidden layers of the neural network, and represent the weights. Eq.(16) presents additive nonlinear autoregressive model (AAR).

where s represents nonparametric univariate smoothing functions that depends on delay parameter. Splines from Gaussian family are fitted in the form of Different number of layers (i.e., m values), which categorize models, are fitted and based on their Akaike information criteria (AIC) values, the best models are selected at each run. Similar analysis and justification can be found in Bezuglov and Comert (Bezuglov and Comert (2016)). The models that have been used for comparison are: (1) LINEAR (AR) m = 3, (2) LSTAR m = 3, (3) NNETS D = 4, m = 4 (25 batch size), (4) AAR model m = 3 and (5) LSTM model (200 epochs, 5 LSTM neurons, ReLu activation and 2 lag steps). An arbitrary example of resulting fitted models on queue length data are presented in Table 2.

Table 2. Parameters for nonlinear models trained on QC31 max queue lengths

4.3. Results

This section describes the findings related to queue length predictions using different models and comparison with the Grey models. The results of our analyses are presented in the following subsections.

4.3.1. Overall comparison

Fig. 4 demonstrates average and maximum queue length prediction errors in terms of RMSE=

models. Simple GM model and GM model with error corrections do not perform well in predicting queue length data. This is mostly due to the traffic signal generating periodic data. GM model performance can improve if queue lengths are predicted cycle-by-cycle without considering zero queue length. GVM and EGVM are able to capture periodicity with quadratic structures, thus, predicting with higher accuracy compared to GM and EGM models. As shown in Fig. 4, LSTM, LINEAR, LSTAR, NNETS, and AAR show similar performance. Their prediction accuracy is much higher than GM and EGM models, but lower than GVM and EGVM models. All models show slightly worse accuracy for maximum queue length prediction compared to average queue length prediction because the variability of the maximum queue lengths have more randomness.

Fig. 4 exhibits the following average errors: Avg QLRMSE = [22.83, 21.25, 4.10, 2.94, 5.01, 6.78, 5.08, 4.96, 5.71], Max QLRMSE=[21.09, 19.98, 4.86, 3.69, 6.40, 11.73, 7.77, 6.47, 13.10], Avg QLMAE=[4.43, 4.30, 1.42, 0.91, 1.95, 2.64, 2.06, 1.85, 2.44], and Max QLMAE= [3.96, 3.90, 1.71, 1.10, 2.22, 4.56, 3.23, 2.22, 7.49]). Results show that GVM and EGVM models are able to achieve an error-bound of terms of RMSE and 1 m in terms of MAE for both average and maximum queue lengths. Compared models are able to achieve 2 m error bound in terms of MAE for average queue length. However, for maximum queue lengths, the error bound increases to 8 m in terms of MAE. Therefore, GVM and EGVM models are more accurate and robust across all scenarios and error types.

Fig. 5 presents the comparison between the performance of EGVM, LSTM, and LINEAR models in predicting average queue lengths of QC4 as an example. We observe that LSTM overestimates when

Fig. 4. Comparison between the performance of different models in predicting queue length

there are any abrupt changes in queue lengths. On the other hand, EGVM is able to capture sudden changes in queue length. LINEAR model shows almost similar behavior to LSTM. The reason is that the LSTM model that we have used in this study is a basic model with minimum features (univariate single-step prediction). Moreover, LSTM is a data intensive model but limited (1 hour) data has been included in our study.

Lastly, computational times are provided in Table 3 per 3600 observations across all data, training time using 2400 and testing time of 1200 observations. GM models do not require any training time and they are updated with low window size (of 4 past observations). LSTM requires more time to learn from the data. Clearly, EGVM is the best option considering both accuracy and computational time. For robust, adaptive, and accurate prediction with low computational times and low sample size, GVM and EGVM models provide accurate prediction of queue length.

Table 3. Average computational times (in seconds) of different models

Fig. 5. Comparison between the performance of EGVM, LSTM, and LINEAR models in predicting average queue lengths of QC4 for 1-step predictions

4.3.2. Model performance comparison (single lane vs multilane)

As stated in the literature review, we found that two major challenges of the queue length prediction models are their prediction capability for multilane scenarios compared to single lane scenarios and their performance in undersaturated and saturated scenarios. Fig. 6 shows all queue length prediction results for single lane scenarios and Fig. 7 shows all queue length prediction results for multilane scenarios. Overall, all model performances degrade in multilane scenarios due to many factors; e.g., lane changing behavior of arriving vehicles. However, the EGVM model is still able to maintain a reasonable accuracy for multilane scenarios compared to single lane scenarios. The average RMSE of EGVM model for multilane scenario is 3.55 m compared to 1.88 m for single lane scenarios. The average MAE of EGVM model for multilane scenarios is 1.10 m, compared to 0.44 m for single lane scenarios. These errors indicate that the EGVM model can be used for both single lane and multilane scenarios.

4.3.3. Model performance comparison (undersaturated vs saturated)

Within multilane scenarios, we also investigated one queue counter that is operating in saturated conditions, QC2, and another queue counter that is operating in undersaturated conditions, QC8. The comparison of RMSE and MAE values is shown in Fig. 8. From Fig. 8, we observed that the EVGM model has shown similar performance compared to other models for QC2. However, it has significantly better performance compared to other models for QC8. The RMSE and MAE for QC8 are lower than QC2, which is expected as the congested scenario will create operational issues, such as residual queue and spillback, which could decrease the accuracy of the model. Therefore, the EGVM model can predict

Fig. 6. Prediction performances on single lane scenarios

queue length with high accuracy in undersaturated conditions while maintaining accuracy comparable to other models for saturated (or congested) conditions.

5. Conclusions

This study shows the effectiveness of Grey Systems in queue length prediction. The EGVM model provides the most accurate queue length predictions for different traffic conditions in both single lane and multilane scenarios. The EGVM model can predict accurately for both undersaturated and saturated conditions, which establish the efficacy of the model for predicting queue length and using it as an input to the adaptive signal control systems. The EGVM model is identified as the best model because it outperforms the compared models for average and maximum queue length prediction. The analysis

Fig. 7. Prediction performances on multi-lane scenarios

showed that GVM models could provide approximately 1 meter precision in queue length prediction. Both GVM models provide more accurate prediction than LSTM using only a fraction of the input data (4 vs 2400 observations) and require very low computational times due to the absence of the training phase. This study also showed that simple GM(1,1), even with error correction, failed to produce competitive results compared to other linear and nonlinear time series prediction models. One limitation of this study is that the models are dependent on the accuracy of the historical queue length estimations (i.e., ground truth). From literature review, it has been observed that accurate queue length estimation is not a trivial task, so this work needs to be combined with a queue length estimation framework for effective utilization. Future work should also include the following: (1) mid-term and long-term forecasts (2) modifications to the basic Grey systems equations and a study on applicability of multivariable Grey models, and (3) seasonal behavior inducing model structures.

Fig. 8. Performance of Grey System and other models for undersaturated and saturated scenarios

Acknowledgments

This study is based on a study supported by the Center for Connected Multimodal Mobility ((USDOT Tier 1 University Transportation Center) Grant headquartered at Clemson University, Clemson, South Carolina, USA. The authors would also like to acknowledge U.S. Department of Homeland Security (DHS) Summer Research Team Program Follow-On, and National Science Foundation (NSF, No. 1719501) grants. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of (), USDOT, DHS, or NSF and the U.S. Government assumes no liability for the contents or use thereof.

References

An, C., Wu, Y.-J., Xia, J., Huang, W., 2018. Real-time queue length estimation using event-based advance detector data. Journal of Intelligent Transportation Systems 22 (4), 277–290.

An, Y., Cui, H., Zhao, X., 2012. Exploring gray system in traffic volume prediction based on tall database. International Journal of Digital Content Technology and its Applications 6 (5).

Bezuglov, A., Comert, G., 2016. Short-term freeway traffic parameter prediction: Application of grey system theory models. Expert Systems with Applications 62, 284–292.

Comert, G., 2016. Queue length estimation from probe vehicles at isolated intersections: Estimators for primary parameters. European Journal of Operational Research 252 (2), 502–521.

Di Narzo, A. F., Aznarte, J. L., Stigler, M., 2015. R package:tsdyn manual.

Gao, K., Han, F., Dong, P., Xiong, N., Du, R., 2019. Connected vehicle as a mobile sensor for real time queue length at signalized intersections. Sensors 19 (9), 2059.

Gao, S., Zhang, Z., Cao, C., 2010. Road traffic frieght volume forecasting using support vector machine. In: International conference on computer science and computational technology (ISCSCT’10). pp. 329–332.

Hao, P., Ban, X., 2015. Long queue estimation for signalized intersections using mobile data. Transporta- tion Research Part B: Methodological 82, 54–73.

Hao, P., Ban, X. J., Guo, D., Ji, Q., 2014. Cycle-by-cycle intersection queue length distribution estimation using sample travel times. Transportation research part B: methodological 68, 185–204.

Ju-Long, D., 1982. Control problems of grey systems. Systems & Control Letters 1 (5), 288–294.

Julong, D., 1989. Introduction to grey system theory. The Journal of grey system 1 (1), 1–24.

Kayacan, E., Ulutas, B., Kaynak, O., 2010. Grey system theory-based models in time series prediction. Expert systems with applications 37 (2), 1784–1789.

Li, B., Cheng, W., Li, L., 2018. Real-time prediction of lane-based queue lengths for signalized intersec- tions. Journal of Advanced Transportation 2018.

Liu, S., Lin, Y., 2006. Grey information: theory and practical applications. Springer Science & Business Media.

Liu, S., Lin, Y., Forrest, J. Y. L., 2010. Grey systems: theory and applications. Vol. 68. Springer Science & Business Media.

Liu, W., Qin, Y., Dong, H., Yang, Y., Tian, Z., 2014. Highway passenger traffic volume prediction of cubic exponential smoothing model based on grey system theory. In: 2nd International Conference on Soft Computing in Information Communication Technology. Atlantis Press.

Ma, C., Hao, W., Wang, A., Zhao, H., 2018. Developing a coordinated signal control system for urban ring road under the vehicle-infrastructure connected environment. IEEE Access 6, 52471–52478.

Qi, L., Zhou, M., Luan, W., 2016. Impact of driving behavior on traffic delay at a congested signalized intersection. IEEE Transactions on Intelligent Transportation Systems 18 (7), 1882–1893.

Radin, S., Chajka-Cadin, L., Futcher, E., Badgley, J., Mittleman, J., et al., 2018. Federal highway administration research and technology evaluation final report: Adaptive signal control. Tech. rep., United States. Federal Highway Administration. Office of Corporate Research .

Tan, C., Chang, S., 1996. Residual correction method of fourier series to gm (1, 1) model. In: Proceedings of the first national conference on grey theory and applications, Kauhsiung, Taiwan. pp. 93–101.

Tiaprasert, K., Zhang, Y., Wang, X. B., Zeng, X., 2015. Queue length estimation using connected vehicle technology for adaptive signal control. IEEE Transactions on Intelligent Transportation Systems 16 (4), 2129–2140.

Wang, Z., Cai, Q., Wu, B., Zheng, L., Wang, Y., 2017. Shockwave-based queue estimation approach for undersaturated and oversaturated signalized intersections using multi-source detection data. Journal of Intelligent Transportation Systems 21 (3), 167–178.

Yang, G., Tian, Z., Xu, H., Wang, Z., Wang, D., 2018. Impacts of traffic flow arrival pattern on the necessary queue storage space at metered on-ramps. Transportmetrica A: Transport Science 14 (7), 543–561.

Zeng, X., Zhan, J., Yang, L., Xiong, Q., Chen, Y., 2017. Research on the queue length prediction model with consideration for stochastic fluid. In: International Symposium for Intelligent Transportation and Smart City. Springer, pp. 51–63.

Zhan, X., Li, R., Ukkusuri, S. V., 2015. Lane-based real-time queue length estimation using license plate recognition data. Transportation Research Part C: Emerging Technologies 57, 85–102.