FEASIBILITY OF USING V2I SENSING PROBE DATA FOR REAL-TIME MONITORING OF MULTI-CLASS VEHICULAR TRAFFIC VOLUMES IN UNMEASURED ROAD LOCATIONS

Portions of dynamic traffic volumes consisting of multiple vehicle classes are accurately monitored without vehicle detectors using vehicle-to-infrastructure (V2I) communication systems. This offers the feasibility of online monitoring of the total traffic volumes with multi-vehicle classes without any advanced vehicle detectors. To evaluate this prospect, this article presents a method of monitoring dynamic multi-class vehicular traffic volumes in a road location where road-side equipment (RSE) for V2I communication is in operation. The proposed method aims to estimate dynamic total traffic volume data for multiple vehicle classes using the V2I sensing probe volume (i.e. partial vehicular traffic volumes) collected through the RSE. An experimental study was conducted using real-world V2I sensing probe volume data. The results showed that traffic volumes for vehicle types I and II (i.e. cars and heavy vehicles, respectively) can be effectively monitored with average errors of 6.69% and 10.89%, respectively, when the penetration rates of the in-vehicle V2I device for the two vehicle types average 0.384 and 0.537, respectively. The performance of the method in terms of detection error is comparable to those of widely used vehicle detectors. Therefore, V2I sensing probe data for multi-vehicle classes can complement the functions of vehicle detectors because the penetration rate of in-ve-hicle V2I devices is currently high.


INTRODUCTION
Monitoring real-time traffic volumes is essential for dynamic traffic control and operation in modern intelligent transportation systems (ITSs). Hence, various vehicle detectors, ranging from conventional inductive loops to advanced radar sensing, are densely deployed and utilised for monitoring real-time traffic volumes. However, the field installation, operation and management of vehicle detection systems require extensive funding and resources to ensure reliability and accuracy of the measured traffic information. In addition, the widely used vehicle detectors (e.g. loop and image processing types) still have deficiencies when used to classify vehicle types (e.g. cars, buses and trucks) from the empirical perspectives of transportation practitioners.
To address the above issues effectively, we conducted a literature review on academic investigations regarding advanced data sources [1][2][3]. Note that the literature review in this study is focused on academic research in which real-life advanced data are used to infer temporal vehicular traffic volumes for a road location with no vehicle detector. Two types of probe data have been used for the dynamic estimation of traffic volumes: cellular phone (CP) [1,2] and car navigation data [3]. Academic trials for inferring the hourly vehicular volume using CP call counts and the related probability of crossing inter-cell boundaries were reported in [1,2]. These two studies showed that CP call data could be utilised to determine hourly traffic volumes, although operated for V2I communication. In relation to this research aim, a novel method was developed to convert V2I sensing probe volumes (i.e. partial vehicular volumes) into overall vehicular traffic volumes. An experimental verification for demonstrating the feasibility of the method was performed. Based on the analysis results, some findings and further research directions related to the possible monitoring of traffic volumes with multiple vehicle classes using V2I probe data in the present and near-future era are discussed.

Approach concept
The spatial-temporal evolution of traffic volume states serves as an initial deterministic system (i.e. a chaotic system) [4][5][6]. In other words, the evolution of traffic volume states naturally shows intensive and wide variations in ergodic and non-periodic manners. This is one of the main reasons that vehicular traffic volumes are directly measured by vehicle detectors rather than indirectly inferred by estimation techniques. For instance, geostatistical analysis techniques (e.g. kriging interpolation) fail to reliably estimate road traffic volume [7,10]. In addition, according to our literature review, the dynamic behaviour of traffic volume makes it difficult to directly infer reliable temporal traffic volumes using indirect probe data (e.g. mobile phone data), despite the fact that the PR of this data source is very high. Therefore, direct probe data closely related to vehicle volumes should be employed to explain the dynamic behaviour of traffic volumes and guarantee the reliability of estimations.
V2I communication is conducted between in-vehicle devices (e.g. RF OBU terminals) and RSE. This implies that in-vehicle devices in service can be considered as moving probes that can be used to measure partial vehicle volumes at RSE locations when the PR of in-vehicle devices is not 100%. Based on this self-evident fact, it can be assumed that the V2I probe volume (i.e. partial vehicle volume) represents a portion of the total traffic volume, or is at least directly correlated with this volume, considering random sampling variations. If this assumption is reasonable, the total traffic volume at an RSE location can then be calculated using the relationship between the V2I probe volume and total vehicle volume as measured from locations close to the RSE location. This assumption is also supported the average estimation error was 20%. Point-topoint vehicle trajectory data collected from car navigation systems were also employed by Chang and Yoon to monitor dynamic traffic volumes [3]. They demonstrated that temporal traffic volumes at a time length of 5 min can be directly monitored with 5.69% average error at a 14.91% penetration rate (PR) of vehicle global positioning system (GPS) systems.
Despite these notable achievements, two current issues should be effectively addressed in relation to the actual application of ITSs from the perspectives of traffic engineers and field staff. Real-time traffic volumes with multiple classes of vehicles should be monitored by other methods as an alternative to vehicle detectors. Advanced probe data sources (e.g. CP and vehicle GPS), used as the input dataset of a method, should be stably supported because online monitoring reliability is one of the crucial requirements of vehicle detection systems. Unfortunately, the CP call and vehicle GPS probe counting data do not include any direct clues for identifying the vehicle type (e.g. cars, buses and trucks) in many cases. These data sources usually belong to private businesses and are thus linked to serious obstacles, such as high cost and privacy policies. In addition, transmission delays are expected in the process of transferring data between the private sector and advanced transportation data centres.
Fortunately, vehicle-to-infrastructure (V2I) communication systems based on dedicated shortrange communication (DSRC) have been widely introduced and utilised for electronic toll collection and section-based traffic speed monitoring in modern ITSs. The V2I system essentially solves the aforementioned obstacles inherent in private sector data because sensing probe data collected through this V2I system include direct information about vehicle types, and monitoring stability and direct access to the sensing data are guaranteed in real time. In addition, in-vehicle V2I devices (e.g. 5.8 GHz radio frequency (RF) onboard unit (OBU) terminals) have a high PR in many developed countries. These facts offer a practical opportunity to directly infer real-time traffic volumes of multiple vehicle classes for any road location where vehicle detectors are not installed.
This research aims to verify the feasibility of using V2I sensing probe data for obtaining the dynamic traffic volumes of multiple vehicle classes in road locations where roadside equipment (RSE) is the zero probe volume values simultaneously. The filtering method is designed based on the fact that the RV distribution (RVD) of temporal probe volumes is greater than or at least equal to that of temporal vehicle volumes, as shown in Figure 1. Thus, unnecessary variations in probe volumes can be removed using the RVDs of the temporal probe volumes and temporal vehicle volumes.
To compute the temporal RV values, the time-series values and average values for the vehicle and probe volumes are defined as follows: A set consisting of three RSE locations is defined as l={tg,up,dn} for the target RSE (tg), along with the upstream (up) and downstream (dn) of tg. As mentioned before, the RV of the probe volumes (i.e. R l p ) is greater than that of the vehicle volumes (i.e. R l v ) ( Figure 1). Thus, r l p (i) can be modified such that it becomes similar to r l v (i) using the standard deviations of R l p and R l v . Let σ l x be the standard deviation of RVD of R l x , where x={v,p} and l≠tg. Let P l a =[p l a (t),p l a (t-1),…,p l a (t-d)] be a time series of the adjusted probe volumes for l. Based on these by research [3,7,10], in which vehicle-GPS probe volume data are employed directly to estimate vehicular traffic volumes along road sections. Importantly, in-vehicle V2I devices are widely utilised for electronic toll collection, with high market share. Accordingly, V2I sensing probe data intrinsically include direct information about multi-vehicle classes. For this reason, V2I sensing probe data provide key information to effectively address the uncertainty problem during direct monitoring of real-time traffic volumes with multi-vehicle classes.
Based on the aforementioned concepts, a method of producing dynamic traffic volumes for multi-vehicle classes at RSE locations using V2I sensing probe volume data is proposed in this study. The method consists of two steps: filtering and converting the probe volume data. In the filtering step, the temporal probe volumes are tuned into suitable data by eliminating random noise effects. In the conversion step, a filtered probe volume at an RSE location where an estimation is desired is expanded into a vehicle volume value using the optimal relationship between the filtered probe volume and vehicle volume. Integrating the two steps can efficiently solve the problem of estimating the vehicle volume by minimising the number of uncertainties that unavoidably arise when addressing this problem.

Filtering method
The V2I sensing probe volume is a random sample with a given PR of in-vehicle V2I devices. Naturally, the temporal development of the probe volume (i.e. partial vehicle volume) shows a wider relative variation (RV) than that of the total vehicle volume when the PR (0.0-1.0) of the in-vehicle V2I devices with respect to the vehicle volume is less than 1.0. This sampling variability significantly increases when the vehicle volume or PR values are low. Zero probe volumes can also occur even when the vehicle volumes are not low and the PR values are high. Because of this sampling variability, estimation failures (i.e. over-and underestimation problems) are inevitable, especially at turning points when raw probe volume data are directly employed to produce estimations of vehicle volumes without a filtering or imputation process. In addition, such estimation failures caused by sampling variability are closely related to the reliability of the estimation.
To address these problems efficiently, the filtering method assesses the reduction of unnecessary random variations and conducts an interpolation of be the independent and dependent variables, respectively.
The temporal evolution of vehicular volume states shows intensive variations ergodically and non-periodically [4][5][6], that is, vehicle volumes vary steeply at turning points ( Figure 1). In this case, undesirable results (e.g. repetitive over-and underestimations and even negative estimations) can occur because of the failure of the directionality of a proper relationship between P and V when linear regression is used. To address these estimation failures effectively, a power curve for nonlinear curve fitting (ranging from logarithmic and linear to positive exponential fittings) is used to identify a suitable relationship between P and V ( Figure 2). The power curve is defined as follows: where α (>0.0) and β (>0.0) are the coefficient and exponent, respectively, of P. For an optimal curve fitting that minimises the total estimation error, the individual errors between the observations and estimations can be expressed as where ̂α and β̂ are the optimal α and β values, respectively, and ϵ i is the estimation error for observation i (i.e. v i ), i!K. Based on these considerations, a minimisation problem for determining an optimal curve is defined as follows: . . , .
v p where ̂α is greater than zero, as vehicle volumes are greater than zero, and β̂ is greater than zero, as vehicle volumes do not decrease with increasing probe definitions, each element of P l a is computed with p ̅ l and an adjustment factor (f l ) (i.e. the rate of σ l v to σ l p ) using Equation 2. In this manner, each probe volume (i.e. p l (i)) is tuned into p l a (i) by removing any unnecessary random sampling variation.
With regard to tg, it is impossible to calculate the value of an adjustment factor (i.e. f tg =σ v tg /σ p tg ), because V tg is not measured. Therefore, it is assumed that the adjustment factor value for tg is more similar to that of l when p ̅ tg (i.e. the average of the elements of P tg ) is closer to p ̅ l , (l≠tg). This can be supported by the rational reasoning that the PR values (=p ̅ l /v ̅ l ) of the in-vehicle V2I devices for the three locations are analogous with acceptable differences, despite the fact that v ̅ tg cannot be estimated.
To determine the adjustment factor value for tg(f tg ), let p ̅ max and p ̅ min be max.{p ̅ l } and min.{p ̅ l }, (l≠tg), respectively. Let f max and f min be the adjustment factor values associated with p ̅ max and p ̅ min , respectively. Based on these considerations, the similarity between p ̅ tg and p ̅ l can be effectively employed to combine the two adjustment factors (i.e. f max and f min ) into one adjustment factor (i.e. f tg ) for tg, (l≠tg). f tg is determined by the inverse of the similarity in Equation 3. , Finally, an adjusted probe volume for tg at (t) (i.e. p a tg (t)) is computed with a determined value of f tg as

Conversion method
A nonlinear relationship between P l a and V l (l≠tg) was employed in this study to convert the adjusted probe volume (i.e. p a tg (t)) to vehicle volume (v̂t g (t)) for the target location (tg) at time interval (t). To determine an optimal fitting, P l a and V l (l≠tg) were used as explanatory and dependent variables, market share of RF OBUs exceeds 40%. The RF OBU probe volume data used for the V2I sensing probe data in the case study were collected from the RSE located at three tollgates of the testbed through the DSRC system of Korea Expressway Corporation (KEC). The vehicle volume data were monitored at the same tollgates in the toll collection system of KEC. As such, the RF OBU probe volume is a direct part of the vehicle volume at a tollgate. In addition, for analysing multiple vehicle classes, the vehicles were categorised into two: vehicle type I pertains to cars whereas vehicle type II includes buses and trucks. Figure 4 illustrates the PR of the probe volume to the vehicle volume for the two types of vehicles, where PR=[probe volume / vehicle volume]. On average, the PR values for vehicle types I and II reach 0.384 and 0.537, respectively. Note that these average PRs are very high from the standpoint of statistical sampling rate. The result reveals a wide variation for the two vehicle types despite the high PRs. The range of variation becomes narrower when the vehicle volume increases as the sampling variability decreases.
The PRs of vehicle type I vary widely from 0.19 to 0.60 in the vehicle volume regime of v<100, but the variation decreases (ranging from 0.29 to 0.48) in the vehicle volume regime of 100≤v. The PRs of vehicle type II, in spite of the average exceeding 0.5, vary extensively from 0.11 to 1.0 in the vehicle volume regime of v<40, while the variation decreases, ranging from 0.26 to 0.77, in the vehicle volume regime of 40≤v. Thus, the behaviour of the PR (i.e. the sample size) relative to the degree of vehicle volume (i.e. population) is close to a mixed state, and it shows a closed boundary condition. This indicates that the probe volume for estimating the vehicle volume should be adjusted to eliminate the sampling variability, even when it is collected at a high PR of in-vehicle V2I devices. However, the PR trends for the two types are constant on average, even when the vehicle volume decreases. This implies that the V2I sensing probe volume can be effectively employed to obtain the total vehicle volume through a method that expands the probe volume into the vehicle volume when the sampling variability included in the probe volume is eliminated.
To thoroughly examine the performance capabilities of the method presented in this paper, the following three performance measures were carefully volumes. Once the optimal values of ̂α and β̂ for a best-fit curve are determined by solving the minimisation problem, a vehicle volume (i.e. v̂t g (t)) for the target RSE location is directly obtained as

Study design
To demonstrate the feasibility of using V2I sensing probe volume data for directly monitoring vehicular traffic volumes with multiple vehicle classes, a case study was conducted using real-world data. The test data were collected from three tollgates located in Seoul External Beltway in South Korea ( Figure 3). The road section between the upstream and downstream locations includes two junctions and six interchanges. There were four lanes on all road locations. The distances (km) from the target location to the upstream and downstream locations are 11.6 and 24.5, respectively. These unfavourable testbed conditions are desirable for verifying the feasibility of the proposed method from the standpoint of practical applications.  Two types of real-world data (i.e. V2I sensing probe volume data and vehicle volume data) for two types of vehicles were collected with a 5-minute aggregation on 1-5 September 2019. In South Korea, RF OBUs are widely utilised for monitoring section-based traffic speed between two RSE locations and electronic tolling at tollgates, and the the model parameters by using a single parameter. This is a significant advantage in practical applications where the estimation accuracy needs to be guaranteed. Despite this consideration, the performance of the method relies heavily on the d value (i.e. the embedding size of the time series) in terms of estimation accuracy. Hence, for a given dataset, the d value plays a critical role in reducing any unnecessary temporal variation in the probe volumes, which in turn influences our understanding of the relationship between the probe volume data and vehicle volume data.
The effects of the d value on the estimation accuracy of the two types of vehicles are shown by the MAPE in Figure 5. The target time span is 18 h (06:00-24:00) per day. This analysis period is important given the dynamic traffic control and operational strategies of the ITS. Regarding vehicle type I, the estimation error decreases steeply (d=2→4) and then remains (d=5→11) in the optimal error space, after which it increases (d=12→20) with small variations. For vehicle type II, the error curve decreases exponentially (d=2→6) and then remains (d=7→16) within the optimal error space, after which it increases (d=17→20). Thus, the estimation error decreases to the optimal error space, and then gradually increases as the d value increases. This concave error behaviour indirectly indicates that the locality of the temporal development of the probe or vehicle volume exists regardless of whether the boundary condition is obvious. This also implies that the relationship between the probe and vehicle volumes can be determined within an acceptable margin of error. selected and applied, excluding the average performance measures. The volumes of the two vehicle types (I and II) vary widely (from 12 to 540 and 2 to 105, respectively) ( Figure 4). In this case, the absolute percentage error (APE) and relative percentage error (RPE) provide a useful basis for comparison [5][6][7]. The APE and RPE have shortcomings when used with low traffic volumes, as the RV of a low traffic volume is higher than that of a high traffic volume. To compensate for the deficiencies of the two measures, a straight error for lane (SEL, vehicles per lane) was also employed, which can be useful in practice from the viewpoints of traffic control and operation. Using these performance measures, various analyses were conducted: hit rate and statistical analyses. Additionally, the mean of the APEs (MAPE) was used as the total performance measure to determine the optimal parameter values (i.e. the d value) of the proposed method. APE (%), RPE (%) and SEL (veh) are expressed as follows: where y i and ŷ i are the observed and estimated values of sample i, respectively, and l denotes the number of lanes.

Results and findings
The model presented in this study was developed based on the integration of the filtering and conversion steps. The model was designed to minimise standard deviation of the RPD (SDRPD) of the raw probe volume is 36.9, whereas that of the filtered probe volume is 14.6. The adjustment gain is also as high as 60.4% [=(36.9-14.6)/36.9·100]. Specifically, the temporal raw probe volumes for vehicle type II reveal intensive variations, similar to the temporal variation in signalised traffic volumes [5,8]. The SDRPDs of the raw and filtered probe volumes are 72.3 and 34.6%, respectively. In addition, the zero probe volumes are replaced with suitable values. Similar to vehicle type I, the adjustment gain for type II reaches 52.2%. The results clearly indicate that the temporal variation in the raw probe volumes can be successfully adjusted to become similar to that of the vehicle volumes, at least in this case. Thus, undesirable estimations can be effectively prevented by filtering out unnecessary random sample variations.
Most importantly, the optimal error space for each vehicle type is very stable, with a minimal error of +0.5%. This indicates that the best or second-best parameter values can be analysed and determined within the margin of error on a weekly or even a monthly basis in advance. This is a crucial advantage from the perspective of field staff personnel who do not have a broad range of experience in calibrating and modifying an advanced ITS model. In addition, the optimal d values of 7 and 11 for vehicle types I and II, respectively, were selected for a deeper analysis. Figure 6 shows a comparison of the temporal variations in the raw and filtered probe volumes for one day. Extreme variations at turning points, which can result in undesirable estimations, are filtered, eliminating unnecessary sampling variations in terms of the relative percentage difference (RPD), where RPD=[p(t+1)-p(t)]/p(t)·100. For vehicle type I, the The analysis results for the two vehicle types according to the two traffic-volume regimes are summarised in Table 1. For all volume regimes, it can be seen that the accuracy of the method proposed in this study is at least comparable to those of modern vehicle detectors. Note that the vehicle counting detection errors for the inductive loop, laser scanner, weight-in-motion (WIM) piezoelectric and WIM quartz detectors with five.g.-minute data aggregation were reported to be 10.6, 24.1, 7.4 and 17.6% in terms of the MAPE, respectively [9]. In all regimes of the two vehicle types, the performance capabilities of the proposed method in terms of the MAPE are also comparable to those of short-term forecasting studies of motorway traffic flows [6] and signalised traffic flows [5,8].
The worst performances for the APE and RPE are observed in the low-volume regime, excluding the SEL, as shown in Figures 8 and 9, especially for vehicle type II. The APEs exceed 20% in many cases, which is undesirable from a forecasting standpoint. Figure 7 displays the relationships between the probe and vehicle volumes before and after filtering. The relationships between the two variables for the two vehicle types are effectively improved in terms of R 2 after the filtering process. Thus, the R 2 values of vehicle types I and II increase from 0.953 and 0.895 to 0.991 and 0.953, respectively. These results indicate that the filtered probe volumes for vehicle types I and II statistically explain 99.1% and 95.3% of the vehicle volumes, respectively. This result appears to be statistically acceptable despite the fact that the attribute of R 2 increases when the number of observations increases. The effect of filtering random variations is also distinguished when the vehicle volumes are low. Therefore, the explanatory power of the probe volumes is remarkably improved, which in turn is related to a more reliable understanding of the relationship between the probe and vehicle volumes during the conversion process.   Moreover, the worst cases, with rates of -35.59% or +49.15% occur late at night, although they are tolerable for SELs within ± 5.0 vehicles (Figure 9b). It should be noted that the prediction performances of the proposed method in low-volume regimes are comparable to those of pattern selection-based short-term predictions [5][6] in terms of the MAPE. The hit rate performance within the RPE±10% is depicted in Figure 10. It can be seen that the performance of the proposed method in terms of the APE is obviously comparable to the required detection accuracy for modern vehicle detectors [9]. The R 2 values for vehicle types I and II are 0.983 and 0.952, respectively. This indicates that the predicted traffic volumes for vehicle types I and II statistically explain 98.3% and 95.2% of the actual traffic volumes, respectively. Hence, the proposed method of directly estimating the traffic volumes for multiple vehicle classes can be a feasible complementary ap-Note that the marginal error of detection by vehicle detectors should not exceed 20% relative to the actual traffic volume [1]. In spite of these undesirable performances, the estimations for the low-volume regime for vehicle types I and II are acceptable with the maximal SELs of 5.56 and 2.15, respectively, which are less than or equal to one vehicle per minute in practice.
The hit rate within the RPE±10.0% does not reach 90.0% in any regime for the two vehicle types; however, the RPE±20.0% is greater than 90.0% in the case of the heavy-volume regime. The hit rate within the SEL±10 vehicles is as high as 95.76% and even reaches 100.0% at times for all regimes of the two vehicle types. In contrast, for the heavy-volume regime with vehicle type I, the APE values are less than 20.0% in most cases (Figure 9a), where the temporal variation of the estimations is in good agreement with that of the observations (Figure 8a). In addition, the hit rate within the RPE ± 20.0% for the  This indicates that the estimation capability of the proposed method is at least comparable to the detection capabilities of modern vehicle detectors [9] based on the R 2 statistics and monitoring accuracy. Therefore, the direct estimation of real-time traffic volumes for multiple vehicle classes is a promising approach to solve the current hindrances associated with the vehicle detection infrastructure and traffic volume surveillance. In addition, the method presented here is instantly applicable when real-time V2I sensing probe volume data is available. Despite the meaningful results, there are other potential approaches for directly monitoring real-time traffic volumes in unmeasured road locations using advanced methods and data. Further research on improving the performance of the method proposed in this study should be conducted. Moreover, GPS-enabled V2I OBU probe volume data can be used effectively for monitoring spatially unconstrained real-time vehicle traffic volumes.

ACKNOWLEDGMENTS
This research was supported by a grant (22TL-RP-C148672-05) from Transportation and Logistics Research Program (TLRP) funded by Ministry of Land, Infrastructure and Transport of Korean government.
proach to the functions of vehicle detectors when V2I sensing probe data with a significant PR are available.

CONCLUSION
The measurement of real-time traffic volumes for multiple types of vehicles is crucial for the operation of advanced traffic control systems. Thus, various vehicle detectors are densely deployed and utilised in modern ITSs. However, this causes difficulties because vehicle detection systems require extensive costs and resources to ensure accuracy of the monitored information. This is a current issue that needs to be effectively addressed in relation to modern ITSs. This challenge explains the motivation behind this study.
V2I systems supported by DSRC are being widely introduced and utilised for section-based speed monitoring and electronic toll collection, with a high OBU PR at present. This provides a promising opportunity for effectively monitoring vehicular traffic volumes with multi-vehicle classes. In this study, a method for directly monitoring the real-time traffic volumes of multi-vehicle classes using V2I sensing probe volume data was developed. To demonstrate the feasibility of this method, a case study was conducted using real-world RF-OBU probe volumes and actual vehicle volumes.
The analysis results are favourable in terms of the R 2 statistics and average error (%) when the PRs of the RF OBU devices for vehicle types I and II (i.e. cars and heavy vehicles, respectively) are 0.38 and 0.54, on average. The R 2 values of the