Factors Influencing Driving Time in Public Transport – A Multiple Regression Analysis

Deviations in driving time (DT), or significant variations, occur frequently on urban public transport (PT) lines, except in subsystems with separate routes. DT variability is the main reason for disturbances in operation, leading to unstable and unreliable transport service. Moreover, it also causes variability in total user travel time, which is one of the main parameters of transport service quality. Identifying and quantifying factors that influence PT vehicle DT characteristics is significant for designing advanced prediction and passenger information systems and prioritising investments to reduce bus travel time and improve the scheduling process, and thus the level of transport service quality. An analysis of the elements of the route and other static elements of the line that influence DT was carried out in this paper. A model for determining and quantifying influential factors and methodologies for collecting all necessary data was created. The multiple regression model, developed as a result of the conducted multivariate statistical analysis using the specialised SPSS software, was applied to the selected representative set of lines in a real urban PT system. The created regression model explains between 18.2% and 97.4% of the variance of average, minimum and maximum DT and its deviation in the peak and off-peak periods.


INTRODUCTION
Constant economic development and improved quality of life lead to an increase in the value of time and value of reliability of the service provided by the PT system [1]. Numerous studies conducted over the last few decades have shown that the reliability of the service provided by the transport system represents a decisive factor in users' travel behaviour [2], mode selection [3][4][5][6], as well as the selection of a specific line within the PT system [7][8].
The starting hypothesis of the paper is that characteristics of the route and other elements of the line structure influence the characteristics of vehicle driving time (DT), and the level of this influence can be quantified. The joint effects of these factors have not been addressed in the literature. Therefore, to fill this research gap we developed a multiple regression model which allows for a more sophisticated investigation of the relationships of a set of independent variables and quantification of their influence on the dependent variable. The defined regression model contains a set of 15 independent variables. The model is tested for 5 different dependent variables in the two peak and two off-peak periods separately. The 5 dependent variables are used to describe the main characteristics of DT.
We define vehicle DT as the sum of driving times between PT stops and unplanned stopping times due to traffic conditions and other external factors. The sum of DT and dwell time (sum of times spent on passenger boarding and alighting at each stop) equals vehicle trip time (TT).
A comprehensive analysis of route elements and other static and dynamic line elements which have an impact on driving time is conducted. The analysis of the factors affecting DT is based on each stop to stop distance -SSD along a line. SSD level approach enables the broadening of the set of factors to be analysed and exclusion of vehicle dwell time at stops and the exclusive analysis of DT at the specific SSD. In this way, we manage to decrease the influence level of the line/section length to analyse other elements of the line structure or route characteristics. The type of rights-of-way (ROW) and the width of the street profile between two stops are among the line route characteristics covered by the analysis as well.
This approach was not observed in the analysed literature, which is presented in the following chapter. The literature review is focused on the factors affecting DT and TT and the methods used for defining the influence level of these factors. The next chapter offers the definition and description of potential factors (independent variables) which might have an impact on DT value and variability. The third part presents the methodology used for collecting all data. The data were collected for a set of selected lines from the Belgrade PT system. Finally, the paper provides the results of the conducted multivariate statistical analysis used to develop the model for identifying statistically significant factors which influence the selected characteristics of DT as the dependent variable. The description also contains the quantification of the identified factors' influence on the set of selected lines in the PT system in Belgrade.

LITERATURE REVIEW
Abkowitz and Engelstein [9] were among the first authors who analysed average DT and DT variability on PT lines using empirical data. Before these two authors, other researchers also analysed the factors affecting DT and TT and tried to find suitable corrective actions to reach higher reliability. However, these researchers were restricted to using simulation models to describe the real system's operation [3,10]. The use of simulation models was primarily caused by the high costs of field research in the PT system.
In the research [9], the authors used multivariate correlation to analyse various structural and operational factors of a PT line and their influence on TT average value and standard deviation on a particular line section. The main hypothesis was that the recorded TT values were independent between the two subsequent route sections. The authors defined the independent variables including section length, number of signalised intersections, number of stops along the section, parking of vehicles along the route, number of passenger boardings and alightings at stops, the period in a day, route direction, and TT deviation on the previous section. The model showed that the average TT on a particular section was significantly influenced by the section length, the total number of passenger boardings and alightings and the number of signalised intersections, while it was influenced to a smaller degree by parking, the period in a day and route direction. Levinson [11] believes that DT variability and time losses have a joint impact on the variability of TT. In general, DT variability is proportional to DT itself, as well as to line length. One of the main conclusions of his research is that TT, and consequently PT vehicle speed, is mainly influenced by the vehicle frequency at stops, dwell time at stops, as well as acceleration and deceleration time. Strathman and Hopper [12] divided the causes of poor accuracy in the system into two groups: internal and external causes. The internal causes included driver behaviour/experience, planned schedule sensitivity (regarding the defined headways, TT and terminal time) and line structure (in terms of its length, number of stops, distribution of boardings and alightings along the line). The external factors involved traffic flow volume, traffic accidents, signal plans at signalised intersections, weather conditions and parking along the line route. According to [13], the most influential elements that explain headway variability along a route are the irregularity of dispatching, scheduled frequency, route length, passenger demand and associated dwell times, and the number of stops.
On the other hand, [14] presented a correlation model for predicting the total TT in the PT system based on the influence of the route elements. Using multiple regression, the authors created a model for calculating the total vehicle TT on a section. The model defined the dependence of the total TT along the section on the section length, traffic flow volume, number of stops and number of signalised intersections. Several factors were also included in the conducted multiple regression as independent variables but were not involved in the final model due to their insignificant impact on the output. These factors were the number of passenger boardings and alightings, the existence of lanes for each turn at intersections and green light duration within the cycle at signalised intersections. The authors did not provide a detailed description of data collection and processing or the final form of data used for testing the model. The question is how the traffic flow volume on the section and the number of stops were expressed. Traffic flow volume is a value that changes between the intersections along a section, so it is not clear how the authors quantified the value along the whole section. Most commonly, the number of stops is linearly proportional to the section length, so the representative value of this factor is usually the unit number of stops (stops/km). Based on the paper's description of the independent variables, it cannot be concluded whether the authors used the same approach.
According to [15], TT variability can be observed from three different perspectives: (1) TT variability between two vehicles on the same route at the same time, which is mainly caused by different waiting times at signalised intersections and pedestrian crossings, different driving styles etc.; (2) TT variability realised during different periods in a day, caused by the influence of other traffic, as well as different transport demand intensity along a line during the day; (3) TT variability realised on different days, which is represented by the different TTs of the same departures (the same departure times) realised on different days in a week. The authors stated that day-to-day variability was influenced by the same factors mentioned in the previous two perspectives. Ricard [16] studied the impact of different variables on TT probability distribution. The main goal of their study was to measure the reliability of PT services. Line type, direction, number of stops, distance and region type were the main route characteristics used. Mazloumi [17] examined TT variability in a PT system for days in a week, focusing primarily on two dimensions of variability. First, the extent of variability and its shape were examined by analysing TT distribution on the selected line. Next, multiple regression analysis was used to determine the factors affecting two measures of TT variability (standard deviation and difference between the defined percentages of representative distribution). In addition to section length, which has been proven by other authors to significantly affect TT variability, the numbers of stops and signalised intersections are also considered to be significant influencing factors. The authors defined two factors used in the analysis -the number of stops and the number of signalised intersections per unit of length of the analysed section. According to [17], drivers who depart earlier from the starting point, at which departure times are defined by schedule, have longer TT than drivers who depart on time or later than the scheduled time. The average time and standard deviation of the difference between the scheduled departure time and realised departure time are the values included in the analysis as potential influential factors. The authors' analysis of the factors influencing TT variability also involved land use (residential areas and non-residential areas). Furthermore, the analysis included weather conditions (average rainfall and its standard deviation) as an influential factor. Values of the analysed dependent and independent variables were provided for four characteristic periods in a day: two peak periods (morning and afternoon) and two off-peak periods. The authors used the method of backward stepwise selection to define the relationship between the defined dependent and included independent variables. The results of the analysis showed that line length, as expected, had the highest positive effect on TT variability. The number of signalised intersections had a greater influence than the number of stops at the observed section. The following table shows a comparison of influential factors (independent variables), dependent variables, characteristics of the dependent variable and the method used within the analysed literature.
Having in mind that the analysed section lengths were long and that they contained a great number of stops, it was logical to expect that section length would have a significant influence on TT variability. The percentage of the influence might have been lower if the authors had analysed shorter sections or each distance between stops along the line. This would have enabled a detailed analysis of DT characteristics and elimination of the impact of dwell time at stops. According to the authors, the difference between the scheduled departure time and realised departure time is an indicator which has a significant influence on TT variability. However, TT variability itself also affects the difference between the two mentioned departure times. It is not obvious how the authors collected and analysed the obtained data, since departure delay on a section can be caused by TT variability in the previous section, which may have been in turn caused by a different factor. There is a significant interdependence between the analysed independent and dependent variables in successive sections, which was not considered and analysed in the paper. Also, the authors of the paper did not pay any particular attention to the line route characteristics.
On the other hand, [18] compared vehicle reliability in sections with exclusive PT vehicle lanes and sections without them. The conducted analyses showed that there was a higher level of reliability in the sections with exclusive PT vehicle lanes. This level was improved by 6% to 18%. Higher percentage values refer to peak periods in a day. The same authors presented three new performance parameters in terms of PT reliability. They also tested the influence of line length and planned headway on the values of the defined reliability parameters. The authors concluded that line length had a significant influence on all three reliability parameters, while the defined headway did not influence TT variability. Furthermore, to make a precise comparison of all spatial bus priority schemes, [19] classified bus lanes based on separation methods into segregated and non-segregated bus lanes.
Feng [20] studied vehicle travel time on the stop-to-stop segment level. They researched the joint impact of several variables (bus stop location, signal delay and traffic conditions) on vehicle TT.

SELECTION OF POTENTIAL FACTORS
The potential factors are divided into two groups based on line characteristics (i.e. stop to stop distances -SSD) which they represented. The first group involves the factors of the line structure, while the second group includes the PT operation characteristics.

Structural characteristics
The literature review showed that section length [9, 12-14, 16-18, 20] has the greatest influence on both total TT and its variability. To potentially decrease its influence level and analyse other elements of the line structure or route characteristics, section length was divided into its smallest elements -SSD. Consequently, the planned analysis and identification of potential factors are based on the analysis of the factors affecting DT for each SSD along a line. Analysing each SSD separately enables the broadening of the set of factors to be analysed and the exclusion of vehicle dwell time at stops and an exclusive analysis of DT at the specific SSD.
Two line route characteristics which can potentially influence DT between two stops on a line are the type of rights-of-way (ROW) and the width of the street profile. It is very rare for the ROW type to change between two subsequent stops. Therefore, it is possible to express the ROW type for the complete length of the observed SSD using one value. Each SSD is assigned a descriptive characteristic of the ROW type simplified in comparison to the theoretical one [21]. The route characteristic in terms of the degree of independence in this research is described by four ROW types (presented in Table 2). This characteristic is expected to have a negative influence on DT characteristics. In other words, total DT and its variability are expected to be lower for an SSD with a higher level of ROW independence.
Street (road) profile width also represents a potentially significant influential factor, particularly at SSDs with the C1 or C2 types. It can be best represented using the number of traffic lanes (in the observed direction) available to PT vehicles. Using microsimulations in their analysis, [22] found that additional lanes for PT vehicles increased reliability. They used DT variability as a reliability indicator. This route characteristic, similarly to the previous one, is expected to have a negative impact on total DT and its variability. In other words, an increase in the number of traffic lanes should decrease total DT and its variability, i.e. the presence of a bus lane increases bus speed [23].
The number of signalised intersections is a factor whose influence has been studied by almost all analysed authors [9,12,14,17,20]. In addition to the number of signalised intersections, the number of unsignalised intersections is included in this analysis. Due to the potentially different influence levels, unsignalised intersections are divided into two groups: unsignalised intersections with priority and unsignalised intersections without priority.
While collecting data regarding the number and type of intersections, it was noticed that unsignalised pedestrian crossings should be separated and observed as an individual influential factor, while signalised pedestrian crossings should be considered together with signalised intersections. There are two reasons for isolating unsignalised pedestrian crossings. The first reason is the fact that the existence of a pedestrian crossing is not conditioned by the existence of an intersection. The second reason is the fact that there is not an equal number of pedestrian crossings at each unsignalised intersection in the observed direction.
The influence of the same number of signalised or unsignalised intersections can be different. To have comprehensive data about intersections, the intersections with left or right turns were separately identified. Vehicle turns affect vehicle time losses in the intersection zone, and consequently, they affect DT at the observed SSD. This particularly refers to the left turns. The numbers of left turns and right turns at the observed SSD are included in the analysis as potentially influential factors. They are expected to have a positive influence on the total DT and its variability. The intersections where vehicles continue straight forward were not included as an additional factor. Their expected influence is minimal, and it is considered to have been already involved in the previously described factors (number of signalised and number of unsignalised intersections).
The existence of parking along the line route is another factor which can influence DT characteristics [12]. Vehicles manoeuvring either to enter or exit the parking place can affect the traffic flow, therefore PT vehicles as well. This influence is expected to be particularly significant for routes with the C1 or C2 type, particularly in the cases where the street profile is minimal (the number of traffic lanes has the value of 1 or perhaps 2). We express the influence of parking as a binary value, the existence or non-existence of parking along with the observed SSD, without detailed characteristics about the number of parking places or type of parking (reverse or forward parking, inline or angular). It is expected to have a positive influence on the dependent variable in the statistical analysis.
SSD direction is described using two values: toward and away from the central city zone. It is assumed that SSD direction can be a significant factor which, in combination with the observation period, represents the transport network load [9,16]. For example, there is a high probability that DT at a specific SSD in the direction toward the city during the morning peak will be greater than during the afternoon. In addition, two SSDs which differ only in the direction can have different values of DT and its variability if they are analysed during one (same) characteristic period in a day.

Operating characteristics
In addition to the detailed consideration of structural elements, a high-quality multivariate analysis has to include certain operating characteristics as influential factors. Firstly, DT is a dynamic element, and its value or its variability value has to be dependent on other dynamic characteristics. Secondly, many structural characteristics can only together with operating characteristics completely represent the analysed SSD. For the statistical analysis, four characteristic periods in a day were selected: morning peak (MP), morning off-peak (MOP), afternoon peak (AP) and afternoon off-peak (AOP). Some authors divided the operating period into smaller successive periods. This is not the case in our research. The characteristic periods were intentionally selected to avoid the transition period between two subsequent periods. Transition periods were excluded from the analysis since gradual changes could not be simply expressed using a unique value and the fact that the stochasticity and non-homogeneity of the environment were particularly pronounced during the transition period.
Vehicle headway is the line operating characteristic included in the analysis primarily due to the opposing attitudes of the analysed authors. On the one hand, [12] states that the sensitivity of the designed schedule (in terms of the defined headway, TT and terminal time) influences the realised TT, and consequently the transport service reliability. On the other hand, [18] concludes that the headway value does not affect vehicle TT variability, and consequently has no influence on reliability. The headway values, which are defined and quantified as potential influential factors, are represented by the planned values and expressed in minutes for each line and each defined period. It is assumed that the headway value has no significant influence on the DT characteristics.
Vehicle frequency was introduced as an operating characteristic to represent the load of the observed SSD and to isolate the distances with similar or identical structural characteristics using the load as the main criterion. There is a significant number of SSDs used by more than one line in the PT network. However, the data on the number of vehicles operating on these lines is much more important. The load of a specific SSD is expressed by the number of vehicles serving it during the observed hour regardless of the line they operate on. It is assumed that SSD load will have a positive influence on DT characteristics, i.e. the increase in vehicle frequency will increase DT and its variability.

METHODOLOGICAL PROCEDURE
Our specific methodology ( Figure 1) is based on a systemic approach and data from the real system. The fore step, or the input for the complete methodological procedure, was defining potential influential factors. The selected influential factors have a direct impact on further steps in the methodological procedure, particularly the steps related to the direct identification of structural and operating characteristics.
The first step was determining the SSDs along the selected lines, their lengths and direction. This was done from the existing databases. The second step in the methodology is determining the other structural characteristics of the selected lines. The method of field research was applied to record the values since there is no cadastre. It included visiting the route and recording the previously defined structural elements. After the field research was conducted, the following (third) step involved determining the values of each element for each SSD processed in the previous step. The values were determined using the logical analysis of survey forms, selecting the data about routes, parking and number of lanes, as well as summarising the registered intersections for each SSD separately. The following step included the validation of the collected data and the creation of a database about the structural characteristics. In the following step, this database was integrated with the database of operating characteristics.
The topic of this paper has become popular and interesting for researchers dealing with PT due to the development and application of information technologies in the PT sector. Modern information technologies have contributed to the development of PT vehicle monitoring and management systems. By registering the history of vehicle movement, this system provides comprehensive and thorough databases on PT system operation.
The defined research methodology envisages four steps in determining the operating characteristics. The first step involves selecting data from the monitoring system databases for the defined period and selected lines. This study has had access to bus AVL data provided by the KentKart vehicle management system in Belgrade. The system archives and analyses the vehicle data, thus enabling high-quality monitoring and control. With a GPS device and a communication device, every vehicle sends two types of packages of UDP (User Datagram Protocol) data to the control centre. A standard data package which is sent every 30 seconds and every 2 hours, as well as a forced data package, is sent in predefined events. The predefined events are: driver login/logout, trip start, terminal stop in/out, and stop in/out. The vehicle arrival time and departure time are recorded in the AVL system. Arrival time is the time at which a bus first enters the stop circle and departure time is the time when a bus leaves the stop circle. Vehicle DT for one SSD is a time interval between the arrival time in the stop k and the departure time from the stop k-1.
To obtain accurate values which will represent the real operation of PT vehicles, the selected data are analysed, logically checked and filtered. The second step includes determining characteristic periods for a PT line. This step involves determining the number of characteristic periods and their borders based on the collected data about the line operation history. The values of realised DT during the previously defined characteristic periods are selected from the data set formed in the first step. In the third step, selection is 1. Analysis and filtering -Selecting the period -Extrication of data for the selected lines -Analysis and filtering of the selected trips -Analysis and filtering at the SSD level The following, fourth step involved determining the headway and frequencies for each of the selected lines and each described characteristic period based on the planned schedule. To express headway for the SSD on the observed line during the characteristic period using one value, the average value of the planned headway was calculated. The vehicle frequency realised at the observed SSD is expressed by the number of vehicles which operate at this distance during the observed hour, regardless of the line they are engaged on. This means that this value represents the joint frequency of all lines. The following step is the sublimation of the conducted research on both structural characteristics and operating characteristics, i.e. it involves the joining of the created databases.
Most commonly, the created final database of the structural and operating characteristics of selected PT lines cannot be directly used in some of the multivariate analysis models. Some data from the database have to be adjusted for further use in the statistical analysis. Adjusting the data is performed by coding, which envisages marking data about the defined factors using numbers or letters.

DATA CODING
The first step of the multivariate analysis involves defining the number and form of the independent and dependent variables for one SSD. Table 3 shows the dependent and independent variables used, their form and explanation, as well as the data type and unit measures.
The standard multiple regression model with the dependent variable Y and independent variables X 1 , X 2 , X 3 , ..., X k is given in the following form: where β 0 represents an intercept or a constant, β 1 , β 2 , β 3 , ..., β k are slope coefficients or regression parameters with independent variables X 1 , X 2 , X 3 ..., X k , and ε is the random error of the multiple regression model. The model describes a linear relationship between the dependent variable and k independent variables. The dependent variables' values differ for each of the characteristic operating periods. Moreover, only the values of two independent variables (headway and frequency) also change, while other independent variables representing structural characteristics of distances between stops have fixed values. In other words, their values do not depend on the analysed period.
Additionally, we assume that characteristic periods have a significant impact on both the existence and the level of influence of other independent variables on the value of dependent variables. Therefore, a total number of twenty independent calculations were performed to analyse the influence of the independent variables on each of the defined dependent variables separately during each of the characteristic periods.
It should be underlined that during the preliminary evaluation of the model it was noticed that the selected multivariate analysis was not applicable for explaining the standard DT deviation. In the cases where the distribution is positively asymmetrical, i.e. it has a long right tail (as is the case for DT), it is recommended to apply log transformation of the determined standard deviation values. Therefore, the fifth dependent variable was introduced into the analysis, and it represents the log-transformed standard deviation (SDLOG) and is included in the table representing data coding. The same approach was applied by [17] in their study.

RESULTS
The verification of the defined methodology and developed model was done on a set of 10 urban lines that adequately represent the PT system in Belgrade: 8 bus lines, one tram and one trolleybus line ( Figure 2 and Table 4). These lines are considered to be representative primarily because of their share in the Belgrade PT system. According to the planned weekday schedules, the selected lines have 16% of all departures and more than 15% of the total gross transport work.
Joining the two databases provides the final database on the structure and operation of the 10 selected lines containing 382 SSDs each having 42 records. Three records are used for identification, 19 are the independent variables' values, while 20 of them represent different values of the 5 dependent variables per characteristic period.
To analyse the impact of all independent variables on the dependent variable for each characteristic period, the method of standard or direct multiple regression was done in the SPSS software. As part of the multiple regression procedure, SPSS performs the collinearity diagnostics of the independent variables. It often indicates problems with multicollinearity which might not be seen in the correlation matrix. The diagnostics result is shown by the values of Tolerance and VIF (variance inflation factor). The coefficient values indicate that there is the multicollinearity of route types and that one of the selected types should be excluded from the analysis, i.e. from the set of the independent variables involved in the multiple regression model [24]. Since the C1 route type has no level of independence and since it is the most common in the PT network in Belgrade, this route type was excluded from further analysis as the independent variable. The mentioned route type is supposed to be implied. Therefore, it is more significant to analyse the route types with a certain level of independence which can be expected to affect DT characteristics along with other independent variables. After excluding the C1 independent variable from the analysis, the conducted collinearity diagnostics of the remaining independent variables provided the values of VIF and Tolerance coefficients which were within the acceptable limits. This indicates that the problem of multicollinearity was not present any longer.

Regression model evaluation
In the table for evaluating the regression model -Model Summary ( Table 5), "R Square" represents the coefficient of determination which indicates which part of the variance of the dependent variable is explained in the model. The indicator "Adjusted R Square" is the corrected value of the coefficient of determination  depending on the number of independent variables and sample size, and its value provides a better estimation of the coefficient of determination in the population [24,25]. The analysis of the adjusted R Square shows a strong and very strong correlation between the defined independent variables and the dependent variable. The selected independent variables explain the values of AVERAGE and MIN excellently. Within all characteristic periods, the created regression model explains more than 96% of the variance of the MIN value. The stated values are significant but also expected, having in mind the fact that PT vehicles have to obey the speed limit, which, along with structural elements, clearly defines the minimum DT at the specific SSD. For the AVERAGE, the values of the adjusted R Square also confirm the quality of the model. The lowest coefficient value is 0.77 for the AP period. However, at the same time, it is a high value for quality verification. In the AOP period, the defined regression model explains more than 90% of the AVERAGE value.
The created regression model also provides very good results regarding the third characteristic of DTits maximum value at SSD. In contrast to the MIN, the MAX values are not limited, which makes them more difficult to predict and explain. Nevertheless, the regression model manages to explain more than 39% of the dependent variable in the AP period, which can be considered the most complex and consequently the most difficult to model. In other periods, the coefficient of determination related to the MAX as a dependent variable has the values of 0.658, 0.724 and 0.756 for the MP, MOP and AOP periods, respectively.
The complexity of the AP period is best reflected in the value of the coefficient of determination when the SD is the dependent variable. The defined regression model cannot be applied to explain this DT characteristic in the AP period since the value of the adjusted coefficient of determination is 0.182, which means that the model explains only 18.2% of the dependent variable. When it comes to other periods in a day, the regression model manages to explain the SD in the range from 41.9% (MP period) to 50.4% (AOP period).
To improve the level of explanation provided by the regression model, the values of standard deviations were transformed using log-transformation with a base of 10. The values of the dependent variable SD were transformed to better suit the regression model, and a new dependent variable SDLOG was created. When SDLOG is the dependent variable, the output of the regression model evaluation has higher values, as expected. This means that the model can also be applied to explain the standard deviation values if these values are previously transformed. The regression model explains the SDLOG dependent variable in the AP period twice better (36.7%) than it explains the SD variable. During the other characteristic periods, the model can explain more than 51% of the dependent variable.
The evaluation of the coefficients' statistical significance is performed based on the output provided by the SPSS software in the ANOVA tables. These are the results of the null hypothesis tests stating that the R square in the population equals 0. In all the presented examples, the model reaches statistical significance (sig.=0.000, which means that p<0.0005).
The results of the evaluation confirm that the route and other structural elements of the line influence DT characteristics. The next step in the analysis of the regression model involves the identification of the independent variables which have a significant influence and quantification of the level of influence.

Evaluation of independent variables
Since the regression model was proven to be significant, the next step includes determination of the significance of each of the β i coefficients. This provides the data about the extent of contribution of each of the independent variables to the prediction of the dependent variable. To compare the variables, standardised coefficients are observed. They are given in the Coefficients table within the SPSS software, and their values were converted using the same scale to make them comparable. We apply t-test to determine the significance. The calculated value of the t-statistic (the ratio of the non-standardised coefficient and its standard error) is compared to the t-value for (n-k-1) degrees of freedom given in the table and a significance value of α=0.05. If p≤0.05, it can be concluded that the coefficient is statistically significantly different from 0, i.e. that the corresponding independent variable has a significant impact on the dependent variable and that its existence in the regression model is completely justified. The values of the statistic and corresponding p-values are given in the "t" and "Sig." columns in Table 6. In addition to these values, the table also shows the standardised (Beta) and non-standardised (B) values of the coefficient of determination of each independent variable.
Similarly to other authors' results, the length has the most significant influence on the AVERAGE. The contribution of this particular regression model is the confirmation of the influence of the route ROW type on the AVERAGE. The presented results show that the coefficient of the C2 type has a negative sign, which confirms the starting hypothesis stating that the increase in the degree of independence decreases the value of AVERAGE. A similar influence level can be expected for the ROW type B, but the data used in this model did not confirm the starting hypothesis, at least not regarding the DT. The reason can be found in the fact that, out of all lines included in the analysis, only tram line 6 has the stated ROW type. Its advantages are not sufficiently used due to a large number of signalised intersections, significant route overlapping with other tram lines and left turns of the vehicles on other tram lines (vehicle overtaking is not possible). Also, the sample of SSD with route type B might be very small when compared to the total population, which prevents determining the significance. In addition to the length of the SSD and C2 ROW type, DT is also influenced by all intersection types. The number of traffic lanes has statistical significance only in two morning periods, while its influence in the MOP period is just slightly above the significance threshold. The results are also specific regarding the influence of direction on the AVERAGE. The presented results can be considered specific and approve the idea of performing the multivariate analysis for each characteristic period. Direction has a significant influence only in the MP period. Here the Beta coefficient has the positive sign, which means that the direction "toward" increases the AVERAGE value.
The evaluation of the independent variables has shown that the planned headway and planned joint vehicle frequency have no statistically significant impact. The t-test shows that the following structural elements also do not have statistical significance: number of pedestrian crossings and right turns. The existence of parking on the line route is statistically significant only in the MOP period when it has a positive coefficient as envisaged, which means that the existence of parking increases the AVERAGE.
The model evaluation presented in the previous section shows that the Pearson correlation coefficient has the highest values in the case of MIN as the dependent variable. These results should be taken tentatively since the length of the SSD is the most significantly influential independent variable and has a crucial role in the MIN value. Similarly to the previous analysis, the direction has statistical significance only during the MP period when it increases the value of MIN. In contrast, there is the influence of the ROW type B. Unexpectedly, this influence has a positive sign. The reason for such an influence of this route type has already been explained. However, it is additionally confirmed by the statistical significance of the joint frequency during the MP period, since it also increases the MIN, similarly to the ROW type B. Some additional differences can be seen in the statistical significance of right turns with a positive sign (which is expected), and the lack of influence of the number of traffic lanes on the MIN. In addition, the number of signalised intersections without priority has a significant influence on the dependent variable, while having a positive sign. However, this is not the case during the MP period.
Having in mind the fact that the DT distribution at SSD is positively asymmetrical, i.e. it has a long right tail, it should be expected that the independent variables affecting the AVERAGE will also have a significant influence on the MAX. The independent variables with acceptable statistical significance are almost identical in both cases. The difference is reflected in the quantification of significance, i.e. in the level of influence of each independent variable. In this case, the length is once again the dominant independent variable but with a significantly lower influence than in the case of the AVERAGE. The values of the standardised coefficients of the ROW type C2 and all defined intersections show that these independent variables have a more significant influence on the MAX than on the AVERAGE. In addition to length, the number of signalised intersections also has the most significant influence on the MAX.
The number of traffic lanes has a statistically significant influence during the AP period as well, in addition to the two morning periods. The existence of parking along the route increases the MAX in both morning periods.
The analysis of the model evaluation showed that the defined regression model could not be applied to explain the SD observed as a DT characteristic. Therefore, the analyses of the evaluation of the independent variables presented in the table refer to the case when the dependent variable is the value of SDLOG.
Deviation is the only DT characteristic which is not dominantly influenced by the length of the SSD. The DT deviation is most significantly influenced by the number of signalised intersections. The stated influence is expectedly positive. In contrast to the previous independent variable, the C2 route type has a negative influence on the SDLOG value, which means that the realised DT at the SSD with a separate yellow lane has a significantly smaller deviation. This fact additionally confirms the starting hypothesis that the route type has a significant influence on DT characteristics.
Furthermore, the regression model quality is also reflected in the evaluation of the influence of the independent variable "TOWARD" on the dependent variable, i.e. on the SDLOG. This independent variable has a statistical significance only during peak periods (morning and afternoon). In the MP period, the Beta coefficient is positive (vehicles moving toward the city centre have a greater DT deviation), while the same coefficient is negative in the AP period (vehicles moving towards the city have a smaller DT deviation).

CONCLUSION
Based on the starting hypotheses stating that the route characteristics and other elements of the line structure influence DT characteristics and that the influence level can be quantified, the methodology was defined and applied to the selected PT lines in Belgrade.
The results of the evaluation of the defined regression model showed that the selected independent variables explained the dependent variables (DT characteristics) to a significant degree. The quantification of the influence (evaluation) was also conducted, and it determined the significance of each independent variable in the regression model for each of the defined DT characteristics separately. The analyses conducted for each of the characteristic operating periods show that, apart from affecting DT characteristics, the characteristic periods significantly affect both the influence presence and influence level of the defined independent variables on the dependent variable.
The most significant independent variables are: length of the SSD, C2 ROW type, number of traffic lanes, all types of intersections and number of left turns.
Driving direction has a pronounced influence on all dependent variables, but only during the morning and afternoon peak periods. On the other hand, the B ROW type and the number of right turns affect the MIN.
The number of pedestrian crossings is the only independent variable whose significant influence was not registered. The results of the applied regression model also showed that operating characteristics (headway and joint frequency) did not have a significant influence on DT characteristics.
The results show that the created regression model successfully explains the correlation between the defined independent variables and dependent variables representing the basic DT characteristics. The regression model can be applied regardless of the presence of modern vehicle management systems in the PT system. We assume that the regression model is also suitable for predicting DT characteristics. The model can also be applied to improve and adjust the operation of the existing systems for the prediction of vehicle driving time, i.e. prediction of vehicle arrival time at stops. This can affect the practical importance of research through the improvement and development of advanced systems for predicting the arrival time, which represents a significant added value to existing passenger information systems in PT. The results of the conducted research can be applied in the process of improving the investment policy through the determination of priorities to reduce the total travel time of vehicles and PT users. The practical application of the study at the operational level is reflected in the improvement of the scheduling process, especially in terms of quality planning of PT vehicle round trip time for characteristic periods.