DEV Community

Cover image for Predicting transit time using ML and AI techniques
gouse
gouse

Posted on

Predicting transit time using ML and AI techniques

Abstract:

Transportation is essential in the contemporary economy. In today’s fast-paced world, where everyone is short of time and is always in a hurry, everyone wants to know the transit duration for better planning. As a result, research communities have given intelligent transit systems a lot of attention. Both traffic engineers and users of the highway network depend on accurate transit times [1]. This work intends to explore the transit time for goods vehicles in highways on a given destination and start points. To brief whats a transit time is, it’s the time taken a goods/vehicle to reach from its source station to the destination. The transit time is calculated/measured based on number of hours/days/months it took to reach the destination. There will be lot of factors that may influence the transit time depends on the geographical location mode of transit, season wind direction, hill or normal roads etc.

In this experiment we are trying to predict the transit time based on those influencing factors to make the importer and the exporter has the most accurate possible timelines. So that they can plan there deliveries/usage criteria’s. for this analysis we used random forest and linear regression ML algorithms.

Keywords: Machine Learning, Linear Regression, Random Forest Model Validation.

Objectives:

Predicting transit times is crucial for transportation. Accurate trip estimation could lower transportation expenses. For creating a urbane transit information systems, trip time prediction is crucial. Linear regression and random forest regression are two of the various techniques employed. The results are individually explored in the below paragraphs.

A. LINEAR REGRESSION

It is a linear model that establishes the relationship between a dependent variable y(Target), and one or more independent variables denoted X(Inputs). In this data set we have different variables. In which we have considered transit time variables as y (target) and 13 other variables such as Temperature, Humidity, Pressure, Visibility, Vehicle age, Wind speed, Loads in Tons, bearing, Weather condition, service status, Destination, Road type, Wind direction as X(Inputs).

B. RANDOM FOREST

Random Forest (RF) is an ensemble supervised machine learning method that can be applied to categorical or numerical datasets as a classifier or regressor. To develop an RF model, multiple random samples from the training dataset are selected with replacement in several iterations, and decision tree is trained for each of them. The trained decision tree then returns the target variables value for each new record in the test dataset. The average of all predicted values from decision tree for the target is used to calculate the final result. Because it reduces decision tree variance, random forest is resistant to noisy data and over fitting, and it is expected to have higher accuracy than individual DTs. When a large dataset is available, RF usually works accurately and efficiently. As model inputs, it could also handle a large number of variables. The random forest model is an excellent choice because of these characteristics.

Methodology:

Python used a coding tool and historic transit data used for training the model. Used linear regression, random forest algorithms.

Dataset:

We have used the transit historic data, considered the following features :

Transit time: Total time took to complete the trip,

Service status: The status shows the trip completed on time or whether it has exceeded and if at all exceeds then how many days 30, 60 or 90 days.

Weather Condition: Weather conditions play an important role as it affects the transit time, the factors such as Heavy rain, snow, fog, unfair climate, cloudy etc.

The Different Variables are shown below:

Variables
Method: We plan to run Random Forest and Linear Regression models to predict the possible transit time which will help plan the trip considering the influencing factors in our dataset. To train the model we used random forest regressor and linear regression Model.

Analysis and Forecasting:

To start with, we did the correlation analysis on `different variables available in the time travel data. The Correlation matrix shall show us the variables that has more impact on the dependent variable. We also examined other algorithms on regression, took random forest as the best suited regression model.

Correlation:
We did correlational analysis to find the relation between two attributes which helped us to find the redundant data. Below table shows the impact of the available variables on the Dependent Variable that is Travel Time. We have found that the variables like Destination, Load in Tons, Weather Condition has direct correlation with the dependent variable — Travel Time. The Pressure, Visibility, Temperature and Humidity has high correlation with Weather Condition variable. The Service status variable has high correlation with Road type variable. The Wind direction variable has high correlation with variables such as Wind Speed, Travel Time. The Pressure, Visibility, Temperature and Humidity

Correlations
Exploratory Data Analysis :: Table Below shows the Exploratory data analysis done for the variables available in the Dataset.

bar charts
Model1 Regression: : Below the regression summary tables are explaining the model related summary, accuracy and predicting weights etc. as per the Model the R-squared value is 0.428 and the Adjusted R-squared value is 0.427

results
Model2 Random Forest Regressor : : We trained our model using Linear Regressor which gave us an accuracy of 42% and then we compared our model with a Random Forest Regressor and found that Random Forest was giving us more accurate results of 90%.In comparison to older techniques like Linear Regression our model gave a more accurate result by 48%.Further we observed that the Root Mean Square Error(RMSE) decreased rapidly to a healthy level ,Mean Absolute Error: 0.42916096051959735,Mean Squared Error: 10.277677122295003,Root Mean Squared Error: 3.205881645085327

Conclusion:

Compared to all the other algorithms such as Linear regression (accuracy: 42 percent) and its variants, Random Forest(accuracy: 90 percent) gives the best result. Predicted transit-time information provides the capacity for road users to organize travel schedule pre-trip and end-trip. It helps to save transport operational cost and reduce environmental impacts. Besides, accurate travel time information also helps delivery industries to promote their service quality by delivering on time. However, the development of travel time estimation and prediction are suffered from the shortage of traffic data sets and too much interference from transport environment. This paper provides a review of travel-time studies that includes variables of travel time, measurement of travel time, methodologies of travel-time prediction and estimation, research difficulties, some relationships between other variables and travel-time from field data and potential solutions of travel-time prediction studies.

References:

  1. Abbott-Jard M, Shah H, Bhaskar A. 2013. Empirical evaluation of Bluetooth and Wifi scanning for road transport. Australasian Transport Research Forum (ATRF), 36th Edition. 14.

  2. Abdollahi M, Khaleghi T, Yang K. 2020. An integrated feature learning approach using deep learning for travel time prediction. Expert Systems with Applications 139(4):112864 DOI 10.1016/j.eswa.2019.112864.

  3. Abduljabbar R, Dia H, Liyanage S, Bagloee SA. 2019. Applications of artificial intelligence in transport: an overview. Sustainability 11(1):189

  4. DOI 10.3390/su11010189. Achar A, Bharathi D, Kumar BA, Vanajakshi L. 2019. Bus arrival time prediction: a spatial kalman filter approach. IEEE Transactions on Intelligent Transportation Systems 21(3):1298–1307 DOI 10.1109/TITS.2019.2909314.

  5. J.W.C. Van Lint, Online learning solutions for freeway travel time prediction, IEEE Trans. Intell. Transp. Syst. 9 (2008) 38–47.

  6. G. Huisken, E.C. van Berkum, A comparative analysis of short-range travel time prediction methods, 82nd Annual Meeting of the Transportation Research Board, 2003.

  7. U. Mori, A. Mendiburu, M. Álvarez, J.A. Lozano, A review of travel time estimation and forecasting for advanced traveller information systems, Transp. A Transp. Sci. 11 (2015) 119–157.

  8. H.B. Celikoglu, Flow-based freeway travel-time estimation: a comparative evaluation within dynamic path loading, IEEE Trans. Intell. Transp. Syst. 14 (2013) 772–781.

  9. L. Li, X. Chen, Z. Li, L. Zhang, Freeway travel-time estimation based on temporal–spatial queueing model, IEEE Trans. Intell. Transp. Syst. 14 (2013) 1536–1541.

  10. F. Soriguera, F. Robuste, Requiem for freeway travel time estimation methods based on blind speed interpolations between point measurements, IEEE Trans. Intell. Transp. Syst. 12 (2010) 291–297.

  11. J.W.C. Van Lint, Reliable travel time prediction for freeways, Netherlands TRAIL Res. School (2004).

  12. J.W.C. Van Lint, C. Van Hinsbergen, Short-term traffic and travel time prediction models, Artif. Intell. Appl. to Crit. Transp. Issues 22 (2012) 22–41.

  13. E.J. Schmitt, H. Jula, On the limitations of linear models in predicting travel times, in: 2007 IEEE Intelligent Transportation Systems Conference, IEEE, 2007, pp. 830–835.

  14. M. Papageorgiou, I. Papamichail, A. Messmer, Y. Wang, Traffic simulation with METANET, in: Fundamentals of Traffic Simulation, Springer, 2010, pp. 399–430. [

  15. P. Edara, R. Rahmani, H. Brown, C. Sun, Traffic Impact Assessment of Moving Work Zone Operations, Smart Work Zone Deployment Initiative (2017).

  16. N.B. Taylor, The CONTRAM dynamic traffic assignment model, Netw. Spat. Econ. 3 (2003) 297–322. [13] L. Du, S. Peeta, Y.H. Kim, An adaptive information fusion model to predict the shortterm link travel time distribution in dynamic traffic networks, Transp. Res. Part B Methodol. 46 (2012) 235–252.

  17. D. Laoide-Kemp, M. O’Mahony, Dealing with latency effects in travel time prediction on motorways, Transp. Eng. (2020) 100009.

  18. E. Castillo, M. Nogal, J.M. Menendez, S. Sanchez-Cambronero, P. Jimenez, Stochastic demand dynamic traffic models using generalized beta-Gaussian Bayesian networks, IEEE Trans. Intell. Transp. Syst. 13 (2011) 565–581.

  19. D. Billings, J.-S. Yang, Application of the ARIMA models to urban roadway travel time prediction-a case study, in: 2006 IEEE International Conference on Systems, Man and Cybernetics, IEEE, 2006, pp. 2529–2534

  20. [26] W. Qiao, A. Haghani, M. Hamedi, A nonparametric model for short-term travel time prediction using bluetooth data, J. Intell. Transp. Syst. 17 (2013) 165–175.

  21. B. Yu, X. Song, F. Guan, Z. Yang, B. Yao, k-Nearest neighbor model for multiple-time-step prediction of short-term traffic condition, J. Transp. Eng. 142 (2016) 4016018.

  22. J. Zhao, Y. Gao, J. Tang, L. Zhu, J. Ma, Highway travel time prediction using sparse tensor completion tactics and-nearest neighbor pattern matching method, J. Adv. Transp. 2018 (2018).

  23. D. Nikovski, N. Nishiuma, Y. Goto, H. Kumazawa, Univariate short-term prediction of road travel times, in: Proceedings. 2005 IEEE Intelligent Transportation Systems, 2005, IEEE, 2005, pp. 1074–1079.

  24. A. Simroth, H. Zahle, Travel time prediction using floating car data applied to logistics planning, IEEE Trans. Intell. Transp. Syst. 12 (2010) 243–253.

  25. C.-H. Wu, J.-M. Ho, D.-T. Lee, Travel-time prediction with support vector regression, IEEE Trans. Intell. Transp. Syst. 5 (2004) 276–281.

  26. G. Leshem, Y. Ritov, Traffic flow prediction using adaboost algorithm with random forests as a weak learner, in: Proceedings of World Academy of Science, Engineering and Technology, Citeseer, 2007, pp. 193–198.

  27. Y. Liu, Y. Wang, X. Yang, L. Zhang, Short-term travel time prediction by deep learning: a comparison of different LSTM-DNN models, in: 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), IEEE, 2017, pp. 1–8.

  28. J. Zhao, Y. Gao, Y. Qu, H. Yin, Y. Liu, H. Sun, Travel time prediction: based on gated recurrent unit method and data fusion, IEEE Access 6 (2018) 70463–70472.

  29. X. Zeng, 2011. Dynamically predicting corridor travel time under incident conditions using a neural network approach.

  30. M. Yildirimoglu, N. Geroliminis, Experienced travel time prediction for congested freeways, Transp. Res. Part B Methodol. 53 (2013) 45–63.

  31. H. Chen, H.A. Rakha, Prediction of dynamic freeway travel times based on vehicle trajectory construction, in: 2012 15th International IEEE Conference on Intelligent Transportation Systems, IEEE, 2012, pp. 576–581.

  32. M. Wang, Q. Ma, Dynamic prediction method of route travel time based on interval velocity measurement system, in: Proceedings of 2014 IEEE International Conference on Service Operations and Logistics, and Informatics, IEEE, 2014, pp. 172–176

  33. S.-K.S. Fan, C.-J. Su, H.-T. Nien, P.-F. Tsai, C.-Y. Cheng, Using machine learning and big data approaches to predict travel time based on historical and real-time data from Taiwan electronic toll collection, Soft Comput. 22 (2018) 5707–5718

  34. N.-E. El Faouzi, R. Billot, S. Bouzebda, Motorway travel time prediction based on toll data and weather effect integration, IET Intell. Transp. Syst. 4 (2010) 338–345.

  35. C. Kamga, M.A. Yazıcı, Temporal and weather related variation patterns of urban travel time: considerations and caveats for value of travel time, value of variability, and mode choice studies, Transp. Res. Part C. 45 (2014) 4–16

Top comments (0)