Smart Public Transit - Transit Hub

This project addresses the problem of urban transportation and congestion by building analytical tools that help the customers and the transit agencies reduce uncertainties and optimize the transit operations. We adress this problem at three fronts - Data Analytics, Planning and analysis tool for understanding and projecting the impact of transportation choices, and developing scalable data stores that can enable cities to operate their own data lakes and analytics engines.

Data Analytics

We focus on data analytics to understand bottlenecks and improve the operational reliability. For this, we start by first collecting multimodal data about transit operations, traffic, public events and congestion from cities of Nashville and Chattanooga. Then, we perform data analytics to understand the causes of transit delays and help provide tools that inform the community as well as transit operators deal with both long term planning as well as short term delays.

Some results from this work are listed below.

Deep Learning Based Anomaly Detection

Non-recurring traffic congestion is caused by temporary disruptions, such as accidents, sports games, adverse weather, etc. The data we use consists of historical traffic speed, jam factor (a traffic congestion indicator), and events collected over a year from Nashville, TN to train a multi-layered deep neural network. The traffic dataset contains over 900 million data records. The network is thereafter used to classify real-time data and identify anomalous operations. Compared with traditional approaches of using statistical or machine learning techniques, our model reaches an accuracy of 98.73 percent when identifying traffic congestion caused by football games. Our approach first encodes the traffic across a region as an image. After that, the image data from different timestamps is fused with event- and time-related data. Then a crossover operator is used as a data augmentation method to generate training datasets with more balanced classes. Finally, we use the receiver operating characteristic (ROC) analysis to tune the sensitivity of the classifier.

Slides: DxNAT: Deep Neural Networks for Explaining Non-recurring Traffic Congestion

Surrogate Data Sensing

Data generated by transit vehicles that are equipped with GPS can be used to provide surrogate sensing of traffic conditions in the city. We propose a multivariate predictive multi-model approach called SpeedPro. It can identify similar clusters of operation from historical data that includes the real-time position of a probe vehicle, weather, and driver identifier, and then employs different models to estimate the traffic speed in real-time as a function of current weather, and transit vehicle speed. The work has been published by the SmartSys 2017 workshop. See the folloing slides: SpeedPro: A Predictive Multi-modal Approach for Urban Traffic Speed Estimation

Understanding Delays and Optimizing Schedule

The on-time arrival performance of buses at stops is a critical metric for both riders and city planners to evaluate the reliability of a transit system. Identifying the bottlenecks in transit networks that often have abnormal delay is the first step for scheduling optimization. We built a prescriptive analytics mechanism to identify historical bus delay patterns and locate the bottlenecks in the transit network by measuring transit performance.

The transit performance is affected by various factors, such as the travel demand, traffic conditions, weather, etc. These stochastic factors make it very difficult to optimize timetables to match the actual transit operation. To better undestand the factors affecting delays we built a system called Delay Radar. It uses multivariate linear regression models and random forest models to analyze the traffic and weather data to make predictions on transit travel time. Further, we created a robust delay prediction algorithm that uses multiple data sources and combines clustering analysis and Kalman filters. Additionally, a novel route segmentation mechanism that handles the issue of data sparsity was developed. You can read more about it in these slides.

To understand the impact of events we built multi-task deep neural networks that utilize contextual features (e.g., scheduled sports games and forecasted weather conditions) to make context-aware predictions of the expected travel delay, as well as the likelihood of accidents on the bus routes. Compared to existing models that rely solely on static and historical data, utilizing scheduled and predicted contextual information could provide a better estimate of transit system performance. Furthermore, the multi-task deep neural network architecture allows faster training and prediction, and reduces the possibility of overfitting, which improves the prediction accuracy. To learn more read the following papers: SmartComp2018 and the following poster.

Finally, we use the long term delay models to create an optimization problem for helping improve the fixed line transit schedule. See the following slides

Mobilytics Gym

As part of the work to improve the efficiency of public transit and urban transportation in general, we also build solutions that will educate the community on benefits of public transit. To mitigate this problem, we build a simulation framework to evaluate the effect of personal transportation choices and also help the cities evaluate the impact of incentive policies in nudging commuters towards alternate modes of travel, such as bike and car-share options. For this purpose, we leverage MATSim, an agent-based simulation framework, to integrate agent preference models that capture the altruistic behavior of an agent in addition to their disutility proportional to the travel time and cost. These models are learned in a data-driven approach and can be used to evaluate the sensitivity of an agent to system-level disutility and monetary incentives given, e.g., by the transportation authority. This framework provides a standardized environment to evaluate the effectiveness of any particular incentive policy of a city, in nudging its residents towards alternate modes of transportation. We show the effectiveness of the approach and provide analysis using a case study from the Metropolitan Nashville area.

Read more about it in this paper.


  1. M. Wilbur, C. Samal, J. P. Talusan, K. Yasumoto, and A. Dubey, Time-dependent Decentralized Routing using Federated Learning, in 2020 IEEE 23nd International Symposium on Real-Time Distributed Computing (ISORC), 2020.
  2. S. Basak, A. Dubey, and B. P. Leao, Analyzing the Cascading Effect of Traffic Congestion Using LSTM Networks, in IEEE Big Data, Los Angeles, Ca, 2019.
  3. F. Sun, A. Dubey, J. White, and A. Gokhale, Transit-hub: a smart public transportation decision support system with multi-timescale analytical services, Cluster Computing, vol. 22, no. Suppl 1, pp. 2239–2254, Jan. 2019.
  4. S. Basak, F. Sun, S. Sengupta, and A. Dubey, Data-Driven Optimization of Public Transit Schedule, in Big Data Analytics - 7th International Conference, BDA 2019, Ahmedabad, India, 2019, pp. 265–284.
  5. S. Nannapaneni and A. Dubey, Towards demand-oriented flexible rerouting of public transit under uncertainty, in Proceedings of the Fourth Workshop on International Science of Smart City Operations and Platforms Engineering, SCOPE@CPSIoTWeek 2019, Montreal, QC, Canada, 2019, pp. 35–40.
  6. C. Samal, A. Dubey, and L. J. Ratliff, Mobilytics-Gym: A Simulation Framework for Analyzing Urban Mobility Decision Strategies, in IEEE International Conference on Smart Computing, SMARTCOMP 2019, Washington, DC, USA, 2019, pp. 283–291.
  7. A. Oruganti, S. Basak, F. Sun, H. Baroud, and A. Dubey, Modeling and Predicting the Cascading Effects of Delay in Transit Systems, in Transportation Research Board Annual Meeting, 2019.
  8. F. Sun, A. Dubey, C. Samal, H. Baroud, and C. Kulkarni, Short-Term Transit Decision Support System Using Multi-task Deep Neural Networks, in 2018 IEEE International Conference on Smart Computing, SMARTCOMP 2018, Taormina, Sicily, Italy, June 18-20, 2018, 2018, pp. 155–162.
  9. C. Samal, A. Dubey, and L. J. Ratliff, Mobilytics- An Extensible, Modular and Resilient Mobility Platform, in 2018 IEEE International Conference on Smart Computing, SMARTCOMP 2018, Taormina, Sicily, Italy, June 18-20, 2018, 2018, pp. 356–361.
  10. C. Samal, L. Zheng, F. Sun, L. J. Ratliff, and A. Dubey, Towards a Socially Optimal Multi-Modal Routing Platform, CoRR, vol. abs/1802.10140, 2018.
  11. C. Samal, F. Sun, and A. Dubey, SpeedPro: A Predictive Multi-Model Approach for Urban Traffic Speed Estimation, in 2017 IEEE International Conference on Smart Computing, SMARTCOMP 2017, Hong Kong, China, May 29-31, 2017, 2017, pp. 1–6.
  12. F. Sun, C. Samal, J. White, and A. Dubey, Unsupervised Mechanisms for Optimizing On-Time Performance of Fixed Schedule Transit Vehicles, in 2017 IEEE International Conference on Smart Computing, SMARTCOMP 2017, Hong Kong, China, May 29-31, 2017, 2017, pp. 1–8.
  13. A. Oruganti, F. Sun, H. Baroud, and A. Dubey, DelayRadar: A multivariate predictive model for transit systems, in 2016 IEEE International Conference on Big Data, BigData 2016, Washington DC, USA, December 5-8, 2016, 2016, pp. 1799–1806.
  14. S. Shekhar et al., A Smart Decision Support System for Public Transit Operations, in Internet of Things and Data Analytics Handbook, 2016.
  15. F. Sun, Y. Pan, J. White, and A. Dubey, Real-Time and Predictive Analytics for Smart Public Transportation Decision Support System, in 2016 IEEE International Conference on Smart Computing, SMARTCOMP 2016, St Louis, MO, USA, May 18-20, 2016, 2016, pp. 1–8.
  16. A. Dubey, M. Sturm, M. Lehofer, and J. Sztipanovits, Smart City Hubs: Opportunities for Integrating and Studying Human CPS at Scale, in Workshop on Big Data Analytics in CPS: Enabling the Move from IoT to Real-Time Control, 2015.