Why This Matters

Public transit agencies increasingly collect detailed operational data through sensors and ticketing systems, but this data is often sparse and noisy. Separate optimization of different prediction tasks neglects important correlations between occupancy and delay. This work is innovative because it shows how multitask learning can leverage shared patterns between related tasks to improve prediction accuracy, particularly valuable when training data is limited for specific routes. The approach enables practical applications despite data quality challenges.

What We Did

This paper develops a multitask learning approach for predicting both occupancy and delay in public transit systems despite sparse and noisy automated passenger counter data. The method uses separate neural network models for occupancy and delay prediction while sharing learned representations between the tasks. The approach includes careful data preprocessing to handle missing values and sensor noise, and demonstrates how multitask learning can improve prediction accuracy by capturing shared patterns between related prediction tasks.

Key Results

The multitask learning models outperform single-task baselines for both occupancy and delay prediction across test scenarios. The approach successfully handles data sparsity and noise through careful preprocessing and shared representation learning. Results demonstrate that occupancy and delay predictions can be improved by jointly training on both tasks, providing transit agencies with better predictions for planning and operations.

Full Abstract

Cite This Paper

@inproceedings{Zulqarnain2023,
  author = {Zulqarnain, Ammar and Gupta, Samir and Talusan, Jose Paolo and Pugliese, Philip and Mukhopadhyay, Ayan and Dubey, Abhishek},
  booktitle = {2023 IEEE International Conference on Smart Computing (SMARTCOMP)},
  title = {Addressing APC Data Sparsity in Predicting Occupancy and Delay of Transit Buses: A Multitask Learning Approach},
  year = {2023},
  acceptance = {31},
  abstract = {Public transit is a vital mode of transportation in urban areas, and its efficiency is crucial for the daily commute of millions of people. To improve the reliability and predictability of transit systems, researchers have developed separate single-task learning models to predict the occupancy and delay of buses at the stop or route level. However, these models provide a narrow view of delay and occupancy at each stop and do not account for the correlation between the two. We propose a novel approach that leverages broader generalizable patterns governing delay and occupancy for improved prediction. We introduce a multitask learning toolchain that takes into account General Transit Feed Specification feeds, Automatic Passenger Counter data, and contextual information temporal and spatial information. The toolchain predicts transit delay and occupancy at the stop level, improving the accuracy of the predictions of these two features of a trip given sparse and noisy data. We also show that our toolchain can adapt to fewer samples of new transit data once it has been trained on previous routes/trips as compared to state-of-the-art methods. Finally, we use actual data from Chattanooga, Tennessee, to validate our approach. We compare our approach against the state-of-the-art methods and we show that treating occupancy and delay as related problems improves the accuracy of the predictions. We show that our approach improves delay prediction significantly by as much as 6% in F1 scores while producing equivalent or better results for occupancy.},
  contribution = {lead},
  keywords = {multitask learning, transit prediction, occupancy forecasting, delay prediction, sparse data, automated passenger counting, machine learning for transit, operational predictions}
}
Quick Info
Year 2023
Keywords
multitask learning transit prediction occupancy forecasting delay prediction sparse data automated passenger counting machine learning for transit operational predictions
Research Areas
transit ML for CPS
Search Tags

Addressing, Data, Sparsity, Predicting, Occupancy, Delay, Transit, Buses, Multitask, Learning, Approach, multitask learning, transit prediction, occupancy forecasting, delay prediction, sparse data, automated passenger counting, machine learning for transit, operational predictions, transit, ML for CPS, 2023, Zulqarnain, Gupta, Talusan, Pugliese, Mukhopadhyay, Dubey