Why This Matters

Public transit systems require accurate occupancy forecasting for operational planning, but many routes exhibit sparse data with imbalanced occupancy distributions (most trips have low occupancy, few have high occupancy). GCN-based methods are innovative because they leverage the underlying graph structure of transit networks to learn more expressive representations while handling data sparsity through inductive learning across stops and routes, improving generalization.

What We Did

This paper proposes a Graph Convolutional Network framework for bus ridership forecasting that addresses data sparsity and imbalance issues in public transit occupancy prediction. The approach combines graph neural networks to capture spatial-temporal dependencies with data augmentation and focal loss to handle the heavy-tail occupancy distribution. GCNs model bus networks as graphs where stops and routes capture the transit network structure, enabling the model to learn patterns specific to route dynamics.

Key Results

Evaluation on real WEGo Public Transit data from Nashville demonstrates that the GCN approach significantly outperforms traditional baselines including random forest and XGBoost methods, with particular improvements in predicting high-occupancy events that are critical for preventing overcrowding and ensuring service quality.

Full Abstract

Cite This Paper

@inproceedings{samir2024smartcomp,
  author = {Gupta, Samir and Khanna, Agrima and Talusan, Jose Paolo and Said, Anwar and Freudberg, Dan and Mukhopadhyay, Ayan and Dubey, Abhishek},
  booktitle = {2024 IEEE International Conference on Smart Computing (SMARTCOMP)},
  title = {A Graph Neural Network Framework for Imbalanced Bus Ridership Forecasting},
  year = {2024},
  acceptance = {32.9},
  month = {jun},
  abstract = {Public transit systems are paramount in lowering carbon emissions and reducing urban congestion for environmental sustainability. However, overcrowding has adverse effects on the quality of service, passenger experience, and overall efficiency of public transit causing a decline in the usage of public transit systems. Therefore, it is crucial to identify and forecast potential windows of overcrowding to improve passenger experience and encourage higher ridership. Predicting ridership is a complex task, due to the inherent noise of collected data and the sparsity of overcrowding events. Existing studies in predicting public transit ridership consider only a static depiction of bus networks. We address these issues by first applying a data processing pipeline that cleans noisy data and engineers several features for training. Then, we address sparsity by converting the network to a dynamic graph and using a graph convolutional network, incorporating temporal, spatial, and auto-regressive features, to learn generalizable patterns for each route.  Finally, since conventional loss functions like categorical cross-entropy have limitations in addressing class imbalance inherent in ridership data, our proposed approach uses focal loss to refine the prediction focus on less frequent yet task-critical overcrowding instances. Our experiments, using real-world data from our partner agency, show that the proposed approach outperforms existing state-of-the-art baselines in terms of accuracy and robustness.},
  contribution = {lead},
  keywords = {ridership forecasting, graph neural networks, public transit, occupancy prediction, data imbalance, spatio-temporal modeling},
  month_numeric = {6}
}
Quick Info
Year 2024
Keywords
ridership forecasting graph neural networks public transit occupancy prediction data imbalance spatio-temporal modeling
Research Areas
transit ML for CPS
Search Tags

Graph, Neural, Network, Framework, Imbalanced, Ridership, Forecasting, ridership forecasting, graph neural networks, public transit, occupancy prediction, data imbalance, spatio-temporal modeling, transit, ML for CPS, 2024, Gupta, Khanna, Talusan, Said, Freudberg, Mukhopadhyay, Dubey