Why This Matters

Scientific workflows executing on distributed infrastructure face intermittent failures from hardware, network, and software faults that can compromise experiment reproducibility and validity. This work innovates by integrating formal verification methods with workflow execution to detect problems early and enable automatic recovery. The formal specification of workflow properties enables rigorous checking against implementation avoiding manual error-prone monitoring.

What We Did

This paper introduces using runtime verification to design reliable execution frameworks for scientific workflows that integrate execution tracking with online fault checking. The work describes integration of runtime verification with workflow execution enabling conditions to be periodically verified during workflow execution. The framework provides detection of anomalies through monitoring of vital health parameters and provides participants with pre-specified conditions for fault tolerance.

Key Results

The framework successfully integrates runtime verification with workflow execution demonstrating detection of anomalies and enforcement of timing constraints. Results show effective monitoring of workflow properties during execution with minimal performance overhead. The approach enables reliable scientific workflow execution with automatic detection of timing violations and anomalies.

Cite This Paper

@article{Dubey2009,
  author = {Dubey, Abhishek and Mehrotra, Rajat and Abdelwahed, Sherif and Tantawi, Asser N.},
  journal = {SIGMETRICS} Performance Evaluation Review},
  title = {Performance modeling of distributed multi-tier enterprise systems},
  year = {2009},
  number = {2},
  pages = {9--11},
  volume = {37},
  bibsource = {dblp computer science bibliography, https://dblp.org},
  biburl = {https://dblp.org/rec/bib/journals/sigmetrics/DubeyMAT09},
  contribution = {lead},
  doi = {10.1145/1639562.1639566},
  file = {:Dubey2009-Performance_modeling_of_distributed_multi-tier_enterprise_systems.pdf:PDF},
  keywords = {runtime verification, workflow execution, fault tolerance, formal methods, scientific computing, monitoring},
  project = {cps-middleware},
  tag = {platform},
  timestamp = {Tue, 06 Nov 2018 00:00:00 +0100},
  url = {https://doi.org/10.1145/1639562.1639566}
}
Quick Info
Year 2009
Keywords
runtime verification workflow execution fault tolerance formal methods scientific computing monitoring
Research Areas
middleware scalable AI
Search Tags

Performance, modeling, distributed, multi, tier, enterprise, systems, runtime verification, workflow execution, fault tolerance, formal methods, scientific computing, monitoring, middleware, scalable AI, 2009, Dubey, Mehrotra, Abdelwahed, Tantawi