Model Predictive Analysis for AutonomicWorkflow Management in Large-scale Scientific Computing Environments

S. Nordstrom, Abhishek Dubey, T. Keskinpala, R. Datta, S. Neema, T. Bapty

Fourth IEEE International Workshop on Engineering of Autonomic and Autonomous Systems (EASe'07) 2007

Why This Matters

Scientific computing workflows often consist of hundreds of interdependent tasks that must be executed across shared cluster resources, and failures in single nodes can cause entire workflow stalls. The innovation lies in applying model-based predictive analysis to dynamically determine workflow feasibility and guide reallocation decisions. This enables systems to make informed decisions about which workflows should continue execution versus those that are predicted to fail.

What We Did

This paper develops model predictive analysis techniques for autonomous workflow management in large-scale scientific computing environments. The authors present a WorkflowML modeling language for specifying job dependencies, data flows, and synchronization constraints, and develop a lookahead algorithm that can predict workflow execution failures. The approach enables proactive workflow modification to avoid stalled states by predicting which jobs cannot complete based on current failure conditions.

Key Results

The paper demonstrates model-driven workflow analysis using a simplified workflow model with synchronization, sequence, and data dependencies. The lookahead algorithm successfully predicts workflow stall conditions and can identify alternative execution paths that avoid predicted failures. Experimental simulations show that the approach can improve overall workflow completion compared to executing all workflows regardless of failure predictions.

@inproceedings{Nordstrom2007, author = {Nordstrom}, S. and Dubey, Abhishek and {Keskinpala}, T. and {Datta}, R. and {Neema}, S. and {Bapty}, T.}, booktitle = {Fourth IEEE International Workshop on Engineering of Autonomic and Autonomous Systems (EASe'07)}, title = {Model Predictive Analysis for AutonomicWorkflow Management in Large-scale Scientific Computing Environments}, year = {2007}, month = {mar}, pages = {37-42}, abstract = {In large scale scientific computing, proper planning and management of computational resources lead to higher system utilizations and increased scientific productivity. Scientists are increasingly leveraging the use of business process management techniques and workflow management tools to balance the needs of the scientific analyses with the availability of computational resources. However, the advancements in productivity from execution of workflows in a large scale computing environments are often thwarted by runtime resource failures. This paper presents our initial work toward autonomic model based fault analysis in workflow based environments}, category = {conference}, contribution = {colab}, doi = {10.1109/EASE.2007.18}, file = {:Nordstrom2007-Model_predictive_analysis_for_autonomicworkflow_management_in_large-scale_scientific_computing_environments.pdf:PDF}, issn = {null}, keywords = {workflow modeling, predictive analysis, scientific computing, job dependencies, fault prediction, workflow management, lookahead algorithm}, month_numeric = {3} }

Model Predictive Analysis for AutonomicWorkflow Management in Large-scale Scientific Computing Environments

Why This Matters

What We Did

Key Results

Full Abstract

Cite This Paper

Quick Info

Keywords

Research Areas

Search Tags