Why This Matters

When hardware failures occur in large-scale embedded systems, manual reconfiguration is infeasible and systems must autonomously adapt their task allocations to maintain operation despite diminished resources. The innovation of GHOST lies in providing a systematic optimization process for finding appropriate healing actions that satisfy both system designer and user requirements. This enables autonomous systems to make intelligent recovery decisions rather than simply failing or restarting.

What We Did

This paper presents the GHOST (Guided Healing and Optimization Search Technique) algorithm for automated healing of large-scale embedded system structures. When a resource fault is detected, GHOST applies iterative transformation criteria to find alternative system models that can continue operation with reduced resources. The approach defines performance criteria including throughput, robustness, and similarity to the original system, and uses weighted optimization to balance competing objectives during the healing process.

Key Results

GHOST successfully handles both resource reduction cases (where available resources are reduced) and resource failure cases (where persistent faults block certain resources). The algorithm demonstrates healing transformations on hierarchical system structures, showing how it can redistribute tasks and management responsibilities to maintain system functionality. Examples illustrate how the optimization process balances throughput maximization against maintaining structural similarity to the original design.

Full Abstract

Cite This Paper

@inproceedings{Dubey2006,
  author = {Dubey, Abhishek and {Nordstrom}, S. and {Keskinpala}, T. and {Neema}, S. and {Bapty}, T.},
  booktitle = {Third IEEE International Workshop on Engineering of Autonomic Autonomous Systems (EASE'06)},
  title = {Verifying Autonomic Fault Mitigation Strategies in Large Scale Real-Time Systems},
  year = {2006},
  month = {mar},
  pages = {129-140},
  abstract = {In large scale real-time systems many problems associated with self-management are exacerbated by the addition of time deadlines. In these systems any autonomic behavior must not only be functionally correct but they must also not violate properties of liveness, safety and bounded time responsiveness. In this paper we present and analyze a realtime reflex engine for providing fault mitigation capability to large scale real time systems. We also present a semantic domain for analyzing and verifying the properties of such systems along with the framework of real-time reflex engines},
  category = {conference},
  contribution = {lead},
  doi = {10.1109/EASE.2006.24},
  file = {:Dubey2006-Verifying_autonomic_fault_mitigation_strategies_in_large_scale_real-time_systems.pdf:PDF},
  issn = {2168-1872},
  keywords = {system healing, resource adaptation, fault recovery, optimization, hierarchical systems, autonomous healing, embedded systems},
  month_numeric = {3}
}
Quick Info
Year 2006
Keywords
system healing resource adaptation fault recovery optimization hierarchical systems autonomous healing embedded systems
Research Areas
CPS scalable AI emergency
Search Tags

Verifying, Autonomic, Fault, Mitigation, Strategies, Large, Scale, Real, Time, Systems, system healing, resource adaptation, fault recovery, optimization, hierarchical systems, autonomous healing, embedded systems, CPS, scalable AI, emergency, 2006, Dubey, Nordstrom, Keskinpala, Neema, Bapty