Why This Matters

Large-scale embedded systems require autonomous fault detection and recovery mechanisms since manual intervention is often infeasible, particularly in space missions where communication latency and system inaccessibility constrain responses. The reflex and healing approach provides a systematic framework for integrating fault management into system architecture, enabling systems to maintain reliable operation despite failures. This work is innovative in demonstrating hierarchical fault management that can scale to complex distributed systems.

What We Did

This paper presents a reflex and healing architecture for software health management in complex embedded systems such as those used in space missions. The framework employs hierarchical reflex engines that detect discrepancies through monitoring, diagnose faults using Timed Fault Propagation Graphs, and execute fault mitigation strategies using state machines. The architecture supports both reactive and proactive healing actions coordinated across multiple hierarchical levels.

Key Results

The paper demonstrates a three-level hierarchical management structure (global, regional, and local) with reflex engines at each level that can detect and respond to faults autonomously. The framework successfully integrates with the ARINC-653 avionics standard, enabling applicability to safety-critical real-time systems. Case studies show how the architecture enables both fault isolation and coordinated recovery actions across system components.

Cite This Paper

@inproceedings{Dubey2009e,
  author = {Dubey, Abhishek and Mahadevan, Nagbhushan and Kereskenyi, Robert},
  booktitle = {International workshop on software health management. IEEE conference on space mission challenges for information technology},
  title = {Reflex and healing architecture for software health management},
  year = {2009},
  category = {workshop},
  contribution = {lead},
  file = {:Dubey2009e-Reflex_and_healing_architecture_for_software_health_management.pdf:PDF},
  keywords = {software health management, fault detection, fault diagnosis, reflex engines, hierarchical architecture, mitigation strategies, real-time systems}
}
Quick Info
Year 2009
Keywords
software health management fault detection fault diagnosis reflex engines hierarchical architecture mitigation strategies real-time systems
Research Areas
CPS emergency middleware
Search Tags

Reflex, healing, architecture, software, health, management, software health management, fault detection, fault diagnosis, reflex engines, hierarchical architecture, mitigation strategies, real-time systems, CPS, emergency, middleware, 2009, Dubey, Mahadevan, Kereskenyi