cureIT - Adaptive Immunity for Software: Making Systems and Services Autonomously Self-Healing

cureIT - Adaptive Immunity for Software: Making Systems and Services Autonomously Self-Healing

Duration
01.10.2019 - 31.03.2025

Software has become a central part of nearly all economic activity, and our daily lives have become increasingly dependent on complex software-intensive systems and services. As such, failures in these systems can affect thousands or even millions of people and lead to massive damages.

Despite significant investments in software testing, much of our software is still plagued by failures. One reason is that the existing techniques for software testing are mainly aimed at checking that the conditions corresponding to known or anticipated problems do not occur. However, the complexity of modern software makes it impossible to anticipate all problems that could be encountered.

The main goal of the cureIT project is to significantly increase the dependability, robustness, and resilience of today's software systems by addressing the faults that remain after thorough testing. We do this by developing new methods and techniques that help software engineers with the creation of so-called self-healing software systems. These are systems that can autonomously detect the occurrence of unanticipated faults during execution, diagnose their causes, and recover from these situations.

To achieve this goal, we build on the notion of an artificial immune system. Similar to the human immune system, it will recognize and take care of unanticipated "foreign bodies" (resp. faults/infections) that could have negative effects. In particular, the project will address the following challenges: (1) Techniques that can detect failures by learning what is the system's normal behavior and recognize when a system behaves abnormally. (2) Adaptive learning techniques that enable early recognition of failures that are similar to the ones that have been seen before. (3) Cost-effective techniques to diagnose the root causes of a failure, and for containing its impact, both inside and outside the system. (4) Techniques for the systematic evaluation of the correct functioning of self-healing software.

The results thus far include the development of a research agenda for self-healing software systems based on artificial immune systems (AISs), a survey of the main approaches to model AISs together with a prototype implementation for anomaly detection using an AIS, a self-healing smart office exemplar, and a method to systematically evaluate self-healing software systems using chaos engineering.