In many cases we in the security industry hear terms thrown around such as "elastic", "self-healing", "cutting edge" and "threat intelligence". These terms are used to communicate the various methods and strategies for operating IT systems and to describe functions in information security arena's. What I have found is that in some cases, these terms mean different things to different people. I feel it necessary to discuss a recent topic that came up and outline what my definition of "self-healing" in computing really means for me as well as the Jigsaw team.
Terms used in the IT field typically are phrases used in marketing or other technical documents. One of the issues is that many terms get bent, twisted or morphed into things that they were never intended. While speaking at a recent cloud computing event, this became inherently clear to me when discussing self-healing technologies. In this blog piece I will discuss several methods of self-healing technology and expand on what I think it means. Remember though that this is my take on it and your definition may be different. This does not mean that my definition is correct or that your definition is wrong, just that there may be differences in what one persons idea of self-healing means.
Self Healing Hardware
We are all aware of RAID drives and various methods of data healing technologies. Even simple backups in some cases if automated could be considered self healing hardware solutions because the intent is to keep a copy of your data on backup hardware. Truly though hardware self healing would be a typical active-standby arrangement that does heartbeats to detect when one piece of hardware fails and fails over or heals the outage by activating other hardware. This is my definition but yours again may differ.
Let's Talk Cloud Computing and Self Healing
In my daily work at Jigsaw, I always build in self healing capabilities into our systems. As a managed security provider, we have to provide access to data at the speed of the wire. As soon as we detect a malicious activity, our system has to respond and inoculate all of our unaffected customers so that they are not vulnerable to a condition that just took place. This requires that we build out cloud computing platforms that can heal themselves to ensure our customers always get the data they need to feed their Jigsaw or other manufacturers appliances that protect their networks. So how do we self heal our environments? There are several strategies we use to make sure we have our systems operating efficiently and correctly. Below are some examples.
Self Healing Examples and Techniques we employ
Monitoring - Our monitoring system is not just a monitor but also can take action if something occurs. An example may be that we have 2 web servers configured. If our main web server no longer responds when our monitoring server tries to connect, the monitoring server could make a DNS change to point to the standby server, thereby healing the condition that the primary server is offline, being rebooted or being maintained by our administrators. In our environment, everything begins with monitoring. There are several tools available that we use to monitor our environments.
Heartbeats - As mentioned earlier our systems may maintain heartbeats to allow them to know the status of other systems in our environment. Missed heartbeats may trigger replication of data, backups or some other change such as a DNS change to redirect valid traffic to a working system.
Round Robin DNS - While not specifically a self healing method, round robin DNS is a feature that can distribute load. It is mentioned because in many cases, this technique can be implemented as part of a self healing strategy.
Cloud Distributed File Systems and other systems can utilize self healing technologies and this is exactly what we do at Jigsaw. We highly suggest learning about the technologies utilized to ensure self healing within cloud computing platforms.
Webmin - An administration tool that includes self healing and monitoring capabilities with automated response to conditions.
HTCondor - A grid computing program that includes service monitoring of other components.