The evolution of the cyber threat landscape highlights the emerging need for organizations to strengthen their ability to identify, analyze, and evaluate cyber risks before they evolve into security incidents. Although the terms “patch management” and “vulnerability management” are used as if they are interchangeable, this is not the case. Most are confused because applying patches is one of the many tools that's available in our arsenal for mitigating cyber risks.
Benefits and Risks of Patching (and Patch Management)
Before deciding whether to install a patch or not, it is important that we understand the associated benefits and risks of doing so. Is patching worth the effort? The most obvious reason for patching, and what organizations usually think of when it comes to patching, is the need to fix security flaws in either the OS or the applications. However, this is not the only benefit you gain from patching timely and correctly. A lot of vendors release patches to improve the applications’ stability. These types of improvements provide a strong case for rolling out patches in the ICS environment because stability and uptime of critical devices are of the utmost importance. Lastly, patches can also assist in resolving specific bugs or flaws in certain applications. Again, this is another benefit, and it strengthens the business case for why organizations should patch. However, besides the benefits, there are equally negative or risky reasons for not patching. This is closely related to how risk is perceived within the IT side of the business as opposed to the OT side. Within the IT side of the organization, the benefits outweigh the risks, as loss of data is considered a bigger concern than the downtime of a network. On the other hand, for the OT side, systems uptime is of great importance. How the two sides of an organization (IT/OT) view risk versus reward is vastly different, and we will examine that in the next paragraph. Focusing on ICS environments, where reliability is a key factor, a major risk could be taking down a critical network or component due to malformed or corrupt patch. In addition, patching can be considered as a very time-consuming and in some cases a full-time job if we consider that over 15 new vulnerabilities are discovered on a daily basis. Another factor to consider is the associated cost of testing the released patches. Again, the IT and the OT sides of the business have different cost factors to consider. Both sides have to build development labs for the patches to be tested against before rolling out on production systems. In the OT world, you have to buy hardware that mimics the real production systems, unlike the IT world where you could mimic the production systems with the use of virtual environments. The logistics and the associated costs behind replicating OT production systems outweigh by far the respective sizes for IT systems. Furthermore, IT could also utilize automated patch management solutions that will vastly reduce the number of staff and man hours required to test all those patches. Unfortunately, this is not the case with OT. Patches have to be tested on each individual device, and most probably, OT teams would have to rely on the vendor specialist to deliver the updates themselves. This is incurring a much higher cost-to-benefit ration than on the IT side. The last thing we need to consider is vendor end-of-life (EOL) product cycles. This is not as much of a risk in the IT side of the organization. Testing and upgrading the OS, for example, is a lot easier with the use of solutions such as virtual environments. If we couple that with the reduced concern on uptime, it is much easier to deal with EOL issues within the IT side than within the OT side of the business. In OT environments, some production systems have been around for over twenty or more years. In most of the cases, they have probably never been upgraded or patched before. Asking the OT people to take the risk of patching a system that has been working flawlessly for decades for the benefit of making it harder to be hacked is a hard thing to do. The adage of “it isn’t if you get hacked but when you get hacked” is helping to highlight the risk of not doing anything. Unfortunately, many OT organizations still believe that it can’t or won’t happen to them. However, the risk associated with “when we get hacked” should be looked at in greater detail and weighed up against the probability of an unexpected, uncontrolled system shutdown as opposed to doing a controlled, manual, segmented patching.
IT vs. OT
To understand the differences in risk perceptions and prioritization between IT and OT, it would be useful to review how these two worlds view the CIA triad. For the IT side, confidentiality has the highest priority. Losing something valuable such as customer or staff personal data could be catastrophic to any organization and could entail financial losses, reputational damage as well as regulatory penalties. Integrity is the second highest concern for IT organizations. Branding and customer retention could be massively affected if an organization has to admit that they have been breached and any data or intellectual property has been stolen. An incident affecting integrity could result in financial losses, as well, and organizations could face problems such as fines or even loss of business-as-usual revenue from unhappy customers. The last concern is availability. Organizations would always like and strive to maintain system availability, especially on systems that are customer-facing. However, should a system go down, the impact on the mean-time-to-repair (MTTR) is a lot shorter in IT environments than within OT organizations. Rebuilding a system from a virtual backup is a lot simpler than having to get a physical device removed off the production and replaced with a new one, which usually involves vendor specialists increasing the cost and the downtime.
On the other hand, for OT organizations, availability has the highest priority. This is completely understandable, as the cost associated with a system downtime, even a short one, could result in millions of dollars or euros. Not to mention that such downtimes may have a significant impact on society as a whole. Just imagine how many households will be impacted by an electric grid downtime. Further, OT systems going out-of-production may hamper other organizations or industries since the interconnections and the interdependencies between products and services are very strong. Integrity has the second highest priority, as with the IT, for the same reasons – branding, loss of revenue and fines. Last in the priority list is confidentiality, although it should not be seen as a minimal concern. Indeed, the loss of sensitive or secret data due to industrial espionage can have even more dire consequences to the organization as the loss of personal data. Despite all these differences, IT and OT do share a common ground, and that’s safety. But this is not the only similarity they share. Looking at the illustration below of systems and solutions specific to each side of the organization, we realize that they overlap in many areas such as asset discovery, vulnerability assessment, policy management, change detection, configuration assessment and log management.
With organizations converging OT and IT and essentially having both entities reporting under one technical umbrella, it is easy to understand the benefits of using an IT historian SIEM tool to analyze all the OT data. IT already has a centralized operations team and the tools in place to be able to quickly identify potential malicious patterns of interest and alert the OT team. The OT team then would have to deal with a single event rather than having to cope with the proverbial noise, thus reducing the headcount and the associated costs.
What Can Be Done If We Cannot Patch?
Having analyzed the differences between IT and OT and the risks and benefits of patching in ICS environments, we are in a better position to understand why there would be a need to avoid patching in certain circumstances. If we can’t patch, what else can be done? It all starts by acknowledging that patch management is a subset of vulnerability management. Vulnerability management is not a stand-alone scan-and-patch function. It’s a holistic function that takes a proactive view of managing the daunting task of addressing identified vulnerabilities in deployed hardware devices and software. Vulnerability management is more than just getting alerts whenever your infrastructure needs a patch applied. Vulnerability management is about making informed decisions and properly prioritizing what vulnerabilities to mitigate and how. This is achieved by embedding internal hooks for telemetry into all systems of interest as well as external hooks for threat intelligence from all sources. Based on these considerations, this is what ICS organizations should do as a bare minimum if they are not in a position to patch.
- Asset analysis or discovery to know what you have in your environment in order to protect it. This process could raise one basic security question: do we actually need all these assets, or are we spending time trying to secure things that are not required?
- Perimeter protection to fortify your organization against both physical and digital intrusion. This could include anything from firewalls to access controls.
- Segmentation, which comes with many benefits when trying both to defend against lateral movements and to contain a security incident so as not to harm the entire organization.
- Log management, which is not to be used as an IDS tool or to detect changes rather as a tool designed to look for movement within the organization to detect potential attacks.
- Vulnerability assessment to determine potential weak points and to identify the vulnerability risk posture of each asset. Once the vulnerability scan is complete, a score is attached to each vulnerability based on the skills required to exploit the vulnerability and the privileges gained upon successful exploitation. The easier the vulnerability is to exploit and the higher the privilege gained, the higher the risk score will be.
- File Integrity Monitoring (FIM) to be able to monitor actual changes taking place within the ICS organization. While the previous steps mentioned predominantly cover external threats and monitoring, FIM takes a look inside the organization and tries to correlate movement with actual changes to reduce the noise further and raise the alarm.
How Tripwire Helps
The Tripwire portfolio consists of best-of-breed solutions that can either work together or be used as standalone solutions offering customers maximum flexibility. Tripwire’s File Integrity Manager is a file integrity and configuration assessment solution that is able to identify an alert on all changes that take place within an organization's network, providing detailed change information such as who made the change when the change happened and also what the actual change was by providing a side-by-side comparison report. Tripwire IP360 is a vulnerability solution that not only has the functionality to discover assets on the network but also scans them against known vulnerability database of over 130 thousand unique tests. IP360 is also unique in a way that is able to prioritize the highest vulnerabilities first with its own scoring algorithm. This method of prioritization is very powerful, as it helps reduce the number of vulnerabilities an organization should focus on, thus facilitating an organization's patch management strategy. Tripwire LogCenter could be considered as a smart historian that provides complete, secure, reliable log collection, and it highlights events of interest from a sea of data. ICS specific asset discovery and inventory solutions have been designed to capture, process and alert on all the same critical elements that Tripwire Enterprise and IP360 are able to but from a data packet analysis perspective i.e. agentless “on the wire” traffic. Having all the solutions integrated would provide not only the security measures recommended as an absolute minimum (including patch management) but also provide processes that could reduce the amount of data required to detect potential threats. Having the ability to use both agent-based and agentless technologies would make sure that organizations are able to include all devices within their infrastructure with no concern on endpoint stability, warranties and varieties. If an organization deems a device critical, regardless of what the device actually is, then it should be monitored and reported on – no excuses! That's the philosophy behind patch management.