Earlier this year, I had the opportunity to present at S4x17 in Miami on the topic of deep packet inspection (DPI) technologies and the ways in which you could evaluate products that tout DPI features. At first glance, I thought, “Sure, no problem. How hard could it be?” It turns out that this is a difficult problem for a few reasons.
The difficulties come in a few forms. Most companies do not hand out their codebase outlining how their DPI implementation is written, so to truly compare, it would require Black-box testing that includes fuzz testing of each device. At the same time, different protocols have different complexities and idiosyncrasies that may or may not be important or handled by a DPI implementation, whereas different vendors may have vendor-specific functions or implementations that only loosely adhere to the protocol specification that should be taken into account when developing a DPI implementation.
These examples are only a small set of elements to consider when comparing products. I hope to go through a methodology, or at least a starting framework, that can take some qualitative ideas and convert them into a quantitative scheme for comparison.
To begin with, we must understand the concepts of the control plane and data plane. The data plane encompasses commands in which the HMI is reading pressure or temperature data from a PLC or writing to a specific register in an interval. This is the actual process data or ladder logic running on your PLC. Meanwhile, the control plane is the total number of actions that can update the firmware or stop the controller entirely. Its commands encompass the underlying OS that is running the PLC. This is akin to executing a Windows update on your Windows PC. It is also important because many PLCs utilize the same network stream to update their PLC firmware version that you would use for reading data.
Essentially, the control plane and data plane traffic traverse the same TCP connection. This is why the need for DPI is important, for a user would certainly want to receive data from their HMI but not at the cost of leaving their PLC vulnerable to firmware updates from a non-authenticated source. Without a DPI firewall, you cannot differentiate between the data plane and control plane messages. There are multiple approaches to implementing a DPI firewall including a full-protocol implementation, a signature-based approach, a proxy-based approach, or machine learning. Some are more useful than others.
Incoming rant paragraph! A signature-based approach is a terrible mechanism to use in an industrial control system (ICS) world. A signature-based system is only a reactive mechanism. The signatures are built from an already discovered vulnerability; this means the attack vector is already out in the wild and could affect running systems. Signatures provide a shallow inspection and require signature database updates. (Internet access on the plant floor? No no.) A signature is typically made for a specific vulnerability. If one byte changes (malware mutation) in the attack vector, you have to build a new signature to mitigate it.
In addition, this approach is effectively building a blacklist rather than whitelist. Blacklisting is a poor approach. Think of it as building a wall that is 6ft tall. Then the attacker comes in with a 7ft ladder and hops over the wall. So then you build the wall 8ft to stop the 7ft ladder, but the attacker has already built a 10ft ladder and hops over. You get the picture. A pro-active approach is more effective.
The depth of an implementation is more important than breadth. A firewall vendor may say they support 500 protocols, but to what extent? Validating a single byte in one protocol does not mean ‘you support DPI’ for that protocol.
With the terms and signature rant out of the way, I would like to discuss some items to consider when coming up with a grading scheme for comparing products. How do we actually compare products? What elements are the MOST important for a DPI implementation? When you think about the top 80-90% of ICS-CERT vulnerabilities, they all fall under the same category: the malware/packet-fuzzing/buffer-overflow/poor implementation grouping.
What that boils down to is this question: does the packet structure of a certain protocol adhere to the protocol specification (if a Modbus FC3 should be X length and nothing else)? If it does not, then drop the packet.
With all of the above in mind, it makes sense to outline elements that have to be in a well-structured DPI implementation, or essentially the elements that you could not leave out!
- (Sanity Check) This is the ability for the DPI engine to understand the entirety of a protocol and all the permutations of the packet structures. If it is an Ethernet/IP CIP message with a length field on a ‘Get Attribute All’ acting on an ‘Analog Input object’, what does that mean? What are the allowed lengths, or how many bytes per field? The DPI implementation HAS to know the protocol inside and out, especially if it is to catch 80-90% of the ICS-CERT vulnerabilities.
- (Action Filter) The ability to allow/deny specific function codes, CIP services in EtherNet/IP, and DNP3 objects in DNP3, to name a few. This is the ability to differentiate between those control plane and data plane actions. It gives a user the ability to still monitor their temperature gauges or pressure and be assured that a firmware update action will be blocked.
- (State Checking) This investigates whether a response has a corresponding request. Has a response come back after an initial request? This ensures spurious responses are dropped.
- (Response Validation) If you look at DNP3, a majority of published vulnerabilities are actually attacks on the HMI rather than the PLC or RTU. This implies the depth in which response messages are validated.
- (Vendor Specific Support) Otherwise known as vendor specific validation. Think of Modbus FC 90 from Schneider (Unity) or PCCC/CSPv4 with Rockwell. If the DPI engine understands vendor-specific elements that your installation utilizes, then this is important as a metric.
- (Pipeline Support) This is where a TCP or UDP message contains multiple ICS protocol messages in its payload. Think of putting 4 Modbus messages in a single frame. The DPI implementation must be able to handle this and iterate through each message ensuring no write commands or firmware updates are embedded.
So, taking the above six ‘must haves,’ I translate this into a grading scheme of sorts.
What constitutes a 4 for Sanity check? Or any value for that matter? This is the harder part. We now have a general sense of aspects to evaluate and a grading mechanism. Based on the protocol, a ‘4’ would denote that not every field is validated. As an example, Modbus FC15 (Write Multiple Coils) does not validate the quantity of outputs in general; perhaps it does not validate quantities of all replies.
Along those lines, a ‘6’ would mean that the DPI engine validates every field and ensures the packet is an exact match of the protocol specification.
As I was going through this exercise, I realized a few things that I alluded to earlier. Depth is hard to gauge without fine details of a product. Is the company willing to outline to what depth they validate the protocol in question? Numbers are subjective in the grading scheme, and each DPI Feature could be broken down into sub-categories for a more accurate picture.
Undoubtedly, you could also compare methods of DPI as well as signature-based system vs. full protocol conformance or proxy setup. In addition, the depth of each protocols’ DPI engine may differ from one to the next, so it would make sense to also compare protocol to protocol, not product to product. Upon inspecting multiple products, the values of each may change in relation to one another, i.e. “product X can do this better than product Y – to me, this is more valuable.”
Finally, there are probably other elements to consider. A DPI engine that supports ‘everything’ but provides a cryptic user interface is not very useful to a customer.
What actions are important if a DPI engine identifies an invalid frame?
- Drop the frame?
- Send a TCP reset (if TCP)?
- Generate an Event message?
Throughput and latency are not mentioned because this is about the functionality, not how fast / slow the product is. However, this should be at least thought of for protocols like GOOSE that require low latency. In general, remember not all DPI implementations are created equal.
I see this as a first step to coming up with some mechanism to compare DPI products. In general, I would not rely on marketing speak to gauge product capabilities. The proof, as they say, is in the pudding.
About the Author: Erik Schweigert leads the Tofino Engineering team within Belden’s Industrial Cybersecurity platform. He developed the Modbus/TCP, OPC, EtherNet/IP modules and directed the development of the DNP3, and IEC-60870-5-104 deep packet inspection modules for Tofino security products. His areas of expertise include industrial protocol analysis, network security, and secure software development. Schweigert graduated with a Bachelor of Science in Computer Science from Vancouver Island University.
Editor’s Note: The opinions expressed in this guest author article are solely those of the contributor, and do not necessarily reflect those of Tripwire, Inc.