One of the biggest problems with the IT / OT convergence in critical infrastructure is that much of the legacy hardware cannot simply be patched to an acceptable compliance level. Recently, Sean Tufts, the practice director for Industrial Control Systems (ICS) and Internet of Things (IoT) security at Optiv, offered his perspectives on where the industry has been, where it is going, and some of the progress being made to secure critical infrastructure.
Phil Labas: Tell me a little bit about your journey into cybersecurity and how you became the ICS IOT practice manager at Optiv.
Sean Tufts: It started with an unsolicited email of all things. The thing we hate most, the thing we are not supposed to click, turned out to lead me to cybersecurity. At the time, I was working at General Electric, and they had just bought this cool company called Wurldtech, which was one of the Grandfathers of the Operational Technology (OT) space. They sent out a note that asked for people who wanted to learn cybersecurity. I thought it sounded amazing. I was installing SCADA systems and was on the networking side, but I had never been as involved as I would have liked. They put me into a training environment where we learned about risk, reward, and business models. That was my first entry into cybersecurity.
PL: What keeps you in the OT space? There’s lots of places to go, lots of little veins and little areas of focus. So what keeps you in ICS space?
ST: It’s real, it’s tangible, and I can touch it. I can look at a process and know how they made it. There’s some pride in what I do. This is not meant to disparage the rest of the security industry, but this area is my focus and has real physical national security interests. If we look at an example from the peak of the pandemic, we actually saw this in real time. I had three toilet paper makers investing in cyber when they started to hit supply chain problems. They started reaching out because they knew that cybersecurity needed to be taken seriously. I care about that, and there’s a patriotism part of it, too. Our team talks about that a lot. One of the reasons we like this industry is because we have a say in the national interest, and that’s pretty fun.
PL: On the flip side, for the folks that are working in more of the traditional IT cybersecurity, there’s the idea of protecting intellectual property, although I totally understand where you’re coming from. As you’re talking to your toilet paper customers and the folks that are manufacturing shoes, what are some of the common challenges for organizations that have these ICS systems and that want to secure this type of infrastructure?
ST: It’s very similar. We get a little frustrated because, for example, we talked to a food company, and we planned to have the same consultant on the entire project. The executive at the food company said that we couldn’t do that because one place was a broth plant, and the other was a cracker plant. His logic was that they both used totally different processes. From our perspective, both plants were using the exact same equipment, so it was the same plant to us. From a risk perspective, some plants would be more dangerous to the environment or human safety than others, but how we solve these problems is the same.
You’re going to go into a different level of depth on a nuclear generator than you would a solar farm. But the same mentality still exists. We need to migrate off legacy systems, and we need to have better policies, procedures, and training on the soft skills. On the technology side, there are a lot of new avenues there. We can disconnect the internet, all these devices, and it’s not a challenge, but that’s shortsighted. There are business impacts to that. So, we need to make sure we surround those legacy and those historically air-gapped environments with just the right toolset and the right technology as well as just make sure it’s built to the right sensitivity.
PL: When you talk about patch management, both path patch management and vulnerability management are probably unique to OT as they are to other verticals. What are the specific challenges that you run into in ICS environments that you don’t see in other environments?
ST: The big one is the inability to patch. If I went to a credit card company and told them that we don’t think we can patch a particular machine for the next 25 years, they would throw me out of their office. However, that’s the reality of a lot of situations you run into in a food production plant. The best approach for a lot of our clients is to document why their environment and processes prohibit patching.
PL: Do you find that the organizations that you’re talking to have robust or well-developed Common Vulnerabilities and Exploits (CVE) plans?
ST: Some have really strong CVE programs on the IT side, and those aren’t bled well into the OT side. I remember when the Schneider safety system hack happened, and that was a big impulse. Outside of the Oldsmar and the Colonial Pipeline incidents, the Schneider problem was probably one of the scariest because it affected the safety systems. Most people didn’t have any way outside of paper and pencil as well as picking up the phone and calling an operator to realize if they had Schneider equipment in the environment. We need to find ways to get a little more intelligence out of this community faster and better.
PL: You make a good point. We have these controls. We know that these things exist because they’re plugged in, they’re communicating in some way, and we’re attaching to them, but the people who are working with this equipment are looking at processes in an entirely different way. They’re looking at a manufacturing process, and they’re looking through the world in a different lens around what their machines are doing for them. It’s not a security lens. It’s more about ensuring uptime. But as a security professional, you’re looking at it through an entirely different set of glasses.
So, how do the people who are focused on making sure that the machine doesn’t physically break up-level the conversation to the folks that need to care and really hold the budget for this? Are there other recommendations for those folks to understand what an organization needs to do to protect itself?
ST: We spend a lot of time talking about adversaries and threats. In Clint Bodungen’s book, he speaks about how the first thing to realize is that the most common threat actor is the good-hearted employee who just wants to do his job and moves a workload to the cloud that shouldn’t have gone. Alternatively, he opens up a firewall port or patches an old system that couldn’t support that. All those actions are perfectly normal business processes that if you’re not careful and you don’t have a really stringent memorandum of change controls, you could get yourself into a lot of trouble that will impact production. That will impact uptime. That will impact health, safety, and environmental factors. The most important “first piece” is realizing that all of this could all come to a grinding halt because of an IT problem that we create ourselves.
The worst problem is that these legacy systems are still allowed to operate, which makes it easier for us to trip alarms on our own stuff. It’s also a super big means and motive for a threat actor. Once they get in there, there are a lot fewer controls to go through than on the IT side, for example, compared to trying to find credit card or Social Security Numbers. It’s a bigger free-for-all. The Florida water supply hack wasn’t an intelligent hack, but that was a lethal problem. Luckily, the guy was awake at the controls because that was the only thing that protected the system.
PL: When you start looking at an OT security program or an ICS security program, what are some things that come to mind around best practices? I want to get a clear view of whether an organization should protect their ICS environment and their OT environment from a framework standpoint like NIST, the MITRE framework, or something like that. Or is there something else that you would recommend to an organization about how they should look at the environment?
ST: The number-one thing is to suspend the word “environment” because this is a people problem. It’s not that we haven’t done the right things from a people-process perspective. Training programs, cross-training, having good disaster recovery efforts, testing those, having instant response plans, and testing them…those are all people initiatives that don’t have enough time or resources. There are a lot of components that go into that. But that’s the piece to me that screams to the front of the line – that is, the lack of time or resources. We talked to a client who is a global brand. They had 250 locations across the globe, and pretty much everything was on the shoulders of one guy who had been at the company for many years. And then, he retired. They relied on him for everything and had to start the entire risk assessment over because everything was pinned on one person. The entire generation that built most of the modern critical infrastructure are retiring, and they’re handing these tools over to a new generation that wants cloud-enabled tools. We have to meet that in the middle.
PL: Are you seeing organizations looking at the longstanding frameworks and wanting to continue to model the company the same way because they may be doing it that way on the IT side? Are you seeing bleed over onto the OT side with regards to that?
ST: I’ve seen a lot of that. There are a couple that are getting a little more momentum on things that were previously ignored. For a long time, it was IEC 62443, and it was anything NIST. NIST 800-82 for manufacturing has been a good standard that a lot of people have gotten really deep on, especially in NIST-centric organizations. I’ve seen a lot of people also leaning into MITRE ATT&CK for ICS. It’s been a real logical way to look at the problem. It’s speaking to a whole class of threat actors and exposing them to some of the weaknesses in these Programmable Logic Controllers (PLCs). That’s been a good migration point for a lot of threat-minded people who haven’t worked in automation environments before.
PL: Are there any OT and ICS frameworks that you like better than others, or do you consider them all good?
ST: It’s all good. The best method is to make sure you don’t leave an obvious glaring hole in your security. As you work through the assess to detect all the weaknesses, there’s a lot underneath there that’s easy to skip over once you start really building program. Any standard is good. Choose a framework. Have something to stand behind, and when it gets challenged, you can show that you looked at the components and how you matured. That’s the big piece.
PL: Can you share what you’ve wrestling with lately?
ST: I’m wrestling with NERC CIP right now. It’s one of the more complex standards out there. There are parts of it I love such as how it promotes the use of multi-factor authentication, but it isn’t fully current with all the cool new parts of MFA like velocity tagging or multiple credentials. The absence of those make it sound like “check-the-box” security. However, when I look at how manufacturers have matured, it is clear that NERC CIP has moved the utilities in a positive direction for everybody from the biggest in IOU in New York City to the smallest in rural coop in Iowa. Unfortunately, we are not seeing that same security growth in the other industries.
PL: There’s a huge gap.
ST: Yeah, and you don’t see that in utilities. The problem is not a sweeping trend, so that’s been good. That’s why I’m wrestling with NERC, for on one side, it could be so much better, but on the other side, it’s pretty efficient.
PL: I think the fact that it’s a regulation, as opposed to guidance, makes all the difference in the world. You mentioned earlier about the generational change with regards to the folks that are working in OT, the folks that first developed these controls, and the next generation that’s moving towards cloud adoption. Historically, it’s been pretty slow as opposed to other areas of business. Why do you think it’s been slow to adopt? Is it the generational turnover that’s going to allow it to accelerate, or is there another influencing factor?
ST: I think people just get uncomfortable if they can’t really touch the switch and they can’t see the router. Those pieces and parts are kind of the lifeblood for a large section of the workforce that built those systems. Just general distrust and fear is why we’ve been slow to get into it. Also, change is hard, and there’s not a ton of budget to spend and knock a production line offline for any amount of time to rebuild the route and switch network. You don’t get those time periods to do that. If a cloud adoption goes bad and you’ve ripped out some of the legacy route and switch environments, what are you left with? Now you’re in trouble. So, that six days goes to 12 or 20. Everyone’s just really waiting to see who goes first, and people are moving workloads to the cloud.
Another piece that’s going to drive it is the cost. It’s really expensive to re-segment your network, to bring all new hardware in, and to replace all that old legacy stuff. It’s a lot cheaper and easier to leverage more modern networking capabilities that in some cases are very cloud heavy and that’s going to involve a cheaper and easier approach. By necessity, we are going to have all these workloads in the cloud, and we’ll be accessing all our interfaces from a browser instead of a dedicated hardboard. I think it will be interesting to see which industry moves first. I think the automotive industry is a very interesting market to watch in this regard. They have incorporated so much automation, robotics, and a lot less human interaction with the manufacturing process.
PL: I appreciate your insight on that. When it comes to working with OT security vendors, what are some of the key elements that every company should consider when they’re having conversations with a potential OEM or a solution provider?
ST: I look for specifics on the skill sets where we’re trying to pull threats and visibility and assets out. It’s important to have the data presented in a way that you want to have it presented, to map it correctly, and to get a good diagram out of it. Is it a big search bar that you can just type one word and find all the equipment of a particular manufacturer? That is how you want to interface with the tool.
PL: So, it’s important that the organization cares enough about you to be able to speak your language and work the way you need to work.
PL: There have been some new initiatives instituted by President Biden to secure and protect critical infrastructure. What are your thoughts on those initiatives?
ST: They’re good. It’s the right way. We need to have that talking point from the executive branch. We need to be able to have people not mess with us. I’ve been closely following some of the requirements that have come out from the Transportation Safety Administration (TSA), which controls pipelines in the wake of the Colonial ransomware incident. There’s been some interesting plays there where I didn’t expect it to be the first thing off the board, really getting the credential management in OT matured. That’s been really interesting to me. I think the API 1164 standard is going to have a lot to say about where pipeline management goes in the future away from TSA. That’s been a fascinating turn for the environment to really focus on the identity.
Another point I’ve been following is the convergence of cybersecurity and safety. Another security professional posited that we should treat this like a safety issue. Otherwise, we’re never going to get the funding. We need to actually make a dent in the security problem by treating it with the same seriousness as a physical safety problem. Some of the things that happened in the last 12 months have all been on that verge. There was a ransomware attack on a hospital in Germany that indirectly led to a patient death. It was reported that this was the first fatality caused by a ransomware attack. But can we sit in our chair and think that it is going to be the last? I don’t know.
PL: It is a good question. I like the idea of it being a safety issue, and I think that ties back to how it’s people problems and people initiatives that are the key component. I asked earlier about budgeting and how to have that conversation with the budget holder. How do you build that value statement to go upstream to request money for an intangible thing such as safety?
ST: That’s actually the exciting part. A lot of our current analytics capabilities, not to mention utilizing all the fun stuff, artificial intelligence, and machine learning, is bringing it all to the table. I am not seeing anybody rushing into analytics programs and forcing that on a non-hardened environment. It’s just too dangerous. What I’m seeing right now is a trend towards unlocking analytics that we can run on that electric motor or predictive maintenance on that flow valve. By doing this, we can reduce our overall costs, make some investments, and make something valuable out of it. The winners and losers in the oil and gas markets are going to be those who can do more with the same amount of assets. And a part of that is going to be a digital component. If you can get 2% to 3% more through a pipe because of a motor not dying, a valve accidentally shutting, or a cracked pipe. Having more intelligence to get there faster is going to affect who will be the winners and losers. There’s no capital. We need new projects, and we can do more with what we have.
PL: And then focusing on understanding your environment better will allow you to drive those efficiencies. And we just don’t have those optics into that right now.
PL: Very cool. Is there any area that we potentially haven’t covered that you think is important that you want to talk about?
ST: No, I’m happy. We got to that data piece because that’s going to be a big unlocking factor, and it’s kind of “future state” stuff. That was fun.
PL: Thanks for taking the time to speak with me today.
ST: Likewise, it was a lot of fun speaking with you.