Skip to content ↓ | Skip to navigation ↓

I’ve been part of a lot of discussions about big data and its role in security. What is interesting to me is how much hope people have for big data being the savior of the security world. I don’t believe that’s going to happen anytime soon.

Why not? Let’s take a look at big data used in a different context. Amazon gathers a lot of information about its shoppers — what they’ve bought, who they are, where they live, and what things they have rated highly.

Even with all this information, I routinely get suggestions from Amazon suggesting that I buy something I already own, or that I bought for someone else as a gift (which they should know because I had it gift-wrapped and sent to someone else’s address). These items were purchased directly from Amazon, so you’d think they would be much more accurate than they are.

Think about the implications here. I authenticated with them, so they know exactly who I am. I authorized them to track information about what I’m doing, buying, etc. They have a lot of historical and behavioral data about me and my purchase habit. When you put those things together, they ought to be almost 100% accurate about what they suggest — as well as what they avoid — based on my preferences. Don’t get me wrong – they do make some good suggestions, but I’d expect the hit rate to be a lot higher than it is.

The fact that Amazon’s recommendations are not very accurate in spite of all the information I give them, makes me suspect that it’ll be quite a while before information security derives significant, sustainable, game-changing security value out of big data.

In the mean time, we’ll get some value out of security analytics but that value will be significantly limited if we don’t collect the right data on the front-end. So, let’s think about what we need to collect so we can answer the questions we’ll have later. After all, a big data repository is a lot like a spreadsheet – general purpose until you decide what you want to do with it, and low value until you get there right data in to satisfy your use case.  To get the right data in, we need to:

  • know what we are protecting;
  • know the value of what we are protecting;
  • know what “normal” looks like;
  • collect the current and historical state data and context to understand how things change over time;
  • know what questions we may want to answer about our security effectiveness (informed by clear policies, standards, and baselines);
  • select controls that provide the right data;
  • get those controls implemented; and
  • feed the data from the controls into the big data repository.

From there, we can start asking the questions and getting answers that drive action.  In the early days, this probably looks more like narrowing down the things we need to look at, rather than giving us the answers outright. That’s not bad, but it isn’t dramatic.

Don’t get me wrong – the promise of machine-to-machine learning and rapid, large scale factoring of security data is real and compelling. It’s just not a silver bullet.

What about you? Are you getting value from big data for security? If so, I’d love to hear from you.


Related Articles:



picThe Executive’s Guide to the Top 20 Critical Security Controls

Tripwire has compiled an e-book, titled The Executive’s Guide to the Top 20 Critical Security Controls: Key Takeaways and Improvement Opportunities, which is available for download [registration form required].


Title image courtesy of ShutterStock