How Much Data Do We Produce?Let’s start with this basic concept: today, “data” is everything. Both personally and professionally, much of our lives have been converted into a bunch of zeroes and ones. Our reliance on data has never been greater and is only certain to grow, especially with the explosion of the Internet of Things (IoT). And the amount of data – good, bad, junk – we produce continues to grow (at breakneck speeds), taking up space on global networks (meaning that if you were able to control even a fraction of this data flow, you would be able to unleash a wicked DDoS attack). So how much data exactly is traveling – nearly at the speed of light – through the networks? According to a June 2016 Cisco white paper, we are in the “zettabyte era” in terms of global IP traffic. Great! What is a zettabyte?
Back to BasicsTo unpack that question, we need to start with a few basics, the first being that humans have cognitive limitations. Our limitations become evident when trying to understand very large (or very small) numbers. We can use notations to represent large numbers, such as 1 ZB equalling 1 x 1021 bytes. But does that notation mean anything to you? Denote one million as 1 x 106, and it may mean something to you, but that is because we have a better understanding of what “one million” means in practical terms. Let us conceptualize “one million” using dollars to create a reference point: your salary is $50,000 a year, you work for 20 years, and assuming you spend nothing, you would accumulate one million dollars. Now, using the table below, we will “scale up” your salary:
|Salary Base||Factor||Adjusted Yearly||Years||Accumulation||Rewritten|
|$50,000 per year||1||$50,000||20||$1 x 106||$1,000,000|
|10||$500,000||20||$1 x 107||$10,000,000|
|100||$5,000,000||20||$1 x 108||$100,000,000|
|1,0000||$50,000,000||20||$1 x 109||$1,000,000,000|
Conceptualizing a ZettabyteWe know what a billion (109) is, but what do we call something written as 1021? That would be a sextillion. Do you feel better now that you have a name for it? We did not think so. Imagine for a moment we could capture – in a single snapshot – all of the global IP traffic in 2016, one zettabyte. What could we compare that to? Using the table below, we rewrote the figures in a comparative manner along with some examples to help you conceptualize what we are actually dealing with. Some notes: we will use 1.28 ZB in this example (some figures rounded and approximate), and for mathematical ease, we will be using decimal values (1,000) – not binary (1,024) – when writing out numbers in full. No need to fuss over this detail, and for all tech talkers, remember: more people speak “non-tech” than tech. Make your life, and their life, easier by avoiding jargon and cumbersome detail. Try to picture the following in your head:
|128 gigabytes||128,000,000,000 bytes||About 32 movies in HD|
|1.28 zettabytes||1,280,000,000,000,000,000,000 bytes||Global IP traffic in 2016|
|128 metres||128,000,000,000 nanometres||Size of football with two extra end zones|
|1.28 terametres*||1,280,000,000,000,000,000,000 nanometres||Distance from Earth to Saturn|
Conceptualizing the Cybersecurity Alert ProcessSo now that we have a better grasp of the size of the data production and flow problem, we need to think about managing it. Unsurprisingly, when asked to identify their top incident response challenges, 36% of cybersecurity professionals surveyed said, “keeping up with the volume of security alerts.” If we hold on to the $20 trillion comparative, we could say our task would be to sifting through $55 billion dollars per day, trying to figure out how much of it is legit, how much has been stolen, how much has been laundered, and how much is funny money. Fun times! FBI Director James Comey in a 2014 interview with 60 Minutes gave a very useful description of the problem (in reference to cyberattacks originating from China):
“Actually, [they are] not that good. I liken them a bit to a drunk burglar. They're kicking in the front door, knocking over the vase, while they're walking out with your television set. They're just prolific. Their strategy seems to be: We'll just be everywhere all the time. And there's no way they can stop us.”They key line is “we’ll just be everywhere all the time” because it is actually happening! From the same survey, 42% say their organizations ignore a significant amount of security alerts because they cannot keep up with the volume. And of course, there is also an unintended danger of being overwhelmed: the feeling crying wolf too many times. But perhaps the more worrying figures are: 34% say that between a quarter to half of the alerts are ignored, 20% say half to three-quarters of alerts are ignored, and 11% say more than three quarters of security alerts are ignored! Mama Mia that’s a lot of front doors kicked in where little is then done! Let’s go back again to the money $20 trillion comparative, where we have to sift through $55 billion per day. If we use the “ignore” figures above, the translation is: alerts tell us something funny is going on, but we are so overwhelmed, we do not bother to look at $15 billion worth of daily alerts. That’s a lot of money being left on the table. Sadly, this issue is nothing new. Ignoring alerts seems as commonplace as alerts themselves and worse as the Cisco 2017 Annual Cybersecurity Report reveals to us that less than half of legitimate alerts actually lead to some sort of correction and less than 1% of severe/critical alerts are ever investigated. In 2014, enterprises dealt with 10,000 alerts per day; in 2016, government departments 50,000 alerts per day; and who knows how many we will be dealing with by the end of 2017 due to the IoT explosion. Unfortunately, despite good tips, such as setting goals, getting the right information, and consolidating, we are still being overwhelmed because we have not addressed the “scale” issue. And oh yeah, did we mention that sometimes cybersecurity analysts may only be able to perform about 10 investigations per day? This is where artificial intelligence and machine learning are going to play a larger role (and why AI start-up firms focusing on cybersecurity issues may be in an incredible position to take advantage of the increasingly vulnerable state we are living in).
What Does It All Mean?It means that we have a lot of work to do and that without artificial intelligence and learning machines to help us with our cybersecurity challenge – something which we think is really two challenges but one issue (hint: network security + information security = data security). We are going down a dark road. If somebody were able to command and control just 1% of the global IP network traffic, the effects could be devastating. This idea may sound far-fetched, but perhaps it is not, especially when you consider how insecure IoT devices are (does your dishwasher come with a password?) and the shift to mobile devices will not stop anytime soon, meaning that just more and more people will be connecting devices WiFi networks that are inherently insecure. These challenges will not get easier, especially as we continue to produce data, and when hackers say they can compromise most targets in about 12 hours. Therefore, we need as many tools as possible (such as AI/LM), but we also need to be smart about and honest about what are dealing with. Cybersecurity is a technology problem, but it’s also a people problem, where we – the people – are still getting the basics wrong. Recognizing that we have cognitive limitations is an important step to getting ahead of the adversaries and nefarious actors. About the Authors: Paul Ferrillo