In Part I, I described some structural problems in MITRE’s ATT&CK adversarial behavior framework. We looked at a couple of examples of techniques that vary greatly in terms of abstraction as well as techniques that ought to be classified as parent and sub-technique. Both examples are borne out of the lack of hierarchical structure among techniques in ATT&CK. As a focused and relatively small catalogue of behavior, this doesn’t necessarily present a pressing problem—it’s small enough for cybersecurity professionals to comprehend the entire framework and understand by inference how the structure should look. However, the problem becomes more pressing as the framework grows and becomes harder for individuals to comprehend. In this part, I want to look at the background of formal ontology, some basic concepts, its uses, its failures and successes and how to think about ATT&CK as an adversarial behavior ontology. ATT&CK is tremendously valuable because it gives us a classification of our knowledge of adversarial behavior to better communicate, collaborate, account for and reason about the domain in a scientific manner. A formal ontology is valuable for much the same reasons, but ontology goes beyond classification. It enables us to build large repositories of knowledge in a machine-readable language. As with any scientific endeavor, ATT&CK’s content will grow. And as it grows, the lack of structure makes working with it tougher, and it risks losing value. My goal here is to introduce successful principles and examples of applied formal ontology and discuss how ontology can be used to protect and expand the value of ATT&CK.
Ontology: A Short Overview
In a previous career, I was a philosopher specializing in the area of metaphysics called ontology. I worked on the ontology of, as natural language philosopher J. L. Austin put it, “moderate-sized specimens of dry goods”—familiar material things like tables, chairs and computers. I also worked on understanding how familiar entities that aren’t so clearly material things fit in the picture—things like shadows, knots, fictional characters, pieces of music, software and the like. Do shadows exist and if so, what are they? Are they something over and above the absence of light? If there were no light sources at all in the universe, would we say that there’s a shadow everywhere? Seems strange to say, probably not. So, shadows are dependent upon a light source as well as something to block the light source. But does there need to be another surface for the shadow to exist on? Do shadows have volume or just area? Can two shadows overlap? If so, could they perfectly overlap? How do shadows figure into our best theories of vision? Perhaps shadows aren’t even real but it's somehow convenient for us to talk as if they were. It turns out that an ontology of familiar entities is really, really hard. A lot of ontology in academic philosophy begins with navel gazing and churns out perplexing philosophical questions and puzzles with questionable worth; however, not all ontology has uncertain practical application. Ontology is a field where practitioners try to get clear on what entities exist, how they should be categorized, what sorts of properties they instantiate and what sorts of relationships obtain between those entities. One of the first systematic ontologies was posited by Aristotle. At the top-level, he thought everything must fall within a framework of categories of two main kinds, substances and accidents. The following tree from John Sowa shows a mapping of the main categories in Aristotle’s ontology, with child-to-parent edges meaning “is a sub-type of.”
This top-level ontology represents common relations and objects categories that are applicable across a wide range of domain-specific ontologies. Historically, philosophers have spent considerable effort arguing over what top-level ontology best represents reality. Aristotle’s ontological ideas come mainly from two pieces of writing, the Metaphysics, from where we get the name of the discipline, and the Organon, a collection of works on logic and language. This connection between ontology, logic and language runs deep and is fundamental to applied ontology. (Sometimes people distinguish the term ‘ontology’ used in philosophy from the term ‘ontology’ used in other fields, where the former refers to examining the concepts of existence and being itself. But I think this distinction is unnecessary.) Every field has and uses ontologies, whether formally, informally, explicitly or inexplicitly. Another way to think of an ontology is as a controlled vocabulary or, my preferred term, a regimented language for speaking in a domain of discourse; it can be thought of as a grammar that sets the rules for well-formed sentences within the domain. In this regard, an ontology is an agreement on how we use our words. A formal ontology is typically couched within a formal language, say, first-order logic, description logic, Cyc, OWL or RDF, to name a few examples. There are several outcomes to formalizing a domain’s ontology useful in industry: increase interoperability, coordination and collaboration; build knowledge representation; enable automated reasoning and powerful querying language support; and reduce terminological bloat among others. A domain need not be crisply defined in order to develop an ontology for it: a friend of a friend is a well-known ontology about human relationships. If A is my friend and I like the content that she shares, and B is A’s friend, then we may infer that it’s likely I’ll like the content that B shares. It can be helpful to think about what people say and what inferences can be drawn about social networks even if we don’t have a precise definition of a friend.
Formal Ontology: Industrial Failure
I’m not the first to suggest formal ontology as a practical application in industry. Applied ontology emerged in the early 1970s at Stanford University, where researchers in robotics referred to formalization of commonsense knowledge as an ontology. A quick search reveals hundreds of scholarly articles on ontology in the engineering and IT world since. However, the industry world has too often been burnt by bad experiences with formal ontologies, and the term is unpopular. Formal ontologies typically fail because they never get off the ground or gain a large enough community of users—there’s no accumulation of established content or principle. Most of these articles start from the beginning and don't rely on or build upon existing work. This is the result of too many standards being built independently with no common methodology. That is why there are few or no real-world examples of successful industrial use.
That’s not to say there are no examples of successful applications of ontology. Formal ontologies work well when backed aggressively by influential constituencies who can gather a critical mass of users and contributors—and MITRE possesses the qualities to fill this role. There are some notable successes particularly in the field of bioinformatics and geographic information science as well as within military and intelligence communities. Perhaps the most successful of these are the bioinformatics ontologies.
Formal Ontology: Bioinformatics Success
The bioinformatics ontology success story begins in 1990 with the Human Genome Project, which was an international scientific effort to map the sequence of nucleotide base pairs that make up human DNA from both physical and functional points of view. Because the human genome contains approximately three billion of these base pairs, there is need to automate organization, querying, analysis and annotation of an incredible amount of data. The Gene Ontology (GO) in 1998 began as a large collaborative effort to standardize development and use of gene ontologies for a wide variety of organisms. It’s now funded by the National Human Genome Research Institute. GO provided a formal ontological framework for a consistent description of gene attributes and products in three main biological subdomains: molecular function, biological process and cellular component. Here’s a graphical representation of where the cell body membrane fits in the picture and its relationship with other cellular components:
(Note the key, which lists various logical, mereological, spatial-temporal and functional relationships which cellular components might have with each other.) In the early 2000s, Barry Smith, my former professor at the University at Buffalo, led the development of the Basic Formal Ontology (BFO) project, which constitutes the top-level ontological basis for a large collaboration of domain-specific ontologies, primarily in the life sciences, including GO. At heart, BFO is similar in many ways to Aristotle’s top-level ontology. At the same time, he initiated the Open Biomedical Ontologies (OBO) foundry, which is an umbrella of interoperable life-science ontologies overseen by an international operations committee. The success of formal ontology in bioinformatics can be visualized from the complete Linked Open Data graph from 2017 below, where most of the red circles represent a domain-specific life science ontology.
Compare to six years earlier in 2011 (with life sciences in light purple):
Today, formal ontology is ubiquitous in the life sciences and bioinformatics, and almost everyone in these fields collaborate via BFO standards as a means for communicating and resolving semantic and organizational differences.
ATT&CK and Formal Ontology Similarities
Let’s look at an entry on cardiac arrest from the Human Disease Ontology:
Note the similarities to ATT&CK Technique example from Part I. Each has a unique identifier, a name, a definition, some annotations and relations to other entities in the framework. This allows for entities declared into the ontology and their interrelationships to be precisely defined to enable a highly structured model of our knowledge about that domain. This is much more than a taxonomy. A taxonomy is just a shell that can accommodate entities and their hierarchical relationships within a domain, whereas a formal ontology can also be populated by domain knowledge expressed in a regimented language or by formal semantics. Formal ontology standards have large communities of practice with open source tools, such as Protégé, a flexible ontology editor and training. OWL, or the Web Ontology Language, for example, includes serialization standards; defining an OWL ontology necessarily defines a standard, XML-based, knowledge interchange format. OWL reasoners can find errors in ontologies and can draw inferences from rules and facts. For example, in GO, gene products are annotated to the most granular term and, by a transitivity principle, every annotation to a GO term implies the annotation to all its parents. OWL also supports query languages for application-specific reasoning, transformation and reporting. Within the last year, ATT&CK content has become available in STIX 2.0, a JSON based cyber threat intelligence specific language and serialization standard, based on fundamentally similar principles to OWL. Automation, realistic risk modeling and management and exchange of structured threat intelligence can be achieved with a combination of these standards and semantic reasoners. The philosophy behind ATT&CK recognizes the organizational advantage of relating techniques to tactics, platforms, groups and software with detection and mitigation annotations. To my mind, the ATT&CK framework is a burgeoning formal ontology of adversarial behavior. Where a taxonomy classifies, an ontology specifies—and ATT&CK specifies. The framework becomes even more valuable if a critical mass of collaborators builds out interoperable ontologies for systems, networks, vulnerabilities, malicious software, adversarial groups and so on. This could represent the groundwork to creating a coherent science of cybersecurity.
While the formal ontological frameworks mentioned above aren't a guide on how to populate domain-specific knowledge or solve instances of the structural problems for ATT&CK, they do provide standardization, principles, mature tools and a collection of successful prototypes from other fields to take notes from. Just like in other successful applied ontology fields, these ideas can be adopted in whole or in part and in incremental steps as we continue to build and improve the ATT&CK framework.