Start your day with intelligence. Get The OODA Daily Pulse.
Why is cyber threat detection so hard? The most obvious reason threat detection is hard is that “threat” is too abstract to solve. It may seem obvious, but effective problem solving requires problem framing. Hence, everyone involved in the process clearly understands the problem and what it is not. We get distracted by vague and amorphous claims of AI outthinking humans or being “slightly” conscious. We forget that problem-solving is more complicated than pip install tensorflow.
Of course, problem framing requires domain knowledge. You must know something about the problem you want to solve, or you will have to learn about it. Unfortunately, this takes time and even then, the problem may be too complicated to understand and articulate fully.
To explain, understand that we tend to know a lot about complex problems after we need to know them. We see this reflected in traditional threat detection which is very rigid and historically implemented in security information and event management systems (SIEMs) with rules and knowledge bases such as signatures. These so-called expert systems manipulate information to detect known threats and alert users. They are called “expert” because they capture specialized knowledge and not because they have a generalizable performance to adjacent threats.
While expert systems capture specialized knowledge, the knowledge itself is often rudimentary. The reason is that few problems contain unambiguous facts that are universal and completely formalizable. This is true for complex problems but especially true for the edges of complex problems. For example, threat hunters––working on the edges of the threat detection problem––rarely know they are making decisions. These skilled experts are fluidly interacting with the problem and responding to patterns they recognize without considering what rules might apply, mainly because few explicit rules apply.
Explicit rules seem to be available only for more straightforward problems, and why so-called expert systems such as SIEMs are, in practice, more akin to beginner systems. I vividly recall a typical “top talkers” response from cybersecurity analysts when asked what to look for when designing threat detection algorithms. This is not to poke fun at security professionals––quite the opposite. They are working on a complex problem where aspects can be challenging to solve because they can be tough to explain to others. The paradox is that for problems for which we need the most help, we are often in the weakest position to ask for it or give it. The lesson I’ve learned is to find beginners and have them tell you what they do. Then, find experts, and watch what they do.
So, problem-solving requires problem specification. However, you can’t specify something you don’t know, nor can you learn something from someone who can’t tell. This is reason behind the adoption of machine learning in many domains, including cybersecurity. Machine learning relaxes the requirements of problem-solving: specifically, the requirement of having to know every detail about a problem which is a requirement for traditional software development and expert systems (e.g., SIEMs). There may be no other development like machine learning that promises this impact on problem-solving. Unfortunately, machine learning does not obviate the requirements of problem-solving. Therefore, domain comprehension and problem framing requirements still exist.
Figuring out where to start is often the most challenging part of problem-solving. We get weighed down by the size of problems and which direction to take. When stuck, we wonder if we can get ourselves unstuck at all. If you don’t know where to start, consider picking the smallest and most accessible part of the problem and try to solve the problem with the smallest and most straightforward solution. Working toward simplicity is an effort to understand any aspect of the problem rather than being weighed down by every part of the problem. Try to find the minimum number of changes needed to create a solution where the problem does not exist. The goal of problem-solving is not to find the most complex version of the problem to solve or build the most complex solution, but rather the least complicated versions of both.
Of course, you can go too small of problem specification. You can make a problem so narrow that you make it trivial. Trivial solutions to narrow problems do not reliably create customers. Moreover, while reductionism is a methodology for approaching complex problems, it is not a solution for complex problems by itself. In other words, reductionism at all costs has a cost, even for problems it is best suited for, like threat detection.
Reductionism is a good place to start, but it’s not the best place to finish. For example, point products are rather common in cybersecurity including threat detection. These products provide a partial solution to a specific threat vector rather than addressing all the requirements that might otherwise be met with a multipurpose solution like a SIEM. Point products constrain a problem to achieve performance gains but at the expense of adjacent parts of the whole of the problem. The result is an incomplete response, creating new problems because partial solutions struggle at the edges of adjacent problems. A partial solution without the support of other partial solutions will result in a solution for which the size of the underlying problem tears the partial solution apart.
The point is that problem-solving should start small but understand that you will oscillate between solving parts of a problem and the whole of the problem and between reductionism and holism in practice. The reason is simple: practical problem solving requires understanding all aspects of a problem, the various levels of a problem, and how everything interacts. These are essential boundaries that all need to be understood because boundaries tell you what to do and not to do.
In part two of this series, we will discuss why threat detection is so hard despite the application of machine learning.