Solving for Negatives

How to make the most effective use of cybersecurity investigators,

and when ineffectively using them is still worth it

As members of a DFIR team, our job is to perform investigations. Whether we’re in-house dedicated teams or are serving customers as consultants, we are hired to find answers to essential questions. However, there can be significant limitations on when and how much it costs to answer these questions. Typically, investigators are asked to prove that something did happen or to analyze the cause behind a known event. But increasingly, as DFIR teams are brought into broader risk-management needs, we’re asked to prove whether anything happened at all. In this post, I want to explore the difference between "proving a positive" and "proving a negative" because the latter is often a source of incredible frustration and wasted effort by DFIR teams.

                    Can you prove that aliens don’t exist?

Note: A friend who reviewed this post mentioned that nothing in forensics can be "proven" with the same finality as in mathematics. It's more accurate to say that we establish confidence in our answers and convey our level of certainty. I'll still use "prove" colloquially throughout this post, but know that this is an informal use of the term.

All investigations begin with a problem statement

Whether we’re working on the helpdesk or performing forensic investigations, all investigations start with a problem statement. It could be something like, “Why isn’t our email working?” or “Our intrusion detection system detected traffic to a known bad host.” The investigation begins with a known starting point, such as the email server or the host where the traffic is detected.

A “prove the positive” investigation starts at this known problem point. The investigator establishes basic facts (is this claimed event happening?) and then collects some initial data, forming a hypothesis to explain the issue. They collect additional data to prove or disprove the theory, refining it in a cycle until the facts support a specific conclusion.

Skilled investigators with access to sufficient logs and data can usually identify the correct explanation within a reasonable time frame. They use multiple data points to establish the correct answer confidently. In most "prove the positive" investigations, a clear problem statement (Was this machine hacked?) and a defined period (Investigate starting from this detected process) lead to specific, confident answers.

For (much) more about this process, see “Building Secure and Reliable Systems: Chapter 15, Investigating.”

The difficulty of prove-the-negative problem statements

Sometimes, especially in large organizations with many legal, customer, or regulatory responsibilities, investigators are asked to prove that something didn’t happen. For example, an internal audit might find that data thought to be tightly controlled had wider access than intended, or someone might report a vulnerability in a product that could allow authentication bypass if others had discovered it first.

In these cases, someone might want to know how confident the company can be that no harm was done during the period of uncertainty. If the risk is serious, investigating to reassure concerned customers, regulators, or other stakeholders is critical for the organization. However, proving that nothing bad happened in the past is much more complex than proving a positive! In philosophy, proving a negative is not seen as completely impossible, but generally speaking they prefer reframing all such efforts to prove an opposite positive. Unlike academic philosophers, the burden of proving security is always on the provider (us!) when it comes to business.

Prove-the-negative investigations start with a problem statement like, “Someone could have abused something at some time; we need to ensure that didn’t happen.” These problem statements are usually more vague and cover longer periods than “prove the positive” investigations. While detecting malware on a workstation might involve logs from minutes or hours, checking if a previously unknown risk has ever been exploited might require examining weeks, months, or even years of logs.

Due to this wider timeframe, we’re more likely to encounter log gaps, visibility issues, or need to make trade-offs due to the sheer volume of data. For example, analyzing logs for a previously unknown vulnerability could require examining years of data across hundreds or thousands of systems. Such an investigation can quickly consume the entire incident response team and may need even more resources.

This type of investigation is different from “threat hunting” in subtle but important ways: For more on that, see “Threat Hunting” at the bottom of this post.

The resourcing problem

Imagine that you’re the chief of police in a small city. You have eight officers on duty who you can assign to respond to the needs of the people. When their instructions are “Patrol the area and keep an eye out for trouble,” they can reasonably cover the entire area. The officers need to walk at a casual pace and keep an eye out for recent broken windows, blood on the sidewalks, or other direct signs of a recent crime having occurred. They’re also available to each citizen who might come up and say, “Help! I have a specific claim of a crime that has been committed,” at which point your officers can investigate that claim. Your staff can do their jobs and are reasonably staffed.

Now imagine that the Mayor has asked you to prove that no one in the business district has been robbed this year. Although no victim has made any specific claims of having been robbed and none of their functioning alarm systems have alerted, it turns out all the door locks had been vulnerable to easy bypass, and no one noticed until a locksmith changing a door lock pointed it out this week.

Your eight officers now must engage with every business owner in the city and find out how much money they have, how much they should have, how much inventory they have, and how much they should have. If any discrepancies are noted, they must investigate every employee and visitor of every business to determine who might have taken the opportunity to steal something. This analysis will be complicated because the business owners may have kept poor records.

Your officers are now utterly unavailable for new cases - and will be for quite some time. They cannot perform the standard community policing they’re expected to perform. What do you do now? Hire eight new officers to focus on the same job as the old ones?  Work the existing officers sixteen hours a day instead of eight, and deal with the mistakes that diminishing returns will cause? Or back-burner all that other work and accept that the intense focus of your officers on this task will result in most other crimes going un-investigated?

These kinds of investigations have no stated hypothesis, expansive time horizons, and involve a lot of tedious investigation with no particular suspicion that anything actually has happened. Too many leaders, faced with the choice to work the team a bit harder rather than not have an answer to the question, will choose to work the staff overtime. After all, “We have to show that we tried”. More than any other type of case, this type of instruction is a team morale-killer.

Navigating the resource constraints

The bad news is that, as our response team becomes practiced at prove-the-negative investigations, we will more likely receive them in the future. When we show ourselves competent to provide reassurances about risk, this type of request moves from “Once in a while” to “Part of a standard process” quickly. Often, stakeholders are at a far enough distance from the team or are non-technical such that the request seems simple, and our success at having done it before tends to reinforce that belief. This effect is especially true if there are team members who aren’t currently involved in an investigation, who are off-call or otherwise not committed right this minute. These “available” team members are seen as idle and easy to commit to new work.

To avoid high levels of burnout, security teams should establish expectations about how often the level of intense effort necessary to prove a negative is justified. For more about how this antipattern and burnout are related, you can see “My Invisible Adversary: Burnout.”  At about the 21-minute mark, I talk about the illusion that a team not actively involved in investigating something must have ‘spare capacity’ to run an investigation at that moment.

The good news is that the best way to accomplish this is similar to existing best practices for your regular investigations: Careful scoping and clear communication about your capabilities, limitations, and the costs of a given investigation can help ensure that proving negatives remains a rare request used only when necessary, and not a routine request made because stakeholders don’t understand the costs underlying the work.

For intrusion investigations and prove-the-positives, we take for granted that we know exactly how much we should investigate and the success criteria. We’re investigating an event and will work until we are confident we can fully explain what caused it. While up-front scoping is still helpful, it can be overlooked without catastrophe.

These discussions MUST be had with the stakeholders asking for the investigation for prove-the-negative investigations. For example, whether or not a vulnerability has been exploited begins with clearly outlining questions we think are being asked and what to do if we can’t answer them. A non-exhaustive list of questions we might ask for a due diligence look into “Has any unauthorized access occurred to this internal service?” might include:

Once we have specific guidance about what’s expected, we now must return to the stakeholders solid estimates of:

Proving a Negative means communicating well

In particular, the last question above is key. Thinking through what answers are actually helpful for leadership to know and informing them about what answers we can give is crucial in understanding whether any prove-the-negative investigation should occur. When trying to prove a negative, the best we can usually answer is, “We have no evidence that anything was abused”.

This answer is usually unsatisfying to all parties. There’s a reason that “We have no evidence of…” has become an industry trope and the butt of jokes, and that’s because it is not possible for anyone outside the team to know whether this means “We didn’t look so we didn’t find anything” or “We turned over every possible stone but can’t conclusively prove that a thing didn’t happen.”

When an investigation is needed, we should provide our stakeholders with a complete list of hypotheses for what might have happened and our level of confidence that those things did or didn’t happen. For this reason, it helps to keep a hypothesis list as your investigative output for a prove-the-negative investigation. Each hypothesis can be described in advance, along with what data is available to investigate it, whether any certainty can be obtained (and how much), and what conclusions the stakeholder can draw.

Once the hypothesis list is prepared, we can ask the stakeholders to select specific ones they’re interested in investigating, agree to the exit criteria for when they’ve been well-investigated, and begin answering those specific and more clearly scoped questions as the basis of our investigation.

Conclusions

I’d love to be able to say, “Investigators can’t prove negatives, so we should not be asked to try,” but the real world and real stakeholder needs for these investigations exist, and we exist to fulfill those needs. What I am advocating for is a much clearer understanding between stakeholders asking, “Can you show that this didn’t happen?” and the teams capable of answering those questions. Investigating to prove the negative should be used sparingly and driven by a set of clear answers we want to provide. One should not be done simply to show that we spent a lot of effort as a proxy to show that we took the issue seriously!

Now is the time - before the next log4j occurs or the next burnt-out sysadmin leaves a taunting “Guess what I did before I left?” sticky note on their monitor - to begin writing templates for hypothesis-driven investigations. Work with executives and the legal department to scope how the team will respond to these questions when they arise and when the investigation should be closed. Define what kind of answers are acceptable and whether it’s acceptable not to investigate if those answers cannot be obtained. As a bonus, this process might help uncover a difference between the capability our leadership thinks we can provide and what we can provide, and we can begin planning to close those gaps!

Threat Hunting (Addendum)

One of the first questions a peer reviewer asked me was “Don’t you find threat hunting valuable? It starts with only a theory and no specific event, which is how you’ve described prove-the-negative investigations.” This is a great point that deserves some additional clarification. Threat Hunting begins with only a theory and pursues potential, but undetected, malicious activity. However, it’s different from a prove-the-negative investigation in some important ways.

First, threat hunting still theorizes a specific set of indicators that should exist and then begins searching for those indicators. While a prove-the-negative investigation might ask “Has anything bad happened to this service,” a threat hunt will ask “Has anyone used powershell commands with an NTLM hash in them to pass the hash?” which is a much more defined and scoped question. When the hunt finds someone doing this, an investigation centers on whether that action was legitimate business or evidence of an attacker present.

Second, and critically - Threat Hunting is an activity that is planned for in advance, scoped, and generally done as a measured project during business-hours. The teams performing the hunt are doing so as a project, not as additional surprise work on top of their other operational tasks. Their deadline is also typically flexible and non-urgent. This means that the level of burnout incurred from a threat hunt is typically far less than from a sudden investigation for which important people are eagerly expecting results.