On the Relationship Between Cheese and Risk
In cybersecurity, there is a lot of talk of “defense in depth”, but sometimes it is unclear what people mean… or why you should care.
This post answers questions about the concept of “defense in depth”, with unexpected help from a famous cheese family.
What is defense in depth?
Like many concepts in cybersecurity, defense in depth is an extrapolation of a military strategy.
Defense in depth is a defensive strategy where terrain is given up to an attacker to prevent casualties and delay the attack. The delay ultimately enables the attacked party to mount a counterattack.
To better understand the metaphor from the physical world, we need to talk a bit more about the tactics.
Defense in Depth (DiD) Tactics
To implement defense in depth, especially if facing off against cavalry units, multiple levels of redundant defenses must be set up. As one line of defense is breached, it falls back to a second line. When the second line falls, you fall back to a 3rd, and so on. If we rely on a single line of defense, it becomes vulnerable to exploitation (another military term used in the cyber realm). When exploited by fast moving units, the breach in the line of defense can lead to an attack from behind or a cut in the supply lines. The unit must rely on redundant defenses to avoid catastrophic failure. It is this concept that inspired what cybersecurity calls “defense in depth”.
Unfortunately, the term defense in depth may mean something different depending on the vendor you are talking to.
A network segregation equipment (firewall) vendor will talk about defense in depth in terms of setting up multiple perimeters. If one perimeter is breached, it will not compromise the entire network.
To another vendor, defense in depth will mean setting up multiple countermeasures to address the same risk, like having an AV solution and an EDR tool on the same machine.
Cybersecurity and Diminishing Returns
In the previous post in this risk management series, we looked at how risk is reduced by applying countermeasures. Investment in countermeasures includes a steep return curve, where protection against residual risk generates diminishing returns.
So, why do we need a “belt and suspenders” strategy?
That’s where the cheese comes in.
Accident Causation
Industrial risk management explains incidents as a series of aligned failures using the Swiss cheese model of accident causation.
Swiss cheese is full of holes. However, with a thick enough block, you will not look through a hole and see daylight out the other side.
How is it possible that cheese full of holes blocks the view? When sufficiently thick, it is unlikely that all the holes in the cheese will align.
By stacking layers of protection, even if the layers are “full of holes”, an incident is unlikely to occur. In fact, it will only occur if all the holes in the Swiss cheese are lined up.
This is exactly what we try to achieve by using defense in depth.
Defense in depth is especially important in cybersecurity where all countermeasures have known bypasses (there is a lot of holes in that cheese). Layering different types of defenses is important to achieve levels of risk reductions we can live with.
Layered Risk Countermeasures: Preventing the Holes from Aligning
Does that mean your business should run multiple AV tools?
Unfortunately, that won’t do much. In order for multiple defenses to be effective, they need to have different types of failure modes. In other words, the holes in the cheese must not align.
There are classes of attacks that are failure modes for a majority (or the entirety) of AV brands. For example, many AV tools cannot detect living –off–the–land (LoL) malware because they are based on malicious binary detection. LoL attacks use known, good binaries that are often part of the operating system. Three different brands of AV might all miss an LoL attack.
A layered approach requires tools with different failure modes.
That is why using two passwords does not count as two-factor authentication or mounting two switches in the same rack does not provide redundancy against fire.
Rate of Countermeasure Failure
Based on the mathematics of risk, this makes a lot of sense.
If we consider that countermeasure 1 has a chance to fail of f1 and countermeasure 2 has a chance to fail of f2, the chance that both of them will fail (F) is the conjunction of these risks:
(1) F = f1 x f2
If we say that f1 is 1 in 100 and that f2 is 3 in 100, the chances that both on them will fail at the same time would be 3 in 10 000 (0.03%).
The combination of both countermeasures is a massive upgrade in probability reduction from the 1% with countermeasure 1 alone or 3% from countermeasure 2 alone. That means layered countermeasures also contributed to risk reduction since risk = probability x impact.
However, this only holds true if the two probabilities are statistically independent.
If the risks are correlated, the combination of the multiple countermeasures provides absolutely no gain. Since the calculus becomes much more complicated, people usually assume risks are statistically independent.
The bottom line: choose cybersecurity solutions that address existing failure modes.
How Bad Can Risk Correlation Be?
Assuming risks are independent, they can be very damaging.
Take, for example, a bank lending money for mortgages. The bank loans Alan $500,000. The bank anticipates a 10% gain in earnings over the course of the loan. However, there is also a 3% risk that Alan will default on the loan.
With this in mind, the expected value of that loan is defined by the following risk equation:
(2) EV = (100%-3%) x (10% x $500,000) + 3% x –$500,000 = $33,500
$33,500 is a pretty good return. However, there is still a big risk that Alan defaults on the loan, leaving the bank out $500,000. So, let’s say the bank bundles one hundred borrowers (like Alan) in some form of financial instrument (let’s call them CDOs for the fun of it).
If, and that’s a big if, the risk of default between the various borrowers is due to random chance (they get sick, their money is eaten by a badger, etc.), their chance of default is statistically independent.
The probability of all the borrowers defaulting at the same time is very small. The bank is extremely unlikely to lose the value of all its loans (3%^100, which is 0.[151 zeros]5%).
The probability that the bank will lose money on the bundle is very small. The sequence of different borrowers can be modeled as a binomial distribution, where we do 100 draws with a 3% chance of “success”. Since each non–default nets 50,000$, but each default costs 500,000$, at least ten borrowers must default on their loans for the bank to lose money (90 x $50,000 – 10 x $500,000 is – $500,000).
Plugging the number in a binomial distribution calculator shows a 0.0874% chance of losing any money. This is a solid investment for sure.
When Default Rates are Correlated
When default rates are correlated, catastrophic failure is possible.
When loan defaults depend on correlated events such as a rise in the interest rate, or when they can be affected by a global economic downturn or a downward turn in housing valuation, the chances of catastrophic failure are much higher. If loan defaults are perfectly correlated, the bank would have a 97% chance of making $5,000,000. However, the bank would also have a 3% chance of losing $50,000,000.
The bank’s expected return is the same in both scenarios. However when risk is statistically independent, the investment can be incredibly safe. When risks are correlated, a healthy dose of catastrophic risk is likely.