It’s time we stop letting ransomware attacks succeed

Enough already.  It’s time to change our approach to ransomware.  What we’ve been doing isn’t working.  

While there is no guaranteed way to stop all attacks, most ransomware exploits that succeed do so because we let them.  We let them succeed by not denying them the ability to cause harm.  WE LET THEM SUCCEED.

What are we doing wrong?  We need to start by deconstructing an attack.  Every incident has three components that answer critical questions:

  1. Susceptibility— Am I immune? If not, do I have compensating controls?

  2. Exploitability— How easy is the attack to pull off? Can a “script-kiddie” do it or does it require the advanced capabilities of a nation state?

  3. Impact— What’s the outcome if an attack succeeds?

Under the NIST framework, Susceptibility and Exploitability controls generally fit under the “Identify” and “Prevent” functions, while Impact mitigation aligns with our ability to “Detect,” “Respond,” and “Recover” from incidents once they begin.

To stop ransomware from establishing a foothold, we depend on controls that are never going to be perfect.  We rely on authentication and firewalls to keep bad actors out, antivirus signatures and proxies to keep out or contain malware, patching and basic hygiene to close vulnerabilities, and we train people to not click links in emails or open risky attachments.  

These reduce our Susceptibility and Exploitability, but not enough.  We see the proof of this every day when we learn of yet another attack. 

Focus on Mitigating Impact

That leaves us with mitigating Impact.  Most Impact controls focus on recovery, such as having a good backup.  Even if our backups work, restoring systems at scale is painful.  

Detection and mitigation controls tend to be in the form of designs and playbooks to keep an attack from spreading once begun.  We abandon the systems that were breached and try to limit the damage.  That sounds reasonable, unless the systems that were breached are important.

What’s missing is a way to mitigate the impact of an attack on each and every system.  It’s not hard to detect ransomware by looking for specific behavior— we just haven’t done it.  Nearly all ransomware that encrypts files exhibit a similar pattern: open files, read, encrypt, write encrypted version, and delete the original files (this last step varies if the ransomware overwrites files).  Repeat in a loop as fast as possible to inflict maximum damage before being noticed.  

That’s the key, because repeating behavior in a loop makes it detectable.

A Better Approach Requires Action

We can look in real-time for large amounts of file reads and writes, CPU workloads indicating encryption, changes in file names or types, and changes in system “entropy” (randomness of data). We can look at the sequence of activities to minimize false alarm.  Behavior monitoring tools do some or all of this today.

It’s been less clear what we should do once ransomware is suspected by automated tools.  Ransomware does its work quickly, so by the time we respond to a notification the system is well on its way to being lost.  

Some tools focus on the behavior of individual processes, so if a ransomware process is detected it can be stopped.  We don’t want to stop something critical, so we begin to whitelist processes not subject to supervision.   This starts us down a path of increasing complexity and maintenance costs that can be both error prone and allow a path for ransomware to exploit.  In most cases, this is a bad trade-off.

A better approach is to slow suspect behavior instead of stopping it, giving us time to notify a user and ask if the system should be permitted to proceed.  We can incrementally slow the system more and more while waiting.  Ransomware’s need to work quickly is mitigated, and we limit the damage it can do without the catastrophic side effects of hard-stopping processes if we have false alarms.

With this approach, both our detection and mitigation can happen in the same place— ransomware needs to access files on a disk.  Rather than looking at the behavior of individual processes, we can observe the behavior of the full system by “wrapping” the storage device drivers.  This becomes harder for processes to work around. 

If ransomware is detected, we use the same wrapper to increasingly slow write access while a user is notified.  We contain the damage ransomware does on each system vs just containing the ransomware from spreading to other systems.

Someone Needs to Build it

Who should implement this approach?  Ideally, we demand it of our malware detection vendors.  Scream at them, loudly.  Tell them that if they don’t implement this, you will find a vendor who will.  Their inaction can’t be your disaster.  If they don’t act, we should build our own solutions.   Making backups and waiting for an attack can’t be our response.

I honestly don’t believe we need another vendor providing malware solutions.  CTM, a cybersecurity research lab and incubator, is in the business of solving hard problems and building companies around those solutions.  This is a problem worth solving but not a stand-alone business worth building.  

CTM might build this and give it away free to end users if existing vendors don’t step up.  I’m that tired of letting these attacks succeed.

If you are tired of ransomware too, and don’t see a better solution coming from your existing vendors, let me know.  If enough interest, I’ll invest some resources to build this and make it available.  Sometimes you just do what’s needed instead of looking to monetize a market.

Anyone else tired of letting this happen to us?  Anyone else want a better solution?  Let me know.