A hard look at GuardDuty shortcomings
Is GuardDuty all you need for AWS threat detection? We’ve asked our friend Rami McCarthy to dive into GuardDuty’s performance and consider the potential place for Canary Infrastructure.
AWS GuardDuty is a belt and suspenders security control for your cloud. However, it’s often taken as a panacea for cloud threat detection. I wanted to dive deeper and run some concrete experiments. Read on for the results of adversarial simulation, a review of detection latency, and an analysis of projected S3 ransomware timing.
The conclusion? GuardDuty has coverage, cost, and efficacy gaps. These limitations make Canary Infrastructure a great complement, with a best-in-class signal-to-noise ratio, consistently low latency, and lower cost.
If you're interested in how Tracebit can help, click Book a demo above to schedule a call with one of our founders.
🏫 GuardDuty 101
GuardDuty’s role as a required control for PCI DSS and NIST.800-53.r5 and place in Scott Piper’s AWS Security Maturity Roadmap is evidence of its cornerstone role in AWS security. It provides a baseline threat detection capability with a mix of signature, heuristic, and machine learning based rules. It’s strengths lie in:
- Turn-key setup
- Deep integration with a diverse set of AWS native log sources
- Use of bundled threat intelligence
The past two years have brought significantly expanded coverage to GuardDuty. AWS has launched Malware Protection for S3, EC2 Runtime Monitoring, ECS and Fargate support, and EKS Runtime Monitoring.
GuardDuty has a compelling pitch. It has detections for some of the top cloud risks, including:
- Anomalous S3 usage, indicating exfiltration of data (or ransomware)
- Workloads querying cryptocurrency-related domain names (cryptojacking)
- Stolen EC2 Instance credentials being used by a different AWS account or outside AWS entirely
🧐 Grading GuardDuty
In security, validating assumptions is crucial. Security is a Market for Silver Bullets, where both buyers and sellers have incomplete information. To cut through GuardDuty’s marketing, I focused on three elements: coverage, cost, and efficacy. To measure efficacy, I conducted multiple hands-on experiments to better approximate GuardDuty’s performance.
Coverage
When considering GuardDuty as the exclusive cloud threat detection tool, it is essential to check its coverage. GuardDuty supports fewer than a dozen services[1], while AWS is rapidly approaching 250. The addition of new SKUs over the past few years has turned much of GuardDuty's coverage into an optional add-on with additional costs. Rollouts of these new SKUs happen slowly, resulting in patchy regional coverage.[2]
In total, GuardDuty has 176 finding types. While more isn’t always more in threat detection, the infrequent addition of new findings, when paired with the slow expansion of service support, paints a picture of GuardDuty only covering a few core parts of AWS’s growing attack surface and complexity.
Cost
An effective security program maximizes return on investment (ROI). To do so, you need to accurately cost out the investment any given control introduces.
Projecting the cost of GuardDuty can be challenging due to the variety of factors involved. Rates are based on volume of analyzed data, from unpredictable sources like CloudTrail Events, VPC Flow Logs, or DNS Query Logs. The new SKUs add to this complexity, incorporating data from EKS Audit Logs and S3 Data Events.
GuardDuty costs can quickly spiral out of control, as illustrated by real-world experiences:
- Corey Quinn had an experience where a third party vendor caused enough cloudtrail events to make GuardDuty the account’s most expensive service
- A Redditor saw monthly charges for S3 Protection creep up from $2.5k to $6.5k in a single account
- An EKS Protection user found GuardDuty costs initially were larger than the cost of the cluster itself
While the cost of foundational GuardDuty features is non-negligible[3], they are worth enabling. Consider that, according to AWS, 90% of the 2000 largest customers do so. Unfortunately, the arithmetic for other features is very case-specific, the price is high, and the best projection model AWS offers involves using the 30-day free trial and checking the bill.
Efficacy
Will GuardDuty detect threats and reduce my risk?
This remaining question is at the heart of validating GuardDuty’s place in your security program. There are a few inherent tradeoffs to GuardDuty’s efficacy as a stand alone threat detection tool. First, GuardDuty can miss environmental context. Regardless of the sensitivity of various resources and workloads, GuardDuty findings have a static risk rating. GuardDuty also tends to generate significant noise, including a high volume of Low findings in an account with standard traffic. This leads to a variety of standard guidance on tuning and handling the alert volume, including blanket exclusion of Low findings.[4] Finally, while GuardDuty’s machine learning models can occasionally filter out important findings, they bring along non-determinism and tend to struggle with low-volume, high impact attacks[5].
Adversarial Simulation using Stratus Red Team
To better understand GuardDuty’s efficacy, I started by simulating six different attack techniques using stratus-red-team. Attacks were orchestrated via VPN from diverse geolocations.
I focused on attacks related to data exfiltration, specifically testing the following Stratus attacks:
- Retrieve a High Number of Secrets Manager secrets (Batch)
- Retrieve a High Number of Secrets Manager secrets
- Retrieve And Decrypt SSM Parameters
- Exfiltrate EBS Snapshot by Sharing It
- Exfiltrate RDS Snapshot by Sharing
- S3 Ransomware through batch file deletion
This simulation resulted in zero GuardDuty alerts.[6]
Experiment 2: Detection Validation using amazon-guardduty-tester
I wanted to follow up the first experiment with an approach that could answer two questions:
- Does GuardDuty consistently detect actions in line with its findings?
- How quickly do GuardDuty findings appear?
To answer these questions, I turned to amazon-guardduty-tester, an AWS published tool for generating GuardDuty findings. This tool uses the CDK to spin up a Kali instance in AWS and the necessary infrastructure to simulate attacks and trigger GuardDuty.
I attempted to run 12 different findings. Of the 11 that successfully ran, three did not result in the expected GuardDuty findings. It is unclear whether the lack of finding is indicative of an issue with GuardDuty, or with the tester.
- Recon:EC2/Portscan
- Recon:IAMUser/TorIPCaller
- PenTest:S3/KaliLinux
The remaining 8 findings covered a variety of data sources and threat purposes. Four used DNS logs, two used VPC flow logs, one used data events, and one used management events.
- UnauthorizedAccess:EC2/TorClient
- CryptoCurrency:EC2/BitcoinTool.B!DNS
- Impact:EC2/MaliciousDomainRequest.Reputation
- Trojan:EC2/DNSDataExfiltration
- UnauthorizedAccess:EC2/SSHBruteForce (triggered 2 findings)
- UnauthorizedAccess:S3/MaliciousIPCaller.Custom
- Backdoor:EC2/C&CActivity.B!DNS (triggered 3 findings)
- Stealth:IAMUser/CloudTrailLoggingDisabled
The median detection latency was 15 minutes:
S3 Ransomware Benchmarking
Going further, I wanted to analyze the growing trend of S3 ransomware[7] in the context of GuardDuty’s efficacy. AWS has reported a rise in this class of attack [pdf], which involves exfiltration of data from S3 followed by a ransom note, often paired with data deletion. As we saw in the previous experiment, there can be considerable latency before GuardDuty detections fire. Also, in the first experiment we saw that Status Red Team’s “S3 Ransomware through batch file deletion” technique didn’t trigger GuardDuty at all.
I wanted to offer a very simplistic benchmark of how quickly a naive attacker could exfiltrate data and then delete it. I avoided optimization, but chose to use s5cmd as an example of an efficient off the shelf tool. The prior benchmark of s5cmd also included download statistics:
I tried to match the experimental design, using a c5n.18xlarge instance in the same region as our test bucket. However, I also leveraged a large attached EBS volume, to better simulate the infrastructure needed to directly download a large volume of data.[8]
We can pair benchmarks with our test of GuardDuty to assess how GuardDuty might hold up against an S3 ransomware attack.[9]
It’s worth noting that detecting any malicious actions targeting S3 can be challenging due to the cost and volume of the requisite logs. These logs are also not available by default. For a deeper dive into S3 logging, check out ramimac.me/s3-logging.
To be generous, we can take the best case 5:08 latency observed in UnauthorizedAccess:S3/MaliciousIPCaller.Custom:
- Using our very conservative results, an attacker could exfiltrate 100GB before the alert fires
- Using the s5cmd benchmark, that number is 1.324 TB
- Looking at s3p instead, it would be 2.464 TB
It’s clear that GuardDuty’s alert latency would limit response efficacy, even if you implemented auto-remediation.
🐦 Canary Infra: A Complimentary control
I asked at the outset: “Is GuardDuty all you need for AWS threat detection?”
Having broken down the cost, coverage, and efficacy limitations of GuardDuty, it’s clear that it leaves room to buffer your threat detection capabilities. That’s where canary infrastructure comes in.
If you'd like to learn more about how Tracebit can level up your detections - click Book a demo above to schedule a call with one of our founders.
Coverage
Canary infrastructure is flexible and extensible, allowing you to blanket your cloud far beyond GuardDuty’s ten supported services.
Cost
The usage-based billing model makes it cheap to deploy. Tracebit automates deployment and maintenance, making it turn-key to operate.
Efficacy
The latency on most canary alerts are bounded by Cloudtrail speed. Tracebit’s prior research demonstrates that the average CloudTrail delay is around two and a half minutes, substantially faster than our GuardDuty test.
Canary infrastructure allows you to create high signal alerts, especially as Tracebit auto-filters false positives driven by security tooling. Alerts are deterministic, firing reliably, unlike GuardDuty’s anomaly detection.
Simply publicizing use of canary infrastructure has been proven to be ”effective at impeding attacker forward progress.”
Footnotes