Andy Smith

CEO & Co-Founder

Thu, March 14, 2024

Canary Infra: Bringing Honeypots towards general adoption

Canary and Infra

There are few practices in security that red teamers and penetration testers alike fear more than honeypots. The idea of having to tiptoe around an environment is a jarring prospect for those used to being able to act with relative impunity.

This isn't just what this particular security vendor thinks, either, there is now empirical evidence that shows this approach makes attackers less effective at their jobs[1].

But if you speak with security teams you'll find many haven't implemented this approach, why is that? We think the cost/benefit trade off has never quite been there, but we also think that's changing.

I'm going to explore the different flavours of "honeypotting" or "deception" and present a new idea that we think is finally going to make this approach standard for teams of all sizes and stages of maturity: canary infra.

Traditional Honeypots

Honeypots are resources created to lure attackers to interact with them. In doing so the attacker reveals their presence and techniques. A traditional honeypot is commonly an application server offering and emulating one or more network services.

Example of an attacker's view of a SSH Honeypot

As an example (illustrated above), a popular service to emulate is SSH. For a long time defenders, researchers and hobbyists alike have written software to let an attacker think they've correctly guessed an SSH username and password. This can be used to alert of the attackers presence and learn more about what they're looking for[2].

There are many traditional honeypots out there, one github list references hundreds of open source projects.

Drawbacks of Traditional Honeypots for Intrusion Detection

The idea of a traditional honeypot has all the value that we've previously extolled - plugging detection gaps or even replacing costly detection engineering by producing low volume, highly actionable, high priority, low false positive alerts.

Canary Infra is harder to avoid

Why aren't they more popular then? We think there are a bunch of factors at play here:

  • Alert Precision - Network based alerts can be difficult to action and the network can be noisy. It is not always useful to be alerted that IP 10.0.0.30 touched Honeypot A because it lacks additional context or could be a false positive.

  • Discoverability - Attackers may not always jump straight to scanning a network for interesting services to attack, meaning lures or breadcrumbs might need to be laid to increase the probability of them finding our honeypot, even if they do - it's often only practical to deploy a small number of honeypot servers in a network.

  • Maintenance - These honeypots are applications running on operating systems, which require regular patching and vulnerability management.

  • Supply Chain Risk - Security software is just as vulnerable to supply chain attacks as anyone else, arguably, it's a more compelling target than other software. Environments in which it's most critical to detect intrusions are often also those in which you'd be most cautious about deploying third-party software.

  • Foothold risk - As security practitioners we should know - any code could potentially be exploited, a honeypot being exploited and leveraged for an attack is the very definition of an 'own goal'.

  • Cost - Running servers costs money. Especially today where microsegmentation is commonplace, the cost of running a honeypot in each network can be untenable.

  • Circumvention - Simulating network services will always be a cat and mouse game. There are tools out there[3] to let you detect and avoid honeypots altogether.

When you lay it out like this - it starts to become apparent why this security practice may not be so popular. Even though the benefits may be great, the costs are non-trivial.

Another way: Canary Infra

Canary Infra is harder to avoid

Something that stood out to us when considering these drawbacks alongside the modern, cloud environments that we're looking to protect was that there are resources in cloud computing where all of these problems basically just .. go away.

Take Amazon Web Services (AWS) as an example. If you consider "serverless" resources such as S3 buckets, IAM roles, DynamoDB tables, Secrets Manager Secrets and consider them against the above criteria:

  • Alert Precision - Any interactions with these resources produce a very rich audit trail (in AWS it's CloudTrail) that doesn't just provide an IP address but also session, user, role, user agent and more, making it much more actionable.

  • Discoverability - Because we're just using the actual cloud assets, we will show up in any API requests to list or search for these resource types. Also, because of cost (see below) it's affordable to scale out many of these resources, making their discovery more likely.

  • Maintenance - AWS is responsible for operating systems and software patches of these resources, most are completely transparent to us (e.g. when Log4Shell happened - Amazon paged their engineering team, not their customers).

  • Supply Chain Risk - As users of AWS we have already accepted the risk of their presence in our supply chain and AWS is responsible for upstream software (e.g. we trust AWS's due diligence on 3rd party software they ship to production).

  • Foothold Risk - There's no direct code execution here and AWS is heavily incentivised to keep it that way.

  • Cost - These resources are usage based, so either very cost effective (cents a month) or free to deploy.

  • Circumvention - Canary Infra is the real deal, a Canary S3 bucket is 100% an S3 bucket, we just need to ensure a few of the variables that go into creating it don't stand out and we become very difficult to fingerprint.

For us this was when things started to get exciting - until now 'honeypotting' has had a bunch of value but has not been a 'no brainer'. With canary infra it actually starts to look like something everyone should be doing.

Historically the word 'honeypots' is associated with a 'late maturity' or even 'never' roadmap item but when the risks and the costs are tiny and the benefits significant it becomes something that could come at the very beginning of a security program, certainly it isn't something that needs to wait until other boxes have been checked.

Tracebit: Automated Canary Infra for Intrusion Detection

Example Tracebit Decoy Generation

Sounds great, what's the catch? Well there is one more thing - setting up infrastructure and monitoring still doesn't come for free.

Cloud computing may make it easier, cheaper and safer but it does still require work. Not least integrating it into existing workflows (CI, CD, IaC) an organisation has already built. As well, the microsegmentation approach we mentioned earlier can mean tens, hundreds or thousands of unique accounts in which this canary infra needs to go.[4]

This is the final piece of the puzzle that we're tackling with Tracebit - there is still a lot of work here to do when considering deploying canary infra:

  • Tailoring to the environment - "honeypot-1" won't cut it, and most teams deploy naming schemes or standardise on particular settings, it's important to blend in (or when standing out - do so consciously!)

  • Infra selection and design - there are a plethora of services suitable for this approach but not all, this needs some deep thought, design and testing.

  • Integration - Today it's table stakes to manage infrastructure with infrastructure as code - any canary infra needs to be prepared to integate in this way.

  • Monitoring - The promise of a honeypot is a highly actionable, high priority, low false positive alert that you can trust will go off at the most critical moment, ensuring all these edge cases and failure cases are covered takes work!

  • Tuning - In theory a honeypot "should never be touched" unless during a security event, but other security tools (Cloud Security Posture Management, Asset Inventory) will touch them, these interactions need to be tuned out to avoid creating noise.

  • Long term management - honeypot programs sometimes fail when treated as a 'one shot' project, the reality is that environments change all the time and whatever canary infra is in place needs to recognize that.

The exciting news is that all of this is possible in a cloud native world - by leveraging APIs and Infrastructure as Code it's possible to solve these problems in a repeatable, scalable way, and that's exactly what we're building with Tracebit!

Conclusion

We really believe that the concept of Canary Infra represents a step change in a space that has not seen significant innovation in a while that is going to move us to a world where they're the rule rather than the exception in a security team's stack.

It's exciting to imagine this world and the impact it's going to have on attacker's behavior.

If you'd like to learn more details about what we're doing, please, reach out!


  1. One particular study: Examining the Efficacy of Decoy-based and Psychological Cyber Deception 2021 ↩︎

  2. We're going to focus here on intrusion detection - honeypots inside the perimeter of an organisation that act as an alert for malicious activity. We'll discuss threat intelligence - honeypots on the internet - in an upcoming article. ↩︎

  3. James Brine's Honeydet being a great recent addition. ↩︎

  4. There is prior art here from Amazon - How to detect suspicious activity in your AWS account by using private decoy resources, but this is not a complete solution that can be scaled out to many use cases. ↩︎

Thanks for reading this far, we hope you got something from this article. Tracebit is building a new kind of security product to be the ‘easy button’ for adding detections to cloud environments using decoys. If you’re interested to learn more about what that looks like please reach out (we currently support AWS, with Azure & GCP coming soon).

Sign up for our newsletter

Subscribe to stay in the loop of future posts and Tracebit updates