Tracebit is a brand new approach to an idea that’s been around for a long time1: honeypots.
Because the word ‘honeypots’ means a lot of different things to different people, the particular type of honeypot Tracebit deploy is called a ‘canary’, but we’ll dig a little here into the history of honeypots to give you the bigger picture.
What are honeypots, why do security teams use them for intrusion detection and why does Tracebit think they’re a big enough deal to base a whole company around them? Let’s jump in!
Why canaries, why honeypots?
Before diving into the what - let’s step back and ask what problem are we trying to solve?
There was a time when a security team might be focused entirely on preventing a security incident (like a data breach) from happening. We think two key things have changed this approach:
- The number of security breaches published publicly tells a compelling story that even the best resourced security teams can be breached.
- The stack of technology upon which an organization is built has become far too complex and changes far too often to properly reason about to say whether it’s ‘secure’ or not.
This "assume breach" mentality means that to protect a business, a large portion of a security team’s time is spent thinking about events that “should not happen” or, rather, “will only happen if an intrusion has occurred”.
It can be tempting to write this off and say that if you detect an intruder once they’re in, it’s already too late. Actually, there’s a sweet spot between an intrusion occurring (e.g. an engineer's credentials being stolen and used) and an actual business impact (e.g. personal data being exfiltrated).
If the team can detect the intrusion and respond quickly, they can stop the business impact and be celebrated for the heroes that they are!
Today, this practice is often called ‘detection engineering’.
False Positives in Detection Engineering
Whilst detection sounds promising in theory - in practice it’s nowhere near that easy. As many a detection engineer has learned, over time and at scale events that seem like they “should not happen” actually happen fairly consistently - leading to false positives and time wasted.
To put it another way - if a false positive is a one in a million event and there are billions of events there will actually be quite a lot of false positives.
Some classic false positive edge cases might include:
- an Excel spreadsheet that reaches out to the internet: “Oh, yeah, the CFO actually really needs that Macro, I know, but they’ve always done it that way”
- an S3 bucket that suddenly has a huge amount of activity: “Yeah the sales team need to trigger that batch job .. no, it’s not possible to predict when they’ll run it .. yes, I know 4AM on Christmas Eve was a suspicious time to run it”.
- an application in production making requests on port 31337 to a server in Italy: “Ah, that’s the licensing check for a library that’s in the critical path for some data pipelines we run. Yes they’ve put a fix on their back log.”.
Rules that seemed obvious and had the security team high fiving suddenly become a pain - false positives, lost faith, disillusionment, despair..
But fear not, it’s challenges like these where honeypots for intrusion detection come in!
What is a honeypot?
Honeypots flip this problem on its head - instead of hunting for the one in a billion event that shouldn’t happen on a resource (like a workstation, S3 bucket or application server) they instead create the resource that should not be interacted with.
Because the resource is wholly created and owned by the person looking to do the detection, they can reason fully about what interactions with it are acceptable and significantly reduce the possibility of annoying false positive edge cases creeping in.
This is a beautifully elegant solution that flips the pain of figuring out things that “should not happen” by creating our own.
Putting honeypots into practice
Sounds great in theory, but practically how do we put this into practice?
As with any security work these should focus on real risks to the business, so let’s start there.
Let’s take detection engineering for a company like Netflix as an example. They store huge volumes of data in S3 buckets. There are tight controls around this data but realistically that data needs to be accessible by a disparate set of applications and humans. As long as the data is accessible by applications and humans there is always some risk that it could land in the wrong hands - impacting their business (a costly incident response, lost customer trust, fines from regulators).
Let’s say we’re focused on protecting the S3 bucket ‘netflix-eu-viewer-data’.
So how could we use honeypots to protect this data? One risk we need to consider is that a Netflix employee’s data access is compromised. If someone were to compromise this access, they may start searching around the S3 buckets they have access to.
A simple but effective way to detect this, would be to create similar honeypot (or as we call them - canary) resources to our critical resources alongside the important asset, that look just as interesting:
- netflix-eu-production-viewer-data-sensitive (honeypot/canary)
- netflix-eu-viewer-data
- netflix-unreleased-content-2024 (honeypot/canary)
Let’s take it back to the compromised access for a privileged user. They start slowly enumerating the data that they have access to. The real asset and the honeypot assets are available to them. All 3 are just S3 buckets - so they all look interesting, really the name at this point is all the attacker has to go off.
Where can they go from here and where do honeypots make the difference?
This example illustrates the value and the impact of honeypotting, we can alert with confidence on any interaction and detect behaviours that would otherwise be difficult or impossible to highlight.
If honeypots are so great, why isn’t everyone using them?
So we’ve hopefully made a strong case for this approach and maybe you’ve got some ideas whirring around your head about how you might put this into practice but we all know there’s no such thing as a free lunch, especially in security, so what’s the catch?
We’ve spent a lot of time thinking about this as well as putting it into practice and (you guessed it..) built a platform that solves many of these challenges. But as with any strategic decision it’s important to make it with your eyes open to the negatives as well as positives.
Here are some of our lessons learned from deploying honeypots at scale.
- Testing - like a fire alarm, by design these techniques don’t alert often but when they do, you really want to know about it. There may be alerts in your program that get triggered often enough they are already de-facto tested (impossible travel, anyone?), this shouldn’t happen with honeypots, so you need a plan here.
- Relevance - it’s true today that most cloud attacks can be quite noisy - enumerating all objects, but it’s still worth making some effort to blend in (“honeypot-1” may certainly get skipped!).
- Freshness - a honeypot needs to match your system and organization, for most systems change is a constant, if your honeypots don’t change then they will begin to stand out. When deploying it’s important to consider how you’ll keep them fresh.
- Distribution - it can be tempting to honeypot only in the most critical environments, actually, it’s worth doing this in a broader set of places. Most security teams we work with are mindful that Build/CI and Staging environments are commonly the staging point for further attacks.
- Security tools - The vast majority of false positives you hit with this technique are, ironically, security tools! Fortunately often these are within scope to be filtered out by the security team that deploy them (our platform is able to automatically filter these tools).
- Internal Communication - this practice can also catch internal as well as external threats. This is a whole other article in itself, but what we’ve found works best is light touch communication internally that some resources may be in place for detection and security and to let the security team know if you think you encountered one (but to absolutely not going hunting for them!). We think this sets the right tone and expectations.
Finally, it’s also our belief that the time has only come recently for honeypots. The cost/benefit of deploying a virtual server to emulate network traffic brings its own risks and complications that don’t necessarily make it a no brainer. Cloud resources, however, are a much more compelling place to look - but we’ll save that for another article!
Next Steps
We hand this over to you! This should have given you some food for thought on deploying honeypots in your environments. Done right, the return on investment can be very high.