Sam Cox

CTO & Co-Founder

Fri, May 3, 2024

Canary AWS credentials: Beyond a token effort

Canary cloud credentials

For a malicious actor, AWS credentials represent the enticing prospect of being able to access an organization's cloud environment. They may be explicitly sought out and are likely to be used if found. Canary AWS credentials are AWS credentials which are created, planted and monitored for use specifically to detect possible breaches of the systems in which they are located. Unsurprisingly, this can be a valuable method of intrusion detection!

In this post we'll explore some of the technical considerations when building or deploying canary AWS credentials at scale, guided by the latest security research. There's at least one existing service in this space[1], but I'd invite you to consider the below as we think there's value in digging a little deeper, no matter what approach you choose.

Considerations when deploying Canary AWS Credentials

  1. Anti-patterns
  2. Placement
  3. Short term vs Long term credentials
  4. Automated Deployment
  5. Identification
  6. Account Fingerprinting
  7. Permissions
  8. False Positives
  9. Conclusion

Anti-patterns

I don't believe there is a one size fits all canary AWS credential strategy, as you'll see below, there's plenty of detail worth considering to avoid wasting effort on a deployment that fails to provide value or quickly becomes stale.

What I will confidently attest is that a quick and dirty "spray and pray" approach is unlikely to provide value. What this might look like:

  • selecting a handful of high value assets and manually deploying canary credentials to them
  • skipping solving for the long term maintenance of the canary credentials deployment (will you cycle a credential that triggers? how will you on board new assets?)
  • placing canary credentials in places they'll never be discovered
  • placing canary credentials in places so discoverable they get discovered far too often (with a vague idea this might provide 'threat intel')

If these examples sound familiar then perhaps I can help you consider the variables necessary to take an approach that may not be perfect immediately but will at least allow for iteration and avoid wasted effort. Let's dive in!

Placement

As a starting point, it's worth having a picture of why and where you want to deploy the canaries. This can be guided by e.g.

  • your threat model
  • reports about threat actors' activity
  • the level of existing alerting and monitoring capabilities you have in place

This will likely be quite specific to your organization. Here are a few concrete examples:

Endpoint or workstation devices

AWS credentials can be placed on endpoint devices to detect activity associated with certain malware. For instance, Palo Alto's Unit 42 reported on the Kazuar backdoor which contains capabilities designed to steal ~/.aws/credentials files. If canary AWS credentials were planted in this location and subsequently used, this activity could be detected.

Docker Container hosts

Datadog recently reported on a threat actor compromising exposed Docker API endpoints and deploying cloud credential stealers designed to obtain AWS credentials from the Instance Metadata Service (IMDS) as well as several locations on disk. SysDig previously reported a similar technique used by TeamTNT after exploiting a service hosted within Kubernetes.

CI or Build environments

CI environments represent an attractive target for malicious actors looking to steal code, inject payloads into builds or laterally move into cloud environments. In the Codecov and CircleCI breaches sensitive environment variables (including AWS access keys) were made available to attackers. In the latter example, the compromise was detected because of canary AWS credentials. CI systems could potentially also represent one of the earliest opportunities to detect supply chain compromise of third party dependencies. For instance, a supply chain attack compromising the Python package ctx led to environment variables being exfiltrated, including any AWS credentials available to the process.

Short term vs Long term credentials

An important, and sometimes overlooked, consideration is whether to issue short term or long term credentials as canaries.

Long term credentials

Long term credentials are those which are valid until they are otherwise revoked. These are access keys associated with IAM users. Long term credentials are stored within AWS and validated for each request.

Short term credentials

These are credentials which have an explicit expiration date, determined at the point of issue. There are a number of different means of obtaining short term credentials (e.g. associated with assumed roles, federated users, etc). Short term credentials include an AWS_SESSION_TOKEN. These credentials can be valid for as long as 36 hours.

Quotas

If building this yourself, there is a limit of 5,000 IAM users within an AWS account, and each IAM user can have two active access keys. This effectively places a maximum of 10,000 active long term credentials which can be issued from a single AWS account.

Short term credentials operate quite differently - the session token they include is effectively an encrypted blob, produced by STS when the credentials are issued. This token can be used to authenticate a request to AWS but is not otherwise stored within AWS. This means that the number of active short term credentials is effectively unlimited.

Detection

A key requirement of canary credentials is to be able to detect when they are used. Primarily this relies on CloudTrail, which logs activity within AWS and includes various details about the principal which performed an action. We can watch CloudTrail for activity associated with the canary credentials to detect when they have been used.

It's important to note however that CloudTrail will not log all API requests made. A simple example of this is a request like S3:GetObject - if data events are not enabled within CloudTrail then there won't be a record of that activity. In a previous blog post, I also showed that VPC endpoint policies can prevent certain activity from being logged in CloudTrail. Finally some events just aren't supported by CloudTrail - for instance SQS:ListQueues[2].

There is however one additional option for detection for long term credentials. You can call IAM:GetAccessKeyLastUsed or IAM:GenerateCredentialReport to determine some very rudimentary details about an access key - including the timestamp, service name and region in which it was last used. This provides much less information than is available in CloudTrail but it is a very effective method of determining whether credentials have been used at all. I speculate that these details are recorded by AWS' internal Runtime Authentication Service as an inherent part of authenticating the request and so I consider it unlikely that there is any technique to evade this system while using the key itself.

One significant advantage of short term credentials from AWS' perspective is that they are stateless and do not need to be stored internally by their systems. It seems therefore unlikely that a similar "last used" mechanism would ever be implemented for short term credentials.

Detection Window

It's worth remembering that it is only possible to trigger credentials which are valid at the time of use. With short term credentials, this can be up to 36 hours from the point in time at which they are issued. With long term credentials, this will be until the access key is explicitly deactivated or deleted.

In certain cases this may make long term credentials preferable, but it is worth considering what your response to an alert will look like. For instance, if long term credentials were issued two years ago and they trigger today; there is a very long time window in which the system containing those credentials could potentially have been breached. In comparison, if short term credentials trigger - you know that the credentials were obtained and used within a short period of time (<36 hours) which could significantly focus the scope of investigation.

Of course, just because long term credentials last indefinitely, it doesn't mean they have to! You can cycle them much more frequently.

Plausibility

The use of canary credentials is a relatively well-known technique, so it's worth thinking about how plausible the credentials are in the context in which they're placed. Many organizations are trying to move away from the use of long term credentials where possible. If found stored on a USB device stored in an office safe, long term credentials may appear quite plausible. On the other hand, one set of long term credentials among many short term credentials on an Engineer's laptop might stand out as potential canaries in the context of your organization and be actively avoided by attackers.

Automated Deployment

It's important to think about how you intend to deploy or make available the canary credentials. This is highly dependent on where you intend to place them. For instance, when deploying on endpoints, can you make use of your Mobile Device Management (MDM) software to facilitate their deployment? Perhaps they need to be namespaced (e.g. under a different AWS CLI profile) to avoid conflicting with other credentials.

In a CI environment, can you automate the issuing of credentials ahead of each job; or placing them in the associated secret stores?

For virtual machines, can you automatically create them at launch time? William Bengston describes an advanced technique in which they could be returned via an IMDS proxy.

Identification

You will want to be able to identify which system may have been compromised when a particular credential is triggered. For instance, if deploying credentials to many endpoints, you'll likely want distinct credentials per device with at least the hostname associated with each one. When deploying canary credentials in CI, you may wish to associate them with a CI build identifier. This is critical to significantly narrow the scope of investigation if one triggers.

The simplest form of identification would be to ensure that the identifier (e.g. hostname) appears in the principal ARN - which in turn will appear in CloudTrail when the credentials are used. For instance, when using long term credentials, you could create an IAM user with the name including the host and issue an access key for that user. With short term credentials, you could pass the hostname as the RoleSessionName when assuming the role. I would caution however that you should not place sensitive information in the principal ARN - it's readily available to anyone with the credentials. Encryption could address this concern but comes with its own challenges. Restrictions on length and format of these various names also prevent placing much metadata in the principal ARN.

For this reason it may be useful to store some metadata about each credential issued (e.g. in DynamoDB). This allows you to store as much more information about each credential, without concerns about it being exposed. This can also allow you to easily see how many credentials you have active (e.g. indexed by device), without resorting to listing IAM users or hunting in CloudTrail.

One word of warning is that - at least for short term credentials - AWS Access Key IDs themselves are not unique. At high enough volumes, you will encounter duplicates (including concurrently valid ones!). You'll want to create your own unique identifier to place in the Principal ARN.

Account fingerprinting

In October 2023, a fairly low key piece of research changed significantly how teams should think about the impacts of account fingerprinting.

It's now possible to determine the AWS account ID associated with an AWS access key without making any requests to AWS. This is because the AWS account ID is encoded within the access key itself. This means that if an attacker finds an access key, they can determine the account ID associated with it without any risk of detection. Previously, attackers used the CloudTrail evasion techniques discussed above, which could be detected, itself a strong signal. This is no longer necessary for attackers and they can stealthily determine the account ID.

Some tools have already built in capabilities to detect whether AWS credentials are associated with accounts known to issue canary credentials. Even if an account isn't in such a list, a sophisticated attacker could consider whether the credentials are plausible in the context in which they find them:

  • do they match the account in any other credentials found?
  • do they match the account ID contained in any ARNs in the environment?
  • are they included in a list of account IDs associated with your organization which they might have been able to obtain by other means?

Although I'm not aware of any public technique to determine an AWS Organization ID from an AWS Account ID, issuing credentials from an account owned by your Organization (rather than a third party) could make the credentials more plausible.

Permissions

When issuing canary credentials from your own AWS account, it's crucial to make sure you give very careful consideration to permissions associated with the credentials. I would highly recommend that they are granted no permissions. It's tempting to think that granting some limited additional permissions might provide threat intelligence if the credentials are ever used but I would be very wary about the additional risks associated with this.

It is not enough to rely on the "implicit deny" of not attaching any policies to the user or role associated with the credentials. One risk is that resource-based IAM policies within your account or others could permit the credentials access even though no such identity-based policy grants access. Inadvertently adding additional policies to the principal could also permit more access than intended. For this reason an explicit deny should be specified in the policy. Consider also using SCPs or Permission Boundaries with explicit denies against the principal as another failsafe.

It is not possible to deny the credentials from calling sts:GetCallerIdentity.

False positives

There's a trade-off between the discoverability of these credentials and the chance that they will be inadvertently triggered. This depends where they're placed. For instance, threat actors may specifically target ~/.aws/credentials, so you probably want to place them in there, but perhaps not as the default profile if it's on an Engineer's machine! But on your CEOs or CFOs machine? That could be a very strong signal with a very low false positive rate (depending, of course, on your CEO or CFO!).

Conclusion

There are obviously quite a few factors which you can consider when it comes to canary credentials, each with its own trade-offs. My general recommendation would be:

  • invest in automating the deployment, regular replacement and tracking of credentials in specific places that you want to improve your detections

  • aim to do this in a way which requires no manual intervention as new resources are provisioned

  • abstract the details of issuing of credentials from the devices themselves - e.g. put it behind an API

  • issue granular credentials and associate enough metadata with them to enable an effective response

I think investing in this flexibility is more valuable than spending too much thought ahead of time about long-term vs short-term, principal names and accounts from which to issue credentials. These decisions can ultimately be changed fairly easily if you have automated and abstracted the deployment process.

Given the increasing research interest in fingerprinting canary credentials and unauthenticated information gathering about AWS resources, it's important to be able to adapt and renew credentials easily to maintain their long term effectiveness - rather than one day find that your manually and carefully placed credentials can be trivially identified!

I'm keen to hear about your own experience with canary credentials - please do reach out!


  1. Thinkst's Canary Tokens: https://canarytokens.org/ ↩︎

  2. As you can imagine, CloudTrail evasion is a topic that's under active research, e.g. some recent research from Nick Frichette ↩︎

Thanks for reading this far, we hope you got something from this article. We're building a product to operationalise and scale cloud canaries to allow organizations to 'assume breach' with minimal set up and maintenance. If you’re interested to learn more about what that looks like please reach out (we currently support AWS, with Azure & GCP coming soon).

Sign up for our newsletter

Subscribe to stay in the loop of future posts and Tracebit updates