An Intro to Designing Secure CI/CD Pipelines
In our previous post, we discussed the recent security incident at Codecov and the following investigation at Mattermost. As a follow-up to that we wanted to share some of the basic design principles as well as a handful of more technical tips and tricks around CI/CD pipeline security that helped Mattermost come out of the incident unscathed.
High-level design principles
If you’re working in the software industry, you’ve probably already heard and even made use of the following principles:
- The principle of least privilege
- Separation of privileges
- Defense in depth
These are common concepts with use-cases in all types of information systems. Their application to CI/CD might not be immediately obvious, however.
Least privilege
A component in a CI/CD pipeline, as in any other system, should only have access to the minimum number of resources it needs to complete its job.
Need to upload build artifacts to S3? Great, just inject the keys to the environment, but first make sure they only grant access to the one bucket you need. Need to clone a private Git repository over SSH? Check that the SSH key can’t access the rest of your GitHub organization.
Need to upload coverage information to Codecov? Make sure the same job in your pipeline can’t also access your S3 buckets and private repositories.
Separation of privileges
Closely related to the principle of least privilege, separation of privileges essentially means that one component (or CI job) should only do one thing–after all, if you do everything in the same job, the least privilege possible is, well, everything.
So make sure that task uploading build artifacts is isolated from the task cloning the private repository, and especially from the one running Codecov.
On CircleCI, one of the CI platforms Mattermost heavily uses, both the principle of least privilege and separation of privileges are easily enforced with proper use of Workflows, Jobs, and Contexts. Similar concepts exist in most CI platforms.
Defense in depth
There are plenty of ways to implement defense in depth. The general idea is simply being a little bit overzealous when protecting your crown jewels, whatever they are. In a CI/CD pipeline, the most important task is usually building the release and singing it, so that’s also a task to pay careful attention to.
At Mattermost we made the call to split our pipelines into three: one set of pipelines running on CircleCI, one on gitlab.com, and one on a self-hosted GitLab instance. The CircleCI infrastructure is public, so it’s mostly used for community PRs and things that are essentially public anyway. The other end of the spectrum is the internal GitLab, which, behind a VPN, entirely shielded from the outside world, builds releases. Between the three, code is mirrored, making it accessible to all tiers.
Similar approaches, or more light-weight variations of it, might be easy to apply in your organization. Are you a maintainer of a major open source project and building everything on CircleCI? Consider creating a separate repository that hosts nothing but the release build configurations, then using that to split the pipelines between PR builds and release builds.
CircleCI Security Checklist
At Mattermost, CircleCI pipelines are the most exposed part of our CI infrastructure: that’s where pull requests are tested, so they run arbitrary untrusted code and publish their logs for community consumption. Because these pipelines are so exposed, it makes sense to pay special attention to how they are configured.
Here are a few useful tips for configuring CircleCI in a secure way.
Don’t pass secrets to forks
First and foremost, if your source repository is public and accepts PRs, check the options under CircleCI Project Settings / Advanced. If “Pass secrets to builds from forked pull requests” is enabled, your CI/CD pipeline is most likely leaking secrets to any malicious actor that opens a new PR.
The setting comes with a big warning label for a reason. Even if your environment variables aren’t configured with any sensitive keys, do not enable this: it also changes how CircleCI handles its internal keys and shared storage.
Make good use of Contexts
Contexts on CircleCI are a powerful security feature. They allow passing secrets only to specific Jobs and controlling who can access them. Label your Contexts clearly use one Context for only one purpose. If your Context is particularly sensitive, limit the access to a specific group in your organization.
Proper application of the “separation of privileges” concept by splitting tasks to Jobs enables you to make better use of Contexts, and good use of Contexts means you’re applying the principle of least privilege correctly.
Know the difference between a PR build and a branch build
If you take the “don’t pass secrets to forks” advice, it’s also important to understand when it has an effect–and especially when it doesn’t.
CircleCI only has one way of identifying a forked PR build: if the build was triggered by a GitHub webhook and the trigger is a pull request event, the build is a PR build. If the build is triggered any other way, it’s no longer a PR build, even if the Git reference is the same:
- If an organization member manually triggers a CI build for a PR, it’s not a PR build
- If a bot triggers a build for a PR using the CircleCI REST API, it’s not a PR build
- Only if the build is triggered automatically by a GitHub PR using the official CircleCI integration is it a PR build
This is an important point because of a very significant implication: If a build is triggered manually or by custom automation, it always gets secrets injected into it. Members of an organization should always review the changes before manually triggering a build on a PR to ensure no malicious code is present.
Even if your build contains no secrets that could be exfiltrated, non-PR builds can always access shared storage such as dependency caches.
Be careful with caches
…which gets us to our last point: caches. Famously one of the hard things in Computer Science, they’re no easier when it comes to your CI/CD pipelines. As the CircleCI documentation points out, caching dependencies in particular is a powerful tool for improving build times, but it comes with its risks.
In addition to the potential filesystem path and permission issues named in the documentation, there is a major security concern: cache poisoning.
Most dependency management systems, regardless of programming language, verify the integrity of your dependencies in one way or another when installing them: it could be by checking a hash or a cryptographic signature, or it could be by only downloading from trusted servers over HTTPS. Whatever the mechanism is, usually the verification happens when the dependencies are being downloaded. Once they’re already in your filesystem, the dependencies are assumed to be intact.
Dependency caching on CircleCI works by grabbing the downloaded dependencies from your Job executor’s filesystem and uploading them into an S3 bucket. When the cache is restored, the contents of the S3 bucket are simply dumped back into the filesystem. This bypasses the verification mechanisms of dependency management systems.
So what’s the problem? Malicious code in any Job can write to any of your caches and inject malware into a dependency. Another Job even in a completely separate build could poison the caches of a sensitive build step, potentially leading to malware in a release binary or secrets being exfiltrated from the build environment.
The only way caches are isolated from each other is based on the PR build mechanism: forked PRs cannot access the caches of non-PR builds or the other way around. This is also where the previous tips come in: passing secrets to PR builds must be turned off and builds must not be manually triggered. Of course, the safest bet is simply to not use caches at all in sensitive build steps.