One risk in deploying fleets of powerful and flexible clusters on constantly changing infrastructure like Kubernetes is that mistakes happen. Even minute manual errors that slip past review can have substantial impacts on the health and security of your clusters. Such mistakes, in the form of misconfigurations, are reportedly the leading cause of cloud breaches, for example. And, with everything that can happen in the containerized world, these types of mistakes are virtually guaranteed to occur.
The question, then, is how developers and platform engineers can, under today’s accelerated development timeframes, minimize these errors — if not eliminate them entirely for the vast majority of common cases.
For many dev teams and platform engineers, an emerging solution is GitOps, or the practice of using Git repositories as a single source of truth for configuration and deployment specs in your build pipeline. An “as-code” practice, GitOps allows developers to access declarative, version-controlled, peer-reviewed descriptions of the master architecture in deployment — and use pull requests to flag any changes they hope to make to that architecture. In addition to performing configuration checks, teams will also ensure that any infrastructure-as-code changes adhere to company security and compliance policies.
For these reasons, GitOps is emerging as a best practice for many devops teams in the pursuit of delivering error-free code, faster. However, humans remain at the core of these new practices, and with humans comes fallibility. The next logical step, then, is to automate security and compliance checks — using policy-as-code to verify infrastructure-as-code changes.
Leveraging GitOps to reduce cloud-native errors
Every cloud-native developer or platform engineer knows the feeling of putting a cluster configuration in place that looks good, only to find out in peer review — or even worse, after deploying it into production — that its behavior is less than ideal. Traditional strategies like change control review boards have been replaced with peer review in the GitOps model and serve to prevent many of those eventualities.
Still, in today’s devops environments, manual configuration checks can become significant bottlenecks, and they result in more work for people who are often already overburdened. Moreover, given the complexity and scale of platforms like Kubernetes, it can be a challenge for teams to manually apply security and compliance policies (that are typically stored in PDFs, wikis, or team members’ brains) to every proposed infrastructure-as-code change. In other words, not only does peer review slow development, but errors still sometimes get through.
Using OPA to automate security and compliance checks
In the spirit of devops, we can both remove the “bottleneck factor” of change management and reduce the possibility of manual error by abstracting security and compliance policies and automating those checks in the GitOps process. In fact, that’s just what developers have been doing with the open source project Open Policy Agent, or OPA.
OPA is a domain-agnostic, general purpose policy engine that is becoming the de facto standard for creating cloud-native policy. In essence, OPA is a building block for creating and implementing consistent and flexible policies across the stack. It allows teams to translate the policies stored in PDFs, wikis, and people’s heads into policy-as-code and to enforce these policies directly. And OPA works in any cloud-native environment, including CI/CD pipelines.
OPA allows platform teams to automate configuration checks as well as security and compliance policies, in this case as part of the established GitOps process, leveraging a CI/CD pipeline tool like Jenkins or Spinnaker. When implemented in the CI/CD pipeline, OPA allows application developers to get immediate feedback on the worthiness of their production changes as they relate to configuration, security, and compliance requirements, before their colleagues begin to review them — and long before they ever get into production.
Indeed, when policy-as-code checks are automated in a GitOps process, infrastructure-as-code changes can only, by definition, reach production if and when they conform to said policies. In this way, OPA allows platform teams to truly “shift security left” to the earliest possible stages of development — and not leave security as a “rubber stamp on code” before deployment.
Why use policy-as-code in a GitOps process?
There are many reasons why developers and platform engineers integrate policy-as-code into their current GitOps process. For one, this strategy helps to accelerate application development and deployment, because it helps solve many of the change management hurdles that slow development pipelines.
Policy-as-code saves platform engineers and devops teams from having to manually review hundreds of lines of configuration code — a process that, by all accounts, machines can and should be doing. On the development side, policy-as-code helps app developers learn and understand what the company’s configuration, security, and compliance policies are and how to correctly abide by them. For example, a developer may not remember — or may have no reason to know — when deploying a load balancer onto Kubernetes in AWS is or is not sanctioned. Policy-as-code solves this problem automatically.
OPA is also a significant time-saver, in terms of policy creation. Some companies already apply policies in their build pipelines, typically using a number of ad-hoc scripts; OPA enables teams instead to take those implicit policies and make them declarative and explicit. For instance, one company, prior to using OPA, wanted to import all of its existing security and corporate policies into their production pipeline, but considering the coding hours involved, the project just wasn’t feasible. With OPA, the company was able to instead create new policies that were then enforceable across each of their cloud environments, saving significant time down the road.
Just as importantly, policy-as-code is, of course, code, which means that these automated checks happen in a way that is already familiar and comfortable for the developers and engineers who are building and deploying cloud-native software. If a platform team makes declarative infrastructure-as-code changes to their cloud environment using a tool like Terraform, for example, the addition of policy-as-code checks in their GitOps process is seamless. In other words, policy-as-code allows devops teams to deal with security and compliance policy in their preferred medium: as software.
As Kubernetes and container-based strategies become the clear future of cloud-native development, devops teams are flocking to GitOps strategies to accelerate development timeframes and reduce the likelihood of cloud misconfigurations. In the same vein, it’s time for teams to adopt a similar “as-code” approach to policy. All they have to gain is a single, scalable way to manage policy throughout the application lifecycle and distribute it across every pipeline, cluster, and cloud in the organization.
Tim Hinrichs is a co-founder of the Open Policy Agent project and CTO of Styra. Before that, he co-founded the OpenStack Congress project and was a software engineer at VMware. Tim spent the last 18 years developing declarative languages for different domains such as cloud computing, software-defined networking, configuration management, web security, and access control. He received his Ph.D. in Computer Science from Stanford University in 2008.
—
New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to [email protected].