Policy Enabled Kubernetes with Open Policy Agent
Addressing Common Concerns with Cloud Computing and DevOps
A move to the public cloud is done, in large part, to address the common concerns of “infrastructure provisioning” that are shared by all application teams. The cloud helps organizations greatly reduce the need for the “undifferentiated heavy lifting” — standing-up servers, networking, and security — needed just to deliver applications and features. Moving to containers and Kubernetes can be seen as the next evolution in allowing development teams to focus on their work, not on infrastructure.
Tackling these common needs is also a large part of a continued DevOps journey. In fact, DevOps is all about reducing variability and human-error, increasing repeatability, and implementing practices underpinned by policies to deliver applications and features reliability and efficiently. This is all important in Kubernetes. And, we know that addressing common concerns for feature teams, through cloud-computing and DevOps, is what enables application teams to deliver faster, enabling businesses to move faster.
Common Applications Concerns
As we embrace modern approaches to provision infrastructure, we also use patterns to address common concerns for building applications and delivering services and APIs. Much the same way Aspect Oriented Programming (AOP)did a decade or so ago by satisfying “cross-cutting concerns”, we are addressing modern application design and construction by adopting patterns, such as The 12 Factor methodology. Policy enablement is also a common concern we can leverage to better manage applications and their associated environments.
What is Policy?
As seen on the Open Policy Agent document site:
“All organizations have policies. Policies are essential to the long-term success of organizations because they encode important knowledge about how to comply with legal requirements, work within technical constraints, avoid repeating mistakes, and so on.
In their simplest form, policies can be applied manually based on rules that are written down or conventions that are unspoken but permeate an organization’s culture. Policies may also be enforced with application logic or statically configured at deploy time.”
Simply put, policies are the boundaries in which we deliver applications and infrastructure. These boundaries drive acceptance criteria for our deliverables, and our definition of done. We are measured, in part, by how well we meet the requirements of these policies, and how effectively we enable our customers to stay within policy when they use our solutions.
Automated Policies to Satisfy Common Concerns
Part of a successful DevOps formula is making sure that we follow internal policies and procedures when pushing changes to computing environments. Not all policy enforcement is done via automated DevOps pipelines. For example, tools like Cloud Custodian — the open source rules engine — are used to automate the implementation of policies to maintain a well-managed and secure cloud. These policies are meant to place guardrails around cloud usage, without adversely affecting cloud users.
The Case for General Purpose Policy Enablement
The type of automated policy enforcement, implemented via Cloud Custodian or similar tools, should be considered for other application settings. Executing within prescribed policies is a common concern of cloud-native applications. Open Policy Agent (OPA) is a general purpose approach to policy enablement.
According to the docs, Open Policy Agent (OPA) is:
“…a lightweight general-purpose policy engine that can be co-located with your service. You can integrate OPA as a sidecar, host-level daemon, or library.
Services offload policy decisions to OPA by executing queries. OPA evaluates policies and data to produce query results (which are sent back to the client). Policies are written in a high-level declarative language and can be loaded into OPA via the filesystem or well-defined APIs.”
Implementing Admission Control Policies in Kubernetes
Automated policy enforcement is also needed as we move to containers and container orchestration platforms. As the Tech Lead on our Enterprise Kubernetes Platform Team, I have been researching and developing patterns for managing policies in our clusters.
One of the policy control points within Kubernetes is admission control. With Kubernetes admission controllers, we can intercept requests to the Kubernetes API server before the relative objects, providing intent for desired cluster state, can be persisted to the etcd key/value object store.
The pattern that I have investigated can be found here, with its companion GitHub repos here and here.
Implementing a Kubernetes Deployment Admission Controller
(The OPA use case I will focus on is controlling from where container images are sourced, as part of Kubernetes deployment manifests.)
As part of a sound governance and compliance stance, it is important to understand, direct, and even control the image sources for your Kubernetes workloads. With OPA and Kubernetes Validating Admission Controllers, event-driven and dynamically-configured automated-policy-based decisions can prevent unwanted images from being deployed into your clusters. In this solution, an opa
service is connected to a Kubernetes ValidatingAdmissionWebhook
, and listens for deployment CREATE
and UPDATE
events sourced by the Kubernetes API server.
The solution involves creating a Kubernetes object graph consisting of the following objects:
In general operation, the OPA server is a RESTful server that exposes services to produce and consume both event data and policies. Since OPA is domain agnostic, any data can be sent to the OPA server, to be evaluated by any policy, as long as the policy matches the event data passed in.
In my example solution, OPA policies are stored in the opa
namespace as Kubernetes ConfigMap resources. The policies are stored in the opa
container by a sidecar workload known as kube-mgmt
. kube-mgmt
reads ConfigMaps as they are applied to the opa
namespace, and compiles them to verify proper syntax.
Upon successful compilation, the policy is stored in the opa
container by kube-mgmt
. Additionally, kube-mgmt
is configured to periodically pull resource metadata that might be needed by the opa
service to correctly evaluate API server events, in the case that the event does not contain all the needed data for the logic defined in the evaluation policy. The resources that kube-mgmt
scans are configured with container spec arguments in a Kubernetes deployment manifest.
Preparing the OPA Artifacts — Step by Step
First, we apply the namespace and auth resources for the solution, as seen below. The opa
ServiceAccount uses the opa
ClusterRoleBinding to bind to the opa
ClusterRole for access to the permissions contained therein.
apiVersion: v1
kind: Namespace
metadata:
name: opa
labels:
app: opa
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: opa
namespace: opa
labels:
app: opa
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: opa
labels:
app: opa
rules:
- apiGroups: [""]
resources:
- namespaces
verbs:
- get
- list
- watch
- apiGroups: ["extensions"]
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups: ["apps"]
resources:
- deployments
verbs:
- get
- list
- watch
- apiGroups: [""]
resources:
- configmaps
verbs:
- get
- list
- patch
- watch
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: opa
labels:
app: opa
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: opa
subjects:
- kind: ServiceAccount
name: opa
namespace: opa
Next, we build the OPA secrets, and server config files, and apply the opa-server
secret to the opa
namespace:
openssl genrsa -out ca.key 2048
openssl req -x509 -new -nodes -key ca.key -days 100000 -out ca.crt -subj "/CN=admission_ca"
cat >server.conf <
Next, we deploy the OPA solution containers, services, and default policy ConfigMap with the following command: kubectl apply -f admission-controller.yaml
The YAML can be seen below:
kind: Service
apiVersion: v1
metadata:
name: opa
namespace: opa
labels:
app: opa
spec:
selector:
app: opa
ports:
- name: https
protocol: TCP
port: 443
targetPort: 443
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: opa
name: opa
namespace: opa
spec:
selector:
matchLabels:
app: opa
replicas: 1
template:
metadata:
labels:
app: opa
name: opa
spec:
serviceAccountName: opa
containers:
- name: opa
image: //openpolicyagent-opa:0.9.1
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 500m
memory: 512Mi
args:
- "run"
- "--server"
- "--tls-cert-file=/certs/tls.crt"
- "--tls-private-key-file=/certs/tls.key"
- "--addr=0.0.0.0:443"
- "--insecure-addr=127.0.0.1:8181"
volumeMounts:
- readOnly: true
mountPath: /certs
name: opa-server
- name: kube-mgmt
image: //openpolicyagent-kube-mgmt:0.7
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 500m
memory: 512Mi
args:
- "--replicate-cluster=v1/namespaces"
- "--replicate=extensions/v1beta1/ingresses"
- "--replicate=apps/v1/deployments"
volumes:
- name: opa-server
secret:
secretName: opa-server
---
kind: ConfigMap
apiVersion: v1
metadata:
name: opa-default-system-main
namespace: opa
labels:
app: opa
data:
main: |
package system
import data.kubernetes.admission
main = {
"apiVersion": "admission.k8s.io/v1beta1",
"kind": "AdmissionReview",
"response": response,
}
default response = {"allowed": true}
response = {
"allowed": false,
"status": {
"reason": reason,
},
} {
reason = concat(", ", admission.deny)
reason != ""
}
The last part of this YAML applies a ConfigMap that contains the main OPA policy and default response. This policy is used as an entry-point for policy evaluations and returns allowed:true
if policies are not matched to inbound data.
The Admission Controller Webhook
The OPA Admission Controller is a Validating Admission Controller, and works as a server webhook. When a request is made to the Kubernetes API server to create an object that is under policy admission control — such as a Deployment resource — the webhook fires and the opa
and kube-mgmt
containers work together to evaluate the API server event and resource data with policies to perform admission review.
The YAML, seen below, sets up the webhook to listen for Kubernetes API server CREATE
and UPDATE
events for the included list of resources, regardless of API group or API version. The namespaceSelector
at the bottom of the YAML file allows us to exclude certain sensitive namespaces from this validation solution.
We base64
encode the ca.crt
file from previous OpenSSL operations, and add it to the webhook-configuration.yaml
This will allow the webhook to securely communicate with the opa
service.
Note: Unlike most centrally managed Kubernetes secrets, the OPA secrets are just used between the webhook and the opa
service. As such, these secrets are idempotent and decoupled from the cluster secrets and CA, and can be regenerated as needed to reconfigure the OPA solution.
kind: ValidatingWebhookConfiguration
apiVersion: admissionregistration.k8s.io/v1beta1
metadata:
name: opa-validating-webhook
namespace: opa
labels:
app: opa
webhooks:
- name: validating-webhook.openpolicyagent.org
rules:
- operations: ["CREATE", "UPDATE"]
apiGroups: ["*"]
apiVersions: ["*"]
resources:
- pods
- services
- replicasets
- deployments
- daemonsets
- cronjobs
- jobs
- ingresses
- roles
- statefulsets
- podtemplates
- configmaps
- secrets
clientConfig:
caBundle: ${base64 ca.crt}
service:
namespace: opa
name: opa
namespaceSelector:
matchExpressions:
- {key: opa-webhook, operator: NotIn, values: [ignore]}
In the clientConfig section, the opa
namespace and service are referenced. CREATE
and UPDATE
operations will cause this webhook to fire.
Rego: The OPA Policy Language
Rego is OPA’s native query language. It is similar to Datalog, but also supports structured documents, like YAML and JSON. Policies that OPA uses to review resources are written in Rego, and usually saved as *.rego
files.
Below is the deployment_create_whitelist.rego
that creates a whitelist of acceptable registries that would be part of the image property in a Kubernetes Deployment spec. The deny[msg]
block is the entry point into this policy and it pulls data from the API server event.
package kubernetes.admission
import data.kubernetes.namespaces
deny[msg] {
input.request.kind.kind = "Deployment"
input.request.operation = "CREATE"
registry = input.request.object.spec.template.spec.containers[_].image
name = input.request.object.metadata.name
namespace = input.request.object.metadata.namespace
not reg_matches_any(registry,valid_deployment_registries)
msg = sprintf("invalid deployment, namespace=%q, name=%q, registry=%q", [namespace,name,registry])
}
valid_deployment_registries = {registry |
whitelist = ""
registries = split(whitelist, ",")
registry = registries[_]
}
reg_matches_any(str, patterns) {
reg_matches(str, patterns[_])
}
reg_matches(str, pattern) {
contains(str, pattern)
}
After data are collected from the event, those data are compared to the whitelist via a call to the reg_matches_any(…)
block. The call stack uses the reg_matches(…)
block to check if the registry variable value (from the container image property) contains a value from the registry whitelist. If a whitelisted value is not found, the policy evaluation responds with a deny
and returns the reason, constructed in the msg
variable.
Note: Even though the webhook fires for CREATE
and UPDATE
API server events, the policy above is only used to evaluate the JSON payloads in the Deployment CREATE
API server events. As a reminder, if an API server event is sent to OPA for evaluation, and no matching policy can be found, OPA will respond with the status ofallowed:true
. This will tell the API server to continue with the write operation to etcd.
Policies as ConfigMaps
The Rego files are stored in the Kubernetes cluster as ConfigMap resources. When a ConfigMap resource is created in the opa
namespace, the kube-mgmt
sidecar container reads the ConfigMap, and compiles the policy. Upon successful compilation, the kube-mgmt
sidecar annotates the ConfigMap with an ok status, as seen below.
Next, the kube-mgmt
sidecar loads the policy contents from the ConfigMap into theopa
container as a policy. Once the policy is installed in the opa
container, the policy can be evaluated against Deployment resources, looking for the whitelisted registry in the image spec. If the registry in the image spec is not in the whitelist of the policy, then the deployment will fail, as seen below.
Error from server (invalid deployment, namespace="app-ns", name="app-name", registry="app-registry"): error when creating "app-deployment.yaml": admission webhook "validating-webhook.openpolicyagent.org" denied the request...
If the kube-mgmt
sidecar cannot successfully compile the Rego policy file, then it will stamp the ConfigMap with a failure status annotation and not load the policy into the opa
container.
openpolicyagent.org/policy-status: {"status":"error","error":{"code":"invalid_parameter","message":"error(s) occurred while compiling module(s)","errors":[{"code":"rego_parse_error","message":"no match found","location":{"file":"opa/deployment-create-whitelist/main","row":5,"col":10},"details":{}}]}}
Additional Use Cases
Open Policy Agent can be used to evaluate the JSON payload of many API server events, and multiple policies can be used to evaluate the same API event. One of the core features of Kubernetes is how it selects resources, driven by labels. Furthermore, governance and compliance within a cluster can be driven by properly labeling resources. It makes perfect sense to use Open Policy Agent to evaluate API server event payloads to ensure that new and reconfigured objects are properly labeled. This ensures that no workloads are introduced into the cluster without the correct labeling scheme.
Under the covers, Open Policy Agent is a RESTful server, that takes in data and policies to evaluate said data. Given its domain-agnostic nature, Open Policy Agent could be deployed into a Kubernetes cluster to provide services to other workloads that need data validation, beyond the use case of validating Kubernetes resources.
Tips
While working with Open Policy Agent, and Kubernetes Validating Admission Controllers, we discovered a few potential issues of which readers should be aware:
- Because of how Kubernetes resource deletions fire the
UPDATE
event, policies need to be carefully crafted to account for unwanted behavior in non-deletionUPDATE
events. - When configuring corporate proxy connections on nodes,
.svc
may need to be added to theNO_PROXY
environment export to prevent erroneously routingopa.opa.svc
calls to outside of the cluster.
Conclusion
Along with moving to Cloud Computing, moving to Kubernetes requires thoughtful design to ensure that governance, compliance, and security controls are included. Using policies to apply rules-based control of resources is a dynamic approach to managing Kubernetes configuration. Policy Enablement is a common concern among teams looking to automate the management of Kubernetes.
The domain-agnostic nature of Open Policy Agent makes it well-suited for policy management and evaluation. In conjunction with Kubernetes Validating Admission Controllers, Open Policy Agent can reduce opportunity for unwanted resource configurations into Kubernetes clusters. And, Open Policy Agent is a RESTful server that could take on other data validation roles in the enterprise.
Critical Stack enables enterprises to run efficient and cost-effective containerized infrastructure while reducing risk. Click here to learn more.