Containers and Kubernetes have revolutionized the way many teams deploy cloud-native apps. Capturing issues in a deployment pipeline and using machine intelligence to find security risks has gotten smoother since the onset of the cloud native computing foundation’s open telemetry project. Using a widely adopted open source container orchestration tool like Kubernetes has many benefits, but it also provides new attack vectors in a CI/CD pipeline. Telemetry standardization plays a key role in dispelling common or known issues, but it doesn’t address every challenge.
Key among those challenges is continuous security. By adding more layers and complexity to application environments, containers and Kubernetes create new opportunities for attackers and new threats for Kubernetes admins to address. And although Kubernetes provides certain built-in security features, those features are hardly enough to stop all attacks on their own.
The following is an overview of Kubernetes security essentials, including the main types of security risks that exist in a Kubernetes-based environment, why securing Kubernetes is harder than securing non-containerized applications, and security best practices that teams can follow for maximizing Kubernetes container security.
Kubernetes vulnerabilities
The main reason why securing Kubernetes is challenging is that Kubernetes is a sprawling platform composed of many parts. Each of those components carries its own security issues and risks.
Here's a rundown of the key parts of a Kubernetes environment and the most common security risks that affect them:
Containers: Containers can contain malicious code that was included in their container image. They can also be subject to misconfigurations that allow attackers to gain unauthorized access under certain conditions.
Host operating systems: Vulnerabilities or malicious code within the operating systems installed on Kubernetes nodes can provide attackers with a path into Kubernetes clusters.
Container runtimes: Kubernetes supports various container runtimes. All of them could potentially contain vulnerabilities that allow attackers to take control of individual containers, escalate attacks from one container to another, and even gain control of the Kubernetes environment itself.
Network layer: Kubernetes relies on internal networks to facilitate communication between nodes, pods, and containers. It also typically exposes applications to public networks so that they can be accessed over the Internet. Both network layers could allow attackers to gain access to the cluster, or, as before, escalate attacks from one part to another.
API: The Kubernetes API, which plays a central role in allowing components to communicate and apply configurations, could contain vulnerabilities or misconfigurations that enable attacks.
Kubectl (and other management tools): Kubectl, Dashboard, and other Kubernetes management tools might be subject to vulnerabilities that allow abuse on a Kubernetes cluster.
Built-in Kubernetes security features
Kubernetes offers native security functions to protect against the threats described above, or at least to mitigate the potential impact of a breach. The main security features offered by Kubernetes include
Role-based access control (RBAC): Kubernetes allows admins to define what it calls Roles and ClusterRoles, which specify which users can access which resources within a namespace or an entire cluster. RBAC provides one way to regulate access to resources. Modern security best practices dictate that all tools that you are using for deployment orchestration offer RBAC support.
Pod security policies and network policies: Admins can configure pod security policies and network policies, which restrict how containers and pods behave. For example, pod security policies can be used to prevent containers from running as the root user, and network policies can restrict communication between pods.
Network encryption: Kubernetes uses Transport Layer Security (TLS) to encrypt network traffic, providing a safeguard against eavesdropping. This cryptographic protocol is another common standard security best practice and widely used in securing HTTPS, email, and messaging platforms.
While these built-in Kubernetes security functions provide layers of defense against certain attacks, they do not cover all threats. Kubernetes uses primarily declaratively run environments, offering no native protections against the following types of attacks:
Malicious code or misconfigurations inside containers or container images: To scan for these, you would have to use a third-party container scanning tool.
Shadow IT deployments or changes: You don’t specifically need malicious code to cause security concerns. Simply not going through your company’s proper change management system and bypassing compliance will cause significant Kubernetes security challenges.
Security vulnerabilities on host operating systems: Again, you would have to scan for these using other tools. And although some Kubernetes distributions (like OpenShift) integrate SELinux or similar kernel-hardening frameworks to provide more security at the host level, this is not a feature of Kubernetes itself.
Container runtime vulnerabilities: as before, Kubernetes has no way to know or alert you if a vulnerability exists within your runtime, or if an attacker is trying to exploit a vulnerability in the runtime.
Abuse of the Kubernetes API: Beyond following any RBAC and security policy settings that you define, Kubernetes does nothing to detect or respond to API abuse.
Management tool vulnerabilities or misconfigurations: Kubernetes cannot guarantee that management tools (like Kubectl) are free of security problems. The same goes for your Helm chart deployments.
Kubernetes hardening best practices
Because the built-in security features of Kubernetes are limited, it's critical for teams to take extra steps to secure their clusters. The following are some best practices for getting the most out of the security features Kubernetes offers and leveraging external tools and strategies to provide more security.
Configure pod security and network policies
As noted above, security policies can be used to enforce restrictions for pods and networks. However, it's important to understand that these policies are not configured and enabled in most Kubernetes distributions by default. Even if you turn them on by default in our distribution, it is likely you need to tailor them to your needs.
A critical first step for your security team is to harden Kubernetes and ensure they set up and enforce these policies in a way that reflects your team's needs. The level of strictness that applies in these policies will vary depending on how secure the cluster needs to be. For example, a production cluster is more likely to have more restrictive policies (such as policies that prevent write-access to resources and prevent all non-essential network traffic) than a cluster used internally for a development pipeline, or for testing and deployment purposes (in which case very strict security policies are typically not as important because the cluster will not be running mission-critical apps connected to the public Internet).
Kubernetes host security
Kubernetes is only as secure as the operating systems that power its nodes. Because Kubernetes cannot monitor or harden host operating systems, admins need to cover that ground themselves. This is of course required for on-premises hosts as well, regardless of whether you have containerized infrastructure or not.
It's a best practice to choose a host Linux distribution with a minimal footprint because extraneous operating system apps or services that are not necessary for Kubernetes increase the attack surface needlessly. You might even use a bare-metal deployment setup, where no operating system is used at all (such as with IOT systems). It's also a best practice to enable SELinux, AppArmor, or a similar security framework on the host system. These tools add another layer of protection against certain exploits against the host. Finally, user, group, and filesystem permissions should be properly configured on the host to ensure that only user accounts that should access the Kubernetes installation can.
Keep your runtime secure and up-to-date
No container runtime used in conjunction with Kubernetes is immune to security vulnerabilities. Therefore, one can never be certain the runtime is safe. However, you can mitigate the risk by keeping the runtime up-to-date.
Leverage logging and auditing to improve security
Log data provide crucial insights into potential security breaches. It's also critical for investigating past events. However, while Kubernetes provides facilities for generating log data, it provides no features for auditing or interpreting that data for any purpose, least of all for security. You, therefore, need to adopt third-party tools to leverage Kubernetes log data as a basis for security operations.
Sumo Logic helps with this process by making it easy to aggregate and interpret Kubernetes logs. By installing the Sumo Logic Kubernetes App, teams can put Kubernetes logs to work to detect anomalous activity on Kubernetes nodes and networks, and thus gain critical visibility into their Kubernetes environments.
With Sumo Logic, you can put all these pieces together to build end-to-end observability in Kubernetes.
Setup and Collection - The entire collection process can be set up with a single Helm chart. Fluentbit, Fluentd, Prometheus, and Falco are deployed throughout the cluster to collect log, metric, event and security data.
Enrichment - Once collected, the data flows into a centralized Fluentd pipeline for metadata enrichment. Data is enriched- tagged- with the details about where it originated in the cluster; the service, deployment, namespace, node, pod, container, and their labels.
Sumo Logic - Finally, the data is sent to Sumo Logic via HTTP for storage, access, and most importantly analytics.
Note: Labels - When you create objects in Kubernetes, you can assign custom key-value pairs to each of those objects, called labels. These labels can help you organize and track additional information about each object. For example, you might have a label that represents the application name, the environment the pod is running in or perhaps what team owns this resource. These labels are entirely flexible and can be defined as you need them. Our FluentD plugin ensures that those labels are captured along with the logs, giving continuity between the resources you have created and the log files they are producing.
Metadata enrichment
Unified metadata enrichment is critical to building context about the data in your cluster and the components' hierarchy. Standalone Prometheus or Fluentd deployments give some context about the data - node, container, and pod level information - but not valuable insight into the service, deployment or namespace. Sumo Logic uses Open Telemetry to unify data collection.
Using Open Telemetry allows Sumo to eventually unify on a single collection agent for logs, metrics, and traces. This allows for no vendor lock-in (such as with using Jaeger FluentBit, Dynatrace, New Relic, etc). By centralizing the collection method, Sumo Logic’s solution allows data to correlate and discover causality across your Kubernetes infrastructure.
A key tenant to Kubernetes monitoring is having consistent metadata tagging across logs, metrics, traces, and events; without which it would be impossible to correlate data when troubleshooting. You can use this metadata when searching through your logs and metrics and use them together to have a unified experience when navigating your machine data.
The namespace overview gives quick visibility into pods experiencing issues, or in this case, in a CrashLoopBackOff state. As many of you may already know from previously troubleshooting this common error, it is most often found due to over-utilized resources and memory usage. Correlating signals in order to find causality in this case is much simpler with the use of Open Telemetry via a single agent.
Namespace overview gives quick visibility into pods experiencing issues or in this case, in a CrashLoopBackOff state.
Ingestion into Sumo Logic
There is tremendous value in having this data come to a single place. With metrics serving as the smoke detector, and logs enabling us to drill down to the root cause, unifying these data sources around a common metadata language enables us to easily correlate these signals. We can pivot from the metrics data about a cluster to the events data about a cluster to the logs data about an application.
Metadata enables us to build a hierarchical view of a cluster. By connecting pods to their services or group nodes by cluster, it becomes easier to explore the Kubernetes stack. By tapping into the Auto-discovery capabilities inherent in Prometheus, we can ensure that the hierarchy visualized in Sumo Logic is accurate and up to date.
Rich metadata enables Sumo Logic to automate building out the Explorer hierarchy of the components present in your cluster and keep the Explorer up to date as pods are added and removed.
Rich metadata enables Sumo Logic to automatically build out the explorer hierarchy of the components present in your cluster, and keep the explorer up to date as pods are added and removed.
Tying together DevOps and SecOps
Development needs to happen with security in mind. Code needs to be constructed so that logs are instructive and useful. Code analysis is critical, as is unified observability. All teams need to access the same data. There is still a deep division between these teams. As systems become more distributed, these teams (AND their data) need to come together.
Kubernetes is the perfect example of how teams can work together in distributed environments. With potentially hundreds of microservices running in an application, containerization and organized distributed systems that utilize Kubernetes become a necessity. Kubernetes is as distributed as it gets, and the architecture has a lot of built-in tooling that makes it easy to pull data from highly federated infrastructure.
Dameon sets, for example, enable standardization for monitoring across nodes. Most deployments can collect the data, but fail by sending it to various, often disparate, backends tools for analysis.
The security team is only looking at compliance and threat data. But wouldn't it be useful for them to know when deployments happen? What are the metrics across the cluster? These are useful investigative tools.
The development team might just look at the logging output to troubleshoot their application, but it is also critical that they look at the performance metrics of their application running in production.
Finally, the ITops team needs observability data for the cloud infrastructure to ensure smooth deployments, but also understand the apps running in that infrastructure.
Security visibility is available at the cluster level alongside log, metric, and event data.
Your security and development team can take this further by providing data about security policies and controls and relevant events in the context of the Kubernetes mental model.
Learn more about Sumo Logic’s DevSecOps platform for Kubernetes.
Kubernetes Observability - Free ebook
Monitoring, troubleshooting and securing Kubernetes with Sumo Logic
Metadata Enrichment
Unified metadata enrichment is critical to building context about the data in your cluster, and the hierarchy of the components present. Standalone prometheus or fluentd deployments give some context about the data — node, container, and pod level information — but not valuable insight to the service, deployment or namespace. Sumo Logic uses Fluentd as a centralized metadata pipeline to ping the API server and gain rich context about the data getting pass into Sumo Logic.
By centralizing metadata enrichment, the Sumo Logic solution reduces the load on the Kubernetes API server and ensures consistent metadata tagging across logs, metrics and events without which it would be impossible to correlate data when troubleshooting. You can use this metadata when searching through your logs and your metrics and use them together to have a unified experience when navigating your machine data.
Ingestion into Sumo Logic
There is tremendous value in having this data come to a single place. With metrics serving as the smoke detector, and logs enabling us to drill down to the root cause, unifying these data sources around a common metadata language enables us to easily correlate these signals. We can pivot from the metrics data about a cluster to the events data about a cluster to the logs data about an application.
Metadata enables us to build a hierarchical view of a cluster. By connecting pods to their services or group nodes by cluster, it becomes easier to explore the Kubernetes stack. By tapping into the Auto-discovery capabilities inherent in Prometheus, we can ensure that the hierarchy visualized in Sumo Logic is accurate and up to date.
Kubernetes Observability - Free ebook
Monitoring, troubleshooting and securing Kubernetes with Sumo Logic
Tying together DevOps and SecOps
We can take this further by providing data about security relevant events in the context of the Kubernetes mental model. Below we can see top security rules triggered in the cluster overview. Zoom in and we see this same data for the service or namespace and so on.
Displaying security information within the natural hierarchies of Kubernetes, we can enable a consistent view across DevOps and SecOp to build closer and more efficient DevSecOps cooperation.
Kubernetes security, application security, and network security
Zooming out, we can also take out Kubernetes security data and insert it in our high-level security dashboards. Combining infrastructure security, network security, full-stack security, and Kubernetes security gives us comprehensive visibility into the entire security story.
Kubernetes Observability - Free ebook
Monitoring, troubleshooting and securing Kubernetes with Sumo Logic