There is quite a lot of confusion about labels and annotations in Kubernetes. Both are key-value pairs attached to objects, so what exactly is the difference, and why did the creators of Kubernetes introduce both of them? This article explores this question in detail.
It is interesting to note that both labels and annotations were present in Kubernetes right from the start. Looking at the official Kubernetes documentation can leave one perplexed; here is what it says about labels:
Labels are key/value pairs attached to objects such as Pods. They are intended to specify identifying attributes of objects that are meaningful and relevant to users but do not directly imply semantics to the core system. Labels can be used to organize and select subsets of objects. They can be attached to objects at creation time and subsequently added and modified at any time. Each object can have a defined set of key/value labels. Each Key must be unique for a given object.
"metadata": {
"labels": {
"key1" : "value1",
"key2" : "value2"
}
}
Labels allow for efficient queries and watches and are ideal for use in UIs and CLIs. Non-identifying information should be recorded using annotations.
And here is what it says about annotations:
You can use Kubernetes annotations to attach arbitrary non-identifying metadata to objects. Clients such as tools and libraries can retrieve this metadata.
This is all a bit cryptic, but in a nutshell, it means that, broadly speaking, labels are used to identify objects, and annotations are used to attach properties to objects. Of course, there are nuances and overlaps, and we will go over the details in this article.
Summary of Key Differences Between Kubernetes Labels and Annotations
Consideration | Labels | Annotations |
---|---|---|
Communicate with built-in Kubernetes controllers? | Yes | No |
Identify objects and allow filtering and selecting? | Yes | No |
Communicate with non-built-in Kubernetes controllers? | Possible, but not the primary purpose, and essentially not used for that purpose in practice | Yes |
Provide information to humans and automated processes about Kubernetes objects? | Possible, but not the primary purpose | Yes |
Note that the official Kubernetes documentation recommends well-known labels and annotations. The best practice when deploying resources on Kubernetes is to follow these recommendations.
Practical Usage Examples of Kubernetes Labels vs Annotations
To illustrate the various use cases we will explore in this article, we’ll use a minikube cluster to run some commands. To create a local minikube cluster, you just need to install the latest versions of minikube, VirtualBox (or some other minikube backend), and kubectl. Once those tools are installed, run the following to create a minikube cluster:
$ minikube start
Once minikube has started, you can then run the following command to check that everything is running well:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
minikube Ready control-plane 2m8s v1.27.4
Communicating with Built-in Kubernetes Controllers
The primary purpose of labels is to communicate with built-in Kubernetes controllers, which is why they were introduced in the first place. Labels are used to identify resources, especially from built-in Kubernetes controllers, based on the presence and values of labels.
A bit of clarification might be needed here. Built-in controllers are developed and released as part of Kubernetes; they are essential to the basic functions of Kubernetes. Other types of controllers are optional and installed as add-ons to a given Kubernetes cluster.
Let’s have a look at a simple example: a deployment. A deployment is a Kubernetes object that manages a number of identical pods, called replicas, using another Kubernetes object called a ReplicaSet. A Deployment object is managed by the Deployment controller, which creates a ReplicaSet object, which in turn is managed by the ReplicaSet controller. In both cases, both controllers will try their best to ensure that the state of the objects match the expectations (in this case, the number of pods).
Let’s create a simple deployment, like this:
$ kubectl create deployment apache --image=httpd --replicas=2
deployment.apps/apache created
$ kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
apache 2/2 2 2 32s
The Kubernetes Deployment controller creates a ReplicaSet:
$ kubectl get replicaset
NAME DESIRED CURRENT READY AGE
apache-55785dd485 2 2 2 59s
The ReplicaSet controller then creates and manages pods based on their label value. Additionally, the label app=apache
was applied automatically upon deployment:
$ kubectl get pod --label-columns=app
NAME READY STATUS RESTARTS AGE APP
apache-55785dd485-7dlxx 1/1 Running 0 114s apache
apache-55785dd485-dbxgp 1/1 Running 0 114s apache
The ReplicaSet controller uses the pods’ labels to identify which pods are part of the replica set. For example, let’s change the app label of one of these pods from ‘apache’ to ‘another’:
$ kubectl label --overwrite pod apache-55785dd485-7dlxx app=another
pod/apache-55785dd485-7dlxx labeled
$ kubectl get pod --label-columns=app
NAME READY STATUS RESTARTS AGE APP
apache-55785dd485-7dlxx 1/1 Running 0 2m53s another
apache-55785dd485-9c7sh 1/1 Running 0 20s apache
apache-55785dd485-dbxgp 1/1 Running 0 2m53s apache
The ReplicaSet controller expects to find two pods matching the requirements (in this case, the label app with the corresponding value ’apache’), but after our change, it sees that there is only one pod with the correct label. This causes it to spin a new replica with the app=apache label to maintain the desired state. This illustrates how Kubernetes built-in controllers use labels to identify resources. However, built-in controllers do not look at annotations at all. In the example above, you could add as many annotations as you want to the pods—nothing would change for the ReplicaSet controller and the pods it controls.
Filtering and Selecting Objects
Another use for labels is closely related to the one we just saw: it instructs the Kubernetes API to filter and select Kubernetes objects based on the presence and value of certain labels.
To continue with the example started in the previous section, we can select the pods with a label app with a value of another, like so:
$ kubectl get pod -l app=another
NAME READY STATUS RESTARTS AGE
apache-55785dd485-7dlxx 1/1 Running 0 3m33s
We can also use labels to perform other operations, such as deleting an unwanted pod:
$ kubectl delete pod -l app=another
pod "apache-55785dd485-7dlxx" deleted
Finally, using labels can be a convenient way to view the combined logs of an application’s replicas without having to reference each pod’s full name in separate commands:
$ kubectl logs -l app=apache
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 10.244.0.25. Set the 'ServerName' directive globally to suppress this message
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 10.244.0.25. Set the 'ServerName' directive globally to suppress this message
[Sun Jul 28 08:50:52.754519 2024] [mpm_event:notice] [pid 1:tid 1] AH00489: Apache/2.4.62 (Unix) configured -- resuming normal operations
[Sun Jul 28 08:50:52.754608 2024] [core:notice] [pid 1:tid 1] AH00094: Command line: 'httpd -D FOREGROUND'
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 10.244.0.23. Set the 'ServerName' directive globally to suppress this message
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 10.244.0.23. Set the 'ServerName' directive globally to suppress this message
[Sun Jul 28 08:41:32.372255 2024] [mpm_event:notice] [pid 1:tid 1] AH00489: Apache/2.4.62 (Unix) configured -- resuming normal operations
[Sun Jul 28 08:41:32.372361 2024] [core:notice] [pid 1:tid 1] AH00094: Command line: 'httpd -D FOREGROUND'
This illustrates how we can instruct the Kubernetes API to select and filter based on the value of labels. Note that the presence of labels, regardless of value, can also be used for filtering:
$ kubectl get pod -l app
NAME READY STATUS RESTARTS AGE
apache-55785dd485-9c7sh 1/1 Running 0 4m20s
apache-55785dd485-dbxgp 1/1 Running 0 7m13s
Communicating with Non-built-in Controllers
Non-built-in Kubernetes controllers are controllers that do not come as part of Kubernetes and are installed on top of an existing cluster. Ingress controllers are a typical example of non-built-in controllers because Kubernetes does not provide controllers for ingresses by default. Examples of such ingress controllers are the NGINX Ingress Controller and the AWS Load Balancer Controller.
Non-built-in controllers make use of annotations to pick up the configurations they need to apply to the corresponding Kubernetes objects. In this way, you can see that annotations are carrying configuration information.
As an example, let’s install the NGINX Ingress Controller and see how annotations are used to communicate with it. The NGINX Ingress Controller can be installed in several ways. Since we are using minikube, the simplest way is just to enable it using the command line (when using other types of Kubernetes clusters, you might need to install it using the official manifest file or Helm chart):
$ minikube addons enable ingress
After a few minutes, minikube should tell you that the NGINX Ingress Controller has been installed successfully. You can verify that it has been installed properly like this:
$ kubectl -n ingress-nginx get pod
NAME READY STATUS RESTARTS AGE
ingress-nginx-admission-create-blt2q 0/1 Completed 0 3m8s
ingress-nginx-admission-patch-qs8x2 0/1 Completed 0 3m8s
ingress-nginx-controller-7799c6795f-gcg52 1/1 Running 0 3m8s
Next, let’s create a Service for our previous Deployment, like this:
$ kubectl expose deployment/apache --port=80
service/apache exposed
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
apache ClusterIP 10.108.167.187 <none> 80/TCP 3s
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 13m
You can quickly test the apache deployment for connectivity by using a port forward, which tells the service to listen locally on port 8080 for http traffic, and forwards that traffic to port 80 on the pod.
$ kubectl port-forward svc/apache 8080:80
Forwarding from 127.0.0.1:8080 -> 80
Now, in another terminal, use curl to request a response from apache:
$ curl http://localhost:8080
<html><body><h1>It works!</h1></body></html>
Now that we’ve tested connectivity to Apache, let’s create an Ingress to map incoming traffic from outside of the cluster to internal cluster services. In the following example, we are mapping requests for the root path “/” to port 80 on the pod: :
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: apache
annotations:
nginx.ingress.kubernetes.io/server-snippet: |
return 200 "testing";
spec:
ingressClassName: nginx
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: apache
port:
number: 80
Before deploying it, we’ll need to start a minikube tunnel():
minikube tunnel
Note: Run the above command in a separate terminal window as it needs to keep running.
Then, in your original terminal, continue with the following steps:
First, enable snippet annotations in the NGINX controller
kubectl patch configmap nginx-load-balancer-conf -n kube-system --patch '{"data":{"allow-snippet-annotations": "true"}}'
Restart the ingress controller to apply changes
kubectl -n kube-system rollout restart deployment ingress-nginx-controller
Wait for the controller to be ready
kubectl -n kube-system rollout status deployment ingress-nginx-controller
Now apply our ingress resource and verify it is ready
kubectl apply -f ingress.yaml
Now apply our ingress resource
$ kubectl get ingresses
NAME CLASS HOSTS ADDRESS PORTS AGE
apache nginx * 80 3s
Test the ingress to see our custom response from the snippet
$ curl http://localhost
testing
Please note the use of annotations in the ingress manifest file. The NGINX Ingress Controller reads these annotations to configure the ingress. Unlike labels, they are not used for identification, and the NGINX Ingress Controller does not use labels to decide how to configure the ingress.
Attaching Information and Metadata to Kubernetes Objects
Labels can provide information about Kubernetes objects, but typically, they do not include anything beyond the metadata used to identify such objects. Additional information must be attached in the form of annotations. In other words, using labels as a source of information is incidental to their primary purpose, which is to allow built-in Kubernetes controllers to identify objects. Let’s have a look at how to do that with Helm (you will need to install it if you have not done so already).
Let’s create a basic chart:
$ helm create tst
Create a values.yaml file with the following content:
podAnnotations:
prometheus.io/scrape: "true"
prometheus.io/port: "10254"
Now let’s install this Helm chart and then have a look at the annotations attached to the created pod:
$ helm install tst ./tst -f values.yaml
...
$ kubectl get pod -l app.kubernetes.io/name=tst
-o jsonpath='{.items[*].metadata.annotations}'
{"prometheus.io/port":"10254","prometheus.io/scrape":"true"}
As you can see, annotations containing information/metadata about the pod have been attached to it. In this case, an appropriately configured Prometheus server can use these annotations to scrape metrics from this pod on port 10254. This example illustrates how metadata attached to a Kubernetes object can be used by an automated process to perform certain actions.
Similarly, annotations could be added with a human audience in mind. For example, annotations could show the owner of the Kubernetes object or which department to contact for support about this object.
Well-known Labels and Annotations
The official Kubernetes documentation lists many “well-known” labels and annotations. “Well-known,” in this case, means that Kubernetes and many tools across the Kubernetes ecosystem will recognize these labels and annotations as having a certain meaning.
Here are some examples of well-known labels:
- app.kubernetes.io/instance: A unique name to identify Kubernetes objects belonging to an application. The default Helm chart uses the release name to populate this label.
- app.kubernetes.io/name: The name of the application the object belongs to. The default Helm chart uses the chart name to populate this label. The same chart could be deployed multiple times with different release names, so it is possible to have objects with the same app.kubernetes.io/name labels but different values for the app.kubernetes.io/instance labels.
- app.kubernetes.io/managed-by: The name of the tool that manages this object. Once again, the default Helm chart sets this label to “Helm” to indicate that Helm manages this object.
- app.kubernetes.io/version: The version of the deployed application.
Some examples of well-known annotations:
- cluster-autoscaler.kubernetes.io/safe-to-evict: Set to “true” to inform the Cluster Autoscaler that it is safe to terminate this pod if it runs on a node that needs to be terminated, even if other rules would normally prevent this action.
- kubernetes.io/description: A free-form text description of the object.
- kubernetes.io/enforce-mountable-secrets: When set to “true”, Kubernetes will check that data mounted by the ServiceAccount is secret. If not, Kubernetes will refuse to create the object.
- service.kubernetes.io/topology-mode: This annotation informs the service to consider certain elements of the network topology when deciding how to route traffic.
As you can see once again, labels identify objects and are used by built-in controllers, while annotations contain informational data and configuration.
Best practices for using Labels and Annotations
When deciding which labels and annotations to add to a certain Kubernetes object, the list of “well-known” labels and annotations found in the official Kubernetes documentation should definitely be your first port of call.
Regarding labels specifically, you should at least consider the recommended ones. Usually, it is a good idea to keep things simple and purposeful by deliberately selecting which labels to add and not adding too many. In addition, you should anticipate how some labels could be useful in the future, such as a label indicating the version of the software being deployed.
You typically won’t have to guess what annotations to add—this choice will be dictated by which tools you are using. These tools will have documentation telling you which annotations to add and with which values in order to achieve certain goals. If you have some tools or scripts that your organization is developing internally, again, the required annotations will typically be documented by the team that developed the tool.
It will be a very rare occurrence to be confused as to whether you need to add a label or an annotation. Typically, when using a tool that is not part of Kubernetes, its documentation will tell you what you need to do. And you won’t use annotations for Kubernetes objects controlled by Kubernetes’ built-in controllers.
Conclusion
It is very common to be confused by labels and annotations and to be puzzled at what the difference actually is between the two. The official Kubernetes documentation is quite mysterious when it comes to explaining the difference between labels and annotations.
The essence of this difference is that labels identify objects, while annotations carry non-identifying information, such as configurations. Labels are typically used by Kubernetes built-in controllers, and annotations are used by pretty much everything else. The use of labels by components other than built-in controllers is typically incidental to their primary use, which is to help built-in controllers identify and select Kubernetes objects.
A practical knowledge of the differences between labels and annotations can help engineers adhere to Kubernetes best practices. Using the well-known labels and annotations and applying instructions found in various documentation will eliminate any doubt about whether a certain key/value pair must be added as a label or annotation.
Related Blogs
How I Rethought Cloud Budgeting—And What Finance Leaders Need to Know
If you’re a finance leader trying to bring more structure and strategy to cloud budgeting, you’re not alone. While most…