Glossary

TermDefinition
criContainer Runtime Interface, the contract between k8s and the container runtime
crictl
ctrDebugging tool for containerD
ociOpen Container Initiative, formalised the specification of an imagespec and a runtimespec
nerdctlDocker like CLI experience for containerD

Certification tips

Bookmarks

https://kubernetes.io/docs/reference/kubectl/conventions/

kubectl imperative commands

The --dry-run=client flag previews objects that would be sent to the cluster, without submitting the changes on the cluster.

kubectl run nginx --image=nginx --dry-run=client -o yaml
kubectl create deployment --image=nginx nginx --replicas=4 --dry-run=client -o yaml > nginx-deployment.yaml

Docker vs containerD

k8s was once coupled to docker, but loosened this by formalising this interface as the container runtime interface (CRI), which in turn leverages the open container initiative (OCI).

containerD is an independent, self installable binary with zero ties to docker or broader container related tooling. Whilst being a minimal runtime, it does come bundled with two troubleshooting utils:

ctr

For basic containerD debugging.

ctr images pull docker.io/library/redis:alpine
ctr run docker.io/library/redis:alpine redis

nerdctl

Provides a Docker compatible CLI for containerD, supporting features such as docker compose and more:

  • Encrypted container images
  • Lazy Pulling
  • P2P image distribution
  • Image signing and verifying
  • Namespaces in Kubernetes

crictl

Built by the Kubernetes community, provides a CLI for CRI compatible container runtimes to inspect and debug container runtimes and is runtime agnostic.

crictl pull busybox
crictl images
crictl ps -a
crictl exec -i -t 3e025dd50a72d956c4f14881fbb5b1080c9275674e95fb67f965f6478a957d60 ls
crictl logs 3e025dd50a72d956c4f1
crictl pods

crictl will work through a default list of sockets to bind to:

unix:///run/containerd/containerd.sock
unix:///run/crio/crio.sock
unix:///var/run/cri-dockerd.sock

These can be explicitly defined:

crictl --runtime-endpoint
export CONTAINER_RUNTIME_ENDPOINT

etcd

The key value pair (KVP) database that supports control plane storage needs. The api-server is the only component that interacts with etcd directly.

Runs on port 2379 and provides the etcdctl CLI. etcd can be run as a systemd unit etcd.service:

ExecStart=/usr/local/bin/etcd \\
   --name ${ETCD_NAME} \\
   --cert-file=/etc/etcd/kubernetes.pem \\
   --key-file=/etc/etcd/kubernetes-key.pem \\
   --peer-cert-file=/etc/etcd/kubernetes.pem \\
   --peer-key-file=/etc/etcd/kubernetes-key.pem \\
   --trusted-ca-file=/etc/etcd/ca.pem \\
   --peer-trusted-ca-file=/etc/etcd/ca.pem \\
   --peer-client-cert-auth \\
   --client-cert-auth \\
   --initial-advertise-peer-urls https://${INTERNAL_IP}:2380 \\
   --listen-peer-urls https://${INTERNAL_IP}:2380 \\
   --listen-client-urls https://${INTERNAL_IP}:2379,https://127.0.0.1:2379 \\
   --advertise-client-urls https://${INTERNAL_IP}:2379 \\
   --initial-cluster-token etcd-cluster-0 \\
   --initial-cluster controller-0=https://${CONTROLLER0_IP}:2380,controller-1=https://${CONTROLLER1_IP}:2380 \\
   --initial-cluster-state new \\
   --data-dir=/var/lib/etcd

kubeadm on the other hand will create an etcd-master pod in the kube-system namespace, which can be explored with a kubectl exec:

kubectl exec etcd-master –n kube-system etcdctl get / --prefix –keys-only

High availability (HA) etcd is supported, see --initial-cluster CLI option.

etcdctl

Major API changes were made in v3, however etcdctl supports both the v2 (and older) and v3 APIs.

The server API version it targets is shown in etcdctl --version and can be changed by setting ETCDCTL_API env var.

export ETCDCTL_API=3
./etcdctl set key1 value1
./etcdctl get key1

For encrypted communication (TLS) between etcdctl client and the etcd server, certificate files must be specified, are available in the etcd-master at the following paths:

--cacert /etc/kubernetes/pki/etcd/ca.crt
--cert /etc/kubernetes/pki/etcd/server.crt
--key /etc/kubernetes/pki/etcd/server.key

A complete working exec example:

kubectl exec etcd-master -n kube-system -- sh -c "ETCDCTL_API=3 etcdctl get / --prefix --keys-only --limit=10 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt  --key /etc/kubernetes/pki/etcd/server.key"

kube-apiserver

The gateway to interfacing with the control plane.

Running the service has a ton of options, see /etc/kubernetes/manifests/kube-apiserver.yaml.

Or the systemd unit defintion /etc/systemd/system/kube-apiserver.service:

ExecStart=/usr/local/bin/kube-apiserver \\
   --advertise-address=${INTERNAL_IP} \\
   --allow-privileged=true \\
   --apiserver-count=3 \\
   --authorization-mode=Node,RBAC \\
   --bind-address=0.0.0.0 \\
   --enable-admission-plugins=Initializers,NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \\
   --enable-swagger-ui=true \\
   --etcd-servers=https://127.0.0.1:2379 \\
   --event-ttl=1h \\
   --experimental-encryption-provider-config=/var/lib/kubernetes/encryption-config.yaml \\
   --runtime-config=api/all \\
   --service-account-key-file=/var/lib/kubernetes/service-account.pem \\
   --service-cluster-ip-range=10.32.0.0/24 \\
   --service-node-port-range=30000-32767 \\
   --v=2

kubeadm will provision an kube-apiserver-master pod in the kube-system namespace.

kube-controller-manager

Kubernetes ships with several built-in controllers that handle different aspects of maintaining the desired state of the cluster. These controllers run inside the kube-controller-manager, except for a few like the cloud controller, which might run separately in cloud-native environments.

  • Node Controller: Manages the lifecycle of nodes, detects and responds to node failures.
  • Replication Controller: Ensures the specified number of pod replicas are running at any time.
  • Deployment Controller: Manages deployments, ensuring that a desired state of replicas and pods is maintained, handles rolling updates and rollbacks.
  • ReplicaSet Controller: Manages ReplicaSets, ensuring the specified number of pod replicas for a ReplicaSet.
  • StatefulSet Controller: Manages StatefulSets, which are used for stateful applications, ensures stable network identities and persistent storage for pods.
  • DaemonSet Controller: Ensures a pod runs on all or specific nodes in the cluster, used for node-level tasks like logging or monitoring agents.
  • Job Controller: Manages batch jobs, ensuring they complete successfully, a job represents a finite task that runs to completion.
  • CronJob Controller: Manages CronJobs, which run Jobs on a time-based schedule.
  • Service Controller: Creates and manages load balancers in cloud environments when a Service of type LoadBalancer is created.
  • EndpointSlice Controller: Manages EndpointSlice objects for improving scalability and reliability over traditional Endpoints.
  • Horizontal Pod Autoscaler (HPA) Controller: Adjusts the number of pods in a deployment, replica set, or stateful set based on observed CPU/memory usage or custom metrics.
  • Vertical Pod Autoscaler (VPA) Controller: Recommends or automatically adjusts the CPU/memory resource requests and limits for pods.
  • Namespace Controller: Manages namespace lifecycle, ensures resources within a deleted namespace are also deleted.
  • ServiceAccount Controller: Creates default ServiceAccounts for new namespaces.
  • PersistentVolume Controller: Watches and manages PersistentVolumes and PersistentVolumeClaims, ensures dynamic provisioning of storage when required.
  • PersistentVolumeClaim Binder Controller: Binds PersistentVolumeClaims to appropriate PersistentVolumes.
  • Garbage Collector Controller: Automatically deletes resources that are no longer referenced, like dependent objects after their owner is deleted.
  • Token Controller: Manages secrets containing API tokens for service accounts.
  • CertificateSigningRequest (CSR) Controller: Handles the approval and management of certificate signing requests.
  • Ingress Controller (Not part of kube-controller-manager): Typically a separate component, manages HTTP/HTTPS traffic routing to services based on defined Ingress rules.

ReplicaSets

Maintain a stable set of replica Pods running at any given time. Usually, you define a Deployment and let that Deployment manage ReplicaSets automatically.

The selector is noteworthy, this can match labels against existing pods. However, if there are an insufficient number of replicas, the template will be used to spawn more. Therefore its crucial that the labels in the template, match the selector.

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: frontend
  labels:
    app: guestbook
    tier: frontend
spec:
  replicas: 3 # mandatory
  selector: # mandatory
    matchLabels:
      tier: frontend
  template:
    metadata:
      labels:
        tier: frontend
    spec:
      containers:
        - name: php-redis
          image: us-docker.pkg.dev/google-samples/containers/gke/gb-frontend:v5

A ReplicaSet can be scaled imperatively on the CLI:

kubectl scale --replicas=6 -f foo-replicaset.yaml

Deployments

Enables declarative updates for Pods and ReplicaSets and provides support for higher order deploy scenarios such as managed rolling updates, where a new version of pods can be incrementally validated, rolled out, while gracefully decommissioning old pods.

Services

Exposes an application running in your cluster behind a single outward-facing endpoint, even when the workload is split across multiple backends.

Given the dynamic and ephemeral nature of Pods, new instances, less instances, healthy, unhealthy, just keeping track of the Pod instances is complex, let alone managing and binding to their IP addresses.

An example, suppose you have a set of Pods that each listen on TCP port 9376 and are labelled as app.kubernetes.io/name=MyApp, here’s a Service to publish that TCP listener:

apiVersion: v1
kind: Service
metadata:
  name: das-service
spec:
  type: NodePort
  selector:
    app.kubernetes.io/name: MyApp
  ports:
    - targetPort: 9376
      port: 80
      nodePort: 30008

There’s a few options to bind a service to an externally routable IP:

  • ClusterIP: (default) assigns the service a cluster-internal IP, making it reachable only within the cluster
  • NodePort: Exposes the Service on each Node’s IP at a static port (30000-32767)
  • LoadBalancer: Exposes the Service externally using an external load balancer (Kubernetes does not offer a load balancing component)
  • ExternalName: Maps the Service to the contents of the externalName field (e.g. api.foo.bar.example).

Namespaces

Provide a mechanism for isolating groups of resources within a single cluster. Names of resources need to be unique within a namespace, but not across namespaces. Namespace-based scoping is applicable only for namespaced objects (e.g. Deployments, Services, etc) and not for cluster-wide objects (e.g. StorageClass, Nodes, PersistentVolumes, Ingresses etc).

I think this is amazing, kubernetes itself runs on kubernetes, the ultimate dog fooding, it installs itself under the following namespaces:

  • default: Kubernetes includes this namespace so that you can start using your new cluster without first creating a namespace.
  • kube-node-lease: This namespace holds Lease objects associated with each node. Node leases allow the kubelet to send heartbeats so that the control plane can detect node failure.
  • kube-public: This namespace is readable by all clients (including those not authenticated). This namespace is mostly reserved for cluster usage, in case that some resources should be visible and readable publicly throughout the whole cluster. The public aspect of this namespace is only a convention, not a requirement.
  • kube-system: The namespace for objects created by the Kubernetes system.

Resources within a namespace can be DNS resolved:

mysql.connect('mydb')

However, connecting to another namespace (e.g. dev) requires a suffix:

mysql.connect('mydb.dev.svc.cluster.local')

Working with namespaces

# List all namespaces
kubectl get namespaces --show-labels
# Set default namespace
kubectl config set-context --current --namespace=<insert-namespace-name-here>
# Validate it
kubectl config view --minify | grep namespace:

Creating a new namespace

Declaratively

apiVersion: v1
kind: Namespace
metadata:
  name: <insert-namespace-name-here>

Imperatively

kubectl create namespace <insert-namespace-name-here>

Namespace-less objects

# In a namespace
kubectl api-resources --namespaced=true
# Not in a namespace
kubectl api-resources --namespaced=false

Gems

  • RFC1035: Some resource types require their names to follow the DNS label standard, at most 63 characters, only lowercase alphanumeric characters or ‘-’, start with an alphabetic character and end with an alphanumeric character
  • Gateway API An official Kubernetes project focused on L4 and L7 routing in Kubernetes. This project represents the next generation of Kubernetes Ingress, Load Balancing, and Service Mesh APIs. From the outset, it has been designed to be generic, expressive, and role-oriented.
  • Headless Services allows a client to connect to whichever Pod it prefers, directly. Services that are headless don’t configure routes and packet forwarding using virtual IP addresses and proxies; instead, headless Services report the endpoint IP addresses of the individual pods via internal DNS records, served through the cluster’s DNS service.