- Glossary
- Certification tips
- Docker vs containerD
- etcd
- kube-apiserver
- kube-controller-manager
- ReplicaSets
- Deployments
- Services
- Namespaces
- Gems
Glossary
Term | Definition |
---|---|
cri | Container Runtime Interface, the contract between k8s and the container runtime |
crictl | |
ctr | Debugging tool for containerD |
oci | Open Container Initiative, formalised the specification of an imagespec and a runtimespec |
nerdctl | Docker like CLI experience for containerD |
Certification tips
Bookmarks
https://kubernetes.io/docs/reference/kubectl/conventions/
kubectl imperative commands
The --dry-run=client
flag previews objects that would be sent to the cluster, without submitting the changes on the cluster.
kubectl run nginx --image=nginx --dry-run=client -o yaml
kubectl create deployment --image=nginx nginx --replicas=4 --dry-run=client -o yaml > nginx-deployment.yaml
Docker vs containerD
k8s was once coupled to docker, but loosened this by formalising this interface as the container runtime interface (CRI), which in turn leverages the open container initiative (OCI).
containerD is an independent, self installable binary with zero ties to docker or broader container related tooling. Whilst being a minimal runtime, it does come bundled with two troubleshooting utils:
ctr
For basic containerD debugging.
ctr images pull docker.io/library/redis:alpine
ctr run docker.io/library/redis:alpine redis
nerdctl
Provides a Docker compatible CLI for containerD, supporting features such as docker compose
and more:
- Encrypted container images
- Lazy Pulling
- P2P image distribution
- Image signing and verifying
- Namespaces in Kubernetes
crictl
Built by the Kubernetes community, provides a CLI for CRI compatible container runtimes to inspect and debug container runtimes and is runtime agnostic.
crictl pull busybox
crictl images
crictl ps -a
crictl exec -i -t 3e025dd50a72d956c4f14881fbb5b1080c9275674e95fb67f965f6478a957d60 ls
crictl logs 3e025dd50a72d956c4f1
crictl pods
crictl
will work through a default list of sockets to bind to:
unix:///run/containerd/containerd.sock
unix:///run/crio/crio.sock
unix:///var/run/cri-dockerd.sock
These can be explicitly defined:
crictl --runtime-endpoint
export CONTAINER_RUNTIME_ENDPOINT
etcd
The key value pair (KVP) database that supports control plane storage needs. The api-server
is the only component that interacts with etcd
directly.
Runs on port 2379 and provides the etcdctl
CLI. etcd
can be run as a systemd unit etcd.service
:
ExecStart=/usr/local/bin/etcd \\
--name ${ETCD_NAME} \\
--cert-file=/etc/etcd/kubernetes.pem \\
--key-file=/etc/etcd/kubernetes-key.pem \\
--peer-cert-file=/etc/etcd/kubernetes.pem \\
--peer-key-file=/etc/etcd/kubernetes-key.pem \\
--trusted-ca-file=/etc/etcd/ca.pem \\
--peer-trusted-ca-file=/etc/etcd/ca.pem \\
--peer-client-cert-auth \\
--client-cert-auth \\
--initial-advertise-peer-urls https://${INTERNAL_IP}:2380 \\
--listen-peer-urls https://${INTERNAL_IP}:2380 \\
--listen-client-urls https://${INTERNAL_IP}:2379,https://127.0.0.1:2379 \\
--advertise-client-urls https://${INTERNAL_IP}:2379 \\
--initial-cluster-token etcd-cluster-0 \\
--initial-cluster controller-0=https://${CONTROLLER0_IP}:2380,controller-1=https://${CONTROLLER1_IP}:2380 \\
--initial-cluster-state new \\
--data-dir=/var/lib/etcd
kubeadm
on the other hand will create an etcd-master
pod in the kube-system
namespace, which can be explored with a kubectl exec
:
kubectl exec etcd-master –n kube-system etcdctl get / --prefix –keys-only
High availability (HA) etcd is supported, see --initial-cluster
CLI option.
etcdctl
Major API changes were made in v3, however etcdctl
supports both the v2 (and older) and v3 APIs.
The server API version it targets is shown in etcdctl --version
and can be changed by setting ETCDCTL_API
env var.
export ETCDCTL_API=3
./etcdctl set key1 value1
./etcdctl get key1
For encrypted communication (TLS) between etcdctl
client and the etcd
server, certificate files must be specified, are available in the etcd-master
at the following paths:
--cacert /etc/kubernetes/pki/etcd/ca.crt
--cert /etc/kubernetes/pki/etcd/server.crt
--key /etc/kubernetes/pki/etcd/server.key
A complete working exec
example:
kubectl exec etcd-master -n kube-system -- sh -c "ETCDCTL_API=3 etcdctl get / --prefix --keys-only --limit=10 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key"
kube-apiserver
The gateway to interfacing with the control plane.
Running the service has a ton of options, see /etc/kubernetes/manifests/kube-apiserver.yaml
.
Or the systemd unit defintion /etc/systemd/system/kube-apiserver.service
:
ExecStart=/usr/local/bin/kube-apiserver \\
--advertise-address=${INTERNAL_IP} \\
--allow-privileged=true \\
--apiserver-count=3 \\
--authorization-mode=Node,RBAC \\
--bind-address=0.0.0.0 \\
--enable-admission-plugins=Initializers,NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \\
--enable-swagger-ui=true \\
--etcd-servers=https://127.0.0.1:2379 \\
--event-ttl=1h \\
--experimental-encryption-provider-config=/var/lib/kubernetes/encryption-config.yaml \\
--runtime-config=api/all \\
--service-account-key-file=/var/lib/kubernetes/service-account.pem \\
--service-cluster-ip-range=10.32.0.0/24 \\
--service-node-port-range=30000-32767 \\
--v=2
kubeadm
will provision an kube-apiserver-master
pod in the kube-system
namespace.
kube-controller-manager
Kubernetes ships with several built-in controllers that handle different aspects of maintaining the desired state of the cluster. These controllers run inside the kube-controller-manager
, except for a few like the cloud controller, which might run separately in cloud-native environments.
- Node Controller: Manages the lifecycle of nodes, detects and responds to node failures.
- Replication Controller: Ensures the specified number of pod replicas are running at any time.
- Deployment Controller: Manages deployments, ensuring that a desired state of replicas and pods is maintained, handles rolling updates and rollbacks.
- ReplicaSet Controller: Manages ReplicaSets, ensuring the specified number of pod replicas for a ReplicaSet.
- StatefulSet Controller: Manages StatefulSets, which are used for stateful applications, ensures stable network identities and persistent storage for pods.
- DaemonSet Controller: Ensures a pod runs on all or specific nodes in the cluster, used for node-level tasks like logging or monitoring agents.
- Job Controller: Manages batch jobs, ensuring they complete successfully, a job represents a finite task that runs to completion.
- CronJob Controller: Manages CronJobs, which run Jobs on a time-based schedule.
- Service Controller: Creates and manages load balancers in cloud environments when a Service of type LoadBalancer is created.
- EndpointSlice Controller: Manages EndpointSlice objects for improving scalability and reliability over traditional Endpoints.
- Horizontal Pod Autoscaler (HPA) Controller: Adjusts the number of pods in a deployment, replica set, or stateful set based on observed CPU/memory usage or custom metrics.
- Vertical Pod Autoscaler (VPA) Controller: Recommends or automatically adjusts the CPU/memory resource requests and limits for pods.
- Namespace Controller: Manages namespace lifecycle, ensures resources within a deleted namespace are also deleted.
- ServiceAccount Controller: Creates default ServiceAccounts for new namespaces.
- PersistentVolume Controller: Watches and manages PersistentVolumes and PersistentVolumeClaims, ensures dynamic provisioning of storage when required.
- PersistentVolumeClaim Binder Controller: Binds PersistentVolumeClaims to appropriate PersistentVolumes.
- Garbage Collector Controller: Automatically deletes resources that are no longer referenced, like dependent objects after their owner is deleted.
- Token Controller: Manages secrets containing API tokens for service accounts.
- CertificateSigningRequest (CSR) Controller: Handles the approval and management of certificate signing requests.
- Ingress Controller (Not part of kube-controller-manager): Typically a separate component, manages HTTP/HTTPS traffic routing to services based on defined Ingress rules.
ReplicaSets
Maintain a stable set of replica Pods running at any given time. Usually, you define a Deployment and let that Deployment manage ReplicaSets automatically.
The selector
is noteworthy, this can match labels against existing pods. However, if there are an insufficient number of replicas, the template
will be used to spawn more. Therefore its crucial that the labels in the template, match the selector.
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: frontend
labels:
app: guestbook
tier: frontend
spec:
replicas: 3 # mandatory
selector: # mandatory
matchLabels:
tier: frontend
template:
metadata:
labels:
tier: frontend
spec:
containers:
- name: php-redis
image: us-docker.pkg.dev/google-samples/containers/gke/gb-frontend:v5
A ReplicaSet can be scaled imperatively on the CLI:
kubectl scale --replicas=6 -f foo-replicaset.yaml
Deployments
Enables declarative updates for Pods and ReplicaSets and provides support for higher order deploy scenarios such as managed rolling updates, where a new version of pods can be incrementally validated, rolled out, while gracefully decommissioning old pods.
Services
Exposes an application running in your cluster behind a single outward-facing endpoint, even when the workload is split across multiple backends.
Given the dynamic and ephemeral nature of Pods, new instances, less instances, healthy, unhealthy, just keeping track of the Pod instances is complex, let alone managing and binding to their IP addresses.
An example, suppose you have a set of Pods that each listen on TCP port 9376 and are labelled as app.kubernetes.io/name=MyApp
, here’s a Service to publish that TCP listener:
apiVersion: v1
kind: Service
metadata:
name: das-service
spec:
type: NodePort
selector:
app.kubernetes.io/name: MyApp
ports:
- targetPort: 9376
port: 80
nodePort: 30008
There’s a few options to bind a service to an externally routable IP:
- ClusterIP: (default) assigns the service a cluster-internal IP, making it reachable only within the cluster
- NodePort: Exposes the Service on each Node’s IP at a static port (30000-32767)
- LoadBalancer: Exposes the Service externally using an external load balancer (Kubernetes does not offer a load balancing component)
- ExternalName: Maps the Service to the contents of the externalName field (e.g.
api.foo.bar.example
).
Namespaces
Provide a mechanism for isolating groups of resources within a single cluster. Names of resources need to be unique within a namespace, but not across namespaces. Namespace-based scoping is applicable only for namespaced objects (e.g. Deployments, Services, etc) and not for cluster-wide objects (e.g. StorageClass, Nodes, PersistentVolumes, Ingresses etc).
I think this is amazing, kubernetes itself runs on kubernetes, the ultimate dog fooding, it installs itself under the following namespaces:
default
: Kubernetes includes this namespace so that you can start using your new cluster without first creating a namespace.kube-node-lease
: This namespace holds Lease objects associated with each node. Node leases allow the kubelet to send heartbeats so that the control plane can detect node failure.kube-public
: This namespace is readable by all clients (including those not authenticated). This namespace is mostly reserved for cluster usage, in case that some resources should be visible and readable publicly throughout the whole cluster. The public aspect of this namespace is only a convention, not a requirement.kube-system
: The namespace for objects created by the Kubernetes system.
Resources within a namespace can be DNS resolved:
mysql.connect('mydb')
However, connecting to another namespace (e.g. dev) requires a suffix:
mysql.connect('mydb.dev.svc.cluster.local')
Working with namespaces
# List all namespaces
kubectl get namespaces --show-labels
# Set default namespace
kubectl config set-context --current --namespace=<insert-namespace-name-here>
# Validate it
kubectl config view --minify | grep namespace:
Creating a new namespace
Declaratively
apiVersion: v1
kind: Namespace
metadata:
name: <insert-namespace-name-here>
Imperatively
kubectl create namespace <insert-namespace-name-here>
Namespace-less objects
# In a namespace
kubectl api-resources --namespaced=true
# Not in a namespace
kubectl api-resources --namespaced=false
Gems
- RFC1035: Some resource types require their names to follow the DNS label standard, at most 63 characters, only lowercase alphanumeric characters or ‘-’, start with an alphabetic character and end with an alphanumeric character
- Gateway API An official Kubernetes project focused on L4 and L7 routing in Kubernetes. This project represents the next generation of Kubernetes Ingress, Load Balancing, and Service Mesh APIs. From the outset, it has been designed to be generic, expressive, and role-oriented.
- Headless Services allows a client to connect to whichever Pod it prefers, directly. Services that are headless don’t configure routes and packet forwarding using virtual IP addresses and proxies; instead, headless Services report the endpoint IP addresses of the individual pods via internal DNS records, served through the cluster’s DNS service.