Akash for MechCloud Academy

Posted on May 27

Understanding Kubernetes Architecture: A Beginner's Guide

#kubernetes

Kubernetes, often referred to as K8s, is a powerful open-source platform for automating the deployment, scaling, and management of containerized applications. Its architecture is designed as a distributed system, with components spread across multiple nodes to ensure scalability and reliability. This guide will break down the Kubernetes architecture, explaining its core components and how they work together to manage containerized workloads effectively.

If you're looking to:

Understand the structure of a Kubernetes cluster
Learn about the roles of control plane and worker nodes
Explore how Kubernetes components interact
Gain insights into the workflows driving Kubernetes

This guide is for you. Let’s dive into the architecture of Kubernetes and make it simple to grasp.

What is Kubernetes Architecture?

Kubernetes operates as a distributed system, meaning it runs across multiple servers (virtual machines or bare-metal) that form a Kubernetes cluster. A cluster consists of two main types of nodes:

Control Plane Nodes: Responsible for managing the cluster and orchestrating containers.
Worker Nodes: Responsible for running the containerized applications.

The following sections will explore the components of each node type and their roles in the cluster.

Kubernetes Control Plane Components

The control plane is the brain of the Kubernetes cluster, managing its overall state and orchestrating workloads. It consists of several key components that work together to maintain the desired state of the cluster.

1. kube-apiserver

The kube-apiserver is the central hub of the Kubernetes cluster, exposing the Kubernetes API. It acts as the primary interface for all communication within the cluster, handling requests from users, CLI tools like kubectl, and other components.

Key responsibilities of the kube-apiserver:

API Management: Exposes RESTful and gRPC APIs for cluster operations, supporting multiple API versions for compatibility.
Authentication and Authorization: Validates users and components using mechanisms like client certificates, bearer tokens, or RBAC policies.
Request Processing: Validates and processes API requests for Kubernetes objects (e.g., pods, services).
Coordination: Facilitates communication between control plane and worker node components.
Watch Mechanism: Allows components to monitor resource changes in real-time.

Security Note: The kube-apiserver must be secured with TLS to prevent unauthorized access. Tools like kubectl proxy, kubectl port-forward, and kubectl exec rely on the API server to interact with the cluster.

2. etcd

etcd is a distributed, strongly consistent key-value store that serves as the cluster’s database. It stores all the configuration data, state, and metadata for Kubernetes objects, such as pods, deployments, and secrets.

Key features of etcd:

Strong Consistency: Ensures immediate synchronization of data across nodes using the Raft consensus algorithm.
Distributed Design: Runs across multiple nodes for high availability and fault tolerance.
Watch Functionality: Allows components to monitor changes to objects in real-time via the Watch() API.
Storage Structure: Stores Kubernetes objects under the /registry key (e.g., /registry/pods/default/nginx for a pod named Nginx).

Fault Tolerance: The number of etcd nodes determines fault tolerance:

3 nodes: Tolerates 1 failure
5 nodes: Tolerates 2 failures
Formula: Fault tolerance = (n - 1) / 2, where n is the number of nodes.

3. kube-scheduler

The kube-scheduler is responsible for placing pods onto worker nodes based on resource requirements, constraints, and policies.

How the scheduler works:

Filtering: Identifies nodes capable of running a pod based on requirements like CPU, memory, or taints.
Scoring: Ranks filtered nodes using plugins to determine the best fit. If scores are equal, a node is chosen randomly.
Binding: Creates a binding event in the API server to assign the pod to the selected node.

Key Points:

The scheduler uses a configurable parameter, percentageOfNodesToScore (default 50%), to limit the number of nodes evaluated in large clusters.
Supports custom schedulers for specialized use cases.

4. kube-controller-manager

The kube-controller-manager runs multiple controllers that monitor the cluster’s state and ensure it matches the desired state defined in object manifests (e.g., deployments, replicasets).

Key controllers include:

Deployment Controller: Manages deployment rollouts and scaling.
ReplicaSet Controller: Ensures the desired number of pod replicas are running.
Node Controller: Monitors node health and status.
Endpoints Controller: Maintains endpoint objects for services.

Key Points:

Controllers run as infinite loops, continuously reconciling actual and desired states.
Custom controllers can be created for custom resources.

5. cloud-controller-manager

The cloud-controller-manager integrates Kubernetes with cloud provider APIs, enabling the provisioning of cloud-specific resources like load balancers, storage volumes, and node instances.

Key controllers:

Node Controller: Updates node metadata (e.g., labels, hostname) using cloud APIs.
Route Controller: Configures networking routes for pod communication.
Service Controller: Manages load balancers and IP assignments for services.

Example Use Cases:

Provisioning an AWS Elastic Load Balancer for a Kubernetes service.
Attaching cloud storage volumes to pods.

Kubernetes Worker Node Components

Worker nodes are responsible for running containerized applications. Each worker node contains the following components:

1. kubelet

The kubelet is an agent running on every node, responsible for managing pods and their containers. It communicates with the API server to receive pod specifications (podSpecs) and ensures they are executed correctly.

Key responsibilities:

Pod Management: Creates, updates, and deletes containers based on podSpecs.
Health Checks: Handles liveliness, readiness, and startup probes.
Volume Mounting: Configures storage volumes for pods.
Status Reporting: Reports node and pod status to the API server using tools like cAdvisor.

Static Pods: Kubelet can manage pods directly from local files (static pods), bypassing the API server. This is used during cluster bootstrapping for control plane components.

2. kube-proxy

The kube-proxy runs on every node and manages network rules to enable communication with Kubernetes services. It implements service discovery and load balancing for pods grouped under a service’s ClusterIP.

Key modes:

IPTables (default): Uses iptables rules for traffic routing, with random pod selection for load balancing.
IPVS: Offers better performance for large clusters, supporting advanced load-balancing algorithms (e.g., round-robin, least connections).
nftables (alpha in v1.29): A more efficient successor to iptables.

How it works:

Monitors services and endpoints via the API server.
Configures network rules to route traffic to the correct pods.

3. Container Runtime

The container runtime is the software responsible for running containers on each node. It handles tasks like pulling images, starting/stopping containers, and managing their lifecycle.

Key concepts:

Container Runtime Interface (CRI): A standardized API allowing Kubernetes to work with multiple runtimes (e.g., containerd, CRI-O, Docker).
Open Container Initiative (OCI): Defines standards for container formats and runtimes.

Example Workflow (using CRI-O):

Kubelet sends a pod creation request to the container runtime via CRI.
CRI-O pulls the container image from a registry.
CRI-O generates an OCI-compliant runtime specification.
The runtime (e.g., runc) starts the container.

Kubernetes Add-On Components

In addition to core components, Kubernetes supports add-ons to enhance functionality. Common add-ons include:

CNI Plugins: Enable pod networking and network policies (e.g., Calico, Flannel, Cilium).
CoreDNS: Provides DNS-based service discovery.
Metrics Server: Collects resource usage data for nodes and pods.
Kubernetes Dashboard: Offers a web-based UI for cluster management.

Key Kubernetes Objects

Kubernetes components manage various objects, including:

Pods: The smallest deployable units, containing one or more containers.
Deployments: Manage pod replicas and updates.
Services: Enable network access to pods via ClusterIP or load balancers.
ConfigMaps and Secrets: Store configuration data and sensitive information.
Custom Resources (CRDs): Extend Kubernetes with custom objects and controllers.

Conclusion

Kubernetes architecture is a well-coordinated system of components working together to manage containerized applications at scale. The control plane orchestrates the cluster, while worker nodes execute workloads. Understanding these components—kube-apiserver, etcd, kube-scheduler, kube-controller-manager, cloud-controller-manager, kubelet, kube-proxy, and container runtimes—provides a solid foundation for working with Kubernetes.

To deepen your knowledge, explore hands-on tutorials on Kubernetes objects or set up a cluster to experiment with these components in action.

DEV Community