Mike

Posted on Jun 16 • Originally published at restlessdev.com

A Complete Noob’s Guide to Kubernetes

#webdev #kubernetes #docker #api

This article first appeared on the RestlessDev blog.

So you’re interested in deploying your app to Kubernetes but just don’t know how to get started. Been there. I was like you once, 3 days ago.

And now I’m qualified to tell you how to get it done. Well, semi-qualified.

What This Guide Is

This guide is a crash course in telling you what you need to know to get started in Kubernetes for a typical web app. This includes getting the site running with traffic going to it, as well as setting up a worker process for running non-web tasks in the background and a database. It also covers some of the terminology you’ll need to know. I caution you to read what I say here with a grain of salt; I’m describing my experience while also attempting to simplify things where I can so that Mere Mortals can get started with this very complicated and full-featured platform.

What This Guide Is Not

This guide won’t tell you how do deploy a Kubernetes cluster. This will vary a lot depending on which cloud provider you’re using, or if you’re trying to set it up on your own hardware. In my case I used Kube-Hetzner, which was very easy to use once I read through the GitHub readme and set up the right instance types for my desired cluster. It also doesn’t cover the process of containerizing your application. I’ve used Podman to build my images but it should work just as well with other containerization systems.

It also doesn’t get into maintenance of your cluster. To be honest, I’ll be learning that as well, as I’m just setting out on this journey myself.

Before you run, you need to crawl. This guide is will get you crawling like a grad student. That’s good, right? They have to be pretty fast.

Some Background

Like others out there, I’ve been through many different iterations on what the current “best practice” (ick) is when deploying web apps. Bare-metal servers (or just “servers” as we used to call them) gave way to virtual machines, which then gave way to platforms like Heroku, which gave way to Amazon’s version of Heroku, Elastic Beanstalk. Then containerization hit, and every cloud provider had their own way to turn a Docker image into a running website. Recently I’ve been using Amazon’s ECS to deploy sites; it’s been pretty relable once it was up and running, but I wasn’t always crazy about the added costs that seemed to pop up all over the place. A load balancer was how much? What’s a VPC and why is it so expensive? And don’t get me started about RDS pricing.

Kubernetes was something I had heard about, but it seemed so enterprise-y. I’m just a simple country boy, I don’t have an army of consultants to deploy my sites and keep them running. I ran through the minikube tutorial a few years back and got something simple running locally, but to go from that to an actual cluster seemed like a leap I wasn’t prepared to take.

Until last Thursday.

I’ve been working on something that just begged for a Real Deployment, and I wanted to get off of the vendor lock-in treadmill I’ve been on the last few years. Kubernetes answered the call.

Terminology

The first thing that I noticed about Kubernetes is all of the terminology you have to learn. It’s a huge platform, built to handle all different types of workloads with a lot of customization available. Fortunately, you don’t need to learn every little thing just to get started.

Here are the terms you really need to know.

Control Planes and Nodes

Control Planes are the brains of your cluster, and are the thing you actually talk to when using the kubectl command line interface. They generally don’t have to have a ton of resources, because they don’t run anything other than the API and other coordination and management software. Nodes are the part of the cluster that actually run your containers, so the amount of memory and compute power (and storage) they have does impact the capacity of your cluster. Many cloud providers seem to provide their own Control Planes free of charge as long as you use their Nodes; if you’re already on their platform it could work out and provide one less thing for you to maintain, but just keep in mind that they aren’t generally very expensive instance types. This isn’t the reason to choose one platform over another, in my opinion.

Kubectl

Kubectl is the command-line tool used to talk to your cluster. It has a rather simple and consistent interface, and once you get used to it, it’s pretty easy to get things done.

Namespaces

Namespaces are how Kubernetes sorts your various resources into logical buckets. In my case, I have a single namespace for my application and everything lives in it. There is a default namespace, but that’s how we get ants. Be a mensch and create a new namespace please.

Deployments

Deployments are how you instruct Kubernetes to turn your images into actual running containers, called Pods. Each deployment can deploy multiple pods with the same image if you need a load balancing situation. You can also use deployments to launch other services like databases and caching layers using their own images. You basically use deployments to describe what you want the cluster to look like, and then Kubernetes makes sure it happens.

Your running application will consist of several different Deployments; in my case I have one for my application API, one for my application cron runner, one for my database, and one for memcached.

Pods

“Pods” are Kubernetes-speak for running containers. You can connect to them directly to get a command-line interface for debugging, and can access their logs to see whatever information you want. They should generally be treated as ephemeral and non-permanent, outside of any Volumes that they use for persistent storage. The Kubernetes platform will replace Pods with fresh versions whenever it needs to to maintain the integrity of the Deployments.

Volumes

Volumes are used to provide a place to store data that needs to persist over time. They get attached to a Pod (or to several Pods if you want) and will live beyond that Pod‘s life cycle. For things like databases this is essential.

ConfigMaps and Secrets

ConfigMaps and Secrets are Kubernetes’ way of dealing with environment variables. ConfigMaps are a little easier to deal with because they are unencrypted and can be edited directly, so use them for environment variables that aren’t sensitive. Secrets are stored differently internally, but are unencrypted by Kubernetes and sent to Pods as regular environment variables when needed. They are good for putting things like API keys that you want to keep safe. You can have as many ConfigMaps and Secrets as you want, each with their own name; they are assigned to the Deployment in the Deployment‘s YAML file. In my case, my API and cron images are based on the same codebase, so I use the same ConfigMap and Secret for both.

If your images are in a private repository, you’ll also need a Secret to tell Kubernetes how to access them.

Services

Services are how you map ports within your Pods to the outside world. In my case I have a Service for my API but not for the cron runner, which just sorta does its thing in the background.

Ingress

Ingress is a layer on top of Services specifically for HTTP/HTTPS connections. In my case, it’s used for higher-level functions like getting certificates from Let’s Encrypt. You can also use it to route different paths to different Services and things like that, although I don’t use that.

ClusterIssuer

ClusterIssuer is used when registering SSL certificates.

That’s it. Those are the only terms you need to know to deploy an application on Kubernetes.

Deploying an Application

Do you like YAML? I hope so, because you’re going to be using it a lot. Let’s dive in.

Throughout this tutorial, I’ll use the following placeholders to indicate things you should replace with your own variation:

<namespace>: The namespace of the application. To make it simple, just try to use a standard lowercase slug-type string. my-namespace would be fine.
<app-name>: Kubernetes lets you assign each resource a name which needs to be unique within a namespace. For simplicity, I call my deployments things like my-app-api, my-app-cron, and my-app-db and my services things like my-app-api-service. You’re free to call yours whatever you want, but if you call yours something different just make sure to put the right name in the right place.
<*-username> and <*-password>: You may need to reference different usernames and passwords in your code. I’ll replace the * with an identifier to help you know what’s expected.

I’d suggest creating a directory that you’ll use for storing all of your YAML files as you’ll want to reapply them whenever you make changes. Throughout the tutorial I’ll assume you’re in the directory with all of your files.

It’s also worth noting that the names of your YAML files aren’t important, but the content in them is. I happen to call them the same name as the resources they reference, but you might go a different way. It’s cool, we can still be friends.

Side note: We’re going to be using kubectl a lot below, and kubectl needs to know how to talk to your cluster. In my case, my deployment process output a file called kubeconfig.yaml that had all of the relevant information. This file can either be supplied to kubectl as an argument (–kubeconfig) or by setting an environment variable (KUBECONFIG). I’ve opted for the latter, which is why you won’t see this important detail in all the examples.

1. Create the Namespace

First you’ll need to create your namespace. This is the bucket where everything else will live. To do this, make the following YAML:

namespace.yaml

apiVersion: v1
kind: Namespace
metadata:
  name: <namespace>

Now apply this YAML to your cluster with the following command:

kubectl apply -f namespace.yaml

We’re under way!

2. Deploy your Database

In this example, I’m using CloudNative-PG, a Kubernetes-oriented PostgreSQL distribution that has lots of nice features built in.

First we need to create the Secret to store your database credentials.

<app-name>-db-secret.yaml

apiVersion: v1
kind: Secret
metadata:
  name: <app-name>-db-secret
  namespace: <namespace>
type: Opaque
stringData:
  username: <database-user>
  password: <database-password>

Please remember to quote your password if it contains characters that make YAML unhappy.

Apply it like this:

kubectl apply -f <app-name>-db-secret.yaml

Now you should be able to deploy the database. You know the drill, another YAML.

<app-name>-db.yaml

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: <app-name>-db
  namespace: <namespace>
spec:
  instances: 1
  primaryUpdateStrategy: unsupervised
  storage:
    storageClass: hcloud-volumes
    size: 20Gi
  bootstrap:
    initdb:
      database: <database-name>
      owner: <database-user>
      secret:
        name: <app-name>-db-secret

One thing to notice is that the kind of this YAML file is Cluster. This goes a little beyond the scope of this tutorial, but my understanding is that you can create your own specs for YAML files that extend beyond the build in ones, as done here with the apiVersion referring to that CloudNative PG spec.

Now apply it.

kubectl apply -f <app-name>-db.yaml

So one thing about databases is that you don’t really want them to be easy for bad guys to get to. With this deployment the database is running inside your cluster without an external IP address.

It so happens that you occasionally need to connect to your database, either to set it up initially, or maintain it, or just to run queries or updates. See how it’s doing. For this purpose we’ll want to set up a separate SSH host that we can use to create a tunnel into the database. This SSH server is typically called an SSH Bastion, I think because it sounds cool and a little Medieval. But also because it is a safe place to enter your private cluster.

ssh-bastion.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ssh-bastion
  namespace: <namespace>
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ssh-bastion
  template:
    metadata:
      labels:
        app: ssh-bastion
    spec:
      containers:
      - name: sshd
        image: linuxserver/openssh-server
        env:
        - name: PUID
          value: "1000"
        - name: PGID
          value: "1000"
        - name: TZ
          value: "UTC"
        - name: PASSWORD_ACCESS
          value: "true"
        - name: SUDO_ACCESS
          value: "true"
        - name: DOCKER_MODS
          value: "linuxserver/mods:openssh-server-ssh-tunnel"
        - name: USER_NAME
          value: <bastion-user>
        - name: USER_PASSWORD
          value: <bastion-password>
        ports:
        - containerPort: 2222
---
apiVersion: v1
kind: Service
metadata:
  name: ssh-bastion
  namespace: <namespace>
spec:
  selector:
    app: ssh-bastion
  ports:
  - port: 2222
    targetPort: 2222.
  type: LoadBalancer

You may have noticed we cheated a bit here and had both a deployment and a service defined in the same YAML file. You can do that if you want.

kubectl apply -f ssh-bastion.yaml

Give it a minute to deploy, and let’s check out our progress.

First, let’s see our pods:

kubectl get pods -n <namespace>

You should see one for your database and another for your SSH bastion. Great work!

Now if we want to actually use our bastion, we need to figure out its external IP. To do that we need to look up the services.

kubectl get svc -n <namespace>

You should see a line for ssh-bastion with a value in the External-IP column. Grab that number.

Open up another terminal window and enter this command:

ssh -o PreferredAuthentications=password -o PubkeyAuthentication=no -p 2222 -N -L <open-local-port>:<app-name>-db-rw.<namespace>.svc.cluster.local:5432 <bastion-user>@<bastion-external-ip>

So yeah, there are a lot of placeholders there. should be any local port that you will use to connect to your cluster’s database. I’m assuming you’re already running PostgreSQL on your local machine. If you aren’t, 5432 is fine. is that IP address I told you to grab earlier. The other placeholders you should already have.

As you’ll see, Kubernetes does a lot of magic with DNS inside your cluster. In this case, it creates a hostname for your read-write database at <app-name>-db-rw. Remember that for later when it comes to setting up your application’s ConfigMaps.

Once you run this command, you’ll need to enter your . It’ll look like your terminal hung. It hasn’t, it’s just opening up a tunnel for you and keeping it open.

Now you can go to your preferred PostgreSQL client and connect to your cluster database with the following parameters.

Hostname: localhost
Port: <open-local-port>
Username: <database-user>
Password: <database-password>

You can set up your application’s database and do any other configuration you need. When you’re done, kill that SSH tunnel and you’re good to go!

3. Deploy the Application

Now that we have the database working as planned, we can turn to the application.

My application consists of two parts, the API and a separate process that runs cron jobs on a regular schedule to monitor my system and tidy things up when needed. The actual website for my application is hosted outside of Kubernetes; it’s an SPA that will connect to this API to do its work.

Since the API and cron runner both share the same codebase, they have the same environment variables. Step 1 is to add the ConfigMap and Secret that the site code will use.

<app-name>-config.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: <app-name>-config
  namespace: <namespace>
data:
  NODE_ENV: production
  LOG_LEVEL: info
  APP_DB_HOST: <app-name>-db-rw
  APP_DB_PORT: "5432"
  APP_DB_NAME: <database-name>
  PORT: "3000"

<app-name>-secret.yaml

apiVersion: v1
kind: Secret
metadata:
  name: <app-name>-secret
  namespace: <namespace>
type: Opaque
stringData:
  APP_DB_USERNAME: <database-user>
  APP_DB_PASSWORD: <database-password>

Now apply each of these:

kubectl apply -f <app-name>-config.yaml
kubectl apply -f <app-name>-secret.yaml

Quick Detour

Do you use a private image repository to store your images? If so you need to tell Kubernetes how to access it. This can be accomplished through adding a Secret. We’ll do it inline this time, as the Docker Hub YAML format is a little wonky. (For those interested, you need to base64-encode a JSON blob and place it in the YAML)

kubectl create secret docker-registry dockerhub-secret \
  --docker-server=https://index.docker.io/v1/ \
  --docker-username=<docker-username> \
  --docker-password=<docker-access-token> \
  --docker-email=<docker-email> \
  --namespace=<namespace>

With those in place, we’ll start by deploying the cron worker. (Remember to remove the imagePullSecrets key if you aren’t authenticating to Docker Hub)

<app-name>-cron.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: <app-name>-cron
  namespace: <namespace>
spec:
  replicas: 1
  selector:
    matchLabels:
      app: <app-name>-cron
  template:
    metadata:
      labels:
        app: <app-name>-cron
    spec:
      imagePullSecrets:
      - name: dockerhub-secret
      containers:
      - name: cron
        image: docker.io/<docker-user>/<app-name>-cron:<tag-name>
        envFrom:
        - configMapRef:
            name: <app-name>-config
        - secretRef:
            name: <app-name>-secret

Apply it like this:

kubectl apply -f <app-name>-cron.yaml

Finally, we’ll deploy the API:

<app-name>-api.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: <app-name>-api
  namespace: <namespace>
spec:
  replicas: 2
  selector:
    matchLabels:
      app: <app-name>-api
  template:
    metadata:
      labels:
        app: <app-name>-api
    spec:
      imagePullSecrets:
      - name: dockerhub-secret
      containers:
      - name: api
        image: docker.io/<docker-user>/<app-name>-api:<tag-name>
        ports:
        - containerPort: 3000
        envFrom:
        - configMapRef:
            name: <app-name>-config
        - secretRef:
            name: <app-name>-secret

You’ll notice that we have 2 replicas specified in this deployment; this will launch 2 different pods and then load balance them for us for a little redundancy. In this example, our API publishes at port 3000 within the container. You may want to replace this (and other 3000s within this tutorial) with your own value if you do it differently.

And apply it like this:

kubectl apply -f <app-name>-api.yaml

Because this set of Pods will need to listen on a port, we’ll need to add a service. We can do that like this:

<app-name>-api-service.yaml

apiVersion: v1
kind: Service
metadata:
  name: <app-name>-api-service
  namespace: <namespace>
spec:
  selector:
    app: <app-name>-api
  ports:
    - protocol: TCP
      port: 3000      # The port exposed by the service
      targetPort: 3000  # The port your app listens on inside the container
  sessionAffinity: ClientIP

And apply it like this:

kubectl apply -f <app-name>-api-service.yaml

4. Ingress and Domain Pointing

So our site is all deployed. Now we just have to make it visible to the world.

In my case I needed to do two things to make this happen. First, I needed to set up the ClusterIssuer for my cluster.

certissuer-prod.yaml

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    email: <email-address>
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-prod-account-key
    solvers:
      - http01:
          ingress:
            class: traefik

kubectl apply -f certissuer-prod.yaml

Finally, we need to add the Ingress.

ingress.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: <app-name>-ingress
  namespace: <namespace>
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  rules:
  - host: api.<app-domain>.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: <app-name>-api-service
            port:
              number: 3000
  tls:
    - hosts:
        - api.<app-domain>.com
      secretName: api-<app-domain>-tls

kubectl apply -f ingress.yaml

Whew, we’re finally almost done. Just one more step: We need to point our domain to the new site. First, though, we need an IP address to point it to.

Run the following command:

kubectl get ingress -n <namespace>

Under the Address column, you should see an IP address. Point an A record at that thang and you’re in business!

Some More Tips

That wasn’t so bad, was it?

Kubectl is also really easy to use once you get the hang of it. For the most part, you can get lists on any part of your cluster like this:

kubectl get <entity> -n <namespace>

where can be pods, configmaps, secrets, deployments, svc or whatever. If you want to inspect something on that list, you can use describe:

kubectl describe <entity> <entity-name> -n <namespace>

# example
kubectl describe pod my-app-api-iwuqe8 -n my-app

When working with pods, you can view their logs like this:

kubectl logs <pod-name> -n <namespace>

Remember that pods are not the same as deployments; you’ll need to get your actual pod instance names from kubectl get pods to connect to the specific instances for logs, exec, and other pod-specific things.

You can get a bash shell on a pod with the following command:

kubectl exec -it <pod-name> -n <namespace> --  bash

If necessary, you can also replace bash above with any other command that’s available on your pod.

If you want to make a small edit to a ConfigMap, you can do it like this:

kubectl edit configmap <configmap-name> -n <namespace>

You can use a similar syntax for other entities, if needed.

You’ll also occasionally find yourself needing to restart your pods after you update a ConfigMap or Secret, or perhaps pushed out an update to your image (if you use the latest tag rather than versioning each release, for example) that you want to have reflected in your Pods. No problemo. You can do a patch to your pod like this:

kubectl patch deployment <deployment-name> -n <namespace> \ 
  -p "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"configmap-reload-timestamp\":\"$(date +%s)\"}}}}}"

If you’ve pushed up a new tag version of your image and you’d like to patch your deployments to use it, you can use this command:

kubectl set image deployment/<app-name>-api api=docker.io/docker.io/<docker-user>/<app-name>-api:<new-tag> -n <namespace>

If you’d like to update your configuration after changing any of your YAML files, you can also just reapply your updates using kubectl apply -f like you have been. Just make sure that you keep track of any changes you’ve made via edits or patches to make sure you don’t apply an older version of your configuration.

Whew!

Ok, that should be enough for now. Hopefully it wasn’t too overwhelming.

I’ve found that Kubernetes way of conceptualizing an application, once I internalized it, was actually easier than ECS. Each step served a particular purpose, and there wasn’t a lot of unnecessary boilerplate needed just to get things set up. Hopefully you’ll have a similar experience.

Til next time.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.