Leapcell

Posted on Jun 11

Learning Go Engineering Practices from K8s

#webdev #programming #backend #go

Map Read and Write

In Kubernetes, we often see that many modifications are executed by writing to a channel before execution. This approach ensures that single-threaded routines avoid concurrency issues, and it also decouples production and consumption.

However, if we simply modify a map by locking, the performance of using channels is not as good as directly locking. Let’s look at the following code for a performance test.

writeToMapWithMutex operates the map by locking, while writeToMapWithChannel writes to a channel, which is then consumed by another goroutine.

package map_modify

import (
  "sync"
)

const mapSize = 1000
const numIterations = 100000

func writeToMapWithMutex() {
  m := make(map[int]int)
  var mutex sync.Mutex

  for i := 0; i < numIterations; i++ {
    mutex.Lock()
    m[i%mapSize] = i
    mutex.Unlock()
  }
}

func writeToMapWithChannel() {
  m := make(map[int]int)
  ch := make(chan struct {
    key   int
    value int
  }, 256)

  var wg sync.WaitGroup
  go func() {
    wg.Add(1)
    for {
      entry, ok := <-ch
      if !ok {
        wg.Done()
        return
      }
      m[entry.key] = entry.value
    }
  }()

  for i := 0; i < numIterations; i++ {
    ch <- struct {
      key   int
      value int
    }{i % mapSize, i}
  }
  close(ch)
  wg.Wait()
}

Benchmark test:

go test -bench .

goos: windows
goarch: amd64
pkg: golib/examples/map_modify
cpu: Intel(R) Core(TM) i7-9700 CPU @ 3.00GHz
BenchmarkMutex-8             532           2166059 ns/op
BenchmarkChannel-8           186           6409804 ns/op

We can see that directly locking to modify the map is more efficient. Therefore, when the modification is not complex, we prefer to use sync.Mutex directly to avoid concurrent modification issues.

Always Design for Concurrency

K8s makes extensive use of channels to pass signals, so that its own logic processing won’t be blocked by unfinished work of upstream or downstream components. This not only improves task execution efficiency, but also allows for minimal retries when errors occur, and idempotency can be broken down into small modules.

The handling of events such as deleting, adding, and updating Pods can all be performed concurrently. There is no need to wait for one to finish before processing the next. Therefore, when a Pod is added, you can register multiple listeners for distribution. As long as you write the event to the channel, it is considered successfully executed, and the reliability of subsequent actions is ensured by the executor. In this way, the current event is not blocked from concurrent execution.

type listener struct {
  eventObjs chan eventObj
}

// watch
//
//  @Description: Listen to the content that needs to be handled
func (l *listener) watch() chan eventObj {
  return l.eventObjs
}

// Event object, you can define the content to be passed as you like
type eventObj struct{}

var (
  listeners = make([]*listener, 0)
)

func distribute(obj eventObj) {
  for _, l := range listeners {
    // Directly distribute the event object here
    l.eventObjs <- obj
  }
}

DeltaFIFO’s Deduplication of Delete Actions

func dedupDeltas(deltas Deltas) Deltas {
  n := len(deltas)
  if n < 2 {
    return deltas
  }
  a := &deltas[n-1]
  b := &deltas[n-2]
  if out := isDup(a, b); out != nil {
    deltas[n-2] = *out
    return deltas[:n-1]
  }
  return deltas
}

func isDup(a, b *Delta) *Delta {
  // If both are delete operations, merge one away
  if out := isDeletionDup(a, b); out != nil {
    return out
  }
  return nil
}

Here, we can see that since events are managed by a separate queue, we can add deduplication logic to the queue individually.

Because the component is encapsulated within the package, external users do not see the internal complexity—they only need to continue handling subsequent events. The removal of one deletion event does not impact the overall logic.

Orthogonal Design Between Components

What is orthogonal design? It means that what each component does is independent from the others, and they can be freely composed without interdependence. For example, kube-scheduler is only responsible for assigning specific nodes to Pods, and after assigning, it does not directly pass the result to kubelet for operation. Instead, it stores the assignment in etcd through the api-server. In this way, it only depends on the api-server to deliver tasks.

Kubelet also listens directly to the tasks delivered by the api-server. Therefore, it can not only maintain the tasks delivered by kube-scheduler, but also handle requests from the api-server to delete Pods. Thus, the things they can do independently are multiplied to yield the total number of things they can accomplish together.

Implementation of Timers

Using Crontab to trigger tasks periodically, you can first write an interface to handle the logic after the task is triggered, and then use the curl image to start the task on a schedule.

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: task
spec:
  schedule: '0 10 * * *'
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: task-curl
              image: curlimages/curl
              resources:
                limits:
                  cpu: '200m'
                  memory: '512Mi'
                requests:
                  cpu: '100m'
                  memory: '256Mi'
              args:
                - /bin/sh
                - -c
                - |
                  echo "Starting create task of CronJob"
                  resp=$(curl -H "Content-Type: application/json" -v -i -d '{"params": 1000}' <http://service-name>:port/api/test)
                  echo "$resp"
                  exit 0
          restartPolicy: Never
  successfulJobsHistoryLimit: 2
  failedJobsHistoryLimit: 3

Abstracting Firmware Code

Kubernetes also follows this approach in the design of CNI (Container Network Interface), for which k8s has established a set of rules for network plugins. The purpose of CNI is to decouple network configuration from container platforms, so that on different platforms you only need to use different network plugins, and other containerized content can still be reused. You only need to know that the container is created, and the rest of the networking is handled by the CNI plugin. You just need to provide the configuration agreed upon in the specification to the CNI plugin.

Can we design pluggable components like CNI in business implementation?

Of course, we can. In business development, the most commonly used is the database, which should be a tool used indirectly by business logic. Business logic does not need to know about the database’s table structure, query language, or any other internal details. The only thing business logic needs to know is that there are functions available for querying and saving data. In this way, the database can be hidden behind an interface.

If we need a different underlying database, we just need to switch the database initialization at the code level. Gorm has abstracted most of the driver logic for us, so during initialization, just passing in a different DSN will enable a different driver, which will then translate the statements we need to execute.

A good architecture should be organized around use cases, so that it can describe the use case completely, independent of the framework, tools, or runtime environment.

This is similar to how the primary goal of residential building design should be to meet the needs of habitation, not to insist on building the house with bricks. An architect should spend considerable effort ensuring the architecture allows users as much freedom as possible to choose building materials while still meeting their needs.

Because of the abstraction layer between gorm and the actual database, to implement user registration and login, you do not need to care whether the underlying database is MySQL or Postgres. You just describe that after user registration, user information will be stored, and for login, the corresponding password needs to be checked. Then, based on system reliability and performance requirements, you can flexibly choose the components to use during implementation.

Avoid Overengineering

Overengineering is often worse than insufficient engineering design.

The earliest version of Kubernetes was 0.4. In its networking part, the official implementation at the time was to use GCE to execute salt scripts to create bridges, and for other environments, the recommended solutions were Flannel and OVS.

As Kubernetes evolved, Flannel became inadequate in some cases. Around 2015, Calico and Weave emerged in the community, which basically solved the network issues, so Kubernetes no longer needed to spend its own efforts on this. Therefore, CNI was introduced to standardize network plugins.

As we can see, Kubernetes was not designed perfectly from the beginning. Instead, as more problems appeared, new designs were continuously introduced to adapt to the changing environments.

Scheduler Framework

In kube-scheduler, the framework provides mounting points, allowing plugins to be added later. For example, if you want to add a node scoring plugin, you only need to implement the ScorePlugin interface and register the plugin in the scorePlugins array of the framework via the Registry. Finally, the result returned by the scheduler is wrapped in Status, which includes the error, code, and the name of the plugin that caused the error.

If the framework’s insertion points are not set, the execution logic will be relatively scattered. When adding logic, because there is no unified mounting point, you might end up adding logic everywhere.

With the abstraction of the framework, you only need to know which stage you want to add logic to. After writing your code, just register it. This also makes it easier to test individual components, standardizes the development of each component, and when reading source code, you only need to check the part you want to modify or understand.

Here is a simplified code example:

type Framework struct {
  sync.Mutex
  scorePlugins []ScorePlugin
}

func (f *Framework) RegisterScorePlugin(plugin ScorePlugin) {
  f.Lock()
  defer f.Unlock()
  f.scorePlugins = append(f.scorePlugins, plugin)
}

func (f *Framework) runScorePlugins(node string, pod string) int {
  var score int
  for _, plugin := range f.scorePlugins {
    score += plugin.Score(node, pod) // Here, if plugins have different weights, you can multiply by a weight
  }
  return score
}

This centralized approach also makes it easier to add unified handling logic for similar components. For example, scoring plugins can calculate scores for multiple nodes simultaneously, without having to wait for each node to finish one by one.

type Parallelizer struct {
  Concurrency int
  ch          chan struct{}
}

func NewParallelizer(concurrency int) *Parallelizer {
  return &Parallelizer{
    Concurrency: concurrency,
    ch:          make(chan struct{}, concurrency),
  }
}

type DoWorkerPieceFunc func(piece int)

func (p *Parallelizer) Until(pices int, f DoWorkerPieceFunc) {
  wg := sync.WaitGroup{}
  for i := 0; i < pices; i++ {
    p.ch <- struct{}{}
    wg.Add(1)
    go func(i int) {
      defer func() {
        <-p.ch
        wg.Done()
      }()
      f(i)
    }(i)
  }
  wg.Wait()
}

You can use a closure to pass the calculation component’s information, then let the Parallelizer execute them concurrently.

func (f *Framework) RunScorePlugins(nodes []string, pod *Pod) map[string]int {
  scores := make(map[string]int)
  p := concurrency.NewParallelizer(16)
  p.Until(len(nodes), func(i int) {
    scores[nodes[i]] = f.runScorePlugins(nodes[i], pod.Name)
  })
  // Node binding logic omitted
  return scores
}

This programming paradigm can be applied very well in business scenarios. For example, after recall in recommendation results, you often need to filter and sort through various strategies.

When there are updates in strategy orchestration, you need hot-reloading, and the internal logic data of filters may also change, such as changes in blacklists, changes in purchased user data, or changes in product status. At this point, running tasks should still use the old filtering logic, but new tasks will use the new rules.

type Item struct{}

type Filter interface {
  DoFilter(items []Item) []Item
}

// ConstructorFilters
//
//  @Description: A new Filter is constructed each time, and if the cache changes, it is updated. New tasks will use the new filter chain.
//  @return []Filter
func ConstructorFilters() []Filter {
  // The filter strategies here can be read from the config file and then initialized
  return []Filter{
    &BlackFilter{}, // If internal logic changes, it can be updated via the constructor
    &AlreadyBuyFilter{},
  }
}

func RunFilters(items []Item, fs []Filter) []Item {
  for _, f := range fs {
    items = f.DoFilter(items)
  }
  return items
}

Splitting Services Does Not Equal Architecture Design

Splitting services actually just changes coupling between services from code-level coupling to data-level coupling. For example, if a downstream service needs a field modified, the upstream pipeline also needs to handle that field. This is only a localized isolation. But if you don't split services, you can still achieve similar isolation by layering your code—using function inputs and outputs to achieve effects similar to service splitting.

Service splitting is just one way of dividing up a system program, and the service boundary is not the system boundary. The service boundary is more about component boundaries; a service can contain multiple types of components.

For example, the k8s api-server.

Or, in a recommendation system, both recommendation tasks and recommendation lists may reside in one component. Recommendation tasks can be of many types: pushing products to a group of people, pushing a batch of products to specific users, ad delivery, etc. These are different kinds of recommendation tasks, but they are abstracted within one component. Downstream users of this data don't know what rules were used internally to generate it—they just perceive that a certain product was recommended to a user. This is the abstraction of upstream and downstream boundaries. Changes to logic inside the recommendation task component don't affect the data consumed by downstream services, which always see (user, item) pairs. The recommendation service logic is thus a component, and after being deployed independently, it can be used by both upstream and downstream.

Main Function Startup

You can use cobra to build a structured command:

kubelet --help

Using this command, you can see the optional parameters for the CLI tool.

If your application is a web server, you can change startup behavior—such as which port to listen on or which configuration file to use—by passing parameters.

If your program is a CLI tool, you can expose parameters more flexibly, allowing users to decide the behavior of the command themselves.

We are Leapcell, your top choice for hosting Go projects.

Leapcell is the Next-Gen Serverless Platform for Web Hosting, Async Tasks, and Redis:

Multi-Language Support

Develop with Node.js, Python, Go, or Rust.

Deploy unlimited projects for free

pay only for usage — no requests, no charges.

Unbeatable Cost Efficiency

Pay-as-you-go with no idle charges.
Example: $25 supports 6.94M requests at a 60ms average response time.

Streamlined Developer Experience

Intuitive UI for effortless setup.
Fully automated CI/CD pipelines and GitOps integration.
Real-time metrics and logging for actionable insights.

Effortless Scalability and High Performance

Auto-scaling to handle high concurrency with ease.
Zero operational overhead — just focus on building.

Explore more in the Documentation!

Read on our blog

Top comments (1)

Galloway Developer • Jun 13

Nice writeup! For efficient concurrent map access in Go, consider using sync.Map for cases with lots of concurrent reads and writes—it handles some patterns without explicit locking.