Leapcell

Posted on Jun 3

Mastering Go Error Handling: A Practical Guide

#webdev #programming #backend #go

The error in Go is just a value, and error handling is essentially making decisions after comparing values.

Business logic should only ignore errors when necessary; otherwise, errors should not be ignored.

In theory, this design makes programmers consciously handle every error, resulting in more robust programs.

In this article, let's talk about best practices for handling errors properly.

TL;DR

Only ignore errors when business logic requires it; otherwise, handle every error.
Use the errors package to wrap errors for stack information, print error details more precisely, and use trace_id in distributed systems to link errors from the same request.
Errors should only be handled once, including logging or implementing fallback mechanisms.
Keep error abstraction levels consistent to avoid confusion caused by throwing errors that are above the current module level.
Reduce the frequency of if err != nil through top-level design.

Accurate Error Logging

Error logs are an important means of helping us troubleshoot problems, so it's very important to print logs that are not easily confusing. How can we use err to get stack logs to help us troubleshoot issues?

Often ask yourself: Can the errors logged in this way really help with troubleshooting?

If we cannot pinpoint the error by looking at the log, it is equivalent to not logging the error at all.

The package github.com/pkg/errors provides us with a wrapper that carries the stack.

func callers() *stack {
 const depth = 32
 var pcs [depth]uintptr
 n := runtime.Callers(3, pcs[:])
 var st stack = pcs[0:n]
 return &st
}

func New(message string) error {
 return &fundamental{
  msg:   message,
  stack: callers(),
 }
}

The stack printing is achieved because fundamental implements the Format interface.

Then, fmt.Printf("%+v", err) can print the corresponding stack information.

func (f *fundamental) Format(s fmt.State, verb rune) {
 switch verb {
 case 'v':
  if s.Flag('+') {
   io.WriteString(s, f.msg)
   f.stack.Format(s, verb)
   return
  }
  fallthrough
 case 's':
  io.WriteString(s, f.msg)
 case 'q':
  fmt.Fprintf(s, "%q", f.msg)
 }
}

Let's look at a specific example:

func foo() error {
 return errors.New("something went wrong")
}

func bar() error {
 return foo() // Attach stack information to the error
}

Here, foo calls errors.New to create an error, and then we add another layer of call with bar.

Next, let's write a test to print out our error:

func TestBar(t *testing.T) {
 err := bar()
 fmt.Printf("err: %+v\n", err)
}

The final printed output will include the stacks of both foo and bar.

err: something went wrong
golib/examples/writer_good_code/exception.foo
 E:/project/github/go-lib-new/go-lib/examples/writer_good_code/exception/err.go:8
golib/examples/writer_good_code/exception.bar
 E:/project/github/go-lib-new/go-lib/examples/writer_good_code/exception/err.go:12
...

You can see that the first line accurately prints our error message, and the first stack trace points to where the error was generated.

With these two pieces of error information, we can see both the error message and the origin of the error.

Error Tracing in Distributed Systems

We can now accurately print errors on a single machine, but in real-world programs, we often encounter many concurrent situations. How can we ensure that error stacks belong to the same request? This requires a trace_id.

You can generate a trace_id based on your own requirements and preferred format, and set it in the context.

func CtxWithTraceId(ctx context.Context, traceId string) context.Context {
 ctx = context.WithValue(ctx, TraceIDKey, traceId)
 return ctx
}

When logging, you can use CtxTraceID to retrieve the traceId.

func CtxTraceID(c context.Context) string {
 if gc, ok := c.(*gin.Context); ok {
  // Get trace id from gin's request ...
 }

 // get from go context
 traceID := c.Value(TraceIDKey)
 if traceID != nil {
  return traceID.(string)
 }

 // Generate one if trace id is missing
 return TraceIDPrefix + xid.New().String()
}

By adding a traceID to logs, you can pull the entire chain of error logs for a request, which greatly reduces the difficulty of searching logs during troubleshooting.

Error Should Be Handled Only Once

An error should only be handled once, and logging is also considered a way of handling an error.

If logging at one place can't solve the problem, the proper approach is to add more context information to the error, making it clear where in the program the problem occurred, instead of handling the error multiple times.

Initially, I had a misunderstanding when using error logs: I thought I should print logs for some business errors, such as:

User account/password incorrect
User SMS verification code error

Since these are caused by user input errors, there's no need for us to handle them.

What we really need to care about are errors caused by bugs in the program.

For normal business logic errors, there’s actually no need to output error-level logs; too much error noise will only obscure real issues.

Fallback for Errors

With the methods above, we can now accurately print errors in real projects to help with troubleshooting. However, often we want our program to "self-heal" and resolve errors adaptively.

A common example: after failing to retrieve from cache, we should fallback to the source database.

func GerUser() (*User, error) {
 user, err := getUserFromCache()
 if err == nil {
  return user, nil
 }

 user, err = getUserFromDB()
 if err != nil {
  return nil, err
 }
 return user, nil
}

Or, after a transaction fails, we want to initiate a compensation mechanism. For example, after an order is completed, we want to send an in-site message to the user.

We may fail to send the message synchronously, but in this scenario, the real-time requirement is not particularly high, so we can retry sending the message asynchronously.

func CompleteOrder(orderID string) error {
 // Other logic to complete the order...
 message := Message{}
 err := sendUserMessage(message)
 if err != nil {
  asyncRetrySendUserMessage(message)
 }
 return nil
}

Intentionally Ignoring Errors

If an API can avoid returning errors, then callers do not need to spend effort handling errors. So if an error is generated but does not require the caller to take extra action, we don't need to return an error—just let the code execute as normal.

This is the "Null Object Pattern." Originally, we should return an error or a nil object, but to save callers from error handling, we can return an empty struct, thus skipping error handling logic in the caller.

When using event handlers, if there is a non-existent event, we can use the "Null Object Pattern."

For example, in a notification system, we might not want to send an actual notification. In this case, we can use a null object to avoid nil checks in the notification logic.

// Define Notifier interface
type Notifier interface {
    Notify(message string)
}

// Concrete implementation of EmailNotifier
type EmailNotifier struct{}

func (n *EmailNotifier) Notify(message string) {
    fmt.Printf("Sending email notification: %s\n", message)
}

// Null notification implementation
type NullNotifier struct{}

func (n *NullNotifier) Notify(message string) {
    // Null implementation, does nothing
}

If a method returns an error but, as a caller, we don't want to handle it, the best practice is to use _ to receive the error. This way, other developers will not be confused about whether the error was forgotten or intentionally ignored.

func f() {
  // ...
  _ = notify()
}

func notify() error {
  // ...
}

Wrapping Custom Errors

Transparent errors can reduce the coupling between error handling and error value construction, but they do not allow errors to be used effectively in logic handling.

Handling logic based on errors makes the error a part of the API.

We may need to show different error messages to upper layers depending on the error.

After Go 1.13, it is recommended to use errors.Is to check errors.

Error types can be checked with errors.As, but this means public APIs still need careful error maintenance.

So, is there a way to reduce the coupling introduced by this error handling style?

You can extract error traits into a unified interface, and callers can cast errors to this interface for judgment. The net package handles errors in this way.

type Error interface {
 error
 Timeout() bool // Is the error a timeout?
}

Then, net.OpError implements the corresponding Timeout method to determine if the error is a timeout and to handle specific business logic.

Error Abstraction Levels

Avoid throwing errors whose abstraction level is higher than the current module. For example, when fetching data in the DAO layer, if the database fails to find a record, it’s appropriate to return a RecordNotFound error. However, if you throw an APIError directly from the DAO layer just to save the upper layer from converting errors, that would not be appropriate.

Similarly, lower-level abstract errors should be wrapped to match the current layer’s abstraction. After wrapping at the upper layer, if the lower layer needs to change how it handles errors, it won’t affect the upper layer.

For example, user login might initially use MySQL as storage. If there’s no match, the error would be “record not found.” Later, if you use Redis to match users, a failed match would be a cache miss. In this case, you don’t want the upper layer to feel the difference in the underlying storage, so you should consistently return a “user not found” error.

Reducing `err != nil`

The frequency of if err != nil can be reduced through top-level design. Some error handling can be encapsulated at lower levels, and there’s no need to expose them to upper layers.

By reducing the cyclomatic complexity of functions, you can reduce the number of repeated checks for if err != nil. For example, by encapsulating function logic, the outer layer only needs to handle the error once.

func CreateUser(user *User) error {
 // For validation, just throw one error instead of spreading it out
 if err := ValidateUser(user); err != nil {
  return err
 }
}

You can also embed the error status in a struct, encapsulate errors inside the struct, and only return the error at the end if something went wrong, allowing the outer layer to handle it uniformly. This avoids inserting multiple if err != nil checks in business logic.

Let’s take a data copy task as an example. You pass in the source and destination configuration, then perform the copy.

type CopyDataJob struct {
 source      *DataSourceConfig
 destination *DataSourceConfig

 err error
}

func (job *CopyDataJob) newSrc() {
 if job.err != nil {
  return
 }

 if job.source == nil {
  job.err = errors.New("source is nil")
  return
 }

 // Instantiate source
}

func (job *CopyDataJob) newDst() {
 if job.err != nil {
  return
 }

 if job.destination == nil {
  job.err = errors.New("destination is nil")
  return
 }

 // Instantiate destination
}

func (job *CopyDataJob) copy() {
 if job.err != nil {
  return
 }

 // Copy data ...
}

func (job *CopyDataJob) Run() error {
 job.newSrc()
 job.newDst()

 job.copy()

 return job.err
}

You can see that once an error occurs, each step in Run will still be executed, but each function checks err immediately and returns if it’s set. Only at the end does job.err get returned to the caller.

Although this approach can reduce the number of err != nil in the main logic, it actually just spreads out the checks and doesn’t truly reduce them. Therefore, in real development, this trick is rarely used.

We are Leapcell, your top choice for hosting Go projects.

Leapcell is the Next-Gen Serverless Platform for Web Hosting, Async Tasks, and Redis:

Multi-Language Support

Develop with Node.js, Python, Go, or Rust.

Deploy unlimited projects for free

pay only for usage — no requests, no charges.

Unbeatable Cost Efficiency

Pay-as-you-go with no idle charges.
Example: $25 supports 6.94M requests at a 60ms average response time.

Streamlined Developer Experience

Intuitive UI for effortless setup.
Fully automated CI/CD pipelines and GitOps integration.
Real-time metrics and logging for actionable insights.

Effortless Scalability and High Performance

Auto-scaling to handle high concurrency with ease.
Zero operational overhead — just focus on building.

Explore more in the Documentation!

Read on our blog

Top comments (2)

Nevo David • Jun 4

Pretty cool seeing someone lay out all the real stuff you hit with Go errors - not just the boring docs side. I always end up having to revisit my logging, never gets old.