Gracefully Restarting a Go Program Without Downtime

Jun 8, 2018 by Russell Jones

Introduction to Graceful Restarts

Being able to deploy a new version of your application or change its configuration in place without downtime has become table-stakes for modern systems software. This post discusses the different approaches that can be taken to gracefully restart an application along with a functional standalone sample to dig into the details. For those not familiar with Teleport, it’s our SSH and Kubernetes privileged access management solution designed for elastic infrastructure written in Go. This post should be interesting to developers and SREs that build and maintain services written in Go.

Background on SO_REUSEPORT vs Duplicating Sockets

Continuing our work in making Teleport highly available, we recently spent some time investigating how to gracefully restart Teleport’s TLS and SSH listener in GitHub issue #1679. Our goal was to be able to upgrade a Teleport binary without having to take an instance out of service.

The two common approaches are covered by Marek Majkowski in his blog post Why does one NGINX worker take all the load?. To recap the approaches are:

During our initial discussions several negatives came up about SO_REUSEPORT. One of our engineers had previously used that approach and noticed that due to its multiple accept queue approach sometimes pending TCP connections could be dropped. In addition when we were having these discussions, Go did not make it easy to set SO_REUSEPORT on a net.Listener. However, it should be noted that within the past few days there has been movement on this issue and it looks like Go will support setting socket options soon.

The second approach was also appealing due to its simplicity plus being the traditional Unix fork/exec spawning model that most developers are familiar with, the convention where all open files are passed to the child process. One thing to note, the os/exec package actually does not allow this. Most likely for safety reasons, it only passes stdin, stdout, and stderr to the child process. However the os package does have lower level primitives that can be used to pass a file to a child process, and that’s what we do.

Using Signals to Switch Socket Process Owner

Before getting into the source, it’s worthwhile providing some more details on how this approach works.

Starting a fresh Teleport process causes it to create a listener socket that receives all inbound network traffic on the assigned ports. For Teleport this is TLS and SSH traffic. We added a signal handler for SIGUSR2 which causes Teleport to duplicate the listener socket then spawn a new processes which is passed both the listener socket (as a file) and the metadata about the socket in its environment variables. Once the new process starts, it re-creates the listener socket from the passed-in file and metadata, and starts processing traffic as it gets it.

It should be noted that when a socket is duplicated inbound traffic is round-robin load balanced between the two sockets. As can be seen in Figure (1), this means for a period of time both Teleport processes will be accepting new connections.

Figure 1: Teleport can duplicate itself and share traffic with the duplicate process.

Figure 1: Teleport can duplicate itself and share traffic with the duplicate process.

Shutdown of the parent process is the same affair but in reverse. Once a Teleport process receives SIGQUIT it will begin the shutdown process: it stops accepting new connections, waits for all existing connections to be drained or for some timeout to occur. Once traffic has cleared, the dying parent will close its copy of the listener socket and exit. Now the kernel will only send traffic to the new process.

Figure 2: Once the first process shuts down, all traffic is no longer duplicated.

Figure 2: Once the first process shuts down, all traffic is no longer duplicated.

Sample Graceful Restart Walk-through

We wrote a small application that uses this approach that you can try yourself. The source code is at the bottom of this post, and you can try it out with the following commands:

First build and start the program.

$ go build restart.go
$ ./restart &
[1] 95147
$ Created listener file descriptor for :8080.

$ curl http://localhost:8080/hello
Hello from 95147!

Send the USR2 signal to the original process. Now when you hit the HTTP endpoint, it will return the PID of two different processes.

$ kill -SIGUSR2 95147
user defined signal 2 signal received.
Forked child 95170.
$ Imported listener file descriptor for :8080.

$ curl http://localhost:8080/hello
Hello from 95170!
$ curl http://localhost:8080/hello
Hello from 95147!

Kill the original process and you will only get responses from the new PID.

$ kill -SIGTERM 95147
signal: killed
[1]+  Exit 1                  go run restart.go
$ curl http://localhost:8080/hello
Hello from 95170!
$ curl http://localhost:8080/hello
Hello from 95170!

Finally kill the new process and nothing will respond.

$ kill -SIGTERM 95170
$ curl http://localhost:8080/hello
curl: (7) Failed to connect to localhost port 8080: Connection refused

Conclusion and Sample Source Code

As you have seen, once you understand how it works, adding graceful restarts to services written in Go is fairly easy and greatly improves the experience of the consumers of your services. If you’d like to see this in action with Teleport, we invite you to take a peek at our reference AWS SSH and Kubernetes bastion deployment which includes an ansible script that leverages in-place graceful restarts for upgrading Teleport binaries without downtime.

Golang Graceful Restart Source Example

package main

import (
	"context"
	"encoding/json"
	"flag"
	"fmt"
	"net"
	"net/http"
	"os"
	"os/signal"
	"path/filepath"
	"syscall"
	"time"
)

type listener struct {
	Addr     string `json:"addr"`
	FD       int    `json:"fd"`
	Filename string `json:"filename"`
}

func importListener(addr string) (net.Listener, error) {
	// Extract the encoded listener metadata from the environment.
	listenerEnv := os.Getenv("LISTENER")
	if listenerEnv == "" {
		return nil, fmt.Errorf("unable to find LISTENER environment variable")
	}

	// Unmarshal the listener metadata.
	var l listener
	err := json.Unmarshal([]byte(listenerEnv), &l)
	if err != nil {
		return nil, err
	}
	if l.Addr != addr {
		return nil, fmt.Errorf("unable to find listener for %v", addr)
	}

	// The file has already been passed to this process, extract the file
	// descriptor and name from the metadata to rebuild/find the *os.File for
	// the listener.
	listenerFile := os.NewFile(uintptr(l.FD), l.Filename)
	if listenerFile == nil {
		return nil, fmt.Errorf("unable to create listener file: %v", err)
	}
	defer listenerFile.Close()

	// Create a net.Listener from the *os.File.
	ln, err := net.FileListener(listenerFile)
	if err != nil {
		return nil, err
	}

	return ln, nil
}

func createListener(addr string) (net.Listener, error) {
	ln, err := net.Listen("tcp", addr)
	if err != nil {
		return nil, err
	}

	return ln, nil
}

func createOrImportListener(addr string) (net.Listener, error) {
	// Try and import a listener for addr. If it's found, use it.
	ln, err := importListener(addr)
	if err == nil {
		fmt.Printf("Imported listener file descriptor for %v.\n", addr)
		return ln, nil
	}

	// No listener was imported, that means this process has to create one.
	ln, err = createListener(addr)
	if err != nil {
		return nil, err
	}

	fmt.Printf("Created listener file descriptor for %v.\n", addr)
	return ln, nil
}

func getListenerFile(ln net.Listener) (*os.File, error) {
	switch t := ln.(type) {
	case *net.TCPListener:
		return t.File()
	case *net.UnixListener:
		return t.File()
	}
	return nil, fmt.Errorf("unsupported listener: %T", ln)
}

func forkChild(addr string, ln net.Listener) (*os.Process, error) {
	// Get the file descriptor for the listener and marshal the metadata to pass
	// to the child in the environment.
	lnFile, err := getListenerFile(ln)
	if err != nil {
		return nil, err
	}
	defer lnFile.Close()
	l := listener{
		Addr:     addr,
		FD:       3,
		Filename: lnFile.Name(),
	}
	listenerEnv, err := json.Marshal(l)
	if err != nil {
		return nil, err
	}

	// Pass stdin, stdout, and stderr along with the listener to the child.
	files := []*os.File{
		os.Stdin,
		os.Stdout,
		os.Stderr,
		lnFile,
	}

	// Get current environment and add in the listener to it.
	environment := append(os.Environ(), "LISTENER="+string(listenerEnv))

	// Get current process name and directory.
	execName, err := os.Executable()
	if err != nil {
		return nil, err
	}
	execDir := filepath.Dir(execName)

	// Spawn child process.
	p, err := os.StartProcess(execName, []string{execName}, &os.ProcAttr{
		Dir:   execDir,
		Env:   environment,
		Files: files,
		Sys:   &syscall.SysProcAttr{},
	})
	if err != nil {
		return nil, err
	}

	return p, nil
}

func waitForSignals(addr string, ln net.Listener, server *http.Server) error {
	signalCh := make(chan os.Signal, 1024)
	signal.Notify(signalCh, syscall.SIGHUP, syscall.SIGUSR2, syscall.SIGINT, syscall.SIGQUIT)
	for {
		select {
		case s := <-signalCh:
			fmt.Printf("%v signal received.\n", s)
			switch s {
			case syscall.SIGHUP:
				// Fork a child process.
				p, err := forkChild(addr, ln)
				if err != nil {
					fmt.Printf("Unable to fork child: %v.\n", err)
					continue
				}
				fmt.Printf("Forked child %v.\n", p.Pid)

				// Create a context that will expire in 5 seconds and use this as a
				// timeout to Shutdown.
				ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
				defer cancel()

				// Return any errors during shutdown.
				return server.Shutdown(ctx)
			case syscall.SIGUSR2:
				// Fork a child process.
				p, err := forkChild(addr, ln)
				if err != nil {
					fmt.Printf("Unable to fork child: %v.\n", err)
					continue
				}

				// Print the PID of the forked process and keep waiting for more signals.
				fmt.Printf("Forked child %v.\n", p.Pid)
			case syscall.SIGINT, syscall.SIGQUIT:
				// Create a context that will expire in 5 seconds and use this as a
				// timeout to Shutdown.
				ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
				defer cancel()

				// Return any errors during shutdown.
				return server.Shutdown(ctx)
			}
		}
	}
}

func handler(w http.ResponseWriter, r *http.Request) {
	fmt.Fprintf(w, "Hello from %v!\n", os.Getpid())
}

func startServer(addr string, ln net.Listener) *http.Server {
	http.HandleFunc("/hello", handler)

	httpServer := &http.Server{
		Addr: addr,
	}
	go httpServer.Serve(ln)

	return httpServer
}

func main() {
	// Parse command line flags for the address to listen on.
	var addr string
	flag.StringVar(&addr, "addr", ":8080", "Address to listen on.")

	// Create (or import) a net.Listener and start a goroutine that runs
	// a HTTP server on that net.Listener.
	ln, err := createOrImportListener(addr)
	if err != nil {
		fmt.Printf("Unable to create or import a listener: %v.\n", err)
		os.Exit(1)
	}
	server := startServer(addr, ln)

	// Wait for signals to either fork or quit.
	err = waitForSignals(addr, ln, server)
	if err != nil {
		fmt.Printf("Exiting: %v\n", err)
		return
	}
	fmt.Printf("Exiting.\n")
}

If You’ve Read This Far ..

Teleport is open source, so you are free to dig in further on github. If you are interested in working on Teleport or similar distributed systems software, we are always looking for good engineers.

Did you enjoy this post?

Check out how our products can help your company: