Chasing missing SIGINT signals down the SSH rabbit hole
May 24, 2018 by Russell Jones
The initial issue: missing SIGINTs
We recently received an interesting bug report on our issue tracker: #1650: ctrl-c ignored.. A Teleport end user reported that our SSH client (
tsh CLI tool) was ignoring Ctrl-C key press, aka the SIGINT signal, when tailing logs across multiple servers in a cluster. Due to this, when the user hit Ctrl-C on their keyboard, Teleport would print
^C but continue blocking.
The issue was that Teleport’s ssh certificate-integrated client,
tsh, was incorrectly handling SIGINT. Digging into the problem revealed something interesting about SSH internals: precisely where signals are handled depends on the type of SSH connection created.
Background on SSH protocol internals
For those not familiar with SSH internals, the protocol allows the creation of a secure channel between a client and server. When running programs on the remote machine, first a “session” SSH channel is created, then a ssh client, such as Teleport’s
tsh, issues either a
exec request in that channel.
When typical users think of SSH, it is these
shell requests they think of. This is the type of request that is created when you type
ssh [email protected]. A pseudo terminal (“PTY”) will be allocated on
instance.us-east-1.compute.amazonaws.com and a program (typically a command shell like Bash) will be launched in it. Programs like Teleport or OpenSSH connect your local PTY to that remote PTY which allows you to run programs like you would locally.
tsh puts your local PTY in raw mode. In raw mode, the local PTY will not perform any interpretation of what you are typing and simply send everything to the remote PTY. (Note, to see this you can type
stty raw and compare what happens when you hit Ctrl-C to what you normally expect. To return to your terminal to sane settings, type
stty sane). When the remote PTY receives the Ctrl-C, it interprets it as SIGINT, and issues a SIGINT to the foreground process in the PTY.
In the case of this bug, the user created an
exec request which is what happens when you do something like
ssh [email protected] tail -f /var/log/kube-apiserver.log.
exec requests typically do not allocate a PTY and are used to run commands on a remote server, collect output, and return. (Note: PTY allocation can always be forced even for
exec requests with the
exec requests, the client and server’s PTY is not in raw mode. When the user types Ctrl-C the local PTY will send a SIGINT to the foreground process:
tsh which then sends a message to the server telling it to terminate the session.
Fixing differences between OpenSSH and Teleport behavior
Here is where things differ between Teleport and OpenSSH. With OpenSSH, the client would send
SSH_MSG_DISCONNECT and cause
sshd to call
exit(1). Unfortunately the golang SSH library that Teleport uses does not support sending
SSH_MSG_DISCONNECT messages and even if it did, there still exists the issue of how to kill the child process because Teleport follows a different process spawning model than OpenSSH.
tsh does is close the channel and client. Once Teleport detects the channel has been closed it reaps any resources associated with the connection and then sends a
SIGKILL to any running processes.
You can see the difference in behavior in Figures (2) and (3) that show how OpenSSH and Teleport behave respectively.
In the case of #1650
tsh was not killing the connection when SIGINT was received. The fix was propagating the SIGINT to the code that was running the remote command, killing it and then finally closing the connection. We’ve fixed this issue now, so Teleport users should be able to tail server logs and exit without a problem anymore!