Create an ssh tunnel background service with autossh and systemd (systemctl)

In the previous post, I have introduced how to create a reverse tunnel to access your local machine from a remote machine, by pass the firewall, or, access a network in the behavior of another machine.

In this post, I advance the topic by introducing how to create a systemd service and start the service automatically in background after the OS boots.


This post lists all 3 ways to create a startup script in general. The most reliable method will be used in this post.

Step 1: Check if your  monitor port

TLDR; try running autossh username_at_server@server, if the usage of autossh appears, you need to do this step. Otherwise, if other errors appear (e.g., unauthorized, ...) or the command succeeds, you can skip this step.

The usage of autossh is as follows. In that case, you have to do continue this section.

usage: autossh [-V] [-M monitor_port[:echo_port]] [-f] [SSH_OPTIONS]

    -M specifies monitor port. May be overridden by environment
       variable AUTOSSH_PORT. 0 turns monitoring loop off.
       Alternatively, a port for an echo service on the remote
       machine may be specified. (Normally port 7.)
    -f run in background (autossh handles this, and does not
       pass it to ssh.)
    -V print autossh version and exit.

Environment variables are:
    AUTOSSH_GATETIME    - how long must an ssh session be established
                          before we decide it really was established
                          (in seconds). Default is 30 seconds; use of -f
                          flag sets this to 0.
    AUTOSSH_LOGFILE     - file to log to (default is to use the syslog
                          facility)
    AUTOSSH_LOGLEVEL    - level of log verbosity
    AUTOSSH_MAXLIFETIME - set the maximum time to live (seconds)
    AUTOSSH_MAXSTART    - max times to restart (default is no limit)
    AUTOSSH_MESSAGE     - message to append to echo string (max 64 bytes)
    AUTOSSH_PATH        - path to ssh if not default
    AUTOSSH_PIDFILE     - write pid to this file
    AUTOSSH_POLL        - how often to check the connection (seconds)
    AUTOSSH_FIRST_POLL  - time before first connection check (seconds)
    AUTOSSH_PORT        - port to use for monitor connection
    AUTOSSH_DEBUG       - turn logging to maximum verbosity and log to
                          stderr

Choose a random port for which that port and the port immediately above it ( port + 1) is free, for e.g.: 2230, then 2230 and 2231 ports must be free. In the next step, add -M 2230 to the autossh command.

Longer reason:

In fact, the real autossh binary file always requires a parameter called monitor port for which that port is used to send and the port immediate above it (port + 1) is used to receive data. Several package wraps the real autossh command by a shell script, located at /usr/bin/autossh. The shell script is just to randomize a port (if not specified) (for 42 times) and pass it to the real autossh located at /usr/lib/autossh/autossh. The content of the wrapper autossh is as follows, in Ubuntu 20.04:

➜  ~ uname -a
Linux transang-ryzen 5.4.0-58-generic #64-Ubuntu SMP Wed Dec 9 08:16:25 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
➜  ~ file $(which autossh)
/usr/bin/autossh: POSIX shell script, ASCII text executable
➜  ~ cat $(which autossh)                                         
#!/bin/sh
# little wrapper to choose a random port for autossh, falling back to $fallback_port

fallback_port="21021"
tcpstat="/proc/net/tcp" 

# take an hex port and check whether it is in use (i.e. locally bound) in
# $tcpstat
# unix command semantics: if in use return 0 else return 1
port_in_use() {
	if egrep -q "^[0-9 ]+: [0-9A-F]{8}:$1" $tcpstat ; then
		return 0
	else
		return 1
	fi
}
	
echo "$@" | egrep -q -- '-f?M ?[0-9]+' # backward compatibility, skip guess if -M is passed

if [ $? -gt 0 ] && [ -z "$AUTOSSH_PORT" ]; then 
	portguess=""
	if [ -r "/dev/urandom" ] && [ -r "$tcpstat" ]; then
		for t in $(seq 1 42); do
			# get a random hex
			randport=$( od -x -N2 -An /dev/urandom | tr -d ' ' )
			
			# increase it a little "bit"
			randport=$( /usr/bin/printf "%04x" $(( 0x$randport | 0x8000 )) )
			randport_1=$( /usr/bin/printf "%04x" $(( 0x$randport + 1 )) )

			# check if port is in use, possibile race condition between here
			# and the exec 
			if ! port_in_use $randport && ! port_in_use $randport_1; then
				portguess=$(( 0x$randport ))
				break	
			fi
		done
	fi

	if [ -z "$portguess" ]; then
		fallback=$( /usr/bin/printf "%04x" $fallback_port )
		fallback_1=$( /usr/bin/printf "%04x" $(( 0x$fallback + 1 )) )
		if ! port_in_use $fallback && ! port_in_use $fallback_1; then
			portguess=$fallback_port
		else
			echo "unable to find a suitable tunnel port"
			exit 1
		fi
	fi

	export AUTOSSH_PORT="$portguess"
fi

exec /usr/lib/autossh/autossh "$@"

From another machine installed with arch Linux, the real autossh is located directly at /usr/bin/autossh, and requires the monitor port explicitly from the command line.

➜  ~ uname -a
Linux transang 5.10.4-arch2-1 #1 SMP PREEMPT Fri, 01 Jan 2021 05:29:53 +0000 x86_64 GNU/Linux
➜  ~ file $(which autossh)
/usr/bin/autossh: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=9b17b2e4928310c4c59e04f7a29b504e60c3e61a, for GNU/Linux 3.2.0, stripped

Step 2: register a systemd service

Create a systemd unit file, namely tunnel.service, in /etc/systemd/system

sudo cat <<EOF >/etc/systemd/system/tunnel.service
[Unit]
Description=SSH tunnel service
After=network.target network-online.target sshd.service

[Service]
ExecStart=/usr/bin/autossh -i /home/transang/.ssh/id_rsa -R '*:2222:localhost:22' -NT username_at_server@server

[Install]
WantedBy=multi-user.target
EOF
  • '*:2222:localhost:22' parameter means: access to port 2222 of server is equivalent to access to localhost (of the local machine where we run this command) at port 22.
  • From step 1, if a monitor port is required, the monitor port specification is required. For example: /usr/bin/autossh -M 2230 -i /home/transang/.ssh/id_rsa -R '*:2222:localhost:22' -NT username_at_server@server

For other parameters, please refer to the post I mentioned at the beginning of this post.

Step 3: authenticate hostname

If you have never access the server server from the root account, you must authorize the server by running the command in step 2 with sudo, and accept the connection to authorize the server's fingerprint.

➜  ~ sudo /usr/bin/autossh -M 2230 -i /home/transang/.ssh/id_rsa -R '*:2222:localhost:22' -NT username_at_server@server                     
The authenticity of host '66.42.33.195 (66.42.33.195)' can't be established.
ECDSA key fingerprint is SHA256:Cs2ar5L58USLpDnqC7CKzKXYiWOTie2O1w9UlH1Mtx4.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '66.42.33.195' (ECDSA) to the list of known hosts.

Without this step, in step 6, there might be the following error:

➜  ~ sudo systemctl status tunnel 
● tunnel.service - SSH tunnel service
     Loaded: loaded (/etc/systemd/system/tunnel.service; disabled; vendor preset: disabled)
     Active: failed (Result: exit-code) since Thu 2021-01-21 00:11:04 JST; 10s ago
    Process: 75561 ExecStart=/usr/bin/autossh -M 2230 -i /home/transang/.ssh/id_rsa -R *:2222:localhost:22 -NT username_at_server@66.42.33.195 (code=exited, status=1/FAILURE)
   Main PID: 75561 (code=exited, status=1/FAILURE)

Jan 21 00:11:04 transang systemd[1]: Started SSH tunnel service.
Jan 21 00:11:04 transang autossh[75561]: starting ssh (count 1)
Jan 21 00:11:04 transang autossh[75561]: ssh child pid is 75562
Jan 21 00:11:04 transang autossh[75562]: Host key verification failed.
Jan 21 00:11:04 transang autossh[75561]: ssh exited prematurely with status 255; autossh exiting
Jan 21 00:11:04 transang systemd[1]: tunnel.service: Main process exited, code=exited, status=1/FAILURE
Jan 21 00:11:04 transang systemd[1]: tunnel.service: Failed with result 'exit-code'.

Step 4: Prevent NetworkManager from randomly disconnect wifi

Reduce power save level to prevent NetworkManager from randomly disconnect wifi:
In /etc/NetworkManager/conf.d/default-wifi-powersave-on.conf change wifi.powersave = 3 to wifi.powersave = 2.  If the file does not exist, create a new one with the following content:

➜  ~ cat /etc/NetworkManager/conf.d/default-wifi-powersave-on.conf                                  
[connection]
wifi.powersave = 2

Restart NetworkManager service in order for the change to takes effect sudo systemctl restart NetworkManager.

The value 2 comes from nm-setting-wireless.h

/**
* NMSettingWirelessPowersave:
* @NM_SETTING_WIRELESS_POWERSAVE_DEFAULT: use the default value
* @NM_SETTING_WIRELESS_POWERSAVE_IGNORE: don't touch existing setting
* @NM_SETTING_WIRELESS_POWERSAVE_DISABLE: disable powersave
* @NM_SETTING_WIRELESS_POWERSAVE_ENABLE: enable powersave
* These flags indicate whether wireless powersave must be enabled.
**/
typedef enum {
NM_SETTING_WIRELESS_POWERSAVE_DEFAULT       = 0,
NM_SETTING_WIRELESS_POWERSAVE_IGNORE        = 1,
NM_SETTING_WIRELESS_POWERSAVE_DISABLE       = 2,
NM_SETTING_WIRELESS_POWERSAVE_ENABLE        = 3,
_NM_SETTING_WIRELESS_POWERSAVE_NUM, /< skip >/
NM_SETTING_WIRELESS_POWERSAVE_LAST          =  _NM_SETTING_WIRELESS_POWERSAVE_NUM - 1, /< skip >/
} NMSettingWirelessPowersave;

Step 5: Start the service and enable it on the startup

Enable and start the service with sudo systemctl enable --now tunnel.

Step 6: disable screen lock

There is a bug that causes the ssh connection to disconnect while in the remote server the background sshd process that listens to the reversed port (in the example, the 2222 port) is still alive. As a result, this prevents autossh from re-connect because the port is already occupied by the previous process.

I am debugging the issue and finding the best workaround. Disabling screen lock is a temporary solution. I have been using this solution for years without any problem.

To prevent the OS from sleeping/hibernation/suspend, use:

sudo systemctl mask sleep.target suspend.target hibernate.target hybrid-sleep.target

To restore these features:

sudo systemctl unmask sleep.target suspend.target hibernate.target hybrid-sleep.target

Once I find a solution to the root issue, I will update the post.

Step 7: testing

Check the service status.

➜  ~ sudo systemctl status tunnel
● tunnel.service - SSH tunnel service
     Loaded: loaded (/etc/systemd/system/tunnel.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2021-01-01 21:27:57 JST; 2 weeks 5 days ago
   Main PID: 1735 (autossh)
      Tasks: 2 (limit: 154375)
     Memory: 6.3M
     CGroup: /system.slice/tunnel.service
             ├─   1735 /usr/lib/autossh/autossh -i /home/transang/.ssh/id_rsa -R *:2222:localhost:22 -NT username_at_server@66.42.33.195
             └─1033070 /usr/bin/ssh -L 63903:127.0.0.1:63903 -R 63903:127.0.0.1:63904 -i /home/transang/.ssh/id_rsa -R *:2222:localhost:22 -NT username_at_server@66.42.33.195

Jan 01 21:27:57 transang-ryzen systemd[1]: Started SSH tunnel service.
Jan 01 21:27:57 transang-ryzen autossh[1735]: starting ssh (count 1)
Jan 01 21:27:57 transang-ryzen autossh[1735]: ssh child pid is 1777
Jan 20 06:58:26 transang-ryzen autossh[1735]: timeout polling to accept read connection
Jan 20 06:58:26 transang-ryzen autossh[1735]: port down, restarting ssh
Jan 20 06:58:26 transang-ryzen autossh[1735]: starting ssh (count 2)
Jan 20 06:58:26 transang-ryzen autossh[1735]: ssh child pid is 1033070

Ssh to the server, try ssh to the local machine with ssh username_at_local@localhost -p 2222.