BorgBackup on macOS

June 24, 2020 - June 28, 2020

Does the dandelion drop all its seeds at the base of its stalk? Does the cuckoo lay its eggs in one nest? So long as your backups are in one place, you are vulnerable to the fortunes of the world.

- The Tao of Backup


Time Machine works well for local backups on macOS. Recently, I did an inventory of my data assets and came to the conclusion that it would be best to have an off-site backup of my data, in addition to a local Time Machine backup. It’s a good idea to have some redundancy in case the Time Machine HDD gets lost, damaged or corrupted.

There are two main requirements that I had when choosing an off-site backup tool:

The first requirement excludes services like Dropbox and Google Drive from the search. The second requirement excludes closed-source software that claim to have E2EE (e.g. Arq, Backblaze).

In the end, I decided to learn how to use BorgBackup. BorgBackup (or Borg) is an open source “deduplicating archiver with compression and encryption”. This project has many contributors and turned 10 years old in March 2020. Borg has also been called the “holy grail of backups”; this definitely intrigued me.

There are many good articles describing how Borg works and how to use it, so I won’t focus on that (but I will link to these articles in the process). Instead, this article is a compilation of notes that I took while setting up and automating Borg on macOS Catalina.


FYI this article was written using macOS Catalina 10.15.5 and borg 1.1.13.

Hosting

I decided to go with Hetzner Storage Box since it comes with Borg support. In this case there is no need to worry about setting up Borg server, only the client on the local machine. The only thing that needs to be done on the server is adding the local machine SSH public key to ~/.ssh/authorized_keys on the server. Storage Boxes use RAID which is great since Borg does not add redundancy to deal with hardware malfunction.

In general, any VPS hosting would work; you just need to make sure that both the client and the server have Borg installed.

Installation

There is a Brew cask for Borg so installation on macOS is simple:

brew cask install borgbackup

For the server, there are distribution packages available for most common Linux and BSD distros.

Remote backup setup

Repo initialization

Before a backup can be made a repository has to be initialized. The main decision that has to be made at this point is encryption key mode selection (this cannot be changed later).

Borg offers 4 options for authenticated encryption with associated data (AEAD):

All 4 options use AES-256-CTR for encryption.

See Borg docs on Encryption and borg init for more information on these options.

Example initialization:

borg init --encryption=repokey-blake2 \
    ssh://username@username.your-storagebox.de:23/./backup/mbp2015

Backup script

I used a combination of sources to write the backup script:

Since both articles give a detailed walkthrough of a typical archive-then-prune workflow, I won’t go into great detail here. Borg usage docs are also very useful, especially when writing command arguments.

Here is my final backup script with detailed comments in case any of the sources goes down; remove the comments in between command arguments before running:

#!/bin/sh

# default repo location so that we can use '::archive' shorthand notation later
export BORG_REPO='ssh://username@username.your-storagebox.de:23/./backup/mbp2015'
# explicitly specify the SSH key
export BORG_RSH='ssh -i /Users/mmxmb/.ssh/storagebox_key'
# only passphrase is needed for repokey borg repo
export BORG_PASSPHRASE='VERY_LONG_PASSPHRASE'

# some helpers and error handling:
info() { printf "\n%s %s\n\n" "$( date )" "$*" >&2; }
trap 'echo $( date ) Backup interrupted >&2; exit 2' INT TERM

info "Starting backup"

# create a daily backup
/usr/local/bin/borg create \
    # work on log level INFO; 
    --verbose \
   # output list of items added (A), modified (M)
   # also output if error happened when accessing a file (E)
   --list --filter=AME \
   # print stats for the created archive at the end; log the return code (rc)
   --stats --show-rc \
   # use lzma compression (low speed, high compression)
   # use a heuristic to decide per chunk whether to compress or not (auto)
   --compression auto,lzma,6 \
   # repo name is inferred from $BORG_REPO
    '::{hostname}-daily-{now}' \
      # directories to backup
      /Users/mmxmb/my_important_docs \
      /Users/mmxmb/my_photos \
      /Users/mmxmb/Desktop

backup_exit=$?

info "Pruning repository"

# prune the repo
/usr/local/bin/borg prune \
    # output verbose list of archives kept/pruned; display prune stats at the end
    --list --stats \
    # only consider archive names starting with this prefix
    --prefix '{hostname}-daily-' \
    --show-rc \
    # number of archives to keep for each time interval
    # example visualisation: https://borgbackup.readthedocs.io/en/stable/usage/prune.html
    --keep-daily 7 \
    --keep-weekly 5 \
    --keep-monthly 6

prune_exit=$?

# use highest exit code as exit code
global_exit=$(( backup_exit > prune_exit ? backup_exit : prune_exit ))

if [ ${global_exit} -eq 1 ];
then
    info "Backup and/or Prune finished with a warning"
fi

if [ ${global_exit} -gt 1 ];
then
    info "Backup and/or Prune finished with an error"
fi

exit ${global_exit}

This script is adapted mostly from The Practical Administrator article.

At this point it is a good idea to stop and play around with Borg to check that everything works smoothly before moving on to the automation step. In my case, I initialized a test Borg repo, used the above script to create an archive of a directory full of text files (so that I don’t have to wait much for compression and encryption), tested extract and prune commands, and finally deleted the repo.

Remote backup automation

Both aforementioned tutorials assume that either cron or systemd is used for scheduling the backup process. Since cron has been deprecated by Apple back in 2005, we need to use launchd which is kind of like systemd but for macOS.

Daemon job definition for launchd is specified in a special XML file called a property list. Here is a sample Borg backup propery list:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
        <key>Label</key>
        <string>mmxmb.borgbackup</string>
        <key>Program</key>
        <string>/path/to/remote/backup/script/backup.sh</string>
        <key>StandardErrorPath</key>
        <string>/path/to/remote/backup/log/backup.log</string>
        <key>StandardOutPath</key>
        <string>/path/to/remote/backup/log/backup.log</string>
        <key>UserName</key>
        <string>mmxmb</string>
        <key>RunAtLoad</key>
        <true/>
        <key>StartCalendarInterval</key>
        <dict>
                <key>Hour</key>
                <integer>12</integer>
                <key>Minute</key>
                <integer>0</integer>
        </dict>
</dict>
</plist>

This file is then saved as /Library/LaunchDaemons/mmxmb.borgbackup.plist (any job.label.plist filename would work).

Here’s an overview of relevant job properties:

For more information on job properties see launchd website.


The job is loaded using:

sudo launchctl load /Library/LaunchDaemons/mmxmb.borgbackup.plist

If RunAtLoad is false, the job can be started after loading using its label:

sudo launchctl start mmxmb.borgbackup

To unload the job use:

sudo launchctl unload /Library/LaunchDaemons/mmxmb.borgbackup.plist

It’s a good idea too keep an eye on /var/log/system.log for possible errors when loading and starting a job.


As an aside, one significant advantage of using launchd as opposed to cron on a laptop is that cron jobs do not execute if the system is turned off or asleep. launchd jobs scheduled with StartCalendarInterval run when computer wakes up, if the computer was asleep when the job should have run. However, if the machine is off when the job should have run, the job does not execute until the next designated time occurs. See Apple Developer Documentation Archive: Scheduling Timed Jobs.

Automation pitfalls

I used cron and systemd extensively but never used launchd. Therefore I had quite a few problems when setting up the launchd job.

Log files permissions

After loading and attempting to run the job, there is a cryptic error message in /var/log/system.log:

Jun 1 01:23:45 mmxmbs-MacBook-Pro com.apple.xpc.launchd[1] (mmxmb.borgbackup[18763]): Service could not initialize: 19F101: xpcproxy + 14521 [XXX][XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX]: 0xd
Jun 1 01:23:45 mmxmbs-MacBook-Pro com.apple.xpc.launchd[1] (mmxmb.borgbackup[18763]): Service exited with abnormal code: 78

This is caused by the user specified under UserName key not having write perimssions for log files specified in StandardErrorPath and StandardOutPath.

SSH key permissions

The job starts but exits shortly with an error (logged to backup error log) when creating an archive:

Remote: Host key verification failed.
Connection closed by remote host. Is borg working on the server?
terminating with error status, rc 2

When debugging this I make sure I am logged in as the user that is specified in the job definition. I then try to SSH to the server the with maximum verbose mode on and SSH public key specified explicitly:

ssh -vvv -i /Users/mmxmb/.ssh/storagebox_key username@username.your-storagebox.de

This is what I see:

debug1: Next authentication method: publickey
debug1: Offering public key: /Users/mmxmb/.ssh/storagebox_key RSA SHA256:... explicit agent
debug3: send packet: type 50
debug2: we sent a publickey packet, wait for reply
debug3: receive packet: type 60
debug1: Server accepts key: /Users/mmxmb/.ssh/storagebox_key RSA SHA256:... explicit agent
debug3: sign_and_send_pubkey: RSA SHA256:...
debug3: sign_and_send_pubkey: signing using ssh-rsa
debug3: send packet: type 50
debug3: receive packet: type 51
debug1: Authentications that can continue: publickey,password
debug2: we did not send a packet, disable method
debug3: authmethod_lookup password
debug3: remaining preferred: ,password

And this is what I expect:

debug1: Next authentication method: publickey
debug1: Offering public key: /Users/mmxmb/.ssh/storagebox_key RSA SHA256:... explicit agent
debug3: send packet: type 50
debug2: we sent a publickey packet, wait for reply
debug3: receive packet: type 60
debug1: Server accepts key: /Users/mmxmb/.ssh/storagebox_key RSA SHA256:... explicit agent
debug3: sign_and_send_pubkey: RSA SHA256:...
debug3: sign_and_send_pubkey: signing using rsa-sha2-512
debug3: send packet: type 50
debug3: receive packet: type 52
debug1: Authentication succeeded (publickey)

Line debug2: we did not send a packet, disable method indicates that the client fails to send the public key for some reason.

In my case the problem is caused by incorrect permissions for SSH keys used to authenticate with Borg server. To resolve the problem I create a new key when logged in as the user specified in the job definition and upload the public key to the sever.

SIP woes

The backup script seems to run well and archive almost everything that is needed. The last message in the backup log:

Tue Jun 1 12:34:56 EDT 2020 Backup and/or Prune finished with a warning

Which turns out to be caused by this error during borg create:

/Users/mmxmb/Desktop: scandir: [Errno 1] Operation not permitted: '/Users/mmxmb/Desktop'
E /Users/mmxmb/Desktop

This error is due to System Integrity Protection (SIP). When a third-party binary is ran in foreground and it needs to scan a restricted directory like ~/Desktop, a dialog pops up asking you if it’s OK to give access to the binary. But when a binary is ran in foreground (e.g. daemon) its access will be denied by default.

This can be fixed by giving the binary full-disk access (FDA) in System Preferences -> Security & Privacy -> Privacy settings. The problem in our case is that the backup script is not a binary; therefore it is not possible to give it FDA. Giving FDA to /usr/local/bin/borg doesn’t work since, from the SIP point of view, the restricted directory gets accessed by the program specified in the launchd job, not borg.

One somewhat hacky workaround that I found in a relevant AskDifferent answer is to create a wrapper binary that calls the backup script. Here’s the code for such a binary written in Go:

// backup provides a binary to run borgbackup script in MacOS Catalina with Full Disk Access
package main

import (
    "log"
    "os"
    "os/exec"
    "path/filepath"
)

func main() {
    ex, err := os.Executable()
    if err != nil {
        log.Fatal(err)
    }
    dir := filepath.Dir(ex)
    script := filepath.Join(dir, "backup.sh")
    cmd := exec.Command("/bin/sh", script)
    cmd.Stdout = os.Stdout
    cmd.Stderr = os.Stderr
    if err := cmd.Run(); err != nil {
        log.Fatal(err)
    }
}

Compile this file and replace the Program property of the backup job property list with the path to this binary. Finally, give FDA to this binary.

macOS FDA Settings

Now the backup job should run as expected and terminate with success status.

Local backup

At this point why not use Borg for local backups as well?

The way I use my laptop, an external backup HDD can be attached for a few days, while I use the laptop at my desk, and then detached for some time when I need to take my laptop with me somewhere. The fact that the HDD is not always attached to the laptop (as opposed to an always available backup server) requires the backup script and launchd job definition to be adjusted slightly.

In particular, having a local backup run once every 24 hours is nice. This condition is easily achievable with StartCalendarInterval launchd property, just like in the remote backup job definition. But if a backup drive hasn’t been attached for a few days it would be ideal if the backup job runs as soon as the drive is re-attached, and not on the next StartCalendarInterval trigger. In Linux, this can be achieved using a udev rule (see Automated backups to a local hard drive tutorial). Since udev doesn’t exist on macOS, here’s one way to achieve similar functionality with another launchd property and some Bash.

Local backup launch daemon job definiton:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>mmxmb.borgbackup-local</string>
    <key>Program</key>
    <string>/path/to/local/backup/script/backup.sh</string>
    <key>StandardErrorPath</key>
    <string>/path/to/local/backup/log/backup.log</string>
    <key>StandardOutPath</key>
    <string>/path/to/local/backup/log/backup.log</string>
    <key>UserName</key>
    <string>mmxmb</string>
    <key>StartCalendarInterval</key>
    <dict>
        <key>Hour</key>
        <integer>11</integer>
        <key>Minute</key>
        <integer>0</integer>
    </dict>
    <key>WatchPaths</key>
    <array>
        <string>/Volumes/backup-disk</string>
    </array>
</dict>
</plist>

The new property, that is not used in remote backup job definition, is WatchPaths. Here’s how WatchPaths works when pointed at a directory:

If the path points to a directory, creating and removing this directory, as well as creating, removing and writing files in this directory will start the job. Actions performed in subdirectories of this directory will not be detected.

If the backup volume is named backup-disk and it has backup directory which contains Borg repos then then the backup job is triggered when the volume is mounted or unmounted. From the point of view of WatchPaths that is equivalent to /Volumes/backup-disk directory being created or deleted.

Since the backup needs to start only when the volume is mounted, the backup script needs to handle the case when the job is triggered when the volume is unmounted:

#!/bin/sh

# it seems that sometimes launchd job is triggered on volume mount
# but the disk is not immediately accessible, so sleeping for a bit helps
sleep 5

DISK_NAME=backup-disk
MOUNTPOINT=/Volumes/$DISK_NAME

# some helpers and error handling:
info() { printf "\n%s %s\n\n" "$( date )" "$*" >&2; }
trap 'echo $( date ) Backup interrupted >&2; exit 2' INT TERM

# exit if disk is not mounted; launchd job gets triggered both when disk is mounted/unmounted
if [ ! -d "$MOUNTPOINT" ]; then
  info "The disk $MOUNTPOINT is not mounted. Exiting."
  exit 0
fi

export BORG_REPO="$MOUNTPOINT/backup/mbp2015"
export BORG_PASSPHRASE='VERY_LONG_PASSPHRASE'

# get unix time of the last complete backup
# source: https://projects.torsion.org/witten/borgmatic/issues/86
LAST_BACKUP_TIME=`/usr/local/bin/borg list --sort timestamp  --format '{time:%s}{TAB}{name}{NEWLINE}' | grep -v '\.checkpoint$' | tail -1 |  cut -f 1`

# find time difference between now and last complete backup
NOW_TIME=`date +"%s"`
SECONDS_SINCE_LAST=$((NOW_TIME - LAST_BACKUP_TIME))
SECONDS_IN_DAY=86400

# exit if last backup took place less than 24 hours ago
if [ $SECONDS_SINCE_LAST -lt $SECONDS_IN_DAY ];
then
  info "Last backup happened less than 24 hours ago. Exiting."
  exit 0
fi

info "Starting local backup"

# create a daily backup
/usr/local/bin/borg create \
    --verbose \
   --list --filter=AME \
   --stats --show-rc \
   --compression auto,lzma,6 \
    '::{hostname}-daily-{now}' \
      /Users/mmxmb/my_important_docs \
      /Users/mmxmb/my_photos \
      /Users/mmxmb/Desktop

backup_exit=$?

info "Pruning local repository"

# prune the repo
/usr/local/bin/borg prune \
    --list --stats \
    --prefix '{hostname}-daily-' \
    --show-rc \
    --keep-daily 7 \
    --keep-weekly 5 \
    --keep-monthly 6

prune_exit=$?

# use highest exit code as exit code
global_exit=$(( backup_exit > prune_exit ? backup_exit : prune_exit ))

if [ ${global_exit} -eq 1 ];
then
    info "Backup and/or Prune finished with a warning"
fi

if [ ${global_exit} -gt 1 ];
then
    info "Backup and/or Prune finished with an error"
fi

exit ${global_exit}

This script also contains some logic preventing the backup job from creating a new archive too often, i.e. when a backup drive is re-attached many times throughout the day.

SIP woes, again

If launchd job calls the script directly, there is a problem even before the archive creation starts:

Sat Jun 06 12:34:56 EDT 2020 Starting local backup

Local Exception
Traceback (most recent call last):
  File "borg/locking.py", line 130, in acquire
FileExistsError: [Errno 17] File exists: '/Volumes/backup-disk/backup/mbp2015/lock.exclusive'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "borg/archiver.py", line 4565, in main
  File "borg/archiver.py", line 4497, in run
  File "borg/archiver.py", line 161, in wrapper
  File "borg/repository.py", line 190, in __enter__
  File "borg/repository.py", line 421, in open
  File "borg/locking.py", line 350, in acquire
  File "borg/locking.py", line 363, in _wait_for_readers_finishing
  File "borg/locking.py", line 134, in acquire
  File "borg/locking.py", line 159, in kill_stale_lock
PermissionError: [Errno 1] Operation not permitted: '/Volumes/backup-disk/backup/mbp2015/lock.exclusive'

Platform: Darwin mmxmbs-MacBook-Pro.local 19.5.0 Darwin Kernel Version 19.5.0: Tue May 26 12:34:56 PDT 2020; root:xnu-6153.121.2~2/RELEASE_X86_64 x86_64
Borg: 1.1.13  Python: CPython 3.5.9 msgpack: 0.5.6
PID: 50157  CWD: /
sys.argv: ['/usr/local/bin/borg', 'create', '--verbose', '--list', '--filter=AME', '--stats', '--show-rc', '--compression', 'auto,lzma,6', '::{hostname}-daily-{now}', '/Users/mmxmb/my_important_docs', '/Users/mmxmb/photos', '/Users/mmxmb/Desktop']
SSH_ORIGINAL_COMMAND: None

terminating with error status, rc 2

SIP restricts access to removable volumes so the trick with creating a wrapper binary and giving it FDA has to be applied here.