Configuration
Basic configuration on any host to backup
Stopping and starting processes: ssd-backup
If you want to create a snapshot of all backed up filesystems together and not of each filesystem separately, you need to stop all processes that write to the backed up locations, remount filesystems temporarily read-only and then create snapshots. After the snapshots are created, the filesystems can be remounted back read-write and the processes can be started again. This task of stopping and starting daemons or processes is performed by the ssd-backup script that is part of the Tar-LVM suite.
Let's now look at the ssd-backup configuration. This script is configured by default using the file /usr/local/etc/ssd-backup.conf
.
### disable sysv (System V) or systemd (Systemd) init scripts support,
### just comment for the default behaviour, i.e. sysv/systemd support
# format: nosysv ("true"|"false")
# format: nosystemd ("true"|"false")
#nosysv "true"
#nosystemd "true"
### sysv or systemd services to stop or start depending on the mode
# format: stopstart sysv|systemd <service> [[pidfile=...][,][psname=...]]
stopstart systemd sssd.service
stopstart systemd cron.service
stopstart systemd postfix.service
stopstart systemd slapd.service
stopstart systemd denyhosts.service
stopstart sysv rsyslog
stopstart systemd dbus.service
### extended regular expressions specifying the names of processes
### to kill in the stop mode (usually not services)
# format: kill <regexp>
#kill "^console-kit-dae"
### real user names whose processes shouldn't be killed (should survive)
### when the -u option is used, i.e. when all non-root user processes
### should be killed (root is always included and doesn't have
### to be listed)
# format: survruser <user>
survruser message+
survruser ntp
### commands to run at the end of the stop mode
# format: stopcomm <command> <arg1> ... <argN>
#stopcomm echo "SSD stopped..."
### commands to run at the beginning of the start mode
# format: startcomm <command> <arg1> ... <argN>
#startcomm echo "SSD starting..."
Supported directives are concisely described directly in the configuration file.
The nosysv
or nosystemd
directives can disable the System V or Systemd support and can thus allow usage of this script even if no service
or systemctl
binary is found.
The stopstart
directives define all services to be stopped in the ssd-backup
stop
mode or started in its start
mode. The services are stopped in the specified order if they're running and started in the reverse order depending on their initial status. They also allow to specify an optional PID file and/or process name to identify running services if the operating system itself doesn't provide this information for certain service.
The kill
directive allows to kill certain processes based on their names. And the survruser
entries allow to specify all real usernames whose processes shouldn't be killed if the -u
argument is used. This argument instructs ssd-backup
to kill all user processes, i.e. all non-root processes that are not excluded by survruser
.
The stopcomm
and startcomm
directives can be used to execute arbitrary commands at the end of the stop mode or at the beginning of the start mode. They can be therefore used for any non-standard operation, e.g. remounting filesystems read-only or read-write, cleanup of some locations, sending a message to users or anything else that's scriptable.
See the ssd-status
help for more information and complete list of its arguments.
ssd-status -h
It's wise to confirm that ssd-backup
is configured properly before continuing. Simply by invoking a command sequence that's similar to the following one. But be prepared for short downtime of your services.
ssd-backup -u -v stop
mount -o remount,ro /var
mount -o remount,ro /
...
mount -o remount,rw /
mount -o remount,rw /var
ssd-backup -u -v start
All the ssd-backup
and mount
commands should succeed, of course, if the configuration is correct. If you need to identify the processes that are using files on certain filesystem, try the lsof
or fuser
commands.
Creating and removing snapshots, backing up: tar-lvm
The snapshot of the writable filesystems backed up from inside of a running operating system can be created by remounting the filesystems read-only, creating LVM snapshots and remounting the filesystems back to their original state, mostly often back read-write. Or by remounting the writable filesystems read-only during the whole backup process. However, the downtime of your services is much longer in the latter case.
All these operations are performed by the tar-lvm script. Its pre
mode remounts the filesystems read-only, creates LVM snapshots are remounts LVM filesystems back read-write. Its run
mode creates the tar
backups and the post
mode remounts the non-LVM filesystems back to their initial state.
The tar-lvm
script is configured in the file /usr/local/etc/tar-lvm/tar-lvm.conf
which has the following syntax.
### suffix of the LVM snapshot names appended to the name of the origin
# format: lvsnapsuffix <suffix>
lvsnapsuffix ".tar-lvm"
### disables ACLs support if set (necessary for older GNU/tar versions),
### just comment for the default behaviour, i.e. ACLs support
# format: noacls ("true"|"false")
#noacls "true"
### filesystems not on LVM
# format: fs <name> (<device-path>|UUID=<uuid>) [<path-to-exclude> ...]
fs "boot" "UUID=621393c4-1827-4b6a-b053-1f249a844626"
### filesystems on top of LVM
# format: lv <name> <group> <snapshot-size>% [<path-to-exclude> ...]
lv "rootfs" "mg-baxic-prod" "20%"
lv "usr" "mg-baxic-prod" "20%"
lv "var" "mg-baxic-prod" "80%"
lv "srv" "mg-baxic-prod" "80%"
# don't backup tmp because it's only temporary location and its contents is
# often deleted on system startup on some systems
#lv "tmp" "mg-baxic-prod" "80%"
lv "vartmp" "mg-baxic-prod" "80%"
lv "varmail" "mg-baxic-prod" "80%"
# don't backup varlock because it's only temporary location and its contents
# is often deleted on system startup on some systems, moreover, varlock cannot
# be remounted read-only because LVM creates file locks in it during snapshot
# creation
#lv "varlock" "mg-baxic-prod" "80%"
lv "home" "mg-baxic-prod" "20%" "./baxic/data"
Supported directives are concisely described directly in the configuration file.
The lvsnapsuffix
entry defines the suffix that is appended to the names of the LVM volumes to get the corresponding LVM snapshot volume names.
The noacls
entry can disable support for POSIX ACLs. This becomes important on older systems whose GNU tar version doesn't support the --acls
option yet.
All other directives, i.e. the fs
and lv
items specify the filesystems to backup. The fs
directive refers to non-LVM devices with filesystems that are remounted read-only during the whole backup process. The device can be specified either by device path or by its UUID.
The lv
lines refer to LVM logical volumes by their names and volume groups they belong to. Because new snapshot logical volume is created for each logical volume to backup in the tar-lvm
pre
mode, its size must be specified as a certain number of percents of its origin.
Furhtermode, both fs
and lv
entries can contain more optional arguments that list directories or files that should be excluded from the backup. See the lv
entry for the home
filesystem above for an example. The path should always start with a dot and is relative to the filesystem root directory.
To confirm the validity of new configuration, the following command sequence should succeed. Be prepared for short downtime of your services again if you're using LVM or for longer downtime comprising the whole backup process if your read-write filesystems are not on LVM.
mkdir /tmp/tar-lvm-test
ssd-backup -u -v stop
tar-lvm -v pre
tar-lvm -v -f run 0 /tmp/tar-lvm-test
tar-lvm -v post
ssd-backup -u -v start
Automating the backup
Wrapper script for one host backup: tar-lvm-one
The tar-lvm-one wrapper script must be configured to simplify the automation of a backup of a specific host. This script invokes the commands described and configured earlier and also mounts and unmounts the backup device in between.
This wrapper script can be configured either on each host separately in local configuration files or in a shared configuration file located on one host denoted as allhost
in the local configuration. The syntax of both files is identical and the directives specified in the local configuration file override the shared directives. The shared configuration doesn't have to be used. This is the case if no allhost
directive is present in the local file. However, the local configuration file cannot be omitted and must specify all needed directives locally or at least the allhost
directive pointing to the host with shared configuration.
Let's now look at an example configuration shared among more hosts from one allhost
. The local configuration must then contain the allhost
directive on each host. All other entries are optional and become useful only if the shared configuration should be overriden by something more specific on certain host.
The local configuration file /usr/local/etc/tar-lvm/tar-lvm-one.local.conf
must contain at least the following part in our example.
# hostname of the machine that contains the shared configuration file
# (and usually runs the tar-lvm-all script if used - hence the name),
# the machine must be accessible by ssh public key authentication, i.e.
# without using a password, comment the allhost line if no shared configuration
# file should be copied to localhost and used
# format: allhost <hostname>[:<hostname_fqdn>]
allhost "baxic-pm:baxic-kvm-1.domain.org"
In this case, the shared configuration file /usr/local/etc/tar-lvm/tar-lvm-one.shared.conf
must define all other directives.
# disable sshfs backup filesystem support, just comment for the default
# behaviour, i.e. both block device and sshfs support
# format: nosshfs ("true"|"false")
#nosshfs "true"
# backup device, either local device or remote sshfs filesystem, the local
# device is specified by name (not by whole path) or by UUID and the device
# must be located in the /dev directory, the remote sshfs filesystem is
# specified by the host and optional user and path
# format: dev (<name>|UUID=<uuid>|[<user>@]<host>:[<path>])
#dev "UUID=f3d286d1-aa4a-6f32-a367-ab93e72cbfa8"
dev "root@baxic-nas.domain.org:/backup-data"
# device mapper name used by cryptsetup if the backup device is local and
# if it is encrypted by LUKS, comment the dm line if the backup device isn't
# encrypted
# format: dm <dmname>
dm "backup"
# backup filesystem mount point
# format: mntdir <bkpfsmntdir>
mntdir "/mnt/backup"
# directory on the backup filesystem containing the backup directory tree
# format: rootdir <bkprootdir>
rootdir "tar-lvm"
# defer ssd-backup start after the whole backup is complete or comment
# for the default behaviour, i.e. ssd-backup start after tar-lvm pre
# (ssd-backup stop, tar-lvm pre, ssd-backup start, tar-lvm run, tar-lvm post)
# format: deferssdstart ("true"|"false")
#deferssdstart "true"
The nosshfs
entry supresses usage of the remote SSHFS backup filesystem and also the check that the sshfs
and fusermount
binaries are present. It should be therefore used on systems without these binaries installed.
As mentioned earlier, the backup device can be either a local device defined by its name or UUID or a remote SSHFS filesystem. All possibilities are specified by the dev
line depending on its syntax.
If the backup filesystem is located at a local device and if it is encrypted by LUKS, the dm
line is required. It instructs tar-lvm-one
to ask for password and decrypt the device before it is mounted.
The mntdir
entry is a mount point of the backup filesystem, i.e. a directory to which the backup filesystem is mounted during the backup.
And the rootdir
line specifies a path to an existing directory on the backup filesystem where the backups and logs should be stored.
There's one more optional entry in the configuration file and that's the deferssdstart
flag. If it is set to true
, it instructs the script to defer ssd-backup start
as the last operation. The default behaviour differs, ssd-backup start
is invoked immediately after the LVM snapshots are created, i.e. before the tar
backup itself. This becomes handy if some read-write filesystem is not located on LVM and if no snapshot can be thus created.
The tar-lvm-one
configuration and backup can be tested as follows:
tar-lvm-one -f all 0
Choosing the triggering model: centralized or distributed
There're two ways how to trigger the backups on all hosts: centralized or distributed. The first centralized way must be used if you need to backup whole physical machine with separate backups of its KVM virtual machines to a device that is connected directly to the physical machine. The reason is that it needs to attach and detach the backup device to the virtuals. However, if you want to backup just via SSH with the exception of the machine the device is connected to, you can choose whether to use the distributed or centralized triggering model.
The main difference between both models even if using just a SSHFS storage is that the centralized way is managed by the tar-lvm-all
script which controls the whole process, the backups are triggered sequentially and there's a maximal number of parallel backups that can run at once. The distributed way is managed and controlled on each host separately, backups are usually triggered from the Cron scheduler at a specific time and they all run independently of each other. Both models use the tar-lvm-one
wrapper script. This script is invoked by tar-lvm-all
in the former case, i.e. in the centralized model, and directly from Cron in the latter case, i.e. in the distributed model.
In fact, both models can be combined and used at once, but there's one significant limitation. If tar-lvm-all
attaches the backup device to some of its virtuals that store the backups directly to this device, the distributed model shouldn't be mixed up with the centralized one unless it uses different backup device. The backup device can be mounted just by one machine at a time if using the direct access to avoid data corruption.
Centralized triggering model: tar-lvm-all
If you chose the centralized model of triggering the backups, let's proceed with configuration of the tar-lvm-all script. This script must run on the physical machine the backup device is connected to if this machine or its KVM virtuals do not want to use SSHFS, but the more efficient direct access method to access the backup device. Otherwise if all hosts to backup store their backups remotely using SSHFS, it can be run on any host - even on a host that's not going to be backed up.
The tar-lvm-all
configuration file is located at /usr/local/etc/tar-lvm/tar-lvm-all.conf
and it looks as follows.
# backup device on the physical machine if at least one backup to local
# device should be performed (not needed for SSHFS), the device is
# specified by name (not by whole path) or by UUID and it must be
# be located in the /dev directory, it can be either whole disk,
# partition or volume etc.
# format: pmdev (<name>|UUID=<uuid>)
pmdev "UUID=0da345bc"
# backup device to create on the virtual machines if at least one backup
# to local device should be performed (not needed for SSHFS), the device
# is specified by name (not by whole path)
# format: vmdev <name>
vmdev "vdb"
# set to "true" to enable password prompt, the prompt is needed if the backup
# device is encrypted and a password should be therefore passed to tar-lvm-one
# (for each virtual machine), just comment if not needed
# format: passprompt ("true"|"false")
passprompt "true"
# set to "true" if you want to backup the physical machine as well, otherwise
# comment or set to "false", the machine must be accessible as localhost by ssh
# public key authentication, i.e. without using a password
# format: pmbackup ("true"|"false")
pmbackup "true"
# names of the virtual machines to backup, the names are hostnames as well,
# but you can specify different hostname appended to the machine name
# behind a colon, e.g. as "machine:host.domain.org", the machines must
# be accessible by ssh public key authentication, i.e. without using
# a password
# format: vm <hostname>[:<hostname_fqdn>]
vm "baxic-prod"
vm "baxic-prod-old"
# hosts to backup using sshfs, the names are hostnames as well, but
# you can specify different hostname appended to the host name
# behind a colon, e.g. as "host:host.domain.org", hosts must be accessible
# by ssh public key authentication, i.e. without using a password
# format: host <hostname>[:<hostname_fqdn>]
host "baxic-test-1"
host "baxic-test-2"
host "baxic-test-3"
# number of hosts to backup in parallel, remaining hosts must wait until
# the preceding backups finish, this setting doesn't apply to vm's (i.e.
# virtual machines) that are always backed up one by one
# format: parhostnum <number>
parhostnum 2
# notification email addresses
# format: mailto <email>
mailto "backup@domain.org"
# SMTP server if neither mail nor mailx (i.e. local MTA) should be used
# and SMTP should be used directly, simply comment if local MTA should be
# used instead
# format: smtp[-tls|s]://[USER:PASS@]SERVER[:PORT]]
#smtpserver "smtp-tls://user@gmail.com:secret@smtp.googlemail.com"
# notification email sender if smtpserver is specified
# format: mailfrom <email>
mailfrom "user@gmail.com"
If at least one KVM virtual machine is backed up to a local device connected to the physical machine and not to a remote SSHFS filesystem, both pmdev
and vmdev
entries must be defined. The pmdev
entry defines the backup device on the physical machine that should be connected as the vmdev
device to the virtual machines. However, both entries are required only if at least one vm
entry is present.
Another optional entry is the passprompt
flag. It instructs tar-lvm-all
to ask for password that should be passed to tar-lvm-one
if the backup device is encrypted.
The pmbackup
flag determines whether to backup the physical machine (i.e. the machine the script runs at) or not. If it is set to true, root@localhost
must be accessible from root@localhost
by SSH public key authentication, because the backup is performed in the same way as in the case of backing up other host. Just the backup device doesn't have to be attached.
The most important entries are the vm
and host
directives that specify the nodes to backup. More specifically, the virtual machines with direct access to the backup device and hosts to backup remotely via SSH. Their names are especially important by the vm
lines because they are equal to the names of the KVM virtual machines. They can be also used to refer to DNS hostnames if they belong to the resolver search domain, but if the names are not resolvable, an optional colon can be used to append FQDN, IP address or another resolvable hostname entry.
There's one more entry that influences the way how the remote SSHFS backups are triggered - the parhostnum
directive. It specifies maximal number of hosts to backup in parallel and if this number of hosts being backed up is reached, all reamining hosts must wait.
The remaining lines configure email notification and they're mostly self-explanatory. The only important thing to take into account that may not be obvious is the fact that if no smtpserver
directive is specified, then local MTA is used to deliver emails. If it is specified, emails are delivered directly via SMTP to the specified mail server. However, that's not the preferred solution. If the delivery fails, there's neither a way how to inform about the error nor any later repeated delivery attempt. The error output can get lost easily.
If you want to test the backup for all machines and hosts, simply invoke tar-lvm-all
on the host that manages the backup process centrally.
tar-lvm-all -v -f all 0
Distributed triggering model: tar-lvm-one
If you chose the distributed model of triggering the backups, there's no need to configure any other tool, because tar-lvm-one
is used for this purpose and this script was already configured earlier.
Inserted: | 2017-11-28 21:20:11 |
Last updated: | 2017-12-18 20:19:06 |