# Removing Snapshots

The feature that was most often requested feature for restic is the ability to remove snapshots from the repository. Sometimes, restic was even (rightfully) criticised for not having such a function.

After about three months of work, PR #518 was merged into the master branch a few days ago. This pull request brings two new commands to restic: forget and prune, which allows you to not only remove a single snapshot manually, but rather specify a policy according to which restic should automatically remove snapshots (so you don’t have to bother with them). The remainder of this post will give a short introduction on how you can use the new commands to implement your own strategy for limiting growth of the restic repository.

For all of the following commands the repository location and password have been written to the environment variables RESTIC_REPOSITORY and RESTIC_PASSWORD so that the commands can be run directly. This is how to do it:

Please note that this feature is not yet contained in any released version of restic, you need to compile the code from the current (as of 22 August 2016) master branch yourself.

## Removing a single snapshot

Let’s suppose you have a restic repository and ran a backup at 5:00 o’clock in the morning each day this year. Running the snapshots command shows you around 235 snapshots:

The forget command allows removing snapshots. When a snapshot ID like 6e001a58 for the first snapshot made on 1 January 2016 is specified as the argument of the command, that snapshot is deleted from the repository:

The snapshot in a restic repository is really just a pointer to the data that was present when the snapshot was made. Removing a snapshot does not remove the data from the repository, only when the command prune is run, unreferenced (and therefore unneeded) data is removed:

In this example prune was finished quickly, but it can take a longer time to check the references for each blob of data. Restic combines several blobs of data into so-called “pack” files. When a pack file is found to contain some data that is still referenced and other data that isn’t needed any more, it will create a new pack file and write the needed data to it, then remove the original pack file. This process can also take some time.

## Applying an expire policy

Removing a single snapshot is useful, but not very convenient. Let’s check out the specific parameters of the forget command:

The most important parameter is --dry-run, which will only print the snapshots that would be removed according to the policy set by the other parameters.

The basic idea is that you run forget by specifying the right parameters tell restic which snapshots you want to keep. Restic then goes through the list of snapshots and removes those that do not match the policy.

Let’s try this with a simple policy: Restic should keep the last seven daily snapshots, eight weekly backups and only a monthly backup for 24 months:

You can see that when this command is run without --dry-run, restic will remove a lot of snapshots (213 of 235):

Afterwards, the list of snapshots is a lot shorter:

## How does restic find the snapshots to remove?

It is important to know how forget filters the list of snapshots, so we’ll go through this in detail now. First, restic lists all snapshots and splits the list into separate lists, one for each combination of host name and directories that have been saved. In our example above, just one host name (mopped) and directory (/home/fd0/tmp/data) was saved, so that makes just one list to go through.

Restic will then sort the list from newest to oldest snapshot and does the following, in exactly this order:

When --keep-last is set, e.g. to the value 10, the newest ten snapshots are kept and removed from the list.

When --keep-hourly is set, e.g. to the value 4, then restic will find the four most recent hours in which a snapshot was created. For each of those hours, it marks the last snapshot as to be kept, and flags the others for removal. It will then remove all the snapshots for these hours from the list.

It’s easier than it sounds. Consider the following snapshots in a repo:

Running forget --keep-hourly 4, restic will find the two snapshots at 19:24:00 and 19:53:23. This is one hour (starting at 19:00:00 and ending at 19:59:59) and restic will only keep the last snapshot for this hour. This means that 98fb9f00 is kept, and d221a465 is removed. The next hour that has a snapshot starts at 18:00:00, the one after that at 05:00:00, and so on. This is the result of running forget --keep-hourly 4:

When --keep-daily is set, e.g. to the value 7, then restic will apply a similar approach to --keep-hourly: Go through the list, find the last seven days in which at least one snapshot was made. For each day, keep the last snapshot made on that day, flag the others for removal, and delete all snapshots from the list.

The options --keep-weekly, --keep-monthly and --keep-yearly are applied in the same way.

## Conclusion

This article described an easy way to remove a single snapshot and also explained how to apply an expire policy for snapshots. This allows regularly removing snapshots from the repository to limit its growth.

The functions to remove snapshots and unneeded data from the repository are new. Please report an issue if you notice any odd behavior or find bugs.