Dealing with GridPane Backups (duplicacy)

Content Error or Suggest an Edit

Notice a grammatical error or technical inaccuracy? Let us know, we will give you credit!

Draft Warning

You’ve reached a draft 🤷‍♂️ and unfortunately, it’s a work in progress.

Introduction

GridPane’s backup system is based on duplicacy, and this article talks about how it functions and how to manage backups from the terminal.

This stems from the following live blog

https://wpguide.io/live-blog/blog-articles/gridpane-duplicacy-leaving-38gb-of-fossils-behind

Investigating Backup Storage Sizes

Reviewing duplicacy Snapshot Sizes

You can run the following command to see all backups for each site and how much they’re taking up.

duplicacy check -tabular | less

If you look at this image, you will see all the snapshots. The Chunks/Bytes is the total bytes and chunks used for each snapshot. The uniq/bytes is the unique data that can’t be deduplicated. The new/bytes is new data duplicacy hasn’t seen. The numbers at the bottom are a sum. So under chunks/bytes that’s the total space used in MB. Hope this helps.

What are fossil files?

https://forum.duplicacy.com/t/prune-command-details/1005#:~:text=they are included.-,Two-step fossil collection algorithm,ignore option to skip certain repositories when deciding the deletion criteria.,-1 Reply

The prune command implements the two-step fossil collection algorithm. It will first find fossil collection files from previous runs and check if contained fossils are eligible for permanent deletion (the fossil deletion step). Then it will search for snapshots to be deleted, mark unreferenced chunks as fossils (by renaming) and save them in a new fossil collection file stored locally (the fossil collection step).

For fossils collected in the fossil collection step to be eligible for safe deletion in the fossil deletion step, at least one new snapshot from each snapshot id must be created between two runs of the prune command. However, some repository may not be set up to back up with a regular schedule, and thus literally blocking other repositories from deleting any fossils. Duplicacy by default will ignore repositories that have no new backup in the past 7 days, and you can also use the -ignore option to skip certain repositories when deciding the deletion criteria.

One-liner to check your fossil file sizes

Run this command on your server from anywhere, it will detect and find .duplicacy files.

echo "\n** Checking duplicacy storage **"; \
echo -n "Total Chunks size of backup chunks: "; du --max-depth="0" -h "/opt/gridpane/backups/duplications/chunks"; \
echo "----"; \
echo -n "Total .fsl files: "; find /opt/gridpane/backups/duplications/chunks -name "*.fsl" | wc -l; \
echo -n "Total .fsl file size: "; find /opt/gridpane/backups/duplications/chunks -type f -name "*.fsl" -print0 | du --files0-from=- -hc | tail -n1; \
echo "----"; \
echo -n "Total normal chunk files: "; find /opt/gridpane/backups/duplications/chunks -type f ! -name "*.fsl" | wc -l; \
echo -n "Total normal chunk file size: "; find /opt/gridpane/backups/duplications/chunks -type f ! -name "*.fsl" -print0 | du --files0-from=- -hc | tail -n1; \
echo "----"; \
echo -n "Duplicacy reporting totals: "; \cd "$(dirname "$(find /var/www/ -name ".duplicacy" | tail -n 1)" )" >> /dev/null; duplicacy check -tabular | grep Total

Pruning Fossil Files

Attention

This may corrupt your backups, you need to make sure that no backups are running before or during this process.

First, let’s do a dry run cause we always do a dry run when doing something destructive.

duplicacy prune -exhaustive -exclusive -d

Once we confirm there’s a ton of fossil files and we want to prune them, remove the -d like so

duplicacy prune -exhaustive -exclusive

Change Log

0 Shares:
You May Also Like

Understanding Linux Swap

Great Articles on Linux Swap https://www.redhat.com/sysadmin/clear-swap-linux https://www.cyberciti.biz/faq/linux-which-process-is-using-swap/ https://www.howtogeek.com/449691/what-is-swapiness-on-linux-and-how-to-change-it/