• 17 Posts
  • 502 Comments
Joined 2 years ago
cake
Cake day: July 5th, 2023

help-circle






  • Every hour. Could do it more frequently if needed.

    It depends on how resource intensive the backup process is.

    Consider an 800GB Immich instance.

    Using Duplicity or rsync takes 1 hour per backup. 99% of the time is spent in traversing the directory structure and checking which files have changed. 1% is spent into transferring the difference to the backup. Any backup system that operates on top of the file system would take this much. In addition, unless you’re using something that can take snapshots of the filesystem, you have to stop Immich during the backup process in order to prevent backing up an invalid app state.

    Using ZFS send on the other hand (with syncoid) takes less than 5 seconds to discover the differences and the rest of the time is spent on the data transfer, at 100MB/s in my case. Since ZFS send is based on snapshots, I don’t have to stop the service either.

    When I used Duplicity to backup, I would backup once week because the backup process was long and heavy on the disk array. Since I switched to ZFS send, I do it once an hour because there’s almost no visible impact.

    I’m now in the process of migrating my laptop to ZFS on root in order to be able to utilize ZFS send for regular full system backups. If successful, eventually I’ll move all my machines to ZFS on root.





  • With 2 disks that would be type mirror in ZFS-speak, completely built-in. Equivalent to RAID1 in terms of hardware fault tolerance.

    You could do a 3-disk mirror or n-disk mirror really. The RAID5/6 rough equivalents are called RAIDzN where N is the number of disk failures they tolerate. E g. RAIDz1, RAIDz2, etc. You probably want a mirror unless you need more space than a single disk provides.


  • Yup, turn it on, let it do a scrub, then turn it off. I’d still use redudnancy though. Not merely to cover the case of the drive failing, but also to cover the bit rot use case. It’s exceedingly unlikely bits to rot at the exact same spot on two or more disks. When ZFS finds a checksum mismatch during a scrub (which indicates bit rot), it’ll be able to trivially recover the data from the drive where the checksum matches. It’ll then rewrite the rotten part.


  • ZFS with automatic snapshots and scrubbing. This will keep as many and as old snapshots as your like. It’ll ensure the files don’t rot. It’ll ensure the media doesn’t die, so long as you have enough redundancy and you replace disks as they die. This is what I’d trust for long term storage because I think I understand how and why it works. It should last as long as I feed it disks. If I delete something, I should be able to restore it from a snapshot. The hardware doesn’t need to be anything fancy. Just a Pi 4/5 with a couple of WD Elements would be fine. Could add more disks for more redudnancy. I’m running 2-disk residency.

    You don’t have to touch the software if it’s not exposed to the Internet. Whatever works today on it will work 20 years from now, so long as the hardware works. A couple of spare Pis, SD cards and power supplies should let it last for decades.