[HN Gopher] BorgBackup: Deduplicating archiver with compression ...
       BorgBackup: Deduplicating archiver with compression and encryption
       Author : phil294
       Score  : 111 points
       Date   : 2022-12-27 19:01 UTC (3 hours ago)
 (HTM) web link (www.borgbackup.org)
 (TXT) w3m dump (www.borgbackup.org)
       | samuell wrote:
       | Have had a look at both Borg and Restic, but even Restic which is
       | supposed to be faster iirc, was extremely slow on my computer.
       | Been much more happy with my tries with https://kopia.io which
       | also includes an optional cross-platform GUI, in addition to the
       | CLI.
         | beci wrote:
         | I tried all these in my environment, about 2 years ago and
         | kopia wins for me too. Is there any advantage of borg over
         | kopia since then?
           | aborsy wrote:
           | Reliability. Borg has been around for a long time, and is far
           | more mature.
           | I wouldn't trust my backups to Kopia (unless for
           | experimentation).
         | nine_k wrote:
         | Wow, Kopia looks pretty interesting. I first thought that it's
         | a fork of Restic, but it appears to be independent. It has all
         | the features that are key to me: encrypted, deduplicated, works
         | on object storage, can mount the backup as a filesystem.
         | On one hand, on may think that three programs with very similar
         | approach and features is a waste of resource. On the other
         | hand, this is what refinement of the idea looks like: each
         | project improves over previous attempts.
       | vbezhenar wrote:
       | Can someone suggest an approach to backup container environment?
       | E.g. running inside Kubernetes.
       | As I see it: I write some kind of configuration.
       | someproject-db is a deployment which runs a postgres db. Tool
       | should connect to this DB, issue some kind of pg_backup command,
       | capture output, retrieve some metadata about previous backup from
       | S3, compute difference with previous run, compress that
       | difference and store it to S3.
       | anotherproject is a deployment which runs an sqlite db. Tool
       | should do the same but with sqlite-specific commands.
       | yetanotherproject-data is a pvc which has attached pv. Tool
       | should find pod which mounted this volume, exec into that pod and
       | retrieve pv data, again find different and store it to S3.
       | Of course things should be configurable. Like store difference
       | every 15 minutes, store complete backup every week and so on.
       | I'm fine with manual recover and with manual configuration (I
       | just don't want to write and test all the scripts myself).
       | What I don't want is some kind of magic tool which will backup
       | the entire cluster, etcd and my grandparents automatically in
       | some magic way only for $50k/cpu core.
         | m3nu wrote:
         | I recently wrote up my strategy for backing up local containers
         | with Borg & Borgmatic here:
         | https://docs.borgbase.com/setup/borg/containers/
         | Borgmatic will beautifully deal with DB dumps and there is a
         | popular container image to run it. As for the cache ("retrieve
         | some metadata about previous backup from S3"), you don't need
         | to keep it locally. It can be restored from the backup
         | repository.
         | Hope some of this applies to your K8s setup.
         | aborsy wrote:
         | Database and VM snapshot and backup can be tricky.
         | My suggestion is using ZFS.
           | mekster wrote:
           | Doesn't zfs kind of solve all the backup problems alone?
           | Technically, no other backup tools can beat it as being the
           | filesystem itself, it knows more than any external tools can
           | know, like instantly know what file got changed over time
           | without scanning the entire tree.
           | I use Borg as backup of backup (zfs snapshots), so I'll be
           | having multiple implementations of backups (also both are on
           | different remote location) just to be on the safe side.
           | I don't use any other fancier ones as I don't like risking
           | data on less reliable tools.
             | m3nu wrote:
             | Remote ZFS replication kinda does. But the offsite backup
             | wouldn't be encrypted and not everyone is using ZFS. So
             | it's not for all situations.
               | mekster wrote:
               | You can send zfs encrypted volume as encrypted.
               | How does it matter if anyone else is using zfs? You
               | either use a service that supports zfs target or run your
               | own Linux instance which is just installing a single
               | package for Ubuntu.
               | m3nu wrote:
               | What I meant was that there are people who don't run ZFS,
               | but still need backups. So it won't work for everyone.
               | Even for my own use cases, not every server and system I
               | maintain could use ZFS right away.
               | Still good to know about the encrypted volume feature.
               | Will be sure to test this next year.
         | dpedu wrote:
         | I'm using Velero to do this in my toy kubernetes clusters. It
         | uses Restic under the hood and can store things into S3. By
         | default it will take a filesystem-level copy of whatever is on
         | a pv. It looks like it supports hooks, e.g. to run pg_backup
         | like you mentioned, but I haven't used them.
         | https://github.com/vmware-tanzu/velero
         | seymon wrote:
         | Is there something to backup helm releases? With including all
         | k8s manifests, configmaps, secrets and also persistent volumes.
         | Preferably FOSS?
       | m3drano wrote:
       | I recommend using borgmatic to ease the management of Borg
       | backups.
       | mtmail wrote:
       | rsync.net has a special discount when you use borg and "you're an
       | expert" https://www.rsync.net/products/borg.html
       | We're looking to replace our self-written borg backup scripts
       | with https://torsion.org/borgmatic/ which is a wrapper around
       | borg.
       | nine_k wrote:
       | If you prefer a similar approach, but as a single compiled
       | binary, there's Restic: https://github.com/restic/restic
       | Update: yet _another_ take on basically the same approach, also
       | as a self-contained binary: https://github.com/kopia/kopia
         | kova12 wrote:
         | I recall trying to use restic instead of borg a couple years
         | ago, and some major feature was unavailable. I don't recall
         | what is was, I think it was compression, which made archives
         | quite large, and required larger instances for backup.
           | [deleted]
           | anotherevan wrote:
           | It probably was compression. The good news is compression is
           | now available with Restic!
         | RockRobotRock wrote:
         | Does anyone have a take on Kopia vs Restic?
           | btschaegg wrote:
           | A major factor I wouldn't want to use Kopia (I looked into
           | it) ist that it is opinionated with regards to how your
           | system is set up (old-school unix FS layouts in a "pets, not
           | cattle" way). It assumes the location of config files and
           | does not allow you to change the backup path that's stored in
           | a snapshot's metadata.
           | That's bad if you want to use it
           | 1) on NixOS (I don't want backup configs laying around in
           | `~/.config`). As Indy famously said: "That belongs in a Nix
           | expression!"
           | 2) with ZFS snapshots (yes, I'm backing up
           | `/path/to/dataset/.zfs/snapshot/<timestamp>/foo/bar`, but
           | that should not be its path in the metadata!)
           | OTOH, it seems to have the upside that you can apparently
           | alter snapshots after the fact more easily (e.g. if you find
           | out you shouldn't have backed up that gigantic VM image you
           | just moved somewhere temporarily). I leave the decision on
           | whether this is a footgun or not to you.
           | And to be clear: The ZFS snaphot thing is also a pain with
           | Restic, too. You can hack around it somewhat better with
           | something like systemd-nspawn, but it _really_ shouldn 't be
           | that hard.
         | somishere wrote:
         | Big thumbs up for Kopia and its very simple GUI / strategy.
         | Have been using it for a couple of years now to remote backup
         | hard drives and working folders on a bunch of family macs to
         | B2. Restored twice now - logic board & corrupt hd. Chose it
         | after trialing both Borg and Restic for ease of use and storage
         | cost. My monthly backblaze bill still hovers around $1.40.
         | mekster wrote:
         | In my book, backup tools don't count unless it has been used
         | widely for a while without major issues repeatedly being
         | reported.
         | Kopia is too new for that state.
           | Silhouette wrote:
           | That's a reasonable point but the solution - as with all
           | things backup - is diversification. Otherwise if everyone
           | followed your logic then no new backup software or storage
           | service could ever become established no matter how good it
           | actually was.
           | Given that all of the options being discussed here look
           | technically better and in some cases more than an order of
           | magnitude cheaper than other popular backup services and
           | software discussed on HN in the past you could afford to run
           | full redundant backups with multiple combinations of software
           | and backing storage and still have more options for a much
           | lower price than a few years ago.
         | binaryanomaly wrote:
         | restic is great and simple to use. Use it for archiving my
         | backups to Google Cloud Storage.
         | auxym wrote:
         | restic also seems to have better Windows support.
         | Borg can can run in WSL but has seen limited testing under
         | such, per their own docs.
       | wyatt_dolores wrote:
       | I had to setup a quick backup to s3 storage to replace an aging
       | rsnapshot setup. I looked at Borg, but Duplicity
       | (https://duplicity.us/) was easier to configure and connect to
       | S3.
       | For syncing S3 storage across providers, I went with rclone
       | (https://rclone.org/). Note that using rclone to sync across
       | providers (e.g. from Amazon to Wasabi) does require the files to
       | be downloaded to the client machine and then uploaded again. Not
       | ideal, but if you have extra bandwidth it is a convenient setup.
         | ThomasWaldmann wrote:
         | I see quite some full backups in your near future. ;-)
         | And that is one of the main reasons why chunk-deduplicating
         | backup tools (like borg, restic, ...) are better than
         | full/incremental style ones.
       | bluedino wrote:
       | Borg works great. I used it at a shop that had lots of servers,
       | but didn't have any real backups-other than when someone would
       | remember to go swap external USB drives and hope they actually
       | ran.
       | Set it up on a bunch of servers with a simple cron job, the
       | initial backups went quickly, and the incrementals were really
       | fast. Made great use of an old Dell server that wasn't doing
       | anything else and had lots of slow disks in it.
         | m3nu wrote:
         | Borg is surprisingly fast and memory-efficient, even when
         | compared to Restic, which is written in Go. Recently did a
         | benchmark to test the upcoming Borg v2 and this surprised me
         | the most:
         | https://github.com/borgbase/benchmarks
           | number6 wrote:
           | I am always torn between the two: restic or Borg.. how would
           | I decide?
             | water554 wrote:
             | Use them both I ended up at borg
       | ThePhysicist wrote:
       | Have been using Borg for many years now, it saved me several
       | times already when I accidentally deleted stuff I realized I
       | still needed later on. What's great is that you can just mount
       | your backup repository as a FUSE filesystem, Borg then gives you
       | a directory structure containing all your backups over time.
       | Personally I use dates to name my backups, e.g. 2022-10-11, so
       | when I need to restore something from a specific date I just go
       | to the appropriate folder and extract it.
       | dang wrote:
       | Related:
       |  _Deduplicating Archiver with Compression and Encryption_ -
       | https://news.ycombinator.com/item?id=27939412 - July 2021 (71
       | comments)
       |  _BorgBackup: Deduplicating Archiver_ -
       | https://news.ycombinator.com/item?id=21642364 - Nov 2019 (103
       | comments)
       | nov21b wrote:
       | Just started using the append only feature to prevent a potential
       | hacker from wiping out backups that live on a remote ssh server.
       | Combined with restricted ssh access this can be made quite
       | secure. I also tested writing backups to my Android phone (as a
       | backup target) using Termux and Wireguard, worked flawlessly with
       | a bit of tuning (keeping the vpn alive)
         | pkulak wrote:
         | Append only modes are brilliant. Is there an easy way to hook
         | into something like Glacier Deep Archive? That would be super
         | cost effective.
       | trulyrandom wrote:
       | I've been using Borg for years. It's great! The deduplication
       | feature allows me to take a "full" backup of my work station
       | _hourly_. Taking frequent backups like this has already saved my
       | bacon a number times in cases where I accidentally mangled
       | /deleted a file I didn't mean to touch.
       | I recently stumbled upon the release notes for the (WIP) v2:
       | https://www.borgbackup.org/releases/borg-2.0.html. Seems to
       | address quite a few of the pain points of v1.
         | mekster wrote:
         | Too bad they couldn't target S3 endpoints or anything other
         | than SSH for remote target on this breaking change or else it
         | would've been the best of the bunch.
           | notpushkin wrote:
           | I'm wondering if there's a less painful way to use Borg with
           | https://rclone.org/ than just maintaining a local Borg repo
           | and then syncing that.
         | m3nu wrote:
         | Yepp. Version 2 got rid of lots of legacy code and cleaned up
         | CLI args a bit. Will be around 5 to 20% faster than the v1.2
         | branch.
         | https://github.com/borgbase/benchmarks
           | mustache_kimono wrote:
           | Yeah, last time I tried, it was impressive, but kinda slow
           | and limited to execution on a single core.
           | Eager to take another look at Borg and Kopia, etc.
         | beci wrote:
         | Why do you miss kopia from your benchmark?
           | trulyrandom wrote:
           | Did you mean to reply to
           | https://news.ycombinator.com/item?id=34153119?
             | m3nu wrote:
             | Probably. My benchmark was mostly to compare Borg v1.2 and
             | v2 and some network optimizations. Restic was a stretch
             | goal really.
             | For Kopia, I do try it once a year, but I still find the
             | docs and CLI args confusing. Running the server part behind
             | a reverse proxy needs 2x HTTPS and searching the forum to
             | get it somewhat working. For a webdav target, the progress
             | display doesn't really work and it's not possible to cancel
             | a backup run. So for now I'm observing and will retry next
             | year.
       | aborsy wrote:
       | Borg is very good. The V2 repository format will bring in a lot
       | of improvements, particularly in cryptography.
       | Anyone knows when 2.0 will be out of beta, and stable?
         | m3nu wrote:
         | Likely next year after 1-2 RCs. It's at beta4 currently.
       | haunter wrote:
       | What's good for Windows 10 (NTFS) drives? I'm using the Veeam
       | Agent free version [0] for years and no problems whatsoever but
       | curious what are some good options
       | 0, https://www.veeam.com/agent-for-windows-community-
       | edition.ht...
         | xupybd wrote:
         | I use restic on Windows servers but for workstations I use
         | backblaze. They have a backup client. It's just too easy. I
         | don't have to think about it.
         | k8sToGo wrote:
         | For Windows Image Backups I use macrium
       | sleepytimetea wrote:
       | Python source code ? No cloud native API integration ? UI?
         | non-nil wrote:
         | There's Vorta: https://github.com/borgbase/vorta which I quite
         | like.
           | eointierney wrote:
           | Deffo recommend Vorta, good ui, very reliable
           | eternityforest wrote:
           | Vorta looks really awesome, maybe awesome enough that I might
           | switch from Back in Time.
       | SoftTalker wrote:
       | Maybe a bit off topic, but what is a good utility for "imaging" a
       | linux system. I have a task to reprovision a system but we want
       | to keep a complete backup of the current system so that it's
       | possible to restore completely as if it were never touched.
       | This is more than just data backup as we would need need to
       | recover disk partitions/LVM metadata, boot records, etc. as well
       | as all the data itself.
         | av8avenger wrote:
         | Take a look at Clonezilla. Used it many times for the exact
         | same purpose. You could run it either on a running system or
         | use the live iso they provide.
         | https://clonezilla.org/
         | eointierney wrote:
         | Clonezilla is awesome, fast, stable, flexible, and reliable,
         | from the Taiwan Supercomputing Centre. I used it lots over a
         | decade ago to manage Mac, Windows, and Linux workstations and
         | servers.
         | akerl_ wrote:
         | Do you need to do this once, or 10 times, or 1000 times? How
         | big are the disks?
         | The most boring answer is "connect the disks to something else
         | and use `dd` to copy the full blocks from start to finish into
         | a file".
           | SoftTalker wrote:
           | For this specific need, just once. Disk is 1TB but only about
           | 350GB used.
           | 'dd' would have been my thought as well, I've heard of
           | Clonezilla also but never used it and not sure it's really
           | doing anything appreciably different.
           | I like the idea of 'dd' because I have a very clear mental
           | picture of what it does. Just wasn't sure there was something
           | else I might want to look at.
             | dividuum wrote:
             | If you need multiple version of such a disk image, a tool
             | like restic (or I guess borg too, not sure?) can also
             | compress what's provided to it via stdin. So you'd dd
             | directly into restic and it will delta compress to earlier
             | backups.
               | akerl_ wrote:
               | Yea; this was the heart of my 1/10/1000 question. Once?
               | I'd probably just use dd and call it a day. 10 times?
               | Probably download clonezilla. 1000 times? Probably
               | automate something w/ restic and some kind of object
               | storage layer so I don't just have a directory full of
               | giant images/deltas somewhere.
         | vbezhenar wrote:
         | Dumbest approach is dd + compress.
         | Slightly smarter approach is dd, then zero unused sectors and
         | compress.
         | Both will produce an image which could me restored with DD (or
         | mounted offline). Second will be smaller.
         | They should be run with unmounted partitions.
       | andrewchambers wrote:
       | I am the author of bupstash -
       | https://github.com/andrewchambers/bupstash which has many
       | advantages over borg in my biased opinion (like air gapped
       | decryption keys and better performance). Feel free to check it
       | out.
       (page generated 2022-12-27 23:00 UTC)