[HN Gopher] BorgBackup: Deduplicating archiver with compression ... ___________________________________________________________________ BorgBackup: Deduplicating archiver with compression and encryption Author : phil294 Score : 111 points Date : 2022-12-27 19:01 UTC (3 hours ago) (HTM) web link (www.borgbackup.org) (TXT) w3m dump (www.borgbackup.org) | samuell wrote: | Have had a look at both Borg and Restic, but even Restic which is | supposed to be faster iirc, was extremely slow on my computer. | | Been much more happy with my tries with https://kopia.io which | also includes an optional cross-platform GUI, in addition to the | CLI. | beci wrote: | I tried all these in my environment, about 2 years ago and | kopia wins for me too. Is there any advantage of borg over | kopia since then? | aborsy wrote: | Reliability. Borg has been around for a long time, and is far | more mature. | | I wouldn't trust my backups to Kopia (unless for | experimentation). | nine_k wrote: | Wow, Kopia looks pretty interesting. I first thought that it's | a fork of Restic, but it appears to be independent. It has all | the features that are key to me: encrypted, deduplicated, works | on object storage, can mount the backup as a filesystem. | | On one hand, on may think that three programs with very similar | approach and features is a waste of resource. On the other | hand, this is what refinement of the idea looks like: each | project improves over previous attempts. | vbezhenar wrote: | Can someone suggest an approach to backup container environment? | E.g. running inside Kubernetes. | | As I see it: I write some kind of configuration. | | someproject-db is a deployment which runs a postgres db. Tool | should connect to this DB, issue some kind of pg_backup command, | capture output, retrieve some metadata about previous backup from | S3, compute difference with previous run, compress that | difference and store it to S3. | | anotherproject is a deployment which runs an sqlite db. Tool | should do the same but with sqlite-specific commands. | | yetanotherproject-data is a pvc which has attached pv. Tool | should find pod which mounted this volume, exec into that pod and | retrieve pv data, again find different and store it to S3. | | Of course things should be configurable. Like store difference | every 15 minutes, store complete backup every week and so on. | | I'm fine with manual recover and with manual configuration (I | just don't want to write and test all the scripts myself). | | What I don't want is some kind of magic tool which will backup | the entire cluster, etcd and my grandparents automatically in | some magic way only for $50k/cpu core. | m3nu wrote: | I recently wrote up my strategy for backing up local containers | with Borg & Borgmatic here: | https://docs.borgbase.com/setup/borg/containers/ | | Borgmatic will beautifully deal with DB dumps and there is a | popular container image to run it. As for the cache ("retrieve | some metadata about previous backup from S3"), you don't need | to keep it locally. It can be restored from the backup | repository. | | Hope some of this applies to your K8s setup. | aborsy wrote: | Database and VM snapshot and backup can be tricky. | | My suggestion is using ZFS. | mekster wrote: | Doesn't zfs kind of solve all the backup problems alone? | Technically, no other backup tools can beat it as being the | filesystem itself, it knows more than any external tools can | know, like instantly know what file got changed over time | without scanning the entire tree. | | I use Borg as backup of backup (zfs snapshots), so I'll be | having multiple implementations of backups (also both are on | different remote location) just to be on the safe side. | | I don't use any other fancier ones as I don't like risking | data on less reliable tools. | m3nu wrote: | Remote ZFS replication kinda does. But the offsite backup | wouldn't be encrypted and not everyone is using ZFS. So | it's not for all situations. | mekster wrote: | You can send zfs encrypted volume as encrypted. | | How does it matter if anyone else is using zfs? You | either use a service that supports zfs target or run your | own Linux instance which is just installing a single | package for Ubuntu. | m3nu wrote: | What I meant was that there are people who don't run ZFS, | but still need backups. So it won't work for everyone. | | Even for my own use cases, not every server and system I | maintain could use ZFS right away. | | Still good to know about the encrypted volume feature. | Will be sure to test this next year. | dpedu wrote: | I'm using Velero to do this in my toy kubernetes clusters. It | uses Restic under the hood and can store things into S3. By | default it will take a filesystem-level copy of whatever is on | a pv. It looks like it supports hooks, e.g. to run pg_backup | like you mentioned, but I haven't used them. | | https://github.com/vmware-tanzu/velero | seymon wrote: | Is there something to backup helm releases? With including all | k8s manifests, configmaps, secrets and also persistent volumes. | Preferably FOSS? | m3drano wrote: | I recommend using borgmatic to ease the management of Borg | backups. | mtmail wrote: | rsync.net has a special discount when you use borg and "you're an | expert" https://www.rsync.net/products/borg.html | | We're looking to replace our self-written borg backup scripts | with https://torsion.org/borgmatic/ which is a wrapper around | borg. | nine_k wrote: | If you prefer a similar approach, but as a single compiled | binary, there's Restic: https://github.com/restic/restic | | Update: yet _another_ take on basically the same approach, also | as a self-contained binary: https://github.com/kopia/kopia | kova12 wrote: | I recall trying to use restic instead of borg a couple years | ago, and some major feature was unavailable. I don't recall | what is was, I think it was compression, which made archives | quite large, and required larger instances for backup. | [deleted] | anotherevan wrote: | It probably was compression. The good news is compression is | now available with Restic! | RockRobotRock wrote: | Does anyone have a take on Kopia vs Restic? | btschaegg wrote: | A major factor I wouldn't want to use Kopia (I looked into | it) ist that it is opinionated with regards to how your | system is set up (old-school unix FS layouts in a "pets, not | cattle" way). It assumes the location of config files and | does not allow you to change the backup path that's stored in | a snapshot's metadata. | | That's bad if you want to use it | | 1) on NixOS (I don't want backup configs laying around in | `~/.config`). As Indy famously said: "That belongs in a Nix | expression!" | | 2) with ZFS snapshots (yes, I'm backing up | `/path/to/dataset/.zfs/snapshot/<timestamp>/foo/bar`, but | that should not be its path in the metadata!) | | OTOH, it seems to have the upside that you can apparently | alter snapshots after the fact more easily (e.g. if you find | out you shouldn't have backed up that gigantic VM image you | just moved somewhere temporarily). I leave the decision on | whether this is a footgun or not to you. | | And to be clear: The ZFS snaphot thing is also a pain with | Restic, too. You can hack around it somewhat better with | something like systemd-nspawn, but it _really_ shouldn 't be | that hard. | somishere wrote: | Big thumbs up for Kopia and its very simple GUI / strategy. | Have been using it for a couple of years now to remote backup | hard drives and working folders on a bunch of family macs to | B2. Restored twice now - logic board & corrupt hd. Chose it | after trialing both Borg and Restic for ease of use and storage | cost. My monthly backblaze bill still hovers around $1.40. | mekster wrote: | In my book, backup tools don't count unless it has been used | widely for a while without major issues repeatedly being | reported. | | Kopia is too new for that state. | Silhouette wrote: | That's a reasonable point but the solution - as with all | things backup - is diversification. Otherwise if everyone | followed your logic then no new backup software or storage | service could ever become established no matter how good it | actually was. | | Given that all of the options being discussed here look | technically better and in some cases more than an order of | magnitude cheaper than other popular backup services and | software discussed on HN in the past you could afford to run | full redundant backups with multiple combinations of software | and backing storage and still have more options for a much | lower price than a few years ago. | binaryanomaly wrote: | restic is great and simple to use. Use it for archiving my | backups to Google Cloud Storage. | auxym wrote: | restic also seems to have better Windows support. | | Borg can can run in WSL but has seen limited testing under | such, per their own docs. | wyatt_dolores wrote: | I had to setup a quick backup to s3 storage to replace an aging | rsnapshot setup. I looked at Borg, but Duplicity | (https://duplicity.us/) was easier to configure and connect to | S3. | | For syncing S3 storage across providers, I went with rclone | (https://rclone.org/). Note that using rclone to sync across | providers (e.g. from Amazon to Wasabi) does require the files to | be downloaded to the client machine and then uploaded again. Not | ideal, but if you have extra bandwidth it is a convenient setup. | ThomasWaldmann wrote: | I see quite some full backups in your near future. ;-) | | And that is one of the main reasons why chunk-deduplicating | backup tools (like borg, restic, ...) are better than | full/incremental style ones. | bluedino wrote: | Borg works great. I used it at a shop that had lots of servers, | but didn't have any real backups-other than when someone would | remember to go swap external USB drives and hope they actually | ran. | | Set it up on a bunch of servers with a simple cron job, the | initial backups went quickly, and the incrementals were really | fast. Made great use of an old Dell server that wasn't doing | anything else and had lots of slow disks in it. | m3nu wrote: | Borg is surprisingly fast and memory-efficient, even when | compared to Restic, which is written in Go. Recently did a | benchmark to test the upcoming Borg v2 and this surprised me | the most: | | https://github.com/borgbase/benchmarks | number6 wrote: | I am always torn between the two: restic or Borg.. how would | I decide? | water554 wrote: | Use them both I ended up at borg | ThePhysicist wrote: | Have been using Borg for many years now, it saved me several | times already when I accidentally deleted stuff I realized I | still needed later on. What's great is that you can just mount | your backup repository as a FUSE filesystem, Borg then gives you | a directory structure containing all your backups over time. | Personally I use dates to name my backups, e.g. 2022-10-11, so | when I need to restore something from a specific date I just go | to the appropriate folder and extract it. | dang wrote: | Related: | | _Deduplicating Archiver with Compression and Encryption_ - | https://news.ycombinator.com/item?id=27939412 - July 2021 (71 | comments) | | _BorgBackup: Deduplicating Archiver_ - | https://news.ycombinator.com/item?id=21642364 - Nov 2019 (103 | comments) | nov21b wrote: | Just started using the append only feature to prevent a potential | hacker from wiping out backups that live on a remote ssh server. | Combined with restricted ssh access this can be made quite | secure. I also tested writing backups to my Android phone (as a | backup target) using Termux and Wireguard, worked flawlessly with | a bit of tuning (keeping the vpn alive) | pkulak wrote: | Append only modes are brilliant. Is there an easy way to hook | into something like Glacier Deep Archive? That would be super | cost effective. | trulyrandom wrote: | I've been using Borg for years. It's great! The deduplication | feature allows me to take a "full" backup of my work station | _hourly_. Taking frequent backups like this has already saved my | bacon a number times in cases where I accidentally mangled | /deleted a file I didn't mean to touch. | | I recently stumbled upon the release notes for the (WIP) v2: | https://www.borgbackup.org/releases/borg-2.0.html. Seems to | address quite a few of the pain points of v1. | mekster wrote: | Too bad they couldn't target S3 endpoints or anything other | than SSH for remote target on this breaking change or else it | would've been the best of the bunch. | notpushkin wrote: | I'm wondering if there's a less painful way to use Borg with | https://rclone.org/ than just maintaining a local Borg repo | and then syncing that. | m3nu wrote: | Yepp. Version 2 got rid of lots of legacy code and cleaned up | CLI args a bit. Will be around 5 to 20% faster than the v1.2 | branch. | | https://github.com/borgbase/benchmarks | mustache_kimono wrote: | Yeah, last time I tried, it was impressive, but kinda slow | and limited to execution on a single core. | | Eager to take another look at Borg and Kopia, etc. | beci wrote: | Why do you miss kopia from your benchmark? | trulyrandom wrote: | Did you mean to reply to | https://news.ycombinator.com/item?id=34153119? | m3nu wrote: | Probably. My benchmark was mostly to compare Borg v1.2 and | v2 and some network optimizations. Restic was a stretch | goal really. | | For Kopia, I do try it once a year, but I still find the | docs and CLI args confusing. Running the server part behind | a reverse proxy needs 2x HTTPS and searching the forum to | get it somewhat working. For a webdav target, the progress | display doesn't really work and it's not possible to cancel | a backup run. So for now I'm observing and will retry next | year. | aborsy wrote: | Borg is very good. The V2 repository format will bring in a lot | of improvements, particularly in cryptography. | | Anyone knows when 2.0 will be out of beta, and stable? | m3nu wrote: | Likely next year after 1-2 RCs. It's at beta4 currently. | haunter wrote: | What's good for Windows 10 (NTFS) drives? I'm using the Veeam | Agent free version [0] for years and no problems whatsoever but | curious what are some good options | | 0, https://www.veeam.com/agent-for-windows-community- | edition.ht... | xupybd wrote: | I use restic on Windows servers but for workstations I use | backblaze. They have a backup client. It's just too easy. I | don't have to think about it. | k8sToGo wrote: | For Windows Image Backups I use macrium | sleepytimetea wrote: | Python source code ? No cloud native API integration ? UI? | non-nil wrote: | There's Vorta: https://github.com/borgbase/vorta which I quite | like. | eointierney wrote: | Deffo recommend Vorta, good ui, very reliable | eternityforest wrote: | Vorta looks really awesome, maybe awesome enough that I might | switch from Back in Time. | SoftTalker wrote: | Maybe a bit off topic, but what is a good utility for "imaging" a | linux system. I have a task to reprovision a system but we want | to keep a complete backup of the current system so that it's | possible to restore completely as if it were never touched. | | This is more than just data backup as we would need need to | recover disk partitions/LVM metadata, boot records, etc. as well | as all the data itself. | av8avenger wrote: | Take a look at Clonezilla. Used it many times for the exact | same purpose. You could run it either on a running system or | use the live iso they provide. | | https://clonezilla.org/ | eointierney wrote: | Clonezilla is awesome, fast, stable, flexible, and reliable, | from the Taiwan Supercomputing Centre. I used it lots over a | decade ago to manage Mac, Windows, and Linux workstations and | servers. | akerl_ wrote: | Do you need to do this once, or 10 times, or 1000 times? How | big are the disks? | | The most boring answer is "connect the disks to something else | and use `dd` to copy the full blocks from start to finish into | a file". | SoftTalker wrote: | For this specific need, just once. Disk is 1TB but only about | 350GB used. | | 'dd' would have been my thought as well, I've heard of | Clonezilla also but never used it and not sure it's really | doing anything appreciably different. | | I like the idea of 'dd' because I have a very clear mental | picture of what it does. Just wasn't sure there was something | else I might want to look at. | dividuum wrote: | If you need multiple version of such a disk image, a tool | like restic (or I guess borg too, not sure?) can also | compress what's provided to it via stdin. So you'd dd | directly into restic and it will delta compress to earlier | backups. | akerl_ wrote: | Yea; this was the heart of my 1/10/1000 question. Once? | I'd probably just use dd and call it a day. 10 times? | Probably download clonezilla. 1000 times? Probably | automate something w/ restic and some kind of object | storage layer so I don't just have a directory full of | giant images/deltas somewhere. | vbezhenar wrote: | Dumbest approach is dd + compress. | | Slightly smarter approach is dd, then zero unused sectors and | compress. | | Both will produce an image which could me restored with DD (or | mounted offline). Second will be smaller. | | They should be run with unmounted partitions. | andrewchambers wrote: | I am the author of bupstash - | https://github.com/andrewchambers/bupstash which has many | advantages over borg in my biased opinion (like air gapped | decryption keys and better performance). Feel free to check it | out. ___________________________________________________________________ (page generated 2022-12-27 23:00 UTC)