[HN Gopher] The various scripts I use to back up my home compute... ___________________________________________________________________ The various scripts I use to back up my home computers using SSH and rsync Author : tosh Score : 138 points Date : 2022-12-09 15:27 UTC (7 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | smm11 wrote: | I gave up on this at home long ago, and just use Onedrive for | everything. I don't even have "local" files. My stuff is there, | and in the event my computer won't start up I lose what's open in | the browser. I can handle that. | | At work I use Windows backup to write to empty SMB-mounted drives | nightly, then write those daily to another drive on an offline | Fedora box. | | My super critical files are on an encrypted SD card I sometimes | put in my phone when cellular connection is off, and this is | periodically backed up to Glacier. The phone (Galaxy) runs Dex | and can be my computer when needed to work with these files. | pmontra wrote: | His backup rotation algorithm is very close to what rsnapshot | does. | | https://rsnapshot.org/ | NelsonMinar wrote: | I use rsnapshot still! It feels very old fashioned but it works | reliably and is easy to understand. | mekster wrote: | It's good to keep multiple backups with different | implementations local and remote. | | Rsnapshot is hard to break by using very basic principles of | file system based files and hard links. If your file system | isn't zfs, I think it's a viable backup strategy for local | copy while you can use others to take remote backups. | PopAlongKid wrote: | >I don't use Windows at the moment and don't really mount network | drives, either. That might be a good alternative to consider. | | Regarding Windows: | | I have successfully mirrored a notebook and a desktop[0] (single | user) with Windows using _robocopy_ , which is a utility that | comes with Windows (used to be part of the Resource Kit but I | think it is now in the base product). When I say "mirror" I mean | I can use either machine as my current workstation without any | loss of data, as long as I run the "sync" script at each switch. | | I use "net use" to temporarily mount a few critical drives on the | local network, then _robocopy_ does its work, it has maybe 85% of | the same functionality of rsync (which I also used extensively | when administering corporate servers and workstations). Back in | the DOS days, I wrote my own very simple version of the same | thing using C, but when _robocopy_ came along I was glad to stop | maintaining my own effort. | | [0]or two desktops, using removable high-capacity media like | Iomega zip drives. | gary_0 wrote: | I use MSYS2 on Windows in order to run regular rsync and other | such utilities. It's served me very well for years. I also have | some bash scripts that I can conveniently run on either Linux | or Windows via MSYS2. | EvanAnderson wrote: | Robocopy is very nice but has no delta compression | functionality. For things like file server migrations (where I | want to preserve ACLs, times, etc) robocopy is my go-to tool. | | I've used the cwRsync[0] binary distribution of rsync on | Windows for backups. I found it worked very well for simple | file backups. I never did get around to trying to combine it | with Volume Shadow Copy to make consistent backups of the | registry and applications like Microsoft SQL Server. (I | wouldn't expect to get a bootable restore from such a backup, | though.) | | [0] https://www.itefix.net/cwrsync | rzzzt wrote: | I used QtdSync, another frontend backed by a Windows rsync | binary. A nice feature was that it supported the "duplicate | entire target folder with hard links, then overwrite changes | only"-style on NTFS volumes, so I could have lots of | browseable point-in-time backup folders without consuming | extra disk space: | https://www.qtdtools.de/page.php?tool=0&sub=1&lang=en | wereallterrrist wrote: | I find it very, very hard to go wrong with Syncthing (for stuff I | truly need replicated, code/photos/text-records) and ZFS + | znapzend + rsync.net (automatic snapshots of `/home` and | `/var/lib` on servers). | | The only thing missing is -> I'd like to stop syncing code with | Syncthing and instead build some smarter daemon. The daemon would | take a manifest of repositories, each with a mapping of | worktrees->branches to be actualized and fsmonitored. The daemon | would auto-commit changes on those worktrees into a shadow branch | and push/pull it. Ideally this could leverage (the very amazing, | you must try it) `jj` for continous committing of the working | copy and (in the future, with native jj formart) even handle the | likely-never-to-happen conflict scenario. (I'd happily | collaborate on a Rust impl and/or donate funds to one.) | | Given the number of worktrees I have of some huge repos (nixpkgs, | linux, etc) it would likely mark a significant reduction in | CPU/disk usage given what Syncthing is having to do now to | monitor/rescan as much as I'm asking it to (given it has to dumb- | sync .git, syncs gitignored content, etc, etc). | hk1337 wrote: | I use Syncthing between Mac, Windows (have included Linux in | the mix at one point), and with my Synology NAS. Syncthing is | more for my short term backup though. I will either commit it | to a repo, save it to a Synology share, or delete it. | | *edit* my gitea server saves its backups to synology | than3 wrote: | I hate to be the one to point out the obvious, but replication | isn't a backup. Its for resiliency just like RAID, the two | aren't the same. | whalesalad wrote: | What is the actual difference between a backup and | replication? If the 1's and 0's are replicated to a different | host, is that any different than "backing up" (replicating | them) to a piece of external media? | jjav wrote: | > What is the actual difference between a backup and | replication? | | Simplest way to think about it is that a backup must be an | immutable snapshot in time. Any changes and deletions which | happen after that point in time will never reflect back | onto the backup. | | That way, any files you accidentaly delete or corrupt (or | other unwanted changes, like ransomware encrypting them for | you) can be recovered by going back to the backup. | | Replication is very different, you intentionally want all | ongoing changes to replicate to the multiple copies for | availability. But it means that unwanted changes or data | corruption happily replicates to all the copies so now all | of them are corrupt. That's when you reach for the most | recent backup. | | That's why you always need to backup and you'll usually | want to replicate as well. | chrishas35 wrote: | When those 1s and 0s are deleted and that delete is | replicated (or other catastrophic change, such as | ransomware) you presumably don't have the ability to | restore if all you're doing is replication. A strategy that | layers replication + backup/versioning is the goal. | natebc wrote: | I'll add that _usually_ a backup strategy includes | generational backups of some kind. That is daily, weekly, | monthly, etc to hedge against individually impacted files | as mentioned. | | Ideally there is also an offsite and inaccessible from | the source component to this strategy. Usually this level | of robustness isn't present in a "replication" setup. | than3 wrote: | Put more simply, backups account for and mitigate the | common risks to data during storage while minimizing | costs, ransomware is one of those common risks. Its | organizational dependent based on costs and available | budget so it varies. | | Long term storage usually has some form of Forward Error | Correction (FEC) protection schemes (for bitrot), and | often backups are segmented which may be a mix of full | and iterative, or delta backups (to mitigate cost) with | corresponding offline components (for ransomware | resiliency), but that too is very dependent on the | environment as well as the strategy being used for data | minimization. | | > Usually this level of robustness isn't present in a | "replication" setup. | | Exactly, and thinking about replication as a backup often | also gives those using it a false sense of security in | any BC/DR situations. | NelsonMinar wrote: | Syncthing has file versioning but I don't know for sure if | it's suitable for backup. | https://docs.syncthing.net/users/versioning.html | reacharavindh wrote: | Replication to another machine that has a COW file system | with snapshots is backup though :-) | | We backup our data storage for an entire HPC cluster, about 2 | PiB of it to a single machine with a 4 disk shelves running | ZFS with snapshots. It works very well. Simple raunchy every | night, and snapshotted. | | We use the backup as a sort of Time Machine should we need | data from the past that we deleted in the primary. Plus, we | don't need to wait for the tapes to load or anything.. it is | pretty fast and intuitive | jerf wrote: | The person you're replying to said "Syncthing ... and ZFS + | znapzend + rsync.net" though. You're ignoring the rsync.net | part. | | I have something similar; it's Nextcloud + restic to AWS S3, | but it's the same principle. You can give people the | convenience and human-comprehensibility of sync-based | sharing, but also back that up too, for the best of both | worlds. Though in my case the odds of me needing "previous | versions" of things approach zero and a full sync is fairly | close to backup, but, even so I do have a full solution here. | jrm4 wrote: | But, it makes things easy. I have e.g. a home computer, a | server in the closet thing, a laptop and a work computer all | with a shared Syncthing folder. | | So to bolster that other thing, I just have a simple bash | script that reminds me every 7 days to make a copy of that | folder somewhere else on that machine. It's not precise | because I often don't know what machine I will be using, but | that creates a natural staggering that I figure should be | sufficient of something goes weird and lose something; like | I'm likely to have an old copy somewhere? | killingtime74 wrote: | For code I just use a self hosted git server | acranox wrote: | Sparkleshare does something kind of similar. It uses git as the | backend automatically sync directories on a few computers. | https://www.sparkleshare.org/ | JeremyNT wrote: | > Given the number of worktrees I have of some huge repos | (nixpkgs, linux, etc) it would likely mark a significant | reduction in CPU/disk usage given what Syncthing is having to | do now to monitor/rescan as much as I'm asking it to (given it | has to dumb-sync .git, syncs gitignored content, etc, etc). | | Are you really hitting that much of a resource utilization | issue with syncthing though? I use it on lots of small files | and git repos and since it uses inotify there's not really much | of a problem. I guess the worst case is switching to very | different branches frequently, or committing very large | (binary?) files where it may need to transfer them twice, but | this hasn't been a problem in my own experience. | | I'm not sure you could really do a whole lot better than | syncthing by being clever, and it strikes me as a lot of effort | to optimize for a specific workflow. | | Edit: actually, I wonder if you could just exclude the working | copies with a clever exclude list in syncthing, such that you'd | ONLY grab .git so you wouldn't even need the double | transfer/storage. You risk losing uncommitted work I suppose. | fncivivue7 wrote: | Sounds like you want Borg | | https://borgbackup.readthedocs.io/en/stable/ | | My two 80% full 1tb laptops and 1tb desktop backup to around | 300-400G after dedupe and compression. Currently have around | 12tb of backups stored in that 300G. | | Incremental backups run in about 5 mins even against the | spinning disk's they're stored on. | _dain_ wrote: | They work together. I use syncthing to keep things | synchronized across devices, including to an always-on | "master" device that has more storage. Then borg runs on the | master device to create backups. | 0cf8612b2e1e wrote: | Python programmer here, but I actually prefer Restic [0]. | While more or less the same experience, the huge selling | point to me is that the backup program is a single executable | that can be easily stored alongside the backups. I do not | want any dependency/environment issues to assert themselves | when restoration is required (which is most likely on a | virgin, unconfigured system). | | [0] https://restic.net/ | SomeoneOnTheWeb wrote: | You can also take a look at Kopia (https://kopia.io/). | | I've been using Borg, Restic and Kopia for a long time and | Kopia is my personal favorite - very fast, very efficient, | runs in the background automatically without having to | schedule a CRON or anything like that. | | Only downside is that the backups are made of a HUGE number | of files, so when synchronizing it can sometimes take a bit | of time to check the ~5k files. | klodolph wrote: | I've been using Kopia, I recommend it. | wanderingmind wrote: | Highly recommend Kopia that has a nice UI and can work | with rclone (so any cloud back end) | codethief wrote: | I don't think GP was talking about backups (which is what | Borg is good for) but about _synchronization_ between | machines which is another issue entirely. | wereallterrrist wrote: | No, I distinctly don't want borg. It doesn't help or solve | anything that Syncthing doesn't do. The obsession with borg | and bup are pretty baffling to me. We deserve better in this | space. (see: Asuran and another who's name I forget...) | | Critically, I'm specifically referring to code sync that | needs to operate at a git-level to get the huge efficiencies | I'm thinking of. | | Syncthing, or borg, scanning 8 copies of the Linux kernel is | pretty horrific compared to something doing a "git commit && | git push" and "git pull --rebase" in the background (over- | simplifying the shadow-branch process here for brevity.) | | re: 'we deserve better' -- case in point, see Asuran - | there's no real reason that sync and backup have to be | distinctly different tools. Given chunking and dedupe and | append-logs, we really, really deserve better in this tooling | space. | formerly_proven wrote: | borg et al and "git commit" work in essentially the same | way. Both scan the entire tree for changes using | modification timestamps. | dragonwriter wrote: | > borg et al and "git commit" work in essentially the | same way. Both scan the entire tree for changes using | modification timestamps. | | But git commit _doesn't_ do that. If you want to do that | in git, you typically do it before commit with "git add | -A". | [deleted] | ww520 wrote: | Yes. I just let Syncthing sync among devices, using it for | creating copies of the backup. The daily backup scripts do | their things and create one backup snapshot, then Syncthing | picks up the new backup files and propagate them to multiple | devices. | blindriver wrote: | I use Synology to back everything up, and then from there I use | Hyperbackup to backup to 2 external hard drives every week. When | the hard drives get full, I buy a new one that is larger and I | put the old one into my closet and date it. | | Now that you reminded me, it might be best to buy a new larger | hard drive if there are any pre-Christmas sales. | kevstev wrote: | Have you looked into backing up into the cloud? I used to do | this way back in the day, but by using AWS I get legit offsite | storage. Its really cheap if you use glacier, and I was | actually looking this week, and there is now an even cheaper | option called Deep Archive. It costs me about $2 a month to | store my stuff there. I just back up the irreplaceable things- | my photos, documents, etc. All the other stuff is backed up on | TPB or github for me. | blindriver wrote: | I don't trust backing up to the cloud, I just do everything | on site and hope there's nothing catastrophic! | kkfx wrote: | Oh, curious, It's the first backup in clojure I've seen :-) | | My personal recipe is less sophisticated: | | - znapzend on all home machines send to a homeserver regularly | (with enough storage), partially replicated between | desktops/laptop | | - homeserver backup itself via simple incremental zfs send + | mbuffer with one snapshot per day (last 2 days), one per week | (last 2 w) and one per month (last 1 month) offsite | | - manually triggered offline local backup of the homeserver on | external USB drives and a physically mirrored home server, | normally on weekly basis | | Nothing more, nothing less. On any major NixOS release update I | rebuild one homeserver and a month or so later the second one. | Desktops and homeserver custom iso are built automatically every | Sunday and just left there (I know, it simply took to much time | checking so...). | | Essentially in case of a fault of a machine I still have data, | config and ready iso for a quick reinstall. In case of logical | faults (like a direct attack who compromise my data AND zfs | itself) there is not much protection beside different sync times | (I do NOT use all desktops/latptop at once, when they are powered | off they remain behind and I have normally plenty of time to see | most casual potential attacks. | | Long story short for anyone: when you talk about backups talk | about how you restore, or your backups will probably be just | useless bits a day... | e1g wrote: | Recent versions of rsync support zstd compression, which can | improve speed and reduce the load on both sides. You can check if | your rsync supports that with "rsync -h | grep zstd" and instruct | to use it with "-z --zc=zstd" | | However, compression is useful in proportion to how crappy the | network is and how compressable the content is (e.g., text | files). This repo is about backing up user files to an external | SSD with high bandwidth and low latency, and applying compression | likely makes the process slower. | greggyb wrote: | Compression is useful even with directly attached storage | devices. Disk IO is still slower than compression throughput | unless you are running very fast storage. | | If your workload is IO-bound, then it is quite likely that | compression will help. Most people, on their personal machines, | would likely see IO performance "improve" with filesystem level | compression. | arichard123 wrote: | I'm doing something similar but running a zfs pool off a usb dock | and using zfs snapshot instead of hardlinks. Usb is slow but it's | still faster than my network, so not the bottleneck. | alchemist1e9 wrote: | Let's not forget zbackup. Excellent useful low level tool. | proactivesvcs wrote: | If one uses software meant for backups, like restic, there are so | many advantages. Independent snapshots, deduplication, | compression, encryption, proper methods to verify the backup | integrity and forgetting snapshots according to a more structured | policy. Mount and read any backup by host or snapshot, multi- | platform, single binary and one can even run its rest-server on | the destination to allow for append-only backups. The importance | of using the right tool for the job, for something as crucial as | backup, cannot be understated. | aborsy wrote: | ZFS send receive is perfect, except there is almost no ZFS cloud | storage on the received side. You have to set up a ZFS server | offsite somewhere, like in a friend's house. | | Restic is darn good too! It has integration with many cloud | storage providers. | neilv wrote: | You can combine this with _restricted_ SSH and server-side | software, so that the client being backed up to the server can | only add new incremental backups, not delete old ones. | | (So, less data loss, in event of a malicious intruder on the | client, or some very broken code on the client that gets ahold of | the SSH private key.) | pjdesno wrote: | I've got a solution that I've used to back up machines for my | group, but never did the last 10% to make it something plug-and- | play for other folks: https://github.com/pjd-nu/s3-backup | | Full and incremental backups of a directory tree to S3 objects, | one per backup, and access to existing backups via FUSE mount. | With a bit more scripting (mostly automount) and maybe shifting | some cached data from RAM to the local file system it should be | fairly comparable to Apple Time Machine - not designed to restore | your disk as much as to be able to access its contents at | different points in time. | | If you're interested in it, feel free to drop me a note - my | email is in my Github profile I think. | LelouBil wrote: | Speaking about backups, I recently set up a back up process for | my home server including a recovery plan, and that makes me sleep | better at night ! | | I have Duplicati [0] that does a backup of the data of my many | self hosted applications Every day, encrypted and stored in a | folder on the server itself. | | Only the password manager backup is not encrypted by Duplicati, | because it's encrypted using my master password, and it stores | all the encryption keys of the other backups. | | Then, I have a systemd service to run rclone [1] every day after | the backups finished to sync the backup folder towards : | | - Backblaze B2 | | - AWS S3 Glacier Deep Archive | | For now I only use the free tier of B2 as I have less than a GB | to backup, but that's because I haven't installed next cloud yet | ! | | However, I still like using S3 because I am paying for it (even | though deep Archive is very cheap) and I'm pretty sure if | something happens with my account, the fact that I'm a paying | customer will prevent AWS from unilaterally removing my data (I | have seen posts about google accounts being closed without any | recourse, I hope I'm protected of that with AWS) | | Right now I only have CalDav/CardDav, my password manager and my | configs being backed up, but I plan to use Syncthing to also | backup other devices towards the home server, to fit inside what | I already configured. | | If anyone has advice on what I did/did not do/could have done | better please tell me ! | | [0] https://www.duplicati.com/ | | [1] https://rclone.org/ | UI_at_80x24 wrote: | ZFS snapshots + send/receive are an absolute game changer in this | regard. | | I have my /home in a separate dataset that gets snapshotted every | 30 minutes. The snapshots are sent to my primary file-server, and | can be picked up by any system on my network. I do a variation of | this with my dotfiles similar to STOW but with quicker snapshots. | customizable wrote: | ZFS is a game changer for quickly and reliably backing up large | multi-terabyte PostgreSQL databases as well. In case anyone is | interested, here is our experience with PostgreSQL on ZFS, | complete with a short backup script: | https://lackofimagination.org/2022/04/our-experience-with-po... | GekkePrutser wrote: | Zfs send/receive is nice but it does lack the toolchain to | easily extract individual files from a backup. It's more of a | disaster recovery thing in terms of backup. | customizable wrote: | You can actually extract individual files from a snapshot by | using the hidden .zfs directory like: /mnt- | point/.zfs/snapshot/snapshot-name | | Another alternative is to create a clone from a snapshot, | which also makes the data writable. | pmarreck wrote: | Came here to say this. Can you list your example commands for | snapshotting, zfs send, restoring single files or entire | snapshots, etc.? (Have you tested it out?) I am actually in the | position of doing this (I use zfs on root as of recently and I | have a TrueNAS) but am stuck at the bootstrapping problem (I | haven't taken a single snapshot yet; presumably the first one | is the only big one? and then how do I send incremental | snapshots? and then how do I restore these to, say, a new | machine? do I remotely mount a snapshot somehow, or zfs recv, | or? Do you set up systemd/cron jobs for this?) Also, having | auto-snapshotted on Ubuntu in the past, eventually things | slowed to a crawl every time I did an apt update... Is this | avoidable? | customizable wrote: | Yes, the first snapshot is the big one, the rest are | incremental. Restoring a snapshot is just one line really. | Something like ;) | | sudo zfs send -cRi db/data@2022-12-08T00-00 | db/data@2022-12-09T00-00 | ssh me@backup-server "sudo zfs | receive -vF db/data" | Whatarethese wrote: | I use rsync to backup my iCloud Photos from my local server to a | NAS at my parents house. Works great. | armoredkitten wrote: | For anyone using btrfs on their system, I heartily recommend | btrbk, which has served me very well for making incremental | backups with a customizable retention period: | https://github.com/digint/btrbk | nickersonm wrote: | I highly recommend this as well, although I just use it for | managing snapshots on my NAS. | | For backup I use hourly & daily kopia backups that are then | rcloned to an external drive and Backblaze. | dawnerd wrote: | I've been using borg + rsync to a google drive and s3. Works | great. Used it a few weeks ago for recovery and it went smoothly. | yehia2amer wrote: | Did anyone tried https://kopia.io/docs/features/ | | It is Awesome ! | | It's very fast usually I struggle with backup tools on windows | clients. And it ticks all my needs. deduplication, End-to-End | Encryption, incremental Snapshots with error Correction if any, | mounting snapshots as a drive and using it normally or to restore | specific files/folders, Caching. The only thing that could be | better is the GUI but it works. | mekster wrote: | Backup tools are nothing until it can prove its reliability | which can only be proved with many years of usage. | | In that regard, I don't trust anything but Borg and zfs. | yehia2amer wrote: | zfs is not an option with windows clients and even most linux | clients. Also finding these set of features is really scarce | not sure why ! I am using zfs on my server though! | falcolas wrote: | So, quick trick with rsync that means you don't have to copy | everything and then hardlink: --link-dest=DIR | hardlink to files in DIR when unchanged | | Basically, you list your previous backup dir as the link-dest | directory, and if the file hasn't changed, it will be hardlinked | from the previous directory into the current directory. Pretty | nice for creating time-machine style backups with one command and | no SSH. | | Also works a treat with incremental logical backups of databases. | amelius wrote: | This is good to know, I used an extra "cp -rl" step in my | previous scripts. | falcolas wrote: | One thing of note - the file is not transferred, so backups | happen faster and consume less bandwidth (important if your | target is not network-local to you). | rsync wrote: | Yes - they accomplish the same thing. | | --link-dest is just an elegant, built-in way to create | "hardlink snapshots" the same way that 'cp -al' always did. | | But note: | | A changed file - even the smallest of changes - breaks the | link and causes you to consume (size of file) more space | cascading through your snapshots. Depending on your file | sizes and change frequency this can get rather expensive. | | We now recommend abandoning hardlink snapshots altogether and | doing a "dumb mirror" rsync to your rsync.net account - with | no retention or versioning - and letting the ZFS snapshots | create your retention. | | As opposed to hardlink snapshots, ZFS snapshots diff on a | block level, not a file level - so you can change some blocks | of a file and not use (that entire file) more space. It can | be much more efficient, depending on file sizes. | | The other big benefit is that ZFS snapshots are | immutable/read-only so if your backup source is compromised, | Mallory can't wipe out all of the offsite backups too. | falcolas wrote: | It also reduces the amount of data transferred, making the | backup faster. | | > We now recommend | | Who's we? | jwiz wrote: | The poster to whom you replied is affiliated with | rsync.net, a popular backup service. | ndsipa_pomu wrote: | I'm using BackupPC https://backuppc.github.io/backuppc/ to do | these kinds of backups. It does all the deduplication so the | total storage is smaller than you'd expect for multiple machines | with lots of identical files. | ysopex wrote: | https://github.com/bcpierce00/unison | | I use this to keep a few machines synced up. Including a machine | that does proper daily backups. | litoE wrote: | All my backups go, via rsync, to a dedicated backup server | running Linux with a large hard disk. But I still lose sleep: | what if someone hacks into my home network and encrypts the file | systems, including the backup server? Other than taking the | backup server offline, I don't see how I can protect myself from | a full-blown intrusion. Any ideas? | jerezzprime wrote: | What about more copies? Have a copy or two in cloud storage, | across providers. This protects against other failure modes | too, like a house fire or theft. | greggyb wrote: | ZFS snapshots are immutable, rendering them quite resilient to | encryption attacks. This may alleviate some of your concern. | saltcured wrote: | There's no perfect answer, since different approaches to this | will introduce more complexity and inconvenience at the same | time they block some of these threats. You need to consider | which kinds of loss/disaster you are trying to mitigate. An | overly complex solution introduces new kinds of failure you | didn't have before. | | As others mention, backup needs more than replication. You | recover from a ransomware attack or other data-destruction | event by using point-in-time recovery to restore good data that | was backed up prior to the event. You need a sufficient | retention period for older backups depending on how long it | might take you to recognize a data loss event and perform | recovery. A mere replica is useless since it does not retain | those older copies. With retention, your worry is how to | prevent the compromised machines from damaging the older time | points in the backup archive. | | The traditional method was offline tape backups, so the earlier | time points are physically secure. They can only be destroyed | if someone goes to the storage and tampers with the tapes. | There is no way for the compromised system to automatically | access earlier backups. You cannot automate this because that | likely makes it an online archive again. A similar technique in | a personal setting might be backing up to removable flash | drives and physically rotating these to have offline drives. | But, the inconvenience means you lose protection if you forget | to perform the periodic physical rituals. | | With the sort of rsync over ssh mechanism you are describing, | one way to reduce the risk a little bit is to make a highly | trusted and secured server and _pull_ backups from specific | machines instead of _pushing_. This is under the assumption | that your desktops and whatnot are more likely to be hacked and | subverted. Have a keypair on the server that is authorized to | connect and pull data from the more vulnerable machines. The | various machines do not get a key authorized to connect to the | server and manipulate storage. However, this depends on a | belief that the rsync+ssh protocol is secure against a | compromised peer. I'm not sure if this is really true over the | long term. | | A modern approach is to try to use an object store like S3 with | careful setup of data retention policies and/or access | policies. If you can trust the operating model, you can give an | automated backup tool the permission to write new snapshots | without being allowed to delete or modify older snapshots. The | restic tool mentioned elsewhere has been designed with this in | mind. It effectively builds a content-addressable store of file | content (for deduplication) and snapshots as a description of | how to compose the contents into a full backup. Building a new | snapshot is adding new content objects and snapshot objects to | the archive. This process does not need permission to delete or | replace existing objects in the archive. Other management tools | would need higher privilege to do cleanup maintenance of the | archive, e.g. to delete older snapshots or garbage collect when | some of the archived content is no longer used by any of the | snapshots. | | The new risk with these approaches like restic on s3 or some | ZFS snapshot archive with deduplicative storage is that the | tooling itself could fail and prevent you from reconstructing | your snapshot during recovery. It is significantly more complex | than a traditional file system or tape archive. But, it | provides a much more convenient abstraction if you can trust | it. A very risk-averse and resource rich operator might use | redundant backup methods with different architectures, so that | there is a backup for when their backup system fails! | europeanguy wrote: | This looks like work. Just get synching and stop complicating | your life. | greensoap wrote: | I recommend Backuppc for these requirements. Pooling, no sw | install required, uses rsync and dedupes across clients and uses | client side hashing to avoid sending files already in the pool. | | https://backuppc.github.io/backuppc/index.html | ranting-moth wrote: | I used to do similar things until I met Borg Backup. I highly | recommend it. | photochemsyn wrote: | I've been using command-line git for data backup to a RPi over | SSH, once it's set up it's pretty easy to stay on top of, and | then everyone once in a while rsync both the local storage and | the RPi to separate USB drives. Also, every 3-6 months or so, | rsync everything to a new USB drive and set it aside so that | something like a system-wide ransomware attack doesn't corrupt | all the backups. ___________________________________________________________________ (page generated 2022-12-09 23:00 UTC)