commonalities and differences =================================== Just to clarify: this post is about a Unix tool, and will not help you finding your soul-mate :P There is a quite useful Unix tool has gotten practically forgotten. The tool I am talking about is `comm(1)`, and the reason it remains mostly unused is that another tool called `diff(1)` covers some of its use cases [1]. comm(1) compares two sorted files line by line. For instance, it can be used to quickly check if your backup contains all the files you wanted to put in there. Imagine you have created a backup of your gopher folder in a tar.gz archive. Let's get the sorted list of files in the backup: $ tar -ztf mybackup.tar.gz | sort > backup_list.txt and then the list of files currently contained in your gopher dir: $ cd $HOME/gopher; find ./ | sort > gopherdir_list.txt I did this with my backup on republic, and I run comm(1): $ comm -3 backup_list.txt gopherdir_list.txt ./phlog ./phlog/ ./phlog/.20190227_comm.txt.swp ./phlog/20190227_comm.txt ./phlog/phlogroll ./phlog/phlogroll/ ./stuff ./stuff/ $ What's happening here? comm(1) is reporting on the first column the files unique to backup_list.txt, and on the second column (separated by a TAB) the files unique to gopherdir_list.txt. There is something off though: comm(1) is still treating "./phlog" and "./phlog/" as distinct entries. This is due to the different way in which find(1) and tar(1) list directories. Easy to solve: $ cd $HOME/gopher; find ./ | sed 's:/$::g' |sort > gopherdir_list.txt $ tar -ztf mybackup.tar.gz | sed 's:/$::g' | sort > backup_list.txt and then: $ comm -3 backup_list.txt gopherdir_list.txt ./phlog/.20190227_comm.txt.swp ./phlog/20190227_comm.txt $ This means that the second file (gopherdir_list.txt) contains two files that are not present in the first file (backup_list.txt). Well, those files could not be in my backup, since one of them is this post, and the other one is the corresponding swp file created by vim(1) while I edit it :P comm(1) can also report the lines unique to either of the input files, or those present in both. As always, man(1) is your best friend. -+-+-+- comm(1) appeared in UNIXv4 (1973) find(1) appeared in UNIXv1 (1971) sort(1) appeared in UNIXv1 (1971) diff(1) appeared in UNIXv5 (1974) sed(1) appeared in UNIXv7 (1979) tar(1) appeared in UNIXv7 (1979) and is not part of POSIX -+-+-+- [1] We will most probably talk about diff(1) in the future...