Evan Prodromou at 2013-08-31T21:14:48Z

My files

So, I have 4 laptops that I've used at various times over the last 5 years. I want to sell off most of them, but I want to make sure I don't have any files on any of them that I lose.

I've got backups of each, but I don't want to keep around hundreds of gigs of backup for a computer I haven't used in years, just so 5 unique files are still around. I only backup volatile files in /var/ /etc/ /usr/local/ /opt/ and of course /home/, but it's still a lot of data.

So I'd like to identify files that are unique to each computer. Here's how I'm doing it:
  1. For each host, I create a file of SHA2 sums. I'm trying to move away from MD5 sums, although they're probably fine for this application. I use GNU Parallel to keep things going quickly; I think for compute-intensive jobs like crypto sums this makes sense.

    find $BACKUPDIR -type f -print0 | parallel -q0 --gnu sha224sum > ~/tmp/${HOSTNAME}_sha224sums.txt

  2. For each host, I make a sorted, uniq'd file of just the sums:

    cut -f1 -d" " ~/tmp/${HOSTNAME}_sha224sums.txt | sort | uniq > ~/tmp/${HOSTNAME}_justsums.txt

  3. I have one computer that's my most recent that I want to keep. So I use "comm" to find checksums that are on other computers that aren't available on that computer:

    comm -13 ${KEEPER}_justsums.txt ${HOSTNAME}_justsums.txt > ${HOSTNAME}_uniquesums.txt

    Technically these aren't actually unique; they're just not on the keeper computer.

  4. Finally, I convert the unique sums into filenames by referencing the original sums file:

    for cs in `<~/tmp/${HOSTNAME}_uniquesums.txt`; do grep -m1 $cs ~/tmp/${HOSTNAME}_sha224sums.txt | cut -f3 -d" " >> ~/tmp/${HOSTNAME}_uniquefilenames.txt; done

  5. From there, I sort the unique filenames and then manually (!!) decide what to copy to the "keeper" computer. The signal-to-noise ratio is too low for me to do much automation, except for when I have a directory that I can rsync over in total.

    There are just a ton of files that are too useless to copy - temp files, dot files, etc.
I'd like to get to the point where I keep most of my stuff on sync'd storage using SparkleShare or git-annex, and most other stuff on hosted git servers like gitorious.org or github.com.

How do you keep multiple laptops in sync?

Eugene Mah, Mark Jaroski, Sarven Capadisli, Susan Pinochet and 7 others likes this.

Olivier Mehani, Evan Prodromou shared this.

Show all 14 replies
No love for fdupes? I'm with moggers87. I buy one laptop, put it under very aggressive on-site repair warranty, replace it by selling it on ebay when the warranty expires. I backup files on an external HD or in the "cloud." 

D A C at 2013-09-04T16:39:27Z

rozzin, Evan Prodromou likes this.

Dear Evan,

Sorry for writing you this way, but I'm not able to write you a direct note. Would it be possible to delete my Identica - Account (praetoriuss@identi.ca)?

Regards, praetoriuss

Martin S. at 2013-10-10T10:31:04Z

i like to know myself :)

raito at 2015-04-26T06:07:29Z