Evan Prodromou

Evan Prodromou at

My files

So, I have 4 laptops that I've used at various times over the last 5 years. I want to sell off most of them, but I want to make sure I don't have any files on any of them that I lose.

I've got backups of each, but I don't want to keep around hundreds of gigs of backup for a computer I haven't used in years, just so 5 unique files are still around. I only backup volatile files in /var/ /etc/ /usr/local/ /opt/ and of course /home/, but it's still a lot of data.

So I'd like to identify files that are unique to each computer. Here's how I'm doing it:
  1. For each host, I create a file of SHA2 sums. I'm trying to move away from MD5 sums, although they're probably fine for this application. I use GNU Parallel to keep things going quickly; I think for compute-intensive jobs like crypto sums this makes sense.

    find $BACKUPDIR -type f -print0 | parallel -q0 --gnu sha224sum > ~/tmp/${HOSTNAME}_sha224sums.txt

  2. For each host, I make a sorted, uniq'd file of just the sums:

    cut -f1 -d" " ~/tmp/${HOSTNAME}_sha224sums.txt | sort | uniq > ~/tmp/${HOSTNAME}_justsums.txt

  3. I have one computer that's my most recent that I want to keep. So I use "comm" to find checksums that are on other computers that aren't available on that computer:

    comm -13 ${KEEPER}_justsums.txt ${HOSTNAME}_justsums.txt > ${HOSTNAME}_uniquesums.txt

    Technically these aren't actually unique; they're just not on the keeper computer.

  4. Finally, I convert the unique sums into filenames by referencing the original sums file:

    for cs in `<~/tmp/${HOSTNAME}_uniquesums.txt`; do grep -m1 $cs ~/tmp/${HOSTNAME}_sha224sums.txt | cut -f3 -d" " >> ~/tmp/${HOSTNAME}_uniquefilenames.txt; done

  5. From there, I sort the unique filenames and then manually (!!) decide what to copy to the "keeper" computer. The signal-to-noise ratio is too low for me to do much automation, except for when I have a directory that I can rsync over in total.

    There are just a ton of files that are too useless to copy - temp files, dot files, etc.
I'd like to get to the point where I keep most of my stuff on sync'd storage using SparkleShare or git-annex, and most other stuff on hosted git servers like gitorious.org or github.com.

How do you keep multiple laptops in sync?

Eugene Mah, Mark Jaroski, Sarven Capadisli, Susan Pinochet and 7 others likes this.

Olivier Mehani, Evan Prodromou shared this.

Show all 14 replies
No love for fdupes? I'm with moggers87. I buy one laptop, put it under very aggressive on-site repair warranty, replace it by selling it on ebay when the warranty expires. I backup files on an external HD or in the "cloud." 

DAC at 12 years ago

rozzin, Evan Prodromou likes this.

Dear Evan,

Sorry for writing you this way, but I'm not able to write you a direct note. Would it be possible to delete my Identica - Account (praetoriuss@identi.ca)?

Regards, praetoriuss

Martin S. at 12 years ago

i like to know myself :)

raito at 10 years ago