Tag Archives: synchronize

isync: Because the cloud is unreliable

A while back I mentioned offlineimap, and Curtis mentioned isync in reply.

2014-08-09-6m47421-isync

As you can see there, isync (or perhaps more accurately, mbsync) was quite willing to draw in more than 20,000 messages to my local hard drive. In that sense, isync did much as it was reported to do.

And considering I just stole a configuration file from Henrik Pingel, it was a piece of cake to get it working.

I’m no e-mail expert, but what that suggests is making a local backup of my cloud-based e-mail services is well within my grasp. Now I won’t have any more excuses for ignoring applications intended for organizing local e-mail collections. Darn. ๐Ÿ˜ก

It also means that if you’ve come to distrust the pie-in-the-sky claims of the past decade — about cloud services being the wave of the future, and how everything will be online in the years to come — you can be a rebel, collect it all, and keep it locally. Print them out. Make a scrapbook. Invite your friends over for a party. ๐Ÿ™„

Of course, the real attraction in something like isync is to pair it with an outgoing message system, much like Ian described long ago, and put yourself in control of the entire process. Step up. Take responsibility. Clean up and move on. Go straight and choose life. ๐Ÿ˜‰

I should mention that Henrik’s configuration will require you to put your password in plain text, unless you encrypt it in a separate file. I also noted that his default is to ignore some of the less interesting folders GMail uses by default — like sent mail or starred mail. Whether you include those in your e-mail coup d’รฉtat is up to you.

Personally I realize now the immensity of six or seven years of e-mail messages that are stashed on GMail’s servers, and that’s only one of my four or five accounts I use. I think maybe I shall save a little drive space for now, and let GMail wrangle all that for a while longer. … ๐Ÿ˜ณ

unison: An alternative to rsync

I’m more than comfortable with rsync; I count it among some of my favorite tools. However I didn’t know there was an alternative in unison.

2014-06-11-6m47421-unison

For what I have seen, unison works in a similar way to rsync, although the home page claims it has a few advantages over other file synchronization tools.

Although some of the points are a bit esoteric, I do like the idea that unison is “fault-resistant” in cases of dropped connections or power failures. Not to point fingers, but I have had problems in the past where cut connections left folders with garbled temporary files everywhere.

I also see that unison remains at user level, which suggests to me that it doesn’t need special daemons or kernel modules to do its job. I can imagine a few situations, like shared systems or some network arrangements where it would be nice not to rely on elevated privileges or specific system daemons to make backups or mirror data.

It’s going to take me a while to learn the ins and outs of unison, and it’s unlikely that it will dethrone rsync for me. I’m a little stuck in my ways when it comes to my own rudimentary backup plans.

On the other hand, should the opportunity or need arise, I’ll keep it in mind as an alternative.

rsync: Needs no introduction

I don’t think there’s much I can say about rsync that isn’t already common knowledge or preaching to the choir.

kmandla@6m47421: ~/downloads$ rsync -ah --progress source/ destination/
sending incremental file list
./
sample-01.txt
            925 100%    0.00kB/s    0:00:00 (xfr#1, to-chk=9/11)
sample-02.txt
            835 100%  815.43kB/s    0:00:00 (xfr#2, to-chk=8/11)
sample-03.txt
            892 100%  871.09kB/s    0:00:00 (xfr#3, to-chk=7/11)
sample-04.txt
            901 100%  879.88kB/s    0:00:00 (xfr#4, to-chk=6/11)
sample-05.txt
            893 100%  872.07kB/s    0:00:00 (xfr#5, to-chk=5/11)
sample-06.txt
            900 100%  878.91kB/s    0:00:00 (xfr#6, to-chk=4/11)
sample-07.txt
            886 100%  865.23kB/s    0:00:00 (xfr#7, to-chk=3/11)
sample-08.txt
            832 100%  812.50kB/s    0:00:00 (xfr#8, to-chk=2/11)
sample-09.txt
            883 100%  862.30kB/s    0:00:00 (xfr#9, to-chk=1/11)
sample-10.txt
            888 100%  433.59kB/s    0:00:00 (xfr#10, to-chk=0/11)

kmandla@6m47421: ~/downloads$ 

rsync is, was, and has been one of my favorite tools for a very long time, and short of single-file, one target copies, it’s the one thing I use to copy, backup, synchronize or just plain double-check.

rsync works across networks, across directories and within file trees. It gives clean progress indicators, can run completely silent, can delete files that aren’t in the source folder, and will avoid updating files that don’t exist in the destination. Just tell it what you want.

I think that will do for now. Like I said at the start, if you know it, there’s no point in me gloating over it. And if you don’t … waste no time in trying it out. ๐Ÿ˜‰

rdiffdir: A succinct sync

Forgive me if I jump slightly out of order. I wanted to work with rdiffdir today, and I promise to touch on rdiff-backup tomorrow.

Also please forgive me if I don’t have screenshots this time. I think I can adequately explain what’s happening, and rdiffdir isn’t particularly wordy.

I have practical experience with it, albeit a few years out of date. At a time when I quit lugging an ancient laptop back and forth to work to listen to music, rdiffdir made it easy to synchronize my main music archive at home with the remote one at work … without a network connection.

“What witchcraft is this?!” you might howl. I’ll give you the command sequence, and you work out what’s happening. Office machine first:

rdiffdir signature music/ music.signature

Then at home:

rdiffdir delta music.signature music/ music.delta

Carry that back to the office, and …

rdiffdir patch music/ music.delta

And that’s it (or at least what I remember of it). The signature command creates a distinct impression of what’s available on the office machine. The delta creates a file packed with changed material from the home machine, and the patch command merges it with the destination at the office again.

It’s very clever, really. What you avoid is rsyncing entire folders to USB drives, then USB drives to destination folders — hopefully saving time, and space on your intermediary drive.

I could see where this would also be useful for completely offline backups, where you want to preserve file arrangements and integrity on one machine with another that is completely disconnected. Which, in this day and age, isn’t a bad idea. ๐Ÿ˜ฏ

rdiffdir is part of duplicity, which is available in Debian and Arch. Tomorrow, rdiffdir’s ugly kid brother. ๐Ÿ˜‰

By the way, I should mention that everything I know about rdiffdir I learned years ago from this page. Credit where credit is due. ๐Ÿ˜€

pssh: Still more parallelized tools

I think so far, every parallelized tool I’ve discovered in the P section has been new to me.

pssh is new to me too, even if it dates back to at least 2009, if not further.

pssh is a collection of ssh-oriented tools written in Python and mimicking a lot of the standard openssh-style fare. There is a strict pssh application, a psshscp tool for scp-ish adventures, a prsync utility and some others, along with a library to assist with creating new tools.

My escapades with the pssh tools was a little less than successful, something I am always willing to blame on myself first.

Part of my difficulty may lie in that the flag options for prsync (and I use that only as an example; some of the other tools also gave difficulty) are very different from vanilla rsync. prsync, as best I can tell, also demands that you declare a host and a user in the command or face error messages.

The odd thing being, if I tried to just sync two folders in my home directory, a la

prsync -r -h kmandla@127.0.0.1 -l kmandla source/ /home/kmandla/dest/

I met with an error exit code of 255 — which I can’t seem to track down in the man pages or on the web site.

Some other issues too; there were slight inconsistencies in the documentation. The man page for psshscp is titled “pscp.” The man page for pslurp says it’s an application to kill parallelized processes, but the extended description talks about copying and source and destinations and so forth. I admit I was confused. (All this in the Arch version, by the way.)

And beyond that — and I’m not afraid to display my ignorance here — I’m not sure if “parallelized” means “optimized for multiprocessor machines,” as was the case with pbzip2 and pigz, or “optimized for high bandwidth connections,” since most of these are aimed at networking tasks. For what I can tell.

It seems it should be the latter … or at least that’s what I’d be looking for. I’m probably splitting hairs here, but I can say that most of my bottlenecks when I use things like rsync or ssh are not at the processor. And that’s all the more I’ll say, at risk of embarrassing myself.

I’ll let you give them a try, and see if they behave any better for you, or if their focus is a little more clear. It’s good to know they’re available, and maybe they’ll brighten someone’s day. ๐Ÿ˜‰

offlineimap: With visible potential

Tara sent me a link a few months ago about offlineimap, at a time when I was tinkering with an unrelated e-mail tool. I can say up front that I can see the potential in this.

2014-02-17-lv-r1fz6-offlineimap

If I understand it right, this would allow you to synchronize a remote mail directory — something like GMail, as above — to a local folder, and then use a traditional mail reader, like mutt, to … well … read your e-mail. ๐Ÿ™„

If that’s the case then I may have found a way around my usual need for a locally built e-mail system, and I can start whacking away at other e-mail tools that don’t really apply to Web-based services.

On the other hand, I’m not sure where offlineimap will help with sending e-mails, although I haven’t really worked much with it beyond what you see above. Science demands an answer.

Something tells me offlineimap has some sort of provision for that. It is, after all, 12 years old.

Sending or receiving, the first place to start would be (and was) the Arch wiki, which has plenty of sample configurations and a special one just for GMail accounts … which worked more or less perfectly for me. I’m such a copy-paster. ๐Ÿ˜ณ

Just for future reference, or if you want to tinker with offlineimap too, here’s what I used:

[general]
# List of accounts to be synced, separated by a comma.
accounts = gmail-remote

[Account gmail-remote]
# Identifier for the local repository; e.g. the maildir to be synced via IMAP.
localrepository = main-local
# Identifier for the remote repository; i.e. the actual IMAP, usually non-local.
remoterepository = gmail-remote
# Status cache. Default is plain, which eventually becomes huge and slow.
status_backend = sqlite

[Repository main-local]
# Currently, offlineimap only supports maildir and IMAP for local repositories.
type = Maildir
# Where should the mail be placed?
localfolders = ~/.mail

[Repository gmail-remote]
type = Gmail
remoteuser = k.mandla@gmail.com
remotepass = password
nametrans = lambda foldername: re.sub ('^\[gmail\]', 'bak',
                               re.sub ('sent_mail', 'sent',
                               re.sub ('starred', 'flagged',
                               re.sub (' ', '_', foldername.lower()))))
folderfilter = lambda foldername: foldername not in '[Gmail]/All Mail'
# Necessary as of OfflineIMAP 6.5.4
sslcacertfile = /etc/ssl/certs/ca-certificates.crt

You will, of course, have to adjust that to your liking. And I’m sure that could be streamlined a bit.

I can’t say if offlineimap is any better or worse than anything else available; I do see a lot of Remy’s NoPriv.py in this too though. Perhaps the two are not dissimilar.

I intend to come back and take a look at offlineimap again sometime soon. I can’t say for sure why — I just have a hunch that it will be useful in the months and years to come. ๐Ÿ˜