Tag Archives: sync

bitpocket: Your in-house drop box

I have three i686 machines at the moment, all three of which are running Arch. One acts as a wireless relay to the other two (thanks, create_ap), and shares its wired connection with the two others.

It also is set up to sync itself against the Arch repositories once a day, and then I manually rsync between the other two and share the downloaded updates. This saves me the drag of triple-updating machines, since my in-house wireless connection is considerably faster than the wired line.

Anything leftover or specific to one piece of hardware — like video drivers — gets downloaded normally, when a specific machine does its update.

I do end up doing some double-back synchronization, from satellite computers to the central hub. That’s more of an insurance measure or to make backups of individual packages that were downloaded to specific machines.

All that rsyncing back and forth relies on sshd of course, and it gets a little tedious having to re-enter passwords on every rsync, but I don’t know if there’s anything to be done about that. It’s the nature of the beast.

I thought perhaps working a little bit with bitpocket might give me some ideas, since bitpocket is intended as a do-it-yourself Dropbox-style network storage tool, built upon the mighty rsync.

2014-11-07-2sjx281-bitpocket

It did and it didn’t. bitpocket runs into much the same issue as I had, where I needed to supply a password four times — actually five — to get the proper access across the network and update the master folder on the central machine.

I don’t hold that out as a fault of bitpocket though, since whatever setting or configuration I’m after to solve my own problem isn’t bitpocket’s responsibility. I shall pursue that independently.

bitpocket itself is rather useful, and very easy to set up. Supply a proper folder and identity for your server machine, create a folder for a corresponding copy on the client, and just call bitpocket from that folder. bitpocket can check for updates, synchronize new files, delete old ones and generally handle everything as it should be done.

bitpocket has some other features that are nifty, such as the ability to tell you what will be updated (in other words, what changes exist since the last sync) and delete protection, where it moves “deleted” files into a hidden folder, in case you make a mistake.

If you already do fairly regular rsyncs against a master machine, or if you just want to streamline the process of keeping work folders up to date, I can see where bitpocket might improve upon a long chain of rsync commands.

In my case, I really need to find out how to authenticate rsync without being prompted each time, and follow through with that. (Edit: I figured it out, thanks. I just needed to generate an id_rsa.pub key, and move it over to the server. ๐Ÿ˜‰ )

isync: Because the cloud is unreliable

A while back I mentioned offlineimap, and Curtis mentioned isync in reply.

2014-08-09-6m47421-isync

As you can see there, isync (or perhaps more accurately, mbsync) was quite willing to draw in more than 20,000 messages to my local hard drive. In that sense, isync did much as it was reported to do.

And considering I just stole a configuration file from Henrik Pingel, it was a piece of cake to get it working.

I’m no e-mail expert, but what that suggests is making a local backup of my cloud-based e-mail services is well within my grasp. Now I won’t have any more excuses for ignoring applications intended for organizing local e-mail collections. Darn. ๐Ÿ˜ก

It also means that if you’ve come to distrust the pie-in-the-sky claims of the past decade — about cloud services being the wave of the future, and how everything will be online in the years to come — you can be a rebel, collect it all, and keep it locally. Print them out. Make a scrapbook. Invite your friends over for a party. ๐Ÿ™„

Of course, the real attraction in something like isync is to pair it with an outgoing message system, much like Ian described long ago, and put yourself in control of the entire process. Step up. Take responsibility. Clean up and move on. Go straight and choose life. ๐Ÿ˜‰

I should mention that Henrik’s configuration will require you to put your password in plain text, unless you encrypt it in a separate file. I also noted that his default is to ignore some of the less interesting folders GMail uses by default — like sent mail or starred mail. Whether you include those in your e-mail coup d’รฉtat is up to you.

Personally I realize now the immensity of six or seven years of e-mail messages that are stashed on GMail’s servers, and that’s only one of my four or five accounts I use. I think maybe I shall save a little drive space for now, and let GMail wrangle all that for a while longer. … ๐Ÿ˜ณ

unison: An alternative to rsync

I’m more than comfortable with rsync; I count it among some of my favorite tools. However I didn’t know there was an alternative in unison.

2014-06-11-6m47421-unison

For what I have seen, unison works in a similar way to rsync, although the home page claims it has a few advantages over other file synchronization tools.

Although some of the points are a bit esoteric, I do like the idea that unison is “fault-resistant” in cases of dropped connections or power failures. Not to point fingers, but I have had problems in the past where cut connections left folders with garbled temporary files everywhere.

I also see that unison remains at user level, which suggests to me that it doesn’t need special daemons or kernel modules to do its job. I can imagine a few situations, like shared systems or some network arrangements where it would be nice not to rely on elevated privileges or specific system daemons to make backups or mirror data.

It’s going to take me a while to learn the ins and outs of unison, and it’s unlikely that it will dethrone rsync for me. I’m a little stuck in my ways when it comes to my own rudimentary backup plans.

On the other hand, should the opportunity or need arise, I’ll keep it in mind as an alternative.

rsync: Needs no introduction

I don’t think there’s much I can say about rsync that isn’t already common knowledge or preaching to the choir.

kmandla@6m47421: ~/downloads$ rsync -ah --progress source/ destination/
sending incremental file list
./
sample-01.txt
            925 100%    0.00kB/s    0:00:00 (xfr#1, to-chk=9/11)
sample-02.txt
            835 100%  815.43kB/s    0:00:00 (xfr#2, to-chk=8/11)
sample-03.txt
            892 100%  871.09kB/s    0:00:00 (xfr#3, to-chk=7/11)
sample-04.txt
            901 100%  879.88kB/s    0:00:00 (xfr#4, to-chk=6/11)
sample-05.txt
            893 100%  872.07kB/s    0:00:00 (xfr#5, to-chk=5/11)
sample-06.txt
            900 100%  878.91kB/s    0:00:00 (xfr#6, to-chk=4/11)
sample-07.txt
            886 100%  865.23kB/s    0:00:00 (xfr#7, to-chk=3/11)
sample-08.txt
            832 100%  812.50kB/s    0:00:00 (xfr#8, to-chk=2/11)
sample-09.txt
            883 100%  862.30kB/s    0:00:00 (xfr#9, to-chk=1/11)
sample-10.txt
            888 100%  433.59kB/s    0:00:00 (xfr#10, to-chk=0/11)

kmandla@6m47421: ~/downloads$ 

rsync is, was, and has been one of my favorite tools for a very long time, and short of single-file, one target copies, it’s the one thing I use to copy, backup, synchronize or just plain double-check.

rsync works across networks, across directories and within file trees. It gives clean progress indicators, can run completely silent, can delete files that aren’t in the source folder, and will avoid updating files that don’t exist in the destination. Just tell it what you want.

I think that will do for now. Like I said at the start, if you know it, there’s no point in me gloating over it. And if you don’t … waste no time in trying it out. ๐Ÿ˜‰

pssh: Still more parallelized tools

I think so far, every parallelized tool I’ve discovered in the P section has been new to me.

pssh is new to me too, even if it dates back to at least 2009, if not further.

pssh is a collection of ssh-oriented tools written in Python and mimicking a lot of the standard openssh-style fare. There is a strict pssh application, a psshscp tool for scp-ish adventures, a prsync utility and some others, along with a library to assist with creating new tools.

My escapades with the pssh tools was a little less than successful, something I am always willing to blame on myself first.

Part of my difficulty may lie in that the flag options for prsync (and I use that only as an example; some of the other tools also gave difficulty) are very different from vanilla rsync. prsync, as best I can tell, also demands that you declare a host and a user in the command or face error messages.

The odd thing being, if I tried to just sync two folders in my home directory, a la

prsync -r -h kmandla@127.0.0.1 -l kmandla source/ /home/kmandla/dest/

I met with an error exit code of 255 — which I can’t seem to track down in the man pages or on the web site.

Some other issues too; there were slight inconsistencies in the documentation. The man page for psshscp is titled “pscp.” The man page for pslurp says it’s an application to kill parallelized processes, but the extended description talks about copying and source and destinations and so forth. I admit I was confused. (All this in the Arch version, by the way.)

And beyond that — and I’m not afraid to display my ignorance here — I’m not sure if “parallelized” means “optimized for multiprocessor machines,” as was the case with pbzip2 and pigz, or “optimized for high bandwidth connections,” since most of these are aimed at networking tasks. For what I can tell.

It seems it should be the latter … or at least that’s what I’d be looking for. I’m probably splitting hairs here, but I can say that most of my bottlenecks when I use things like rsync or ssh are not at the processor. And that’s all the more I’ll say, at risk of embarrassing myself.

I’ll let you give them a try, and see if they behave any better for you, or if their focus is a little more clear. It’s good to know they’re available, and maybe they’ll brighten someone’s day. ๐Ÿ˜‰

offlineimap: With visible potential

Tara sent me a link a few months ago about offlineimap, at a time when I was tinkering with an unrelated e-mail tool. I can say up front that I can see the potential in this.

2014-02-17-lv-r1fz6-offlineimap

If I understand it right, this would allow you to synchronize a remote mail directory — something like GMail, as above — to a local folder, and then use a traditional mail reader, like mutt, to … well … read your e-mail. ๐Ÿ™„

If that’s the case then I may have found a way around my usual need for a locally built e-mail system, and I can start whacking away at other e-mail tools that don’t really apply to Web-based services.

On the other hand, I’m not sure where offlineimap will help with sending e-mails, although I haven’t really worked much with it beyond what you see above. Science demands an answer.

Something tells me offlineimap has some sort of provision for that. It is, after all, 12 years old.

Sending or receiving, the first place to start would be (and was) the Arch wiki, which has plenty of sample configurations and a special one just for GMail accounts … which worked more or less perfectly for me. I’m such a copy-paster. ๐Ÿ˜ณ

Just for future reference, or if you want to tinker with offlineimap too, here’s what I used:

[general]
# List of accounts to be synced, separated by a comma.
accounts = gmail-remote

[Account gmail-remote]
# Identifier for the local repository; e.g. the maildir to be synced via IMAP.
localrepository = main-local
# Identifier for the remote repository; i.e. the actual IMAP, usually non-local.
remoterepository = gmail-remote
# Status cache. Default is plain, which eventually becomes huge and slow.
status_backend = sqlite

[Repository main-local]
# Currently, offlineimap only supports maildir and IMAP for local repositories.
type = Maildir
# Where should the mail be placed?
localfolders = ~/.mail

[Repository gmail-remote]
type = Gmail
remoteuser = k.mandla@gmail.com
remotepass = password
nametrans = lambda foldername: re.sub ('^\[gmail\]', 'bak',
                               re.sub ('sent_mail', 'sent',
                               re.sub ('starred', 'flagged',
                               re.sub (' ', '_', foldername.lower()))))
folderfilter = lambda foldername: foldername not in '[Gmail]/All Mail'
# Necessary as of OfflineIMAP 6.5.4
sslcacertfile = /etc/ssl/certs/ca-certificates.crt

You will, of course, have to adjust that to your liking. And I’m sure that could be streamlined a bit.

I can’t say if offlineimap is any better or worse than anything else available; I do see a lot of Remy’s NoPriv.py in this too though. Perhaps the two are not dissimilar.

I intend to come back and take a look at offlineimap again sometime soon. I can’t say for sure why — I just have a hunch that it will be useful in the months and years to come. ๐Ÿ˜

ntp: Including my favorite, ntpdate

I learned about ntp and its cohort, ntpdate, when I ran into a machine so slow and old that it couldn’t keep time properly between reboots.

Its internal battery was shot, and powering it down meant the machine would reset at its next startup. And a reset clock triggered a BIOS warning, and … and … and …

The solution was to immediately synchronize its time across the Internet, as soon as it powered up. To that end, ntpdate became quite useful.

2014-02-11-lv-r1fz6-ntpdate

A quick

ntpdate -u pool.ntp.org

brings everything back into line. And it’s not just old machines that need nudging now and again. As you can see in the screenshot, even a high(er)-end machine can need adjustment over time.

I don’t bother with regular synchronizing though, just because I don’t think a shift of a few dozen seconds over the course of a couple months is worth the effort.

I must admit I haven’t worked much with ntp beyond ntpdate; it’s one of those tools I know about but don’t seem to have much call to use.

I know can do quite a bit more, but until the need arises, I am content to leave it as a mystery. ๐Ÿ˜‰

grive: Sync with Google Drive

I’m not a Google Drive user; these days I am more than a little concerned about cloud storage. Always have been.

If you are though, there is an open-source, console-based tool for syncing with your Google Drive folder. grive does a decent job at keeping its promise.

2013-08-10-v5-122p-grive-01 2013-08-10-v5-122p-grive-02 2013-08-10-v5-122p-grive-03

Simple before-and-after sequence there. ๐Ÿ˜‰

As I see it, grive considers your local folder to be the master, and makes changes to your Drive account as needed.

If something is missing locally, it deletes it remotely. There may be ways to adjust that behavior, and I just didn’t look hard enough for them.

If the home page is to be believed, grive has a few shortcomings, in terms of what it can and will upload or download.

Dot-files are ignored, as are files containing slashes, because the escape sequence is odd to handle, according to the site.

There are also some restrictions on files located in more than one place. I’m not fully grasping the issue; brush up on it before you rely on grive. You wouldn’t want to lose something.

Other than that, grive does what it promises, cleanly and without too much hullabaloo.

If you can convince me to use Google Drive, in this day and age, I might get to know it better. ๐Ÿ™„