pssh: Still more parallelized tools

I think so far, every parallelized tool I’ve discovered in the P section has been new to me.

pssh is new to me too, even if it dates back to at least 2009, if not further.

pssh is a collection of ssh-oriented tools written in Python and mimicking a lot of the standard openssh-style fare. There is a strict pssh application, a psshscp tool for scp-ish adventures, a prsync utility and some others, along with a library to assist with creating new tools.

My escapades with the pssh tools was a little less than successful, something I am always willing to blame on myself first.

Part of my difficulty may lie in that the flag options for prsync (and I use that only as an example; some of the other tools also gave difficulty) are very different from vanilla rsync. prsync, as best I can tell, also demands that you declare a host and a user in the command or face error messages.

The odd thing being, if I tried to just sync two folders in my home directory, a la

prsync -r -h kmandla@ -l kmandla source/ /home/kmandla/dest/

I met with an error exit code of 255 — which I can’t seem to track down in the man pages or on the web site.

Some other issues too; there were slight inconsistencies in the documentation. The man page for psshscp is titled “pscp.” The man page for pslurp says it’s an application to kill parallelized processes, but the extended description talks about copying and source and destinations and so forth. I admit I was confused. (All this in the Arch version, by the way.)

And beyond that — and I’m not afraid to display my ignorance here — I’m not sure if “parallelized” means “optimized for multiprocessor machines,” as was the case with pbzip2 and pigz, or “optimized for high bandwidth connections,” since most of these are aimed at networking tasks. For what I can tell.

It seems it should be the latter … or at least that’s what I’d be looking for. I’m probably splitting hairs here, but I can say that most of my bottlenecks when I use things like rsync or ssh are not at the processor. And that’s all the more I’ll say, at risk of embarrassing myself.

I’ll let you give them a try, and see if they behave any better for you, or if their focus is a little more clear. It’s good to know they’re available, and maybe they’ll brighten someone’s day. 😉

2 thoughts on “pssh: Still more parallelized tools

  1. darkstarsword

    Neither, it’s for performing the same operation on two or more remote machines in parallel. So for example, if you wanted to check the uptime on three machines, you could do something like this (Debian names the commands as parallel-xxxinstead of pxxx):
    ian@dukhat~ [i]> parallel-ssh -H -H -H -i uptime
    [1] 13:57:52 [SUCCESS]
    13:57:50 up 69 days, 1:56, 4 users, load average: 0.42, 0.60, 0.71
    [2] 13:57:55 [SUCCESS]
    22:57:52 up 23:08, 0 users, load average: 0.00, 0.01, 0.05
    [3] 13:57:55 [SUCCESS]
    22:57:49 up 1 day, 16:14, 0 users, load average: 0.00, 0.01, 0.05

    There are plenty of uses for that – imagine wanting to simultaneously reboot every node in a cluster of several hundred servers, for example.

    prsync & pscp is for copying files to a several servers at once, though it seems to insist that the remote directory is specified as an absolute path, which isn’t very helpful when my home directory is not in the same place on each remote server, like this:
    ian@dukhat~ [i]> cat hosts.txt ian dss dss
    ian@dukhat~ [i]> parallel-rsync -h hosts.txt -ar junk /home/dss/mirror/
    [1] 14:17:18 [FAILURE] Exited with error code 11
    [2] 14:18:56 [SUCCESS]
    [3] 14:19:01 [SUCCESS]

    1. K.Mandla Post author

      Actually, that makes a lot more sense. I had a hard time understanding what use pssh was, but I’ll take another look at it now that I have a better picture of what it’s trying to do.

      Cheers, and thanks! 😉

Comments are closed.