I think so far, every parallelized tool I’ve discovered in the P section has been new to me.
pssh is new to me too, even if it dates back to at least 2009, if not further.
pssh is a collection of ssh-oriented tools written in Python and mimicking a lot of the standard openssh-style fare. There is a strict
pssh application, a
psshscp tool for scp-ish adventures, a
prsync utility and some others, along with a library to assist with creating new tools.
My escapades with the pssh tools was a little less than successful, something I am always willing to blame on myself first.
Part of my difficulty may lie in that the flag options for
prsync (and I use that only as an example; some of the other tools also gave difficulty) are very different from vanilla rsync. prsync, as best I can tell, also demands that you declare a host and a user in the command or face error messages.
The odd thing being, if I tried to just sync two folders in my home directory, a la
prsync -r -h email@example.com -l kmandla source/ /home/kmandla/dest/
I met with an error exit code of 255 — which I can’t seem to track down in the man pages or on the web site.
Some other issues too; there were slight inconsistencies in the documentation. The man page for
psshscp is titled “pscp.” The man page for
pslurp says it’s an application to kill parallelized processes, but the extended description talks about copying and source and destinations and so forth. I admit I was confused. (All this in the Arch version, by the way.)
And beyond that — and I’m not afraid to display my ignorance here — I’m not sure if “parallelized” means “optimized for multiprocessor machines,” as was the case with pbzip2 and pigz, or “optimized for high bandwidth connections,” since most of these are aimed at networking tasks. For what I can tell.
It seems it should be the latter … or at least that’s what I’d be looking for. I’m probably splitting hairs here, but I can say that most of my bottlenecks when I use things like rsync or ssh are not at the processor. And that’s all the more I’ll say, at risk of embarrassing myself.
I’ll let you give them a try, and see if they behave any better for you, or if their focus is a little more clear. It’s good to know they’re available, and maybe they’ll brighten someone’s day. 😉
Neither, it’s for performing the same operation on two or more remote machines in parallel. So for example, if you wanted to check the uptime on three machines, you could do something like this (Debian names the commands as parallel-xxxinstead of pxxx):
ian@dukhat~ [i]> parallel-ssh -H firstname.lastname@example.org -H email@example.com -H firstname.lastname@example.org -i uptime
 13:57:52 [SUCCESS] email@example.com
13:57:50 up 69 days, 1:56, 4 users, load average: 0.42, 0.60, 0.71
 13:57:55 [SUCCESS] firstname.lastname@example.org
22:57:52 up 23:08, 0 users, load average: 0.00, 0.01, 0.05
 13:57:55 [SUCCESS] email@example.com
22:57:49 up 1 day, 16:14, 0 users, load average: 0.00, 0.01, 0.05
There are plenty of uses for that – imagine wanting to simultaneously reboot every node in a cluster of several hundred servers, for example.
prsync & pscp is for copying files to a several servers at once, though it seems to insist that the remote directory is specified as an absolute path, which isn’t very helpful when my home directory is not in the same place on each remote server, like this:
ian@dukhat~ [i]> cat hosts.txt
ian@dukhat~ [i]> parallel-rsync -h hosts.txt -ar junk /home/dss/mirror/
 14:17:18 [FAILURE] firstname.lastname@example.org Exited with error code 11
 14:18:56 [SUCCESS] email@example.com
 14:19:01 [SUCCESS] firstname.lastname@example.org
Actually, that makes a lot more sense. I had a hard time understanding what use pssh was, but I’ll take another look at it now that I have a better picture of what it’s trying to do.
Cheers, and thanks! 😉