sort: Deserves better attention

A long time ago, when I was assembling The List, I thought to myself that I should save a little extra time and space for sort. Unfortunately it’s not going to work out, because of some time crunches I have in real life. And that’s a shame, because in no small sense, it’s a really great tool that saves me a lot of hassle.

Here’s an example. What’s your machine say if you ask it this?

lsmod

If it’s anything like mine, and it probably is, you get a huge smattering of modules that are currently inserted into your kernel. That’s what it’s supposed to say.

Now try to pick out the ones that are interrelated. Not so easy, is it? If, for example, I want to see if the ath5k module was inserted when I jammed a PCMCIA wireless card in, sort comes to my rescue.

lsmod | sort

Yes, I know I could use grep, and in some cases I would. But modules tend to be interrelated, and sometimes it’s more useful to have a full list to scan, especially when troubleshooting.

sort can do a couple of interesting things. The -u flag will avoid doubling-up on entries, if you’re not interested in duplicates. The -h flag, which allows sorts by human-readable numeric values. How is that useful?

ls /etc -hs | sort -h

Now it’s useful. Even better, here’s the top five biggest files in a directory.

ls /etc -hs | sort -h -r | head -5

There are many ways to do that; that’s just one way to skin the proverbial cat.

sort gets a little cryptic when you’re not interested in sorting by the first character in a line, but it’s not impossible. Here’s a deliberately screwy text file, tab-separated, and we’re going to sort by the second column. 😯 Trust me. 😉

kmandla@6m47421: ~$ for i in {1..10} ; do echo -e $(shuf -n 1 /usr/share/dict/cracklib-small )"\t"$(shuf -n 1 /usr/share/dict/cracklib-small ) >> test.txt ; done

kmandla@6m47421: ~$ cat test.txt | column -t
sial         nullstellensatz
galloped     codicil
pored        presence
measurer     lane
protective   ocean's
rapport      scotsmen
shrewd       sift
calculation  drafted
parklike     gimmicks
moslem       logo

The trick is to use the field separator and key flags to tell sort to look for a tab, and to sort by the first characters after that.

kmandla@6m47421: ~$ sort -t$'\t' -k2 test.txt | column -t
galloped     codicil
calculation  drafted
parklike     gimmicks
measurer     lane
moslem       logo
sial         nullstellensatz
protective   ocean's
pored        presence
rapport      scotsmen
shrewd       sift

The $'\t' represents our tab character, and the -k2 tells sort to do its magic on the second field it finds. Voila. That wasn’t so hard, was it?

I had a few other things I hoped to show with sort, but honestly, time these days is very short. Maybe one day we can come back and swap sort war stories. … 😀

P.S.: sort, like all great command line tools, is part of coreutils. 😉