Tag Archives: files

sloccount and sloc2html.py: Because size does matter

I’m not a coder, and I try whenever possible to repeat that, at the top of my lungs.

There are some coder’s tools that I think are nifty though. Way back in the C section we had codemetre, for checking the length and breadth of your efforts, and comparing that with the number of commented lines.

In a similar vein, here’s sloccount, which goes a step further by counting out lines of types of code, summing them across an entire project, and showing some basic analysis.


And that’s what you get if you point it at the source for the 3.14.2 kernel. 😉

sloccount is nifty by itself, no doubt about it. If you’re willing to make a small leap toward the graphical, sloc2html.py will — as you might expect — convert the results of sloccount into something browser-oriented.


This is where things start to break down, because the home page for sloccount also hosts a touched-up version of sloc2html.py, but the link to the actual python program is dead. I scraped around the internets and I think I found a reasonable facsimile thereof, here. I have no way of knowing if that’s the right one though. 😕

The problem is that, as with so many python programs I encounter, it seems that advances in python proper spawn errors in sloc2html.py, the least of which seemed to occur with Arch’s python2 in charge. What you see above is what sloc2html.py could write out, before it crashed and burned.

I think you get an idea of what it should be doing though, from the screenshot. If you are a pythonian, you might be able to get it going with version 3+, with a little effort. I lack the requisite skills for that.

Which is what I mentioned at the start of this post. Clever, aren’t I? 😉

rdfind: Echolocation

I once destroyed — utterly destroyed — a Windows XP installation by playing fast and loose with a utility that sought out duplicate files and arbitrarily removed them.

You can imagine the havoc that caused. I couldn’t tell you the name of the utility now, and it doesn’t really matter except that tinkering with rdfind brought that memory back.


No, I didn’t destroy any Linux installations today, and I daresay that neither Linux nor rdfind would allow me to utterly decimate the system without at least showing some credentials. I get by with a little help from my friends.

It does sound like the author of rdfind may have had similar experiences though, given the explanation on the home page and rdfind’s options for linking files, as well as removing them outright.

I also like that rdfind creates an output file, showing the fruits of its labor and giving you a report on its opinions. Every program should be so polite.

Seeing as rdfind was Ian Munsie’s suggestion, I suppose I should offer one small note of thanks for pushing me in this direction. I do think it’s a step above fdupes, in technical terms.

Now all I need are some duplicate files to thrash. … 😈

diff: Tools to show you what’s changed

Up front I should say that I had a nice post about diff and cmp and the other tools in diffutils ready, but by some freakish twist of fate, it seems to have vanished.

I blame no one for that, but it does mean that this post is a very abbreviated version — a mere shadow of its former glory.

In short, diff shows what’s different between two files, line by line. It sounds simpler than it is.


As you can see, line by line, diff shows changes into or out of a file, as it compares to another.

If you’ve worked with diff, it was probably under the pretext of patches for code; that’s where I learned what little I know about it.

Technically diff only works with two files; diff3 should help you sift through three at a time.

At this point, you’re probably thinking that diff alone is only marginally useful, and obviously intended toward patching things. After all, it’s hardly readable except by experienced users.

To that end, I offer you sdiff, making life easy since … since … well, anyway:


As you can see, sdiff does you the favor of flagging differing lines with pipe symbols. Much easier to absorb, for visual people … like me.

One last note: cmp comes with diff, but is quite different.


Where diff is comparing lines of text, cmp is working byte-by-byte. I can only suggest that this might be useful if you’re looking for one or two different characters in two similar files — perhaps a data corruption issue, or something like it. I don’t have quite so much experience with cmp, as I do with diff … which isn’t saying much. 🙄

In closing I should mention that each of these has a laundry list of files and options, and a lot more ways to be put to use than what I show here.

But I think I hit the main points of the old post. I’ll dig around some more and see if I can dredge it up, but I have the feeling it’s lost to the ether. Such is life. 😐