A long time ago I mentioned wikipedia2text, and not long after we ran past wikicurses as an alternative. In both of those cases, the goal was to show Wikipedia pages in the console, without so much congealed dreck. wikicurses in particular seemed like a good option.

But considering that much of Wikipedia is put together in a markdown-ish fashion, wouldn’t it make sense to have some sort of conversion between HTML and Wikipedia format? You could conceivably take a dull .html file and send it straight through, coded and set.

Never fear, true believer.


html2wikipedia is a free-ranging program that does very much that same thing. In that case, I grabbed, pumped it through html2wikipedia, and got something very close to markdown.

I should mention that it’s not perfect; I wouldn’t blithely slap the results of html2wikipedia straight into a Wikipedia page, mostly because I think the formatting would be off kilter here or there.

But at first glance, it’s certainly in a workable state. The author suggests it should work in Windows too, so if you’re an avid Wiki-gnome (I am not), this might save you save time and work in the future.

Like I mentioned, I don’t see html2wikipedia in either Arch or Debian, but I don’t take the time to go through every distro out there.😯 Whether it is or isn’t, this is one of those times where it might be quicker and easier to download the source code and build it manually than download all the other packaging materials that accompany a 59Kb executable.πŸ™„

3 thoughts on “html2wikipedia: Converting back and forth

  1. darkstarsword

    Wikipedia uses some type of markup format, but it is not markdown. “markup” is just the general term for adding syntax to text to add things like formatting to it, but says nothing about what kind of syntax that is.

    “markdown” is a specific “markup” format (and a pretty good one for many things), but it is not the one used by wikipedia / mediawiki.

    1. K.Mandla Post author

      I couldn’t remember the name of it.😳 I thought I had read once a long time ago that it was a variation on markdown, which is why I said “markdown-ish,” but I should have looked a little harder for the correct name.πŸ˜•

