Tag Archives: download

msdl: Ripping the format of the Evil One

I kid. There’s nothing inherently evil about mms:// format streams. Not that I am aware of, anyway. 😕

And if you should still stumble across an mms:// URL and wish to access it, you might think back to earlier this year, and mimms. For almost a year mimms languished on this blog in a dubious working-or-not-working state, mostly because yours truly, despite razor-sharp Googling skills :roll:, couldn’t find a live, working mms:// stream.

Well all that changed this morning, and I can now vouch for both mimms and … msdl.

2014-11-06-2sjx281-msdl

Better late than never, I suppose. 🙄 In my defense, I’m only partly to blame for that, since Microsoft apparently deprecated mms:// format sometime around a decade ago. So finding a working stream depended on a lot of factors beyond my control.

Aside from all that, msdl seems to work with the same alacrity and wild abandon as mimms. I do notice that they both have speed indicators and basic progress counters, which is good.

mimms adds the .wmv extension to its output, while msdl apparently lets the stream determine its filename. Neither way is an issue for me.

msdl has a healthy number of options available as flags, and I particularly like the speed controls and the verbose option, and the option to stop streaming after a set period of time.

While I’m at it, I should be clear that msdl doesn’t just stream mms:// URLs, but can handle rtsp:// as well as a few specialized formats. So no, K.Mandla is not investigating software that only works with a decades-old, deprecated stream format. 😐

Trust me, I’m a professional. 😉

wikipedia2text: Looking well-preserved, thanks to Debian

Conversion scripts are always good tools to know about, even if I don’t need them frequently enough to keep them installed. wikipedia2text is one that, in spite of its age, still seems sharp.

wikipedia2text

Technically, the script’s name was just “wiki,” and technically the source link listed on the home page is dead. Late in 2005 though, it made its way into Debian, and is still in a source tarball there. So it seems that it is possible to achieve immortality — all you need to do is somehow find your way into Debian. 😉

The script works fine outside of Debian; just decompress it and go. You’ll need to install perl-uri if you’re using Arch. But if you’re in something Debian-ish, it should pull in liburi-perl as a dependency when you install it.

One thing that’s not mentioned outright in the blog post but does appear in the help flag: wikipedia2text will need one of about a half-dozen text-based browsers, to do the actual fetching of the page. I used lynx because … well, just because. Which leads me to this second screenshot.

lynx

At this point I’m wondering if wikipedia2text is an improvement over what a text-based browser can show. After all, lynx is showing multiple colors, uses the full terminal width, and I have the option of following links.

What’s more, wikipedia2text — strangely — offers a flag to display its results in a browser, and in my case it was possible to send the output back into lynx. So if you’re keeping track, I ran a script that called a browser to retrieve a page, then rerouted that page back into the browser for my perusal. 😕 :\

In the absence of any other instruction, wikipedia2text will default to your $PAGER, which I like because mine is set to most, and I prefer that over almost anything else. Perhaps oddly though, if I ask specifically for pager output, wikipedia2text will arbitrarily commandeer less with no option to change that. Without any instruction for a pager, the output is $PAGER. But with the instruction it jumps to less? That’s also a little confusing. …

Furthermore, I couldn’t get the options for color output to work. And I don’t see a flag or an option to expand the text width beyond what you see in the screenshot, which I believe to be around 80 columns. That alone is almost a dealbreaker for me.

I suppose if I were just looking for a pure text extraction of a page, wikipedia2text has a niche. And it’s definitely worth mentioning that wikipedia2text has a text filtering option with color, which makes for a grep-like effect.

So all in all, wikipedia2text may have a slim focus that you find useful. I might pass it by as an artifact from almost 10 years ago — mostly on the grounds that it has some odd default behavior, and I fail to see a benefit of using this over lynx (or another text-based browser) by itself. 😐

httrack: The website copier

I could have used httrack about four months ago, when I wanted to mirror a fairly large website for my offline perusal, and lacked a proper tool. I tried bew and another graphical webcrawler, and even fell back on wget, but nothing was 100 percent successful. I ended up mass-downloading most of what I needed, and it wasn’t a pretty sight.

httrack might have saved me the trouble, and probably would have done a much better job.

2014-11-04-2sjx281-httrack

httrack is more than capable of patiently stepping through the architecture of a website, and bringing you a copy of everything there.

But on top of that, httrack, like a lot of good network-based software, has so many options, it can be a bit bewildering. If you open the --help flag, be prepared. It’s a couple hundred lines long at least.

For example, there are flags to save files in a cache, to skip files that are available locally, four options for logging, flags to create an index, screen for particular types of files (ie., HTML only, etc.), set directions for following directories (only up or only down), disable bandwidth abuse limits, cap the number of links, continue a broken-off mirror attempt, enter an interactive mode, confine the search to a single site, and dozens upon dozens more.

Most of those other ones are far and beyond anything I would ever need, let alone understand. If you know what they mean, you might find them quite useful. And maybe best of all, httrack has about a dozen shortcuts for common flag combinations, meaning you can ask for just --spider, instead of typing out -p0C0I0t.

The first time you use it, I’d recommend just httrack though, since by itself the command steps you through a simple wizard, letting you pick options menu-style. If you’ve never used httrack before, it’s a good introduction, and will finish with the command line needed to recall the same options you set. Very helpful, if you’re like me and you learn by example. 🙂

Once you get the hang of it, try things like httrack http://example.com -W%v2, which will give you a nice fullscreen progress display and prompt you if it finds any eccentricities. Quite useful.

I’m going to go back now and re-mirror the site I mangled back in July, and hope I can get a cleaner, more complete copy. 😉

stftp: The simple terminal FTP client

I’m all about full-screen, intuitive, colorful interfaces to programs. I don’t really care if something has been done before (everything has been done before), unless you’re reinventing the reinventing of the reinventing of the wheel. Just give me a good interface and a decent perspective on the task at hand, and I don’t care if your program was written in 2012 or 1992 — it’ll work for me.

Here’s stftp, which — with the possible omission of color — hits all three of those criteria. Among FTP clients, stftp is in a crowded box. But among full-screen console FTP client applications, it’s standing tall.

2014-10-11-2sjx281-stftp

The bottom line is a status bar. The top is a breadcrumb trail. Everything in between is selectable. Navigation is by arrow keys, with left and right moving you up or down in the tree. Press enter on a folder and you move in, but press enter on a file and you download it. Other one-letter keypresses are for uploading, deleting or filtering.

My trusty memory script says stftp is running on a little over a megabyte of memory, which may or may not vary with the depth of your travels and the lists it needs to manage. Still, +/- 1Mb is a svelte number.

I don’t use FTP clients very often, and I know some clients are a lot more feature-full than this (combined FTP and torrent client, anyone?) but I really like stftp for its clean interface and obvious arrangement. I’d jump at the chance to use this over, for example, gFTP, which always irritated me as a rancid excuse for an application. 👿

Now if only we can get some color in stftp. … 😀

gtorrent-ncurses: Not quite ready for prime time

It’s no secret I’ve been an rtorrent fan for nearly a decade now. It has its shortcomings and at times it seems to lack some features that the new kids have. But overall, it has been a reliable standby.

That doesn’t mean it’s the best way of doing things though, and when you stop trying new things, that’s when you get old.

But gtorrent-ncurses — the text-only option to the full gtorrent — might not be the one to take the throne.

2014-09-16-jsgqk71-gtorrent-ncurses

It looks like a good start, and as best I can tell it is actually working. But that interface looks suspiciously broken, and as best I can tell, there are only two controls: “a” for add a torrent, and “q” for quit.

No progress indicator. No bandwidth meters. No throttling controls. No help screens, priority settings, peer lists, sharing ratios, tab completion for adding files … the list goes on.

I only looked briefly at gtorrent’s full graphical interface, so it may be that it’s possible to get those things from the full X-based UI. They are suspiciously missing from the text-only version though, and in this day and age, more than a dozen years after the original BitTorrent, it’s a little hard to overlook.

I’m willing to give gtorrent-ncurses the benefit of a gestation time, and come back to it later. Like I said, it appears to be working, even if the “interface” wasn’t doing much to tell me that. I’ll be back in a little while. 😉

html-xml-utils: A sweet suite

I’m in favor of any tool that can strip away the manure that masquerades as XML files. I have no earthly idea why anyone would use that style or arrangement voluntarily, especially when simpler and cleaner arrangements are so much … cleaner and simpler to work with. :\

So if you hand me a suite of 10 or 12 tools that scrape away at XML and HTML files, I’m like a kid on Christmas Day. Here’s html-xml-utils, which is just a toy box full of goodies. Which unfortunately means I can only show one or two.

hxnormalize, I imagine, improves readability for pages with frequent links. Go from this:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
  <title>Simple page</title>
</head>

<body>

<h1>A simple HTML page</h1>

<p>This is a very simple HTML page, made from scratch for the purpose of testing some <a href="http://www.w3.org/Tools/HTML-XML-utils/man1/" target="_blank">tools</a> in the <a href="http://www.w3.org/Tools/HTML-XML-utils/" target="_blank">html-xml-utils</a> package.

</body>
</html>

to this:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "">

<html>
  <head>
    <title>Simple page</title>
  </head>

  <body>
    <h1>A simple HTML page</h1>

    <p>This is a very simple HTML page, made from scratch for the
      purpose of testing some <a
      href="http://www.w3.org/Tools/HTML-XML-utils/man1/"
      target="_blank">tools</a> in the <a
      href="http://www.w3.org/Tools/HTML-XML-utils/"
      target="_blank">html-xml-utils</a> package.</p>
  </body>
</html>

Not only does every line break at a link, which makes them easy to spot, but some closing tags have been corrected, because I gave hxnormalize the -x flag.

I can re-use my example with hxprintlinks, which will number every link in the document, and add a reference list at the bottom of the page.

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
  <title>Simple page</title>
</head>
<body>
<h1>A simple HTML page</h1>
<p>This is a very simple HTML page, made from scratch for the purpose of testing some <a href="http://www.w3.org/Tools/HTML-XML-utils/man1/" target="_blank">[1]tools</a> in the <a href="http://www.w3.org/Tools/HTML-XML-utils/" target="_blank">[2]html-xml-utils</a> package.

<ol>
<li>http://www.w3.org/Tools/HTML-XML-utils/man1/</li>
<li>http://www.w3.org/Tools/HTML-XML-utils/</li>
</ol>
</body>
</html>

Of course, pipe hxnormalize into hxprintlinks, and some of that will be cleaned up a little. 😉

If you remember xidel or xmlstarlet, you might remember how it’s possible to pull single elements out of an XML file, for further editing. hxextract can do that, and here are the results of hxextract command .config/openbox/rc.xml on my system:

kmandla@6m47421: ~/downloads$ hxextract command rc.xml 
<command>gmrun</command><command>urxvtc -e alpine -d 0</command><command>urxvtc -e wicd-curses</command><command>urxvtc -g 142x60 -e /home/kmandla/.scripts/mc.sh</command><command>/home/kmandla/.scripts/cleanup.sh</command><command>urxvtc -e htop</command><command>urxvtc -e alsamixer</command><command>/home/kmandla/.scripts/volume.sh</command><command>urxvtc -e alsamixer -D equal</command><command>urxvtc -g 142x60 -e elinks</command><command>/home/kmandla/.scripts/browser.sh</command><command>urxvtc -g 35x9 -e tty-clock -x -t -B</command><command>urxvtc -g 24x12 -e clockywock</command><command>urxvtc -e vim</command><command>urxvtc -e sc</command><command>urxvtc -e wyrd</command><command>urxvtc -e tudu</command><command>urxvtc -e mocp</command><command>pidgin</command><command>urxvtc -g 80x24 -title rhapsody -e /home/kmandla/.scripts/chatnews.sh</command><command>urxvtc</command>

Not pretty, but a step forward in terms of finding miscreant keyboard commands in my rc.xml file. 😐

There is a lot more — a lot more — available in html-xml-utils that I just don’t have the time and resources to touch on. Look for tools that will convert from XML to asc files, tools that will build tables of contents and bibliographies for entire trees of files, and even a few that transpose tables or just pull out links. That one, hxwls, is mighty clever. …

I leave it to you to explore the rest of that suite. If you’re like me and can only scratch your head a the ascent of XML as a data format, this will be fun for you to play with.

Oh, and I almost forgot: Theodore gets credit for mentioning this one. Thanks, Theodore. 😉

gplayer: Get loud with the cloud

It seems cloud-based or Internet-heavy tools are the choice of the gods of shuf today, since the second title for this sunny Saturday is a CLI-based interface for Grooveshark, the online streaming audio service.

2014-09-06-6m47421-gplayer

Like I said earlier this week, I’m in favor of any utility that strips away the worthless scum that coats most Internet services, and I list the noxious, fetid remains of Flash technology among that. So from the start, gplayer wins points for allowing me to sidestep the standard Grooveshark player.

That said, gplayer doesn’t reach the same degree of finesse that soundcloud2000 did. It is worth remembering that what you see in the screenshot is accomplished in approximately 60 lines of code … of course, allowing for the fact that mplayer, that seven-headed-ten-horned beast of media playback, is doing all the heavy work.

It’s still impressive though. gplayer gives you a search function that mimics Grooveshark, returns a list of 20 results, and allows you to cue any of the titles that are listed. From there, mplayer takes over, using its keypresses as controls and its frame progress counter as an onscreen display.

I’ve found a few small incongruities in gplayer, and I’ll note them here just as a matter of record. For one, when a song ends either because it’s over or because the listener gave “q” to mplayer, gplayer never recovers its prompt. I get a dull cursor without any text, and short of pressing CTRL+C, gplayer seems to have stalled.

I don’t think that was the way the author intended, since it would make more sense to me, as a casual user, to either get a new search prompt, a prompt to cue another title, a repeat of the previous list, or just be dropped to the shell. As it is, I’m lost somewhere betweem mplayer finishing and gplayer recovering.

Second, it’s fairly easy to send gplayer into a tailspin over the selection. Any non-numeric character will cause an error, and any out-of-range of numbers will cause an error. It’s just an issue of trapping those entries and preventing gplayer from exploding across the screen.

I can stop there since that’s about the limit of gplayer’s functions. If you’re willing to hold its hand for a little bit, and if you can find a way to cue up several songs in a row, and if you’re a fan of Grooveshark in the first place, you’ll probably find a place in your heart for it.

Oh, and I think the author should follow soundcloud2000’s lead, and subtitle gplayer as “Grooveshark without all the stupid css.” Or maybe “without all the stupid Flash.” 😉

googlecl: Cutting corners with everything Google

My office uses Google Documents for almost everything it does. We all have GMail addresses and even our primary site is managed through Google, although the intricacies escape me.

I concede that it does streamline some things, but only because I have to. I’m still no fan of the cloud, and I never have been, and probably never will be.

Having said all that, I can see where googlecl would be very, very useful in our office for bulk management of e-mail lists or contact information. Just as a very brief example:

2014-09-06-6m47421-googlecl-01 2014-09-06-6m47421-googlecl-02

That’s the same example that appears on the home page, so I suppose pixellating much of those images was unnecessary. All the same, I think you should get the point. With something as simple as google contacts add and a little data, I get a corresponding addition to my online Contacts list.

Which is what you would probably expect. And it likewise goes without saying that googlecl can handle not only GMail Contacts, but also Blogger posts, YouTube uploads, Calendar events, additions and edits to Documents, and just about every other aspect of your collective Goo-perience, from the command line.

I can’t go into too much detail on invididual commands and configuration, mostly because each Google aspect has its own rasher of options and specifics. If you’re genuinely interested — and again, for my daily workload I already see a few places this can be useful — you’ll need to look closely on your own.

Probably the one thing I like best about googlecl is what you see in the terminal screenshot above: Rather than require a configuration file setup, googlecl simply links you to the API authentication page, and prompts you for the passcode. It does save a step, and gets you moving a little faster with the entire Goo-perience.

And I tip my hat to that. I never have and never will concede my own private and personal information to The Almighty Cloud, and have serious worries on behalf of anyone who does. But I’m taking this to work on Monday, and seeing if it will help cut a few corners. 😐

mps-youtube: All-in-one search, download and play

I had mps-youtube on my list as a “YouTube downloader,” which I realize now was only partly right. Saying mps-youtube is a downloader is like saying a car keeps you dry when it rains — not only is that incomplete, but it’s not really its true purpose.

2014-09-04-6m47421-mps-youtube-01 2014-09-04-6m47421-mps-youtube-02 2014-09-04-6m47421-mps-youtube-03

mps-youtube — which will probably install as mpsyt — can perform simple searches or skim through online playlists, pull down video information and comments (I can’t imagine who would torture themselves with that, though), save and edit local playlists, playback video or audio, download best-quality versions of a link, corroborate its results against MusicBrainz, and a lot more.

I daresay that mps-youtube is what yaydl and yougrabber hoped to be, and what youtube-viewer should have been. In fact, I could suggest mps-youtube if what you really want from youtube-dl is a proper user interface.

Bonus points for excellent use of color, snazzy ASCII graphics title screen, helpful prompts and onboard documentation, and a screen-conscious arrangement.

Nota benes for using python, which might slow down on some of your older hardware, and relying on mplayer (or mpv) for playback. Then again, if your machine is too weak to handle either of those for audio playback, you probably won’t be using it for mps-youtube.

Downsides, of course, would be that as far as I can tell, it’s a YouTube-only program.

But that’s probably the way it was intended originally, and I can’t fault a program for fulfilling its original intent. And doing so with style deserves a gold: ⭐ Enjoy! 😉

soundcloud2000: A quick listen

I like the way the soundcloud2000 home page describes the utility: “SoundCloud without the stupid css files.” :mrgreen:

I’m in favor of any tool that strips way the worthless lard that clings to most Web services, and I’ve been stuck in that mindset for the better part of the past decade. The sooner the general public realizes that lightbox effects and Web2.0 tripe are lipstick on pigs, we’ll all be much better off.

That’s not particularly aimed at SoundCloud; to be honest, I have very, very few dealings with SoundCloud as a whole, just by virtue of my relative disinterest. Listening to random recordings by random people in random places around this random world strikes me as rather … random. 🙄

All the same, if I was stapled to a chair and forced to pick through it, I would much rather have the benefit of soundcloud2000 than be required to navigate through its web interface.

2014-09-03-6m47421-soundcloud2000-01 2014-09-03-6m47421-soundcloud2000-02

Simple enough: Up, down, paging keys and return to listen. Left and right to seek, space to pause. The “u” key allows you user access.

And that’s about it. When they invent photos that can come embedded with sound (note to self: It can be done) maybe I’ll have more to exhibit. As it is now, I give soundcloud2000 a great big thumbs-up for good use of color, intuitive navigation and a snazzy opening screen. Have a star: ⭐

And best of all, no stupid css files. 😀