Google Reader, employing Google’s petabytes of storage, archives every feed item it’s ever pulled for you. This has always amazed me, as I’m sure I and everyone else must be using far more in Reader than the 5 gigs we get from Gmail. Still, they don’t have much of a choice; it wouldn’t do anybody good if you could only see the 10 or 20 items present on a feed’s XML file at any given time. And even though they’re probably clever enough to only have to store one copy of every item for that item’s hundreds of thousands of readers, they’ve practically built a third copy of the internet (after their cache).
A nice fallout of this archiving is that whenever content you’ve subscribed to disappears from the web, you’ll still be able to access its (admittedly homogenized) Reader copy, forever; “forever” here meaning “presumably for as long as Google is around.” When (if?) Google dies, will its data die with it? Despite my intuition that Google will long outlast current notions of what computers are and how they work, I still don’t like entrusting important data to other people, not to mention data that is accessible only through the web. I want a local copy.
But they don’t make it easy for you. Reader is all AJAXed out, so even simple page saves don’t work. Copying/pasting would be a nightmare. Screenshots? Too sloppy. Emailing copies of each item? Too time-consuming. Tagging them with a special tag, making that tag’s feed public, then subscribing in, like, Thunderbird or something? Even if that weren’t absurdly roundabout, the public feeds only have twenty or so items.
I’m talking specifically about a blog I loved, but that up and disappeared one day, completely, leaving the only copies of the lost data scattered throughout Netvibes, Newsgator, Bloglines, and Reader. Google searches turned up nothing like a straightforward guide to saving from Reader, which surprised me. But there were clues, and using only a couple tools, I finally got it. It’s actually pretty easy, I was able to save 118 items in about ten minutes with this method. Let me show you it.
You need Firefox, the two plugins Greasemonkey and ScrapBook, and the Greasemonkey script Google Reader Print Button. Then it’s just a matter of clicking “Print” for each item you want to save, which opens it in its own tab, then using ScrapBook’s “Capture All Tabs…” function, which automatically does a “Save Page As, Web Page, complete” into your %AppData% folder for each tab, then finally optionally using ScrapBook’s “Combine Wizard” (in the tools menu of the ScrapBook sidebar [Alt+K]) to put all the items into a single folder with a single index.html file.
The “printing” part is the most cumbersome, but goes by pretty quickly with the repetition of a series of clicks and keystrokes:
- Click “Print”
- Press Esc (to close the print dialogue)
- Press Ctrl+Tab (to get back to Reader)
- Press J (to go to the next feed item)
Do that mindlessly for a couple minutes, and they’ll all be there, waiting to be saved. I’m gonna put the word “disk” in here too so that anybody Googling for a solution might find this.