Technology consultancy Arc90 has released a simple tool that will work in any modern web browser and makes reading online a whole new experience. The tool is called Readability and it performs a simple task, removing the clutter from almost any web page, leaving only the featured content. The resulting page is cleanly formatted and easy to read. You even get to choose the fonts, margin spacing and general layout. For example:
Readability is extremely easy to install and use. Follow the steps on their installation page (watch the video first, if you like) to place their bookmarklet on your browser’s toolbar. Then, whenever you’re on a page that you want to really read – not just skim – click the bookmarklet and Readability de-clutters the page. It can even remove the distraction of inline text links, by moving them all to footnotes at the bottom of the content. When you’re done, click the “Reload Original Page” button and the page is restored to its original state.
The developer of Readability, Richard Ziade, was interviewed recently on Rebooting the News, where he explained that developing the technology to correctly identify the featured content on a page and remove everything else was much more difficult than it looks. He started the project in his spare time to meet his own need to reduce the level of distraction that he knew was interfering with his online reading comprehension. The whole program is well worth listening to.
Our special guest is the developer of the Readability plug-in, Richard Ziade. He’s a partner in arc90, a strategic consulting and software development firm. Recently, his product was in the news because Apple’s Safari browser incorporated it, as Dave explained in a post at scripting.com. (It would be a good idea to read that post before listening.)
It’s a great tool and I’ve been using it a lot but here’s a question that needs to be asked of readers: If you get to control the viewing experience and choose to ignore the ads and other bumpf (and who wouldn’t?), what responsibility do you have to replace the revenue that those ads bring in for the publisher? And conversely, if the readers are voting with their feet and turning off the ads, how can publishers change their content and revenue models, in order to attract readers who are willing to support them? These are questions that all media companies have been grappling with and programs like Readability (and the new Safari 5 browser, which has similar functionality built in) simply shines a brighter light on how our online consumption has changed the media landscape.
Yeah, the algorithm for weighting the objects based on inner tag types is really impressive. I wish I had thought of that :).
This is also a god-send for web scraping. I've implemented it in several projects after porting it. Props to Arc90.