Are you a developer? Do you write Python? Great! We’ve made some big changes and additions to the Readability Python library. We’ve simplified interactions with the Reader API and added support for the Parser API.
First, The (small bit of) Bad News
With this new release, we’re dropping Python 2.5 support. Hopefully, this isn’t a shock. We’d like to look to start putting effort towards supporting Python 3 rather than worry about supporting past versions of Python.
The Important Stuff
The philosophy behind simplifying the code was to bring HTTP to the surface. We ditched the idea of models and now expose the raw JSON from the server as dicts. All calls will return an instance of httplib2’s Response class with a slight modification. Each instance has an added ‘content’ attribute that contains the server’s JSON response encoded with the standard lib’s json.loads. This gives the user direct access to the data in a convenient dict.
from readability import ReaderClient
rdb_client = ReaderClient('my_reader_token', 'my_reader_secret', 'user_key', 'user_secret')
bookmarks_response = rdb_client.get_bookmarks(favorite=True)
If you used past versions of the library, you might notice that XAuth still exists but OAuth is gone. The OAuth implementation was never fully baked. Instead of trying to maintain our own, we decided to let the community leverage the great libraries that are already out there. Simple Geo’s oauth2 library is great. If you’re using Django, we added Readability support to django-social-auth to help with the process.
Parser API Support
Recently, all Readability users were given access to the Parser API.
After the Readability package is installed, getting parsed content is as easy as:
from readability import ParserClient
parser_client = ParserClient('your_parser_token')
parser_response = parser_client.get_article_content('http://www.some-web-page/blog.html')
"content": "I'm idling outside Diamante's, [snip] ...",
"author": "Rafi Kohan",
"short_url": "http://rdd.me/g3jcb1sr", ...
Note, the use of ‘content’ looks a bit redundant. It’s necessary due to the response content being in an attribute called ‘content’ and the article content being returned with the key ‘content’.
Take a look at the code on Github. Feel free to submit feedback, feature requests, contributions and bug reports. Don’t forget to let us know if you build something!