Friday, March 23

Hacking blogger's atom.xml feed

So, in the past couple of days, one of my projects has been to figure out why the atom.xml feed for Goodnight Burbank was not listing all 30 of the episodes - we only could see 25 episodes, from the most recent modified to the 25th recently modified. This only seemed to be happening with the advent of the new blogger - so, I thought there must be people who have had this problem before.

Surprisingly, I discovered people do not seem to be having this problem - at least not asking the question. For instance:
  1. When I contacted my friendly neighborhood blogger support staff, they pointed me to the google Reader and proved to me that the feed was actually fully there (which, when I looked, it was!). So, obviously, there was nothing wrong - even if I and other feedreaders could not see past the 25th most recently modified post.
  2. When I contacted my engineering friends at google, I was told that "Feeds are not supposed to have all of the episodes/posts on a single feed. Usually atom feeds only have 10 posts/episodes. You are lucky to have 25!" And then when I pointed to the episodes I saw in the google Reader, I was told that, more than likely, there is a cache that exists and it is already indexed.
This had me frustrated. Obviously, the constraints of programming were out of my hands, since blogger is being developed and maintained by engineers at google. So, taking one of my friend's advice, I decided to begin the process of migrating the site to a WordPress installation.

Wordpress.com or my own?
This was a question, since the hosted version of Wordpress is managed by others - but I noticed that modification of templates and other aspects would be a bit of a problem (not sure how much it would cost). So, I downloaded a version of Wordpress 2.1.2 and tried the install.

In following the famous "5 Minute Install Instructions", I learned the the zipped version of the file seemed to be missing the special pages (like wp-config-special.php and install.php), so I then downloaded the .tar.gz version (fortunately, I was on a Linux server). And, true to the word, once all was in place - it was much less than 5 minutes.

Migrating from blogger to Wordpress
Now, to migrate off of the blogger platform and onto the new Wordpress install. But, how to do it? That took a little bit of detective work, and I found Ady's plugin which I tried to follow the instructions, but could not figure out - after I installed and activated it, what to do next?

Well, being an old hacker, I accidentally found the "Plugin Editor" and started reading the code - and discovered the Management options that were now added to my Wordpress installation. So, I went to it and, after reading Ady's post with greater understanding, I brought down the old blog. But, what surprised me was that the script found all 30 posts, not just the 25 that I could see on the atom.xml feed. This was puzzling.

So, again, I went back into the code and discovered an interesting snippet that Ady had recovered from somewhere - and it did not look at all like what I expected from google. The link now says:

http://goodnightburbank.blogspot.com/feeds/posts/full?alt=rss&max-results=&start-index=1

where Ady's code manually figured out the maximum number of posts by paging through the XML feed itself with a loop. Interestingly, I had seen a similar feed link when I looked at my google Reader subscription, I just thought it was a translator for the atom.xml request.

Help Pages?
Where was this? I spent over three weeks going through this. And, scouring the blogspot help pages, there was nothing listed - including this new feed format as shown above.

Well, to give credit where credit is due, I must point to Sumesh of Digital Dreams for his post on Hacking Blogger which was the first post I saw that explained the new format.

Note to Blogger Team - can you please publish these APIs (name/value pairs) in a way that is easier to find and reference? Especially since I would rather not have to manually change the number of posts on my feed request when I submit to feedburner for Goodnight Burbank.

Conclusion - sticking with blogger...for now
So, when all was said and done, it was easier to stick with blogger than to set up the Wordpress site - and make sure that the feed I am feeding to feedburner is correctly modified. Will I change over some other time? Maybe. But not tonight. I have more work to do.

No comments: