Week 3 of the Summer of Code has been by far the most productive week yet. The main focus of this week was to parse the AtomPub data into a PHP array, and I’m pleased to say this was a success.
The XML Parser
I started the week off by writing the custom XML parser I talked about last week. To do this, I researched several different methods for utilizing PHP’s xml_parse function. Since the parsing occured on an established standard where the tag names will not change on me, I decided to parse the tags based on a tag name switch. This worked well until I started running into nested tags. Although, that problem was quickly resolved with the use of a few class variables.
Missing Data
Once the XML was in a parsed array, I began looking over the array and envisioned how this data would import into WordPress. During this process, I realized the AtomPub feed was missing two bits of key data: the draft status of posts and a list of trackbacks. I immediately began looking into possible workarounds.
While I investigated solutions, my mentor Lloyd discussed the missing data with Six Apart. We were assured support for app:draft was on their todo list, but they did not commit to any date for availability. So, Lloyd gave me the go ahead for workarounds.
To solve the draft problem, I ended up creating a 404 checker. Assuming that drafts will not be published, the URL for the post should result in a 404. Knowing this, while posts are imported I loop through the URLs and check the HTTP status codes. The workaround certainly isn’t the best as it’s resource intensive, but for the time being it works.
After fixing the draft problem, I looked into solutions for the missing trackbacks. I found this function on TypePad’s XML-RPC developer site, however, attempts to implement the function call have failed me. So, I continued to search for alternatives.
I found out today that Movable Type has a hidden RSS feed for trackbacks. I tested this and indeed is it true. Unfortunately, TypePad does not seem to have this feed. My guess is this is because of their premium pricing model, removing support for additional and custom templates in the lower tiers. If anyone happens to know the super-secret URL for a post’s trackbacks on TypePad, I would love to know, but I truly believe that feed does not exist.
Therefore, the search for a trackback solution continues. For the time being this is being put on the back-burner. When I get some free time over the next couple of weeks or during the second half of the Summer of Code I will revisit this problem, but at the moment trackback support is being forgone.
Next Week’s Plan
So, what’s up for next week? Early next week I plan on working on the actual importing of data into WordPress. All the arrays are prepped and the functions are ready, so the importing process should go fairly quickly. I’m actually anticipating finishing up the import code by the middle of next week, but if things don’t go to plan I have until the end of the following week. Should I finish early, I will revisit some of the priority two items accumulated over the past few weeks. With a little luck, next week will bring a functioning importer with some additional fixes.

2 Comments
What problems did you run into with mt.getTrackbackPings on TypePad? I haven’t had a problems with it. Drop me an email and I can send you a simple script that makes the request.
To be honest, I’m not exactly sure what my problem was. I kept on getting a blank XML document in return with no content other then the declared XML tags. Thanks for the offer on the code. I’m sending you an email.
2 Trackbacks/Pingbacks
[...] am on June 15, 2008 | # | Tags: atompub importer, weekly Week 3 status report is now available. [...]
[...] you remember last week, I had some difficulties getting trackbacks working. Well, thankfully that is no longer the case. Earlier this week Joseph Scott helped me figure out [...]
Post a Comment