PHP DOMXML: Constructing an (Atom) RSS Feed

// September 21st, 2008 // PHP, Programming

There’s no doubt an easier way of parsing an RSS feed (xml) than by using DOM XML, but I’m here to show you how to do it the hard way. Infact, you could learn a thing or two by taking the harder, yet still logical, routes in programming.

The Basic Elements of an Atom RSS Feed
First, you have the “rss” element that defines the version and the xmlns, as shown below:

<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">

Next comes your “channel” element – essentially the root of your RSS feed – which contains all the RSS feed information, including content. The channel element does not have any attributes, only child elements. The child elements consist of: title, description, lastBuildDate, language, ttl, atom:link, and the RSS item(s) – as shown below:

<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
    <channel>
        <title>Example RSS Feed</title>
        <description>A mock-up RSS feed.</description>
        <lastBuildDate>Sun, 21 Sep 2008 18:56 -0500</lastBuildDate>
        <language>en-us</language>
        <ttl>60</ttl>
        <atom:link href="http://examplesite.com/feed/news.rss" rel="self" type="application/rss+xml"/>
    </channel>
</rss>


And finally, there are the “item” elements (there’s a separate item element for every RSS item.. d’oh), which consists of: guid, link, title, pubDate, and description. A partially complete example shown below:

<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
    <channel>
        <title>Example RSS Feed</title>
        <description>A mock-up RSS feed.</description>
        <lastBuildDate>Sun, 21 Sep 2008 18:56 -0500</lastBuildDate>
        <language>en-us</language>
        <ttl>60</ttl>
        <atom:link href="http://examplesite.com/feed/news.rss" rel="self" type="application/rss+xml"/>
        <item>
            <guid isPermaLink="false">1</guid>
            <link>http://www.examplesite.com/article/32424</link>
            <title>Article 32424 - Dog Eats Man</title>
            <author>John Doe</author>
            <pubDate>Sun, Sep 21st 2008</pubDate>
            <description><![CDATA[ <strong>A dog, ate a man - wow!</strong> ]]></description>
        </item>
    </channel>
</rss>

Notice how there’s an attribute to the “guid” element: isPermaLink - you should set this accordingly.

Also, notice how the “description” element contains a CDATA section. This allows there to be symbols like: < > / – and so on. Most descriptions call for some HTML, so the CDATA section is needed in most cases, however you get to decide if you actually need it.

DOMXML Functions: What we will be using
The DOMXML function names are, for the most part, self-explanatory. I’ve linked all the functions to their documented pages on the PHP website – click the function names to read more about the functions.

DOMDocument::__construct([string $version [, string $encoding]])
DOMDocument::createElement(string $name [, string $value])
DOMDocument::save(string $filename [, int $options])
DOMElement::setAttribute(string $name, string $value)
DOMNode::appendChild(DOMNode $newnode)

The Programming.. Not Explained (in depth, at least)
Explaining every part of programming the script would take a while, and I don’t really have that much time right now. I did my best to comment on everything that happens in the script, I re-iterated the comments so reading the whole document top-to-bottom might seem a little redundant.

I have uploaded the finished script here: http://dev.jstrese.net/Examples/DOMXML_Example_RSS_Feed.html

Leave a Reply