Title graphic of the Moonspeaker website. Small title graphic of the Moonspeaker website.

 
 
 
Where some ideas are stranger than others...

AMAZONS at the Moonspeaker

The Moonspeaker:
Where Some Ideas Are Stranger Than Others...

HOW TO BUILD AN RSS FEED

There are few websites, no matter how irregularly updated, that lack an RSS feed, and the Moonspeaker is no exception. After all, RSS feeds are a common auto-generated add-on to blogging software, and in a phenomenon that is both bane1 and boon2 of the internet, most websites are blogs these days. However, contrary to the impression widely made these days by the "web 2.0" and "social media" pundits, you don't have to use a blogging platform or a social media feed to have an RSS feed. RSS is an open format that can be coded up in a plain text editor and published along with the rest of the files on your website. It is well worth learning how to write an RSS feed by hand if you have a website, even if you would never actually do such a thing. Depending on your website platform, the auto-generated feed may be fairly plain, and you might like to add some goodies to it like pictures or additional links, or even RSS-feed only posts. RSS has successfully weathered the loss of a major RSS aggregator and reader in 2013,3 so you can be reasonably assured that you won't be wasting your time. Based on my own experience however, the information available out there to help you actually code up or edit your own RSS feed is crap. The information provided is often inaccurate, or the author assumes you have all manner of knowledge you can't possibly have, or merely gives extensive directions for one specific blogging platform. (Hence my lack of diplomacy.) All very well-meaning, but alas almost worse than nothing. So here I am going to try to avoid those issues by writing up the approach that finally worked for me.

After some experimentation and rather more frustration than I like to remember, I have found that the best RSS feed format for the hand coder and/or tweaker of pre-generated feeds is RSS 2.0. It is quite stable, to my knowledge no RSS readers choke on it (unlike Atom) and it allows you to include pictures in your entries as well as a little icon that will show up in the RSS feed list, if you'd like to have that. But let's start with the very basics.

First, you will want to be able to check out your RSS feed as you work on it, and for that you will need an RSS reader of some kind. There are many, many stand alone readers out there, but I haven't found one that is light weight or convenient enough for my purposes. Most, if not all the major web browsers have stripped out decent RSS reader capabilities for no apparent reason, but luckily many developers have developed add-on readers to refill the breach. My main web browser is Firefox these days, and I use the Bamboo feed reader add-on. It's plain, fast, and quite simple to use. Of the less major web browsers, I know for certain that Sea Monkey has a built in RSS feed reader, as do Midori and Ice Cat.

Second, you will need to be able to post your RSS feed on-line, because RSS feed readers often can't view a feed that is hosted on your own computer.4 Even if your website host autogenerates an RSS feed, it is always better to work on a copy. If you decide to make a test feed from scratch, drop the starter code into a plain text file and save it under a different but related name to your usual site feed, for example "yoursitetestfeed.xml". The extension of the file needs to be "xml" and the file must be plain text. If you're not sure whether you have plain text or not, try bolding a word; if you're successful, you'll need to turn "rich text formatting" off.

Okay, at last here is the starter code of a plain RSS feed in the 2.0 format.


1  <?xml version="1.0" encoding="utf-8"?>
2  <rss version="2.0">
3
4  <channel>
5  <title>Website Title</title>
6  <link>http://www.websiteurl.yours/</link>
7  <description>RSS Feed</description>
8  <language>en</language>
9  ...
10 </channel>
11 </rss>

Lines 1 and 2 provide critical information that any RSS reader or aggregator must have. Line 1 says your feed is in xml mark up, and Line 2 gives the xml mark up dialect, rss version 2.0. The "mark-up" part is anything enclosed by angle brackets (also known as less than and greater than signs). There is no need to edit these lines at all, unless for some reason you are using an encoding other than utf-8. If you are and know all about text encodings on the internet, you probably should just skip down to the end of this page and grab the code snippets from the end. Otherwise, let's keep going.

Lines 4 to 8 are also critical information. For the purposes of RSS, your website is a "channel" and that needs to be defined so a person subscribing to the feed can see what it is called and click it to go to your site. Lines 5 to 7 are where you will definitely be making changes to reflect the content and theme of your site. If the primary language of your website is english, Line 8 probably won't change for you. However, if your website is provided in say, english and french versions, you would have (I hope) a separate feed for each version of the site. In that case, the french version would have a Line 8 replacing "en" with "fr."

Line 9 is not code at all, but just a placeholder for future goodies. For the moment, we are still focussing on building the most basic skeleton of the RSS feed. The last two lines are the final parts of that skeleton. They must be present in that order at the very end of the file, no matter what. Without them, your file will be malformed and unreadable. You can add white space to make things more readable for you in the code as long as it doesn't break up the tags defined by the angle brackets. If you were to go ahead and set up a test RSS feed with the code provided, but deleting Line 9 because that isn't good xml and would make an RSS reader cry, after uploading it you could go ahead and subscribe to it. The test feed will be available to subscribe to, there just won't be any items to look at yet. But now, let's add a simple item, just plain text, no pictures or fancy formatting.


1  <?xml version="1.0" encoding="utf-8"?>
2  <rss version="2.0">
3
4  <channel>
5  <title>Website Title</title>
6  <link>http://www.websiteurl.yours/</link>
7  <description>RSS Feed</description>
8  <language>en</language>
9
10 <image>
11 <url>http://www.moonspeaker.ca/favicon.ico</url>
12 <description>Plain icon of the Moonspeaker.</description>
13 <link>http://www.moonspeaker.ca/</link>
14 <width>32</width>
15 <height>32</height>
16 </image>
17
18 <item>
19 <title>New Random Site of the Week: Satimage - Smile</title>
20 <link>http://www.satimage.fr/software/en/index.html</link>
21 <description>
22 2014-02-20: Item text, two or three sentences or even a full article.
23 </description>
24 <pubDate>Wed, 05 Feb 2014 20:50:00 MDT</pubDate>
25 <guid>http://www.satimage.fr/software/en/index.html</guid>
26 </item>
27
28 </channel>
29 </rss>

Now things are a bit more exciting. Each post to your RSS feed is an item, and it must include the information and tags shown. After the item marker on Line 18, Lines 19 to 23 are fairly self-explanatory. This example happens to have a link included in Line 20 pointing to an external page, but this could of course be a link to a page within your own site instead. A link to a page on your own site should be complete, i.e. start with "http://" and include the full domain name. There are other sources on this topic that will suggest that as long as you have provided your website url on Line 6, that you can use relative links for items on your own site. However, I have not found this works consistently, whereas giving the complete url always does.

To add a favicon to your feed, the code for that is in lines 10 to 16. I have found that an image that is in ".ico" format consistently behaves as it should in feed aggregators, whereas ".png" format is hit and miss. Also note the full url pointing to the image file, and the fact that it's height and width are only 32 pixels. There again different sites claim that the size can be bigger, up to 64 pixels on a side or even larger. The size used here definitely works consistently. If you want to be absolutely sure that your site's favicon shows up when your rss feed is loaded, make sure that your top item always points to a part of your own site. Right now this seems to be a fairly common rss feed reader bug, so better safe than sorry. It also has the virtue of giving you one more motivation to keep adding original content to your website. To my mind anyway, one of the best parts of having a website is creating material for it and sharing that content with others on purpose, as opposed to having your personal data mined and shared involuntarily by others.

If you thought including a feed icon would involve another use of the enclosure tag, that would be a reasonable expectation. However, that isn't how it actually works, probably because it is an icon for the feed as a whole, not for an individual item. Regarding the favicon format, creating ".ico" files is no issue for Windows users at all, who should be able to generate them easily using Paint. For the rest of us, there are on-line format conversion options. If you have Photoshop there is a free plugin provided by Telegraphics. The GIMP performs the same operation either using a command line tool in older versions, or in the latest versions just use the "export file" option and give the file the "ico" extension (you may need to type it in).5

Matters are a bit more complicated on Lines 24 and 25, but only because their tags are not as plain language as their predecessors. For every item, there must be a "pubDate" to tell the RSS feed reader there is something new to pick up. The format shown must be used even if the most anyone ever sees of the publication date is something like "Wednesday, 5 February 2014." The 3 letter code at the very end, "MDT," is the time zone designation. To my knowledge, there is no reason this couldn't vary for every item because there are contributors from many time zones or maybe you're a travelling blogger. However, the potential effects of this isn't something I have ever tested since there are no other contributors to my site, and if there were, there would be one standard time zone defining publishing times. So the truth is, you can probably "set it and forget it" if you're on your own. Line 25 is also necessary, and needs to be unique, no item can have the same guid. In general this is no trouble if you use the specific url of the page, or you could use just the numbers from the date and time of your posting. For example, the sample item above could have a guid of "20140205205000," or the date plus the time on your computer's clock right before you finish editing the article.

Okay, so what would you do if you wanted to format the text in your item a bit? Originally I wrote "formatting is pleasantly simple, as it happens." This is not quite true, and to find out what was correct and get formatting to work properly at last on my own feed, I had to go back and read the RSS 1.0 standard, which clarified many things. Your RSS feed is an xml file, but you can't just use any html tag you want within the description. Instead, all <, >, ', and " must be replaced with their html entities (&lt;  &gt;  &apos;  and &quot;). There is one exception though, and it's an important one: do not replace the double quotes in the url or it won't work. Despite that weird exception, the trickiest part with these is that it's an all or nothing deal: they must all be replaced, or none. If none are replaced, and the feed reader is set up to account for such common errors in a friendly way, it will usually ignore them. However, if you have replaced all the angle brackets but forgot to replace apostrophes and double quotes, you will likely get an "illegal character entity" error. So take care that all of your descriptions have had those four types of characters replaced, and then you can format your items, and probably do mad things like create blinking text — but please don't. The code example below is updated to show what adding some html to format the text and add a link should look like.


1  <?xml version="1.0" encoding="utf-8"?>
2  <rss version="2.0">
3
4  <channel>
5  <title>Website Title</title>
6  <link>http://www.websiteurl.yours/</link>
7  <description>RSS Feed</description>
8  <language>en</language>
9
10 <image>
11 <url>http://www.moonspeaker.ca/favicon.ico</url>
12 <description>Plain icon of the Moonspeaker.</description>
13 <link>http://www.moonspeaker.ca/</link>
14 <width>32</width>
15 <height>32</height>
16 </image>
17
18 <item>
19 <title>New Random Site of the Week: Satimage - Smile</title>
20 <link>http://www.satimage.fr/software/en/index.html</link>
21 <description>
22 2014-02-20: Item text, &lt;i&gt;two or three&lt;/i&gt; sentences or...
23 &lt;a href="http://www.moonspeaker.ca/"&gt;See the Moonspeaker.&lt;/a&gt;
24 </description>
25 <pubDate>Wed, 05 Feb 2014 20:50:00 MDT</pubDate>
26 <guid>http://www.satimage.fr/software/en/index.html</guid>
27 </item>
28
29 </channel>
30 </rss>

On Line 22 I have added the html tags that will make the words "two or three" italic when rendered by an RSS feed reader. There is a new line added, Line 23, with a new link to a familiar place.6 My perusal of the RSS 1.0 standard also revealed the solution to the question of how to successfully code accented characters or punctuation beyond the standard ASCII character set in xml that works consistently and across browsers and feed readers. It's actually rather simple, although not necessarily clear to people who don't work on defining these things. The key thing to remember is that RSS 2.0 encompasses whatever behaviour is defined in RSS 1.0, except for what is deprecated. This is the same set up as for html standards: html5 is good for everything you could do in html4. So if you would like to include accented characters, you need to add the following lines directly under the xml version line.


1  <?xml version="1.0" encoding="utf-8"?>
2  <rss version="2.0">
3  <!DOCTYPE rdf:RDF [
4  <!ENTITY % HTMLlat1 PUBLIC
5  "-//W3C//ENTITIES Latin 1 for XHTML//EN"
6  "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent">
7  %HTMLlat1;
8  ]>
9  ...

This extra code is an "entity declaration", and it tells the feed reader how to interpret accented characters by using entity definitions provided at w3c.org. The definitions are just plain text mark up files, so if you'd like to see how entity definitions work, go ahead and download them and have a look. Any plain text editor should be able to open them, or if your computer fusses, just change the ".ent" to ".txt" and you should be able to open it by double-clicking.

Supposing you'd like to include some Greek text in your descriptions, all you need to do is add entity definitions for those letters, as shown below. Notice how the new lines precede the "]>" in line 12. There is one more thing to keep in mind though: there are feed readers that don't handle !DOCTYPE declarations happily, and if they don't, when they parse those lines out all accented characters and similar will promptly become illegal as far as that reader is concerned. I don't think this is actually very common, but it's important to know this and decide if it's a concern ahead of time.


1  <?xml version="1.0" encoding="utf-8"?>
2  <rss version="2.0">
3  <!DOCTYPE rdf:RDF [
4  <!ENTITY % HTMLlat1 PUBLIC
5  "-//W3C//ENTITIES Latin 1 for XHTML//EN"
6  "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent">
7  %HTMLlat1;
8  <!ENTITY % HTMLsymbol PUBLIC
9  "-//W3C//ENTITIES Symbols for XHTML//EN"
10 "http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent">
11 %HTMLsymbol;
12 ]>
13 ...

Having sorted out the basics of an item including making it a bit prettier if desired, what about adding a picture? That's a bit more involved again, but as long as you have the basic code to start with it isn't too much trouble. The second item in the sample below is pulled straight from the Moonspeaker's RSS feed, and I've dropped the favicon and entity definition code in order to focus on the task at hand. (I haven't dropped the rest because otherwise you couldn't just drop the code into a test file and try it.) You'll notice that Lines 23, 24, and 29 are all too long for the box they are in, so they take up two lines on the screen, but in an actual file these lines should not be broken up by hitting the return or enter key.


1  <?xml version="1.0" encoding="utf-8"?>
2  <rss version="2.0">
3
4  <channel>
5  <title>Website Title</title>
6  <link>http://www.websiteurl.yours/</link>
7  <description>RSS Feed</description>
8  <language>en</language>
9
10 <item>
11 <title>New Random Site of the Week: Satimage - Smile</title>
12 <link>http://www.satimage.fr/software/en/index.html</link>
13 <description>
14 2014-02-20: Item text, <i>two or three</i> sentences or...
15 <a href="http://www.moonspeaker.ca/">See the Moonspeaker.</a>
16 </description>
17 <pubDate>Wed, 05 Feb 2014 20:50:00 MDT</pubDate>
18 <guid>http://www.satimage.fr/software/en/index.html</guid>
19 </item>
20
21 <item>
22 <title>New Thought Piece: Tearable Puns</title>
23 <link>http://www.moonspeaker.ca/FoundSubjects/thoughtpieces2.html#tearable
       </link>
24 <enclosure url="http://www.moonspeaker.ca/Images/tearablepuns.jpg"
       length="19006" type="image/jpeg" />
25 <description>
26 2014-07-01: Having survived my thesis defence...
27 </description>
28 <pubDate>Tue, 01 Jul 2014 19:45:00 MDT</pubDate>
29 <guid>http://www.moonspeaker.ca/FoundSubjects/thoughtpieces2.html#tearable
       </guid>
30 </item>
31
32 </channel>
33 </rss>

The key line of interest now, assuming you already have a picture ready to go, is Line 24. The "enclosure" tag can be used for all sorts of things, but for the purposes of this tutorial I'll be sticking to its use for adding an image to the RSS feed, basically because this is a use that I have been able to implement successfully and consistently. There are three things needed within the enclosure tag.

  1. The full url of the picture you plan to add. This can be in any format a web browser can handle, which these days definitely includes jpegs, gifs, and pngs.
  2. "length" is the size of the picture file in bytes. Yes, this is somewhat weird. To get the size in bytes, you can use the "Get Info" option in MacOSX, or right-click for "Properties" in Windows. For other *NIX systems than MacOSX it is usually possible to control-click to bring up file properties.
  3. Finally, you need to make sure the image format is correct in the "type" setting. The usual 3 letter extension of the file should be fine.

But wait, there is still one more thing to consider, and that is that a few common RSS feed readers do not display enclosures. In that case, the simplest workaround is simply to include the image within an image tag in the description, with its angle brackets replaced with the appropriate character entities. In my experiments, the balky RSS feed readers cheerfully ignore the enclosure and display the image as desired when it is included in this way.

From here, you actually have enough to keep adding items. You can simply use the code provided here as a template. There is no need to delete items from the feed, though you can certainly delete older items for convenience. However, if you are tweaking a feed that is generated for you, don't delete anything. The website software will take care of that housekeeping for you, and trying to do its job for it may interfere with the workings of other items such as blog streams and the like.7

All together then, that is how you can build your own RSS feed, and the information about the code you need to implement basic things like minor text formatting and adding a picture to an item. It is quite possible to add sound files as well (OpenCulture does this quite a lot) using the enclosure tag. The url should point to a sound or movie file, and the type should be "audio/mpeg", "video/mpeg" or "video/quicktime". For more information about the rss format itself, including optional channel elements and the like, the best site I have found is the RSS Advisory Board. It is not as oddly unreadable as the W3C Consortium or many other websites that present web standards right now. If you need to look up a standard mime type, Helena Ahonen-Myka at the Department of Computer Science, University of Helsinki provides a copy of the Apache server mime-types file.

  1. "Bane" because so many people who use blogging software have never learned any of the underlying mark up languages that comprise the actual skeleton of any given website. It isn't necessary to code a website on your own, but to be aware of and able to apply the basics is not only hugely practical, it is empowering. It horrifies me to see so many people, young, old, and in between, whose only means of coping with web pages is accepting whatever WordPress, FrontPage, or something similar coughs up for them. Then when they struggle with problems or feel stuck with an unsuitable layout, instead of feeling they can look up how to modify things to solve those issues, many folks feel humiliated or foiled. This is truly unjust.
  2. "Boon" of course because blogging platforms do lower the barriers to creating a website and producing some content on the internet for whatever reason without being required to learn any mark up languages. That's completely fair, and everyone needs to start with a bit of help, even Netscape Composer in the 1990s. The "bane" part can be overcome and the "boon" part even bigger by making those tools more helpful for people who are interested in learning html and the like. However, the so-called web 2.0 approach is problematic on so many levels, and they go deeper than you might expect, as Olia Lialina has written about in Rich User Experience, UX and Desktopization of War. The people who spent their time sniping at amateurs who couldn't or didn't see any reason to design webpages to look like content-minimal magazines have a lot to answer for.
  3. Google killed google Reader in early 2013, for unclear reasons, although at the time the move seemed related to the company's push to drive people committed to the google ecosystem into using its own answer to Facebook. It's moves of this kind that make me wince a little when heavy google product users begin inveighing against apple's walled garden. Apple's walled garden is problematic, no question, but google also has a walled garden, it's just that right now the walls are still short enough to step over in places.
  4. This may not be a limitation of stand-alone RSS feed readers, though. In web-browser based RSS-readers, the limitation is security related, and even though it is not totally convenient, it is one worth having.
  5. There is a GIMP tutorial on making icons as well.
  6. If you need a quick primer on html, a great site to use is Ross Shannon's HTML Source. His tutorials are brief, thorough, and well documented. Oddly enough, he doesn't seem to have an RSS tutorial though, because if he did this page would be utterly unnecessary.
  7. For those folks who are struggling to figure out how to use the automatic rss generation capabilities of BibDesk, which are wonderful but basically not covered in its documentation yet, a page that sums things up is provided by Michael McCracken, one of the developers, at About RSS. It's not at all complicated, but not quite obvious either.
Copyright © C. Osborne 2017
Last Modified: Friday, January 27, 2017 22:27:08