RSS 2 Dates and Such

Mark Pilgrim: History of RSS date formats.

Go read Mark's points regarding RSS 2 and the use of the Dublin Core elements to get some context for what follows. If you're really new to RSS and the issues around it, Mark has another page with the goods.

As a developer, I totally agree that ISO 8601 is easier to parse. It's just a better way to represent a timestamp for computers to read. That's a technical reason for using the Dublin Core date element, but not a terribly compelling reason to abandon native RSS 2 elements. And, honestly, "either way works, but we like doing it this way" is not a compelling reason either.

I don't think that <pubDate> should be omitted in preference to the Dublin Core counterpart. Why? Because like it or not, as things are, <pubDate> is the defined tag for expressing publication date in the RSS 2.0 spec.

Having said that, I fully realize that <pubDate> is defined as an optional element. And the use of externally defined elements via namespace is allowed. But, I think it's odd to rely upon external elements that offer identical functionality (albeit in a different form) as the declared specification.

A Developer's Perspective

Let's consider the task of RSS 2 implementation from a developer's perspective.

If I were to write a new RSS parser (and believe me, I am not that stupid), I would first and foremost take the specification for RSS 2 and implement that to the letter. Every element that is defined there would be supported, regardless of whether it were optional or not. (Maybe it's just me, but as a developer, the spec is the Bible. It has to be followed or things don't work.)

If I were to stop right there without support for any of the existing namespaces, my parser would work, but it would be poor indeed. It should be complete by implementing the RSS 2 spec alone (for RSS 2 feeds that is). But a growing number of RSS 2 feeds today rely on the Dublin Core namespace to express common tags defined by the RSS 2 spec, so my parser would ignore those elements since it just implements the RSS 2 spec. For example, with the new RSS 2 template for Movable Type, any feeds it produces would be missing dates and category assignment in the eyes of my RSS 2 reader. Pretty significant, considering the date is the when of blogging.

I think Dave's concern about all this is that the specification of RSS 2 is weakened by using alternative formats to replace elements it already defines. Let's take this to the extreme: where every single element of RSS 2 is replaced by externally defined elements. Silly, right? And destructive to the goal of RSS -- using the acronym Dave acknowledges: "Really Simple Syndication".

The Publisher's Perspective

Let's look at this from the other angle. If I were completely new to RSS and wanted to implement a feed for my site, I would first and foremost take the specification for RSS 2 and implement that to the letter. I would use <pubDate> since dc:date isn't part of the specification and not having any information on the optional namespace extensions, I would be ignorant of it.

Having followed the specification, I would have an expectation that anyone using a RSS 2 reader should have no problem processing my RSS 2 feed fully. Dates and all. But if there are some readers that don't process <pubDate> and expect Dublin Core date tags instead, that would be a little frustrating to me.

Common elements like date and category should be part of the basic elements defined by RSS 2. And they are. And the publisher should look no further than the simple spec that Dave provides.

So What Then?

<pubDate> should be used as the preferred date tag for RSS 2 feeds. If you want to use ISO 8601 dates, feel free -- but <pubDate> tags should also be present for the benefit of all RSS 2 readers (whether they provide support for the Dublin Core extension or not).

Here's another interesting tidbit. RSS 2 defines both <pubDate> and <lastBuildDate> elements. Some are using the Dublin Core to replace <pubDate> -- but that tag looks like this: <dc:date>2003-06-21T10:00:00-05:00</dc:date>. This tag doesn't describe what it applies to -- one might wonder if it is meant as the publication date or the last build date or maybe it defines the author's birthday? Who knows? The RSS 2 spec doesn't define what the Dublin Core "date" tag is used for. The Dublin Core spec doesn't define what the "date" tag means in the context of an RSS 2.0 document. So one can only infer. Hardly a specification.

And no, I'm not taking Dave's side just because he linked to me yesterday.

TrackBack

TrackBack URL for this entry:
http://bradchoate.com/mt/feedback/tb/702

Listed below are links to weblogs that reference RSS 2 Dates and Such:

» Enough Funkyness from Observations
Hey, it may be funky, but my RSS 2 feed validates.... [Read More]

» The Funk that Won't Go Away from Rodent Regatta
A long debate began last week concerning RSS feeds and the formatting of those feeds. Apparently, each feed is made... [Read More]

» RSS 2 Dates and Such from Roland Tanglao's Weblog
(SOURCE:"scripting news")- This seems reasonable to me. [Read More]

6 Comments

The Dublin Core specification that the dc module is built on defines the use of date as, "Typically, Date will be associated with the creation or availability of the resource." It could be better defined, but it does resolve a certain amount of potential confusion.

Mark said:

The key fact you're missing is that RSS consumers that care about dates have always needed to support Dublin Core, because it's the only way to express dates in RSS 1.0 feeds.

Dave Winer said:

It was worth it to do the little funky dance with you guys. Smart people, you're figuring it out. Excellent.

PS to Mark: I don't think you do have to support dc:date. Dates are not absolutely required to deal with a RSS feed. After all at the item level 0.91 didn't have them at all, and a lot of feeds are in that format. If an aggregator is going to work with those feeds it has to do something meaningful with items with no pubDates.

Brad: People weren't studying the spec, as you say you are doing now. That's what I realized when I looked at how otherwise intelligent people were mangling RSS. Not just well-formed manglings, but (for example) 0.91 feeds with guids (that's not cool, guid didn't come in until 2.0).

Mark: Prior art is everything. The rule for evolution from 0.91 to 0.92 to 2.0 is that everything new is optional and if you put a 2.0 version number on a 0.91 file you get a valid 2.0 file.

Anyway, it feels like this discussion is going somewhere. Happy about that.

Brad Author Profile Page said:

Tim: Alright-- can you explain how to use the dc:date tag to identify both RSS 2 date elements defined for the channel element?

Mark: Did I mention RSS 1.0 at all? My hypothetical RSS 2.0 reader was built to just read RSS 2.0 files. Without consideration of namespaces, it wouldn't be able to date entries unless pubDate is present in the feed. In such a case, I would consider that a failing in the RSS 2.0 feed, not in the reader.

Dave: I admit I've been largely oblivious to all this until recently. I've heard bits and pieces of the discussions (I would guess a lot of webloggers fall into this category).

dc:date and dcterms:modified.

dcterms:modified uses the 8601 format like dc:date. I find that the HTTP 1.1 Last-Modified is more useful then the dcterms:modified date simply because its easier to obtain even if its not totally accurate.

jocbrut said:

thanks for the info

About

This article was published on June 21, 2003 10:43 AM.

The article previously posted was 4,558,302.

The next article is Yet another IE6/CSS bug.

Many more can be found on the home page or by looking through the archives.

Powered by Movable Type