I’ve had a livejournal account for some time now and although I post all my articles to my blog, I have a WordPress plugin that automagically syncronises the two accounts. This is ideal.
I’m also part of a community known as UKNOT. UKNOT has an RSS aggregator that grabs a bunch of blogs and presents them nice and neatly. Only problem is that since my blog’s been on there, the text in the titles has not been parsed correctly – if I have a character in the title that turns into an HTML entity, then it goes all pear-shaped.
Today I tracked down why (which will probably make a bunch of UKNOTters very happy). When my post gets sent to LJ, it has the special characters in the title. Take my last blog entry as an example:
“Therrre’s been a Murrrrrderrrrrâ€
Fine. Nothing wrong with that. Comes up in the RSS feed as:
<title>“Therrre’s been a Murrrrrderrrrr”</title>
For those of you who aren’t technical, that’s how those special characters are embedded within the RSS feed. They’re parsed as HTML entities and converted by your browser to something pretty. Obviously this means that if I want to put an ampersand into my document, it has to be encoded as &.
Problem is this – look what LJ’s RSS feed does:
<title>&#8220;Therrre&#8217;s been a Murrrrrderrrrr&#8221;</title>
Yup, its converted the ampersands into HTML entities. I’ve checked what I can – everything seems fine apart from LJ’s RSS generator. So, LJ, you’re broken. I don’t know why I expected anything different…
[Sidenote: Thanks to Joel for updating the planet link to my blog so quickly. It’s all fixed on Planet now]