In my post on Webslices, I mentioned that the home page of my Flash site uses server-side includes instead of a static HTML file. But it doesn’t really update that often: maybe 3 or 4 times a month. Is it really worth building that file dynamically? Should I switch from SSI to something more powerful, like PHP, that will let me add headers so that repeat visitors won’t have to re-download the whole page except when it’s actually different? Or should I switch to a static file, with the same benefits but simpler? What am I actually building, anyway?

Looking through the code, I find:

Browser upgrade banners. People using old versions of Firefox (currently 1.5 or older) or Internet Explorer (currently 5.5 or older) get an “Upgrade to Firefox 2” banner instead of the thumbnail of the current issue of the comic. This is just as easily done with JavaScript—and is done with JS elsewhere on the site. (I used to make some minor adjustments for other versions of IE, but I converted them all to conditional comments a while back.)

Last-modified date in the footer, pulled from the actual file. I’ve already got a script to update this in the static files, so it’s just a matter of adding it to my general update script. A two-minute, one-time change and I’ll never notice the difference.

Latest posts from this blog. Probably better done with an iframe, or maybe using AJAX. Drawback: either method would mean an extra request from the client. On the plus side, repeat visitors would be able to re-use the rest of the page, and only download the 5-item list.

Unique-per-day spamtrap addresses, hidden where harvesters might pick them up. But only a few of them still accept mail and feed it to filters. Mostly, they just waste spammers’ resources. I could easily either get rid of them or change the script to generate a new address with each update instead of each day.

So really, there isn’t much stopping me from using a static file for the most-viewed page on the site, with all the attendant savings in system resources, bandwidth, etc.

On the other hand, I keep contemplating switching to a database-driven system for the whole thing, which would make any changes now meaningless. But since I’ve been thinking about that since around 2000 or so, and haven’t changed it yet, that’s not exactly a blocker!

Update (March 30): I’ve made the conversion to a static file. The blog posts and browser upgrade banners are now done client-side (and run after the rest of the page is loaded), the last-modified date is part of the pre-processing script, and I just removed the daily spamtrap addresses. Now to see whether it actually improves performance.

When the first Firefox 2 beta was released, I looked into Microsummaries, a feature that enables bookmarks to automatically update their titles with information. I concluded they were useful, but not for anything I was doing. The main application would be my Flash site, but it already had an RSS feed for updates, and a microsummary could only really include the most recent item.

Now the first IE8 beta supports Webslices. They’re similar in concept, but can include formatted data (not just plain text) and use microformat-like markup on the web page instead of a <link> element in the head.

I figured with two browsers supporting the concept, I’d give it a shot. I adapted the script I use to generate the RSS feed so that it will also take everything on the most recent day and generate a text file, which is used for the Microsummary title. For the Webslice, to start with I just marked up the “Latest Updates” section of the home page. Since I haven’t installed IE8b1 at home, I’m using Daniel Glazman’s experimental Webchunks extension for Firefox to try it out. Unfortunately the extension doesn’t seem to resolve relative links in its current state.

The real question, of course, is whether either technology offers anything better than what feeds can do now.

I think I’ll end up going the external-feed route for the Webslice as well, since it’ll use a lot less bandwidth than having a bunch of IE installations pulling the entire home page once a day. Plus since I’m using SSI on that page, it doesn’t take advantage of conditional requests and caching, and a static file will. But that’ll have to wait. Lost is on in 2 minutes, and after getting up earlier than usual this morning, I’ll probably be going to bed right after the show.

Update: I checked in IE8, and the webslice does work as expected. A few minor differences: Webchunks pulls in external styles, like the background and colors, while IE8b1 only uses styles in the chunk itself. Interesting bit: I’m marking up list items as entries, and IE8 is actually displaying them as a bulleted list, while Webchunks is simply showing the content.

So it at least works. Maybe tonight or Sunday I’ll see if I can refine it a bit.

I’ve been reading High Performance Web Sites and started thinking about how to apply the guidelines to my own sites (not to mention stuff for work). A lot of them are things I already do: minimize external resources, use compression & cache control, etc. Others are a bit out of reach for a personal site [edit: not anymore], like using a content delivery network. It got me looking at the way I use scripts, and reminded me of a change I made about a year and a half ago.

Way back when, I put a simple app on my Flash site: a team-name generator for teams of speedsters. It randomly generated a name from two lists, and provided a button to generate another one. I originally wrote it in PHP.

The funny thing was that it was the most-hit page on the site, because people would sit there and hit the button to generate a new name half a dozen times before moving on. And because it was a sever-side script, that meant not just another HTTP hit, but re-downloading the entire web page with only 2 words being different.

Eventually I realized it was much better suited to a client-side app. I rewrote the whole thing in JavaScript, using DOM functions to replace the name on the current page instead of reloading. I left the hooks to the PHP in place, so that it would still work for clients with JavaScript disabled.

  • It was much faster — practically instantaneous, in fact.
  • It used a lot less bandwidth — 40 KB (5 KB × 8 ) vs. 6 KB (5 KB + 1 KB) for a typical 8-name* scenario.
  • Traffic stats more accurately reflected the page’s popularity, as it dropped from #1 to around #30–50.

* Based on a drop from 32,000 hits/month in July 2006 to 4,000 hits/month in September, with the rest of the site staying about the same, it seems people were hitting reload 7 times.

From time to time the idea is put forth that Opera (and Firefox, for that matter) needs to start dealing with bad code. There are two problems with that view:

  1. Opera already deals with quite a bit of “bad code” (but there’s always room for improvement)
  2. Just dealing with bad code isn’t enough: you have to deal with it the same way someone else does.

#2 is the tough part.

The rules for dealing with good code are, for the most part, specific. If you encounter well-formed HTML, you can be reasonably sure you know what the author meant. But there are very few rules for dealing with bad code. Trying to "deal with it" means trying to guess what the author meant, and sometimes different assumptions are equally as likely.

Example:


<p><b>Here's some text</i> and here's some more.</p>

Did the author close the non-existent italics by mistake, meaning to close the bold? Or did he open bold by mistake, intending to open italics? Or is the closing italics tag left over from copy-and-paste? Depending on what assumptions the browser makes, it should display it as:

Here’s some text and here’s some more.

Here’s some text and here’s some more.

Here’s some text and here’s some more.

And that’s just a simple example. It gets wilder when you throw in issues like inline vs. block elements. A paragraph should never appear inside a tag for text formatting, like bold or italic. By all rights, starting a new paragraph (or more precisely, ending the previous one) should also revert to plain formatting. But a lot of old pages expect the formatting to continue into the next paragraph, because way back when, a P tag was a double-line break, not a container.

Now, suppose that Browser A always makes the first assumption, and Browser B always makes the second. If someone tests their code in Browser A, and it happens to be what they want it to do, they won’t necessarily notice that their code is broken. The result: the site looks wrong in Browser B, and the page author — who thinks the page is fine, since he tested it in Browser A — blames Browser B.

Multiply that scenario by millions of pages and you have a large chunk of the web as we know it today.

So the solution isn’t just to “handle bad code.” It’s to handle that bad code in the same way that the dominant browser handles it. And since there’s no document you can look to for guidance, that means taking every possible chunk of bad code, running it through the other browser, and seeing what it does to it.

And there are a lot of ways to break code!

Even Microsoft did this back when IE was new. At the time, lots of people were writing broken code and testing it by seeing whether it looked right in Netscape. So IE had to make the same assumptions Netscape did on certain things. Once IE became established, they diverged.

Some relevant articles:

This post originally appeared on Confessions of a Web Developer, my blog at the My Opera community.