Tag Archives: useragent

Browser Sniffing Strikes Again!

As the first major web browser to reach a double-digit version, Opera has been testing out alpha releases of version 10 for months now. One of the early problems they encountered was bad browser detection scripts that only looked at the first digit of a version number and decided that Opera 10 was actually Opera 1, and therefore too old to handle modern web pages.

After extensive testing, they’ve concluded that the best way to work around this is to pretend to be Version 9.80. From now on, all versions of Opera will identify themselves as “Opera/9.80” with the real version appearing later in the user-agent string.

For example:

Opera/9.80 (Macintosh; Intel Mac OS X; U; en) Presto/2.2.15 Version/10.00

This is similar to the way all Gecko-based browsers identify themselves as Mozilla/5.0, then list the real browser name and version number later on, which makes me wonder why they didn’t just stick with that increasingly irrelevant prefix — though I suppose any scripts looking specifically for Opera versions might have still picked up Opera/10 later on in the ID.

It’ll be some time before Firefox or Safari runs into this issue, but with Internet Explorer 8 in wide release, you have to wonder…what will Microsoft do when they get to IE 10?

What’s in a User-Agent String?

Some people browse collections. I collect browsers. Mostly I just want to see what they’ll do to my web site, but I have a positively ridiculous number of web browsers installed on my Linux and Windows computers at work and at home, and I’ve installed a half-dozen extra browsers on our PowerBook.

One project I’ve worked on since my days at UCI was a script to identify a web browser. In theory this should be simple, since every browser sends its name along when it requests a page. In practice, it’s not, because there’s no standard way to describe that identity.

Actually, that’s not quite true. There is a standard (described in the specs for HTTP 1.0 and 1.1: RFC 1945 and RFC 2068), but for reasons I’ll get into later, it’s not adequate for more than the basics, and even those have been subverted. That standard says a browser (or, in the broader sense, a “user agent,” since search robots, downloaders, news readers, proxies, and other programs might access a site) should identify itself in the following format:

  • Name/version more-details

Additional details often include the operating system or platform the browser is running on, and sometimes the language.

Now here are some examples of what browsers call themselves: Continue reading