Wondering just how many Netscape 4 visitors this site gets, I pulled up some server stats and noticed two very strange patterns.
The first appears to be a spider, calling itself
Mozilla/4.08. It’s already suspicious, since the real Netscape 4 includes the language and OS, as in
Mozilla/4.08 [en] (Windows NT 5.0; U). Then there’s the pattern: lots of hits from the same IP, all to actual pages—not a single image, style sheet, or script—and some interesting mistakes that look like it misparsed the links.
The other pattern showed Netscape 4 requesting favicon.ico. The thing is, Netscape 4 doesn’t know about favicons. This is scattered across a few visitors from various IP addresses and looks like actual visitors—show up, look at a page or two with images and styles, etc. Versions range from 4.06 to 4.8, and platforms include Windows XP, Linux, BeOS, and—believe it or not—CP/M. Actually, the last set of hits admit to being
Mozilla/4.7 [en] (CP/M; 8-bit; Fake user agent). The only direct reference I can find calls it a robot, but it seems the anonymizing features in Squid use CP/M in their example fake UA.
So why do browsers and robots fake their identity? Sometimes it’s for anonymity. You might not want to be tracked, whether for simple privacy reasons or because you’re doing something illicit like harvesting email addresses. Other times it’s out of necessity, when sites send different content to different browsers. Some will mimic another browser exactly, like the User-Agent Switcher extension for Firefox, and others will mimic it just enough to get by, like Opera.
Whatever reasons people (or programmers) use, the results are sometimes strange—like MSIE on Linux!