Troubleshooting & How-Tos 📡 🔍 Servers

Weird 404 logging error with PHP and mod_rewrite on DreamHost

When I started the category clean-up project on my blog a while back, I decided to start monitoring 404 errors on the blog to see if I missed any incoming links that needed to be redirected. I was surprised to find that the logs showed no 404 errors at all from within the blog structure. Images, sure, but no articles, no tags, no categories. This seemed a bit hard to believe.

I tested it by deliberately hitting a non-existent page, and was dismayed to find that Apache logged the hit as 200 (OK).

Crap! a WordPress update must have broken 404 handling! How long had this been going on? I’d better manually insert a header in the 404 page!

That seemed to work, as far as Chrome’s Developer Tools and curl -I were concerned. I didn’t have time to follow up on the logs right away, so I checked back later…and the logs still showed 200 OK, not 404.

WTF?

It turned out that, when served through WordPress, Apache was sending a 404 code to the browser but logging a 200.

Probably a plugin, right?

Not so. I installed a fresh copy of WordPress on a test site and discovered something interesting: 404 codes were logged correctly when using the default /?p=123 permalink structure, but if I changed it to anything readable like /yyyy/title or even /title, the problem recurred.

A little more investigation: I skipped WordPress entirely and just hit a PHP page that served up a 404. When I hit it directly, it logged correctly. But when I used WordPress’ mod_rewrite rules to send a hit to that page, it logged a 200.

So clearly, it was something about mod_rewrite. I don’t run my own Apache server these days (my department at work is mainly a Windows shop), but I was pretty sure it didn’t work that way back when I did.

So I did some testing of different configurations at home and on my webhost. Direct hits always logged the correct status, but with a rewrite rule, here’s what I found:

  • FastCGI & CGI on DreamHost show 200/404.
  • mod_php on home box shows 404/404.
  • mod_php on DreamHost shows… 200/404.

At this point I figured there was no point setting up a CGI or FastCGI-based PHP environment on my home box, because it was clearly something about Dreamhost’s Apache configuration.

It does log correctly if you use ErrorDocument directive to point 404 to a PHP script. But IMO that’s abusing the error handler mechanism to do something it wasn’t meant for. (Not that I haven’t done it myself, but only on older IIS servers where ISAPI Rewrite and URL Rewrite weren’t available.)

I’ve added a custom logging snippet to my WordPress 404 page. There are other ways I can capture the data, but that seemed like the least overhead for now.

Mentions

Toby's Log : Dreamhost, mod_rewrite, and logged status codes

…led me to believe that it must be something with the mod_rewrite that Dreamhost uses. I found one other blog post mentioning the same problem, plus several other pages that seemed to be related but without enough information to conclusively…